Why can't we capture the design of software more effectively?

Question

As engineers, we all "design" artifacts (buildings, programs, circuits, molecules...). That's an activity (design-the-verb) that produces some kind of result (design-the-noun).

I think we all agree that design-the-noun is a different entity than the artifact itself.

A key activity in the software business (indeed, in any business where the resulting product artifact needs to be enhanced) is to understand the "design (the-noun)". Yet we seem, as a community, to be pretty much complete failures at recording it, as evidenced by the amount of effort people put into rediscovering facts about their code base. Ask somebody to show you the design of their code and see what you get.

I think of a design for software as having:

An explicit specification for what the software is supposed to do and how well it does it
An explicit version of the code (this part is easy, everybody has it)
An explanation for how each part of the code serves to achieve the specification (e.g, a relation between spec fragments and code fragments)
A rationale as to why the code is the way it is (e.g., why a particular choice rather than another)

What is NOT a design is a particular perspective on the code. For example [not to pick specifically on] UML diagrams are not designs. Rather, they are properties you can derive from the code, or arguably, properties you wish you could derive from the code. But as a general rule, you can't derive the code from UML.

Why is it that after 50+ years of building software, why don't we have regular ways to express this? (Feel free to contradict me with explicit examples!)

Even if we do, most of the community seems so focused on getting "code" that design-the-noun gets lost anyway. (IMHO, until design becomes the purpose of engineering, with the artifact extracted from the design, we're not going to get around this).

What have you seen as means for recording designs (in the sense I have described it)? Explicit references to papers would be good. Why do you think specific and general means have not been succesful? How can we change this?

[I have my own ideas that flesh out the bulleted viewpoint above, but I'm interested in other people's answers... and its hard to implement my scheme [[and maybe that's the real problem :-]] ]

EDIT 2011/1/3: One the answer-threads hints that "documentation" (presumably textual, in particular not-formal) might be adequate. I guess I should clarify that I don't believe this. CASE tools appeared on the scene starting in the 80s, but the early tools mostly just captured pixels for the diagrams of whatever you drew; while the tools were arguablly commercially successful, they really weren't very helpful. A key insight was, that if the additional "design" artifacts are not formally interpretable, you can't get any serious tool help. I believe the same insight applies to any long-term useful form of design capture: if it hasn't got a formal structure, it won't be of any real use. Text documents pretty much fail this test.

Agreed on UML - a communication tool at best, contributing to the design description, but not in itself being the design. At worst, though, UML is graphical source code. — , Dec 31 '10 at 13:13
When I build systems, I have meet a lot of "nonfunctional" requirements: coded in *this* language, uses *that* database, handles 1E10 records in with 100mS average response time, ... You can't leave these out of the specification. (Without nonfunctional requirements a forever-loop is an adequeate program for any functional spec). My whole point about "design" capture is to handle another nonfunctional requirement: "maintainable". — Ira Baxter, Dec 31 '10 at 16:49
Your discussion look interesting but I'm not still not sure what the question is exactly about. Do you think you could try to give something like a concrete example or something to clarify what you are interested in exactly. I'm thinking about something like the FFT example where you could give the full picture of the 4 bullet points in your question like you see them and maybe what kind of things you would like to do with the results once captured. — n1ckp, Jan 03 '11 at 00:27
I’ve no idea on the whys of this issue, but it’s the subject of [Fred Brooks’ ‘The Design of Design’](http://www.amazon.co.uk/Design-Essays-Computer-Scientist/dp/0201362988). (Apologies if you’re already familiar.) He discusses examples from programming and architecture. He particularly notes that capturing rationales (for both the design, and rejected alternate designs) is really useful, and not well-served by current tools. — Paul D. Waite, Feb 13 '12 at 08:42

score 9 · Answer 1 · answered Dec 31 '10 at 14:08

9

I think there are several reasons why we still aren't good at this.

People for a long time though software was like houses, and were using processes and ideas from construction. "Software architect" was a title all programmers wanted. During the last ten years the software architect has almost died out. The idea of waterfall processes where you first have an architect saying how the software should work and look, and then get people making diagrams of how it should be constructed and lastly have code monkey implementing these nice worksflows/UML diagrams to spec, this idea is now widely derided. So in fact, the whole industry was barking up the wrong path for 40 years.
The tools we use constantly change and improve. Programming is a logical puzzle, and we come up with better ideas and techniques to abstract that puzzle and making it understandable. The way we model this must evolve at the same speed, but it lags behind. The improved techniques to abstract the puzzle of programming also means we can increase the complexity. And so we do. Programming always lies on the edges of the complexity we as programmers can handle.
Making ways of describing the program is a sort of abstraction. If we can come up with a good way of abstracting the software, we can also put that abstraction directly into the development tools, and therefore add another abstraction/simplification to the programming. This has happened many times over. Examples of such abstractions are functions, classes and libraries.
Therefore; If you have a successful and accurate model of the software, that model will be equivalent to the software. Which kinda makes the whole effort pointless, which in turn corroborates point 1 above: Modeling the software is much less useful that previously thought. It is instead better to extract data about the software from the code. Creating a UML model from how the code actually looks is much more enlightening than creating an UML model and trying to implement that theoretical model.

answered Dec 31 '10 at 14:08

Lennart Regebro

2,265
13
17

2

Don't agree with your last point. Yes, it will be rather equivalent to the software, but it's still easier to redraw a model than to refactor the actual software when (not if) you found out something wasn't such a good idea after all. I wouldn't underestimate the importance of the designing step. – Joris Meys Dec 31 '10 at 14:13
1

@Joris Meys: The problem is that you won't know what and what isn't a good idea until you have implemented it. (Except in trivial cases, but then you won't have much use of a UML diagram anyway). Also you shouldn't overestimate how hard it is to refactor code. I recommend to read Kent Becks books on XP and Test Driven Design. – Lennart Regebro Dec 31 '10 at 14:17
@Lennart : thx for the tip – Joris Meys Dec 31 '10 at 14:36
"accurate model..."? I'm interested in *design*; unless you want to define "model" as synonoumous with *design* I don't want to drag in another not-well-defined term. If you insist they are the same, I don't buy your argument that model-cum-design is the code, at least not according to my definition. Where's the specification? Where's the rationale? Maybe you don't accept my definition; in that case, how does anybody get any understanding. What I *will* agree with, is that it would be ideal if we could provide all the design elements to our tools. The code would still be separate. – Ira Baxter Dec 31 '10 at 16:26
+1, I will take a fugly, but proven and functional prototype over 200 pages of text and diagrams of what has not been created yet just about every time. Of course, ideally, I would want to leave it as a clean, documented, and universally loved project. – Job Dec 31 '10 at 21:17
@Ira Baxter, Clarification: If you are designing software, in the sense you are clearly talking about, then you need a model, and that model is the central part of the design from the programming point of view (and other parts are mockups and UI designs, etc). The specification is not a part od the design. The design is how you intend to implement the specification. In short: I don't see in your comment any substantial criticism of what I said. – Lennart Regebro Dec 31 '10 at 22:43
@Lennart: I think of "design" as the explanatory relation between what is desired (the specification) and the artifact (the code). If it really is a mathematical relation, you'll have a hard time leaving the specification and the code out, and having anything left for the relation to reference. Having said that, I don't want to quibble about whether the spec/code is in/out of the "design". What I am more interested in is what we should record to describe how and why the artifact is put together. I have trouble with your word "model" because everybody has a different meaning for it. – Ira Baxter Jan 02 '11 at 06:38
@Job: OK, your fugly prototype prints an "@". Is that fundamental and you'll be fired if you take it out? Or is just some extra junk that leaked into the prototype? I agree you don't want 200 pages of *flaky* text and diagrams. My question is trying to focus on what you *should* have as descriptions that characterize the productized version of your fugly prototype. You want to leave it as "clean and documented"; what do you mean by *documented*? – Ira Baxter Jan 02 '11 at 06:41
@Ira Baxter: Right. If done before the code, the explanatory relation between specification and code is the design. If done after the code it's documentation. :) Your question, of why we aren't good at making formal designs, is what I tried to explain above. I don't see why the verb "model" is problematic in that explanation. Do you get fired for a "@"? That's *specification*, not design. – Lennart Regebro Jan 02 '11 at 06:59
2

@Lennart: The difference between you and Job is that you seem to agree that a specification expressed somehow might be necessary, although I don't know how your set of currently-programmable abstractions does that. But imagine that I need to a signal processing program that extracts the principal frequencies. Notice I didn't suggest how. You might decide to *use* a Fast Fourier Transform, and I'll surely find footprints all over the code that implements FFT bits. But where is the fact that you decided to use an FFT explicitly recorded? I believe you need it unless your readers are all-knowing. – Ira Baxter Jan 02 '11 at 07:12
@Ira Baxter: It's recorded in the code. If the system is complex enough, there might be developers documentation that also would mention it. I don't know what a "specification expressed" means. – Lennart Regebro Jan 02 '11 at 07:19
@Lennart: What is the exact source code phrase you used to record "we chose to implement signal processing with FFT"? I hope you're not going to tell me its the text, "CALL FFT(...)", nor do I think it reasonable to say, "Oh, I found a butterfly step so it must be using FFT". (By "specification expressed" I meant "specification expressed explicitly" but SO won't let you edit comments after a few minutes, sorry). – Ira Baxter Jan 02 '11 at 07:38
1

@Ira Baxter: How about "We chose to implement signal processing with FFT"? Source code has comments, you know. And I can also write it in a README file. The explicit expression of specification is the code. Text lines like "We chose to implement it wit FFT" is not explicit, nor design in the sense of your original post. It's documentation of the implementation. You seem to have ended up in an argumentative mode, but I don't understand what you are trying to argue against. – Lennart Regebro Jan 02 '11 at 11:30
So again, my point is that although you can *describe* the code/design/specification/model, as it you "We use FFT" sentence, that does not *capture* the code/design/specification/model. To do that you need a form of abstraction of the code/design/specification/model, and that's what programming is about: abstraction. So if you have such an abstraction, it will not "capture" the code/design/specification/model, it will be a part of it. – Lennart Regebro Jan 02 '11 at 11:35
@Lennart: I've revised my question to make clear that the kind of design information I'm looking for is formally based so I can tools to help work with them. I'm confused by your assertion that you should have "some kind of abstaction" and that will become part of the code (and it must therefore be formally defined), and the approach you suggested for recording "we decided to use FFT" as source comments or (text in a) README file. I also note that something that says, however it does it, "We decided to use FFT" doesn't seem like it has to be part of the code to make the program executable. – Ira Baxter Jan 03 '11 at 00:10
@Ira Baxter: I can't help you there. I'm on the other hand "confused" by your assertion that the sentence "We decided to use FFT" is a formal description of the code. – Lennart Regebro Jan 03 '11 at 08:59

score 5 · Answer 2 · edited Mar 04 '14 at 16:28

You might be interested in reviewing the software traceability literature. In no particular order:

CUBRANIC, D., MURPHY, G. C., SINGER, J., AND BOOTH KELLOGG, S. Hipikat: a project memory for software development. Transactions on Software Engineering 31, 6 (2005), 446–65.
TANG, A., BABAR, M. A., GORTON, I., AND HAN, J. A survey of the use and documentation of architecture design rationale. In Proc of the 5th Working IEEE/IFIP Conference on Software Architecture (2005).
RAMESH, B., POWERS, T., STUBBS, C., AND EDWARDS, M. Implementing requirements traceability: A case study. In Proc of the Int’l Symp on Requirements Engineering (York, 1995).
HORNER, J., AND ATWOOD, M. E. Design rationale: the rationale and the barriers. In Proc of the 4th Nordic Conference on Human-Computer Interaction: Changing Roles (Oslo, Norway, 2006).
CLELAND-HUANG, J., SETTIMI, R., ROMANOVA, E., BERENBACH, B., AND CLARK, S. Best practices for automated traceability. Computer 40, 6 (June 2007), 27–35.
ASUNCION, H., FRANÇOIS, F., AND TAYLOR, R. N. An end-to-end industrial software traceability tool. In Proc of the 6th Joint Meeting of the European Software Eng Conf and the ACM SIGSOFT Int’l Symp on the Foundations of Software Engineering (ESEC/FSE) (Dubrovnik, 2007).

Note that this is just the tip of the iceberg, and I'm sure I've left out some key papers.

On a separate note, my own work on Arcum was a means for programmers to express to the IDE the use of crosscutting design idioms. Once expressed, programmers could then transform their source code to use alternative implementations:

Macneil Shonle, William G. Griswold, and Sorin Lerner. Beyond refactoring: a framework for modular maintenance of crosscutting design idioms. In Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC-FSE '07; September 3-7, 2007). 175-184.

Incidentally, Arcum is also related to your DMS work.

+1 for this. RT isn't everything but it's one of several positive steps toward actually *solving* the problem instead of making the same old excuses for it. — Aaronaught, Jan 02 '11 at 18:32

score 3 · Answer 3 · answered Jan 03 '11 at 15:40

I see two problems.

The first is that it is bloody difficult to keep code and documentation in sync. If they are separate, they will diverge and the documentation will become useless. Programmers have tried to use tools to do the work of keeping them in sync (such as CASE-tools), but these tools got between the programmers and their code, which did more damage than good. One of the key insights of domain driven design (Evans, 2004) is that good design is really hard, so in order to get something out of it, you must:

choose the smallest possible area of your program where good design will yield great benefits, the so called core domain
work really hard to get a good design in form of a ubiquitous language that all team members use all the time
as much as possible, make the design part of the code

The other big problem with the way we do design, is that our design-methods are not mathematical enough. Leaky abstractions do not lend themselves to derive solid conclusions from them, and the world of strictly applied logic and clear truth is called math, which programmers mostly shy away from.

The few mathematical tools we have, such as formal methods, are very unwieldy.

Map-reduce is a nice example of math in programming. The core idea is this: When you have an associative, binary operation you can distribute its execution very easily. A binary operation is a function with two parameters, associativity implies that (a+b)+c = a+(b+c)

a1+a2+...+a99+b1+b2+...+b99+c1+c2+...+c99 is

(a1+a2+...+a99) + (b1+b2+...+b99) + (c1+c2+...+c99) where the As, Bs and Cs can trivially be added on different locations, their results collected and summed up in no time.

Map-reduce is a ridiculously simple idea. It can be described on one piece of paper. If you can assume that the reader has a firm grasp on the concept of associativity, if fits on a quite small piece of paper. Now try to explain map-reduce to somebody without using the term associativity or referring to it indirectly. I dare you.

Software design without mathematical abstractions is like trying to do architecture without ever bothering to learn geometry.

Maybe Haskell can fix this over time. The use of concepts from category theory to describe programs looks promising to me. Category theory is so abstract that even mathematicians had little use for it, but apparently categories, which are abstract beyond recognition, seem to be abstract enough to describe the structure of software. We'll find out. Slowly.

Joris Meys · Answer 4 · 2010-12-31T14:10:53.013

2

UML is to a program what the plans are to a building in my humbe view. Plans alone aren't a design off course, you need material specifications (used code tools) for that, a general view of the building (some schematic representation of the whole software, including GUI designs), how the building is planted in the surroundings (a clear scheme of how the software interacts with others / is planted within the OS), how it stands to the climate and soil (interaction with hardware), ... Plenty of books on design try to define it, but as with so many things in science, every scientist has a bit his own definition.

Now, I also don't agree with your observation that you can't derive the code from UML. You can, if you have the additional information mentioned. But the real code isn't the design any more, that's the artifact. You can't extract the real stones and concrete from a plan either, but you need the plan to put the real stones and concrete in the correct form and the correct place.

In that light, I found following article interesting (I met it in a different context when I was looking for graph software, but nonetheless...). The graph approach to describe a design made sense to me, although -again- this is only part of the design in my opinion. The interesting thing is that this approach gives a framework to understand and refactor designs (as opposed to refactor the software), as indicated in following papers :

There are a whole lot of other approaches to describe (part of) the design, like structured design (HIPO Charts) or integrated program design, design patterns, ...

Still, as long as there's no industry standard set, it's unlikely to get a "regular" way to express this. Even after 50+ years. And be honest, if your company finds a good way to express a design, would you share it with the world?

edited Dec 31 '10 at 14:10

answered Dec 31 '10 at 14:04

Joris Meys

1,933
14
20

If your company finds a good way of doing this, the programmers would tell everyone else pretty darn quickly. :-) – Lennart Regebro Dec 31 '10 at 14:53
I think you miss my point about UML (and any other "single" notation). UML-before-code is a constraint on how you want to build the software. So are all the other notations (yes, I've seen a lot of these, I've been around awhile). Given a constraint, it is arguably possible to produce code meeting that constraint (joke method: produce all possible programs and check to see which ones match the constraint, pick one of those). In this sense you can "generate code from UML". But (if we stick to class diagrams) that code won't implement the function you want... – Ira Baxter Dec 31 '10 at 16:32
... and most of the other notational schemes suffer from this too, they don't really capture a specification of what the program is supposed to do. Nor do they providing anything like a rationale; *why* is your UML chart the shape it is? What part of the code can I change without breaking the chart? Can I change in way that doesn't damage the intention behind the chart? – Ira Baxter Dec 31 '10 at 16:36
@Ira: After visiting your webpage, it became more clear to me. But I have to admit that a high-level semantic discussion on these matters is beyond my expertise. Yet, if you consider the _real_ code as part of the design, then where is the actual artifact? UML -or any other notation- is in my humble opinion a blueprint of the code structure, and that's something I like to call _part_ of the design. More than the actual code in fact. mind the _part_. It's not the design, but still an essential contribution to it. again, in my humble opinion. The rationale etc. can be added to that as explained. – Joris Meys Dec 31 '10 at 16:39
@Joris: Most diagrammatic notations can be considered as projections (inferred artifacts) from the code (at least after its finished) or could be considered as guidance for building the code (blueprint). There's lots of possible diagrams (some listed in your answer). Which ones are fundamental to the code you have, and which are just accidents? Flowcharts are digrams, so they should qualify; yet I'm pretty confident that flowcharts of some code chunk would *not* be considered part of its design. So, what's fundamental? What's accidental? – Ira Baxter Dec 31 '10 at 17:20
@Joris: The Fiadeiro paper looks interesting; they apply graph transformations to abstract architectures. I actually build source-to-source program transformation tools; we have applied transformations to reshape real architectures of large C++ systems. This still doesn't address the "design" issue, though. – Ira Baxter Dec 31 '10 at 20:24

score 2 · Answer 5 · answered Dec 31 '10 at 22:49

2

From my own personal experience, I would argue that we are good capture the design of software. We have a database of requirement and design documents for every single feature that we have ever implemented on our platform. I suppose my circumstance maybe unique. Here are some things to think about though.

Every single person on my team has an engineering degree...mostly EE or CE. Engineering teaches you design as part of the curriculum.

I think that there a lot of so called software engineers that come from CS backgrounds. Software design is not an integral part of of most CS programs. I am not saying that all CS majors are bad at design, but I would wager that most have no formal education that taught it them. I think a lot of people assume that if you can program that you can design software, which is not true. Given that many programmers don't have an engineering background it is not really surprising that many software projects don't have a team that is good at capturing design.

answered Dec 31 '10 at 22:49

Pemdas

5,385
3
21
41

So what specific method of writing requirements and design documents do you use? (I looked at your bio and was expecting to see somebody from the defense space and was surprised). I assume by requirements you mean some natural-language text? ... if so, you have no arguments about them? (I distinguish natural language requirements from formal specs). Are they complete? And the design documents? Are they up-to-date completely for the currently shipping system? I'll agree that lots of so-called programmers and software engineers are not, so we can stick to discussing what ones that are should do. – Ira Baxter Jan 02 '11 at 06:50
1

I am not sure that there is a specific name to our method. Our requirements are what I would call a hybrid between natural language requirements and a high level design documents. We usually have two rounds of editing. First, we document what a feature needs to do in plain English. Then we specify exactly how it is going to work from the user's perspective. The document has two goals. One, we want to provided a document that can have reviewed by our Marketing team to make sure we meet our business needs. Two, we want to provided a document that be can used by our QA team to test against. – Pemdas Jan 02 '11 at 08:05
1

Our design documents are much more formal and detailed. They always the include the following: Some sort of use case analysis, any explanation of trades offs or reasons we chose a particular way of doing things, references to other designs, explicit interface definitions, data structures...ect. We don't necessarily have explicit comments regarding how specific code meets certain requirements. I am undecided about whether or not I think this is important. – Pemdas Jan 02 '11 at 08:15
1

I would say that our documents are 95% up to date. A few things here and there slip through the cracks. – Pemdas Jan 02 '11 at 08:23

Why can't we capture the design of software more effectively?

5 Answers5