This post is going to be pretty long.
Feel free to scroll down, or roll the paragraphs, if you think.
I’m pretty sure you will entirely read it later, since it is really interesting.
In case of panic, click here to jump to conclusions.
Managing The Development Of Large Software Systems
I am going to describe my personal views about managing large software developments. I have had various assignments during the past nine: years, mostly concerned with the development of software packages for spacecraft mission planning, commanding and post-flight analysis.
In these assignments I have experienced different degrees of success with respect to arriving at an operational state, on-time, and within costs. I have become prejudiced by my experiences and I am going to relate some of these prejudices in this presentation.
Computer Program Development Functions
There are two essential steps common to all computer program developments, regardless of size or complexity. There is first an analysis step, followed second by a coding step as depicted in Figure 1.
Figure 1. Implementation steps to deliver a small computer program for internal operations.
This sort of very simple implementation concept is in fact all that is required if the effort is sufficiently small and if the final product is to be operated by those who built it – as is typically done with computer programs for internal use. It is also the kind of development effort for which most customers are happy to pay, since both steps involve genuinely creative work which directly contributes to the usefulness of the final product.
An implementation plan to manufacture larger software systems, and keyed only to these steps, however, is doomed to failure.
Many additional development steps are required, none contribute as directly to the final product as analysis and coding, and all drive up the development costs. Customer personnel typically would rather not pay for them, and development personnel would rather not implement them. The prime function of management
is to sell these concepts to both groups and then enforce compliance on the part of development personnel.
A more grandiose approach to software development is illustrated in Figure 2.
The analysis and coding steps are still in the picture, but they are preceded by two levels of requirements analysis, are separated by a program design step, and followed by a testing step. These additions are treated separately from analysis and coding because they are distinctly different in the way they are executed. They must be planned and staffed differently for best utilization of program resources.
Figure 3 portrays the iterative relationship between successive development phases for this scheme.
The ordering of steps is based on the following concept: that as each step progresses and the design is further detailed, there is an iteration with the preceding and succeeding steps but rarely with the more remote steps in the sequence. The virtue of all of this is that as the design proceeds the change process is scoped down to manageable limits. At any point in the design process after the requirements analysis is completed there exists a firm and closeup moving baseline to which to return in the event of unforeseen design difficulties.
What we have is an effective fallback position that tends to maximize the extent of early work that is salvageable and preserved.
I believe in this concept, but the implementation described above is risky and invites failure. The problem is illustrated in Figure 4. The testing phase which occurs at the end of the development cycle is the first event for which timing, storage, input/output transfers, etc., are experienced as distinguished from analyzed. These phenomena are not precisely analyzable. They are not the solutions to the standard partial differential equations of mathematical physics for instance. Yet if these phenomena fail to satisfy the various external constraints, then invariably a major redesign is required.
A simple octal patch or redo of some isolated code will not fix these kinds of difficulties. The required design changes are likely to be so disruptive that the software requirements upon which the design is based and which provides the rationale for everything are violated. Either the requirements must be modified, or a substantial change in the design is required. In effect the development process has returned to the origin and one can expect up to a lO0% overrun in schedule and/or costs.
One might note that there has been a skipping-over of the analysis and code phases. One cannot, of course, produce software without these steps, but generally these phases are managed with relative ease and have little impact on requirements, design, and testing. In my experience there are whole departments consumed with the analysis of orbit mechanics, spacecraft attitude determination, mathematical optimization of payload activity and so forth, but when these departments have completed their difficult and complex work, the resultant program steps involve few lines of serial arithmetic code. If in the execution of their difficult and complex work the analysts have made a mistake, the correction is invariably implemented by a minor change in the code with no disruptive feedback into the other development bases.
However, I believe the illustrated approach to be fundamentally sound. The remainder of this discussion presents five additional features that must be added to this basic approach to eliminate most of the development risks.
Step #1: Program Design Comes First
The first step towards a fix is illustrated in Figure 5. A preliminary program design phase has been inserted between the software requirements generation phase and the analysis phase. This procedure can be criticized on the basis that the program designer is forced to design in the relative vacuum of initial software requirements without any existing analysis..As a result, his preliminary design may be substantially in error as compared to his design if he were to wait until the analysis was complete. This criticism is correct but it misses the point. By this technique the program designer assures that the software will not fail because of storage, timing, and data flux reasons. As the analysis proceeds in the succeeding phase the program designer must impose on the analyst the storage, timing, and operational constraints in such a way that he senses the consequences. When he justifiably requires more of this kind of resource in order to implement his equations it must be simultaneously snatched from his analyst compatriots. In this way all the analysts and all the program designers will contribute to a meaningful design process which will culminate in the proper allocation of execution time and storage resources. If the total resources to be applied are insufficient or if the embryo operational design is wrong it will be recognized at this earlier stage and the iteration with requirements and preliminary design can be redone before final design, coding and test commences.
How is this procedure implemented? The following steps are required.
- Begin the design process with program designers, not analysts or programmers.
- Design, define and allocate the data processing modes even at the risk of being wrong. Allocate
processing, functions, design the data base, define data base processing, allocate execution time, define interfaces and processing modes with the operating system, describe input and output processing, and define preliminary operating procedures.
- Write an overview document that is understandable, informative and current. Each and every worker must have an elemental understanding of the system. At least one person must have a deep understand- ing of the system which comes partially from having had to write an overview document.
Step #2: Document The Design
At this point it is appropriate to raise the issue of – “how much documentation?”
My own view is “quite a lot;” certainly more than most programmers, analysts, or program designers are willing to do if left to their own devices. The first rule of managing software development is ruthless enforcement of documentation requirements.
Occasionally I am called upon to review the progress of other software design efforts. My first step is to investigate the state of the documentation, If the documentation is in serious default my first recommendation is simple. Replace project management. Stop all activities not related to documentation. Bring the documentation up to acceptable standards.
Management of software is simply impossible without a very high degree of documentation. As an example, let me offer the following estimates for comparison. In order to procure a 5 million dollar hardware device, I would expect that a 30 page specification would provide adequate detail to control the procurement. In order to procure 5 million dollars of software I would estimate 1500 page specification is about right in order to achieve comparable control,
Why so much documentation?
- Each designer must communicate with interfacing designers, with his management and possibly
with thecustorner. A verbal record is too intangible to provide an adequate basis for an interface or management decision. An acceptable written description forces the designer to take an unequivocal position and provide tangible evidence of completion. It prevents the designer from hiding behind the-“l am90-percent finished” – syndrome month after month.
During the early phase of software development the documentation is the specification and is the design.
Until coding begins these three nouns (documentation, specification, design) denoteasingtething. If the documentation is bad the design is bad. If the documentation does not yet exist there is as yet no design, only people thinking and talking about the design which is of some value, but not much.
- The real monetary value of good documentation begins downstream in the development process during the testing phase and continues through operations and redesign. The value of documentation can be described in terms of three concrete, tangible situations that every program manager faces.
- During the testing phase, with good documentation the manager can concentrate personnel on the mistakes in the program. Without good documentation every mistake, large or small, is analyzed by one man who probably made the mistake in the first place because he is the only man who understands the program area.
- During the operational phase, with good documentation the manager can use operation-oriented personnel to operate the program and to do a better job, cheaper. Without good documentation the software must be operated by those who built it. Generally these people are relatively disinterested in operations and do not do as effective a job as operations-oriented personnel. It should be pointed out in this connection that in an operational situation, if there is some hangup the software is always blamed first. In order either to absolve the software or to fix the blame, the software documentation must speak clearly.
- Following initial operations, when system improvements are in order, good documentation permits effective redesign, updating, and retrofitting in the field. If documentation does not exist, generally the entire existing framework of operating software must be junked, even for relatively modest changes.
Figure 6 shows a documentation plan which is keyed to the steps previously shown. Note that six documents are produced, and at the time of delivery of the final product, Documents No, 1, No. 3, No. 4, No. 5, and No. 6 are updated and current.
Step #3 – Do it twice
After documentation, the second most important criterion for success revolves around whether the product is totally original. If the computer program in question is being developed for the first time, arrange matters so that the version finally delivered to the customer for operational deployment is actually the second version insofar as critical design/operations areas are concerned. Figure 7 iltustrates how this might be carried out by means of a simulation. Note that it is simply the entire process done in miniature, toatime scale that is relatively small with respect to the overall effort. The nature of this effort can vary widely depending primarily on the overall time scale and the nature of the critical problem areas to be modeled. If the effort runs 30 months then this early development ofapilot model might be scheduled for 10 months. For this schedule, fairly formal controls, documentation procedures, etc., can be utilized. If, however, the overall effort were reduced to 12 months, then the pilot effort could be compressed to three months perhaps, in order to gain sufficient leverage on the mainline development.
In this case a very special kind of broad competence is required on the part of the personnel involved. They must have an intuitive feel for analysis, coding, and program design. They must quickly sense the trouble spots in the design, model them, model their alternatives, forget the straightforward aspects of the design which aren't worth studying at this early point, and finally arrive at an error-free program.
In either case the point of all this, as with a simulation, is that questions of timing, storage, etc. which are otherwise matters of judgment, can now be studied with precision.
Without this simulation the project manager is at the mercy of human judgment. With the simulation he can at least perform experimental tests of some key hypotheses and scope down what remains for human judgment, which in the area of computer program design (as in the estimation of takeoff gross weight, costs to complete, or the daily double) is invariably and seriously optimistic.
Step #4 – Plan, Control and Monitoring Testing
Without question the biggest user of project resources, whether it be manpower, computer time, or management judgment, is the test phase. It is the phase of greatest risk in terms of dollars and schedule. It occurs at the latest point in the schedule when backup alternatives are least available, if at all.
The previous three recommendations to design the program before beginning analysis and coding, to document it completely, and to build a pilot model are all aimed at uncovering and solving problems before entering the test phase. However, even after doing these things there is stillatest phase and there are still important things to be done. Figure 81ists some additional aspects to testing. In planning for testing, I would suggest the following for consideration.
- Many parts of the test process are best handled by test specialists who did not necessarily contribute to the original design. If it is argued that only the designer can perform a thorough test because only he understands the area he built, this is a sure sign of a failure to document properly. With good documentation it is feasible to use specialists in software product assurance who will, in my judgment, do a better job of testing than the designer.
Most errors are of an obvious nature that carl be easily spotted by visual inspection. Every bit
of an analysis and every bit of code should be subjected to a simple visual scan by a second party who did not do the original analysis or code but who would spot things like dropped minus signs, missing factors of two, jumps to wrong addresses, etc., which are in the nature of proofrea0ing the analysis and code. Do not use the computer to detect this kind of thing – it is too expensive.
Test every logic path in the computer program at least once with some kind of numerical check. If I were a customer, Iwould not accept delivery until this procedure was completed and certified. This step will uncover the majority of coding errors.
While this test procedure sounds simple, for a large, complex computer program it is relatively difficult to plow through every logic path with controlled values of input. In fact there are those who will argue that it is very nearly impossible. In spite of this Iwould persist in my recommendation that every logic path be subjected to at least one authentic check.
After the simple errors (which are in the majority, and which obscure the big mistakes) are removed, then it is time to turn over the software to the test area for checkout purposes. At the proper time during the course of development and in the hands of the proper person the computer itself is the best device for
checkout. Key management decisions are: when is the time and who is the person to do final checkout?
Step #5 – Involve The Customer
For some reason what a software design is going to do is subject to wide interpretation even after previous agreement. It is important to involve the customer in a formal way so that he has committed himself at earlier points before final delivery. To give the contractor free rein between requirement definition and operation is inviting trouble. Figure 9 indicates three points following requirements definition where the insight, judgment, and commitment of the customer carl bolster the development effort.
Figure 10 summarizes the five steps that I feel necessary to transform a risky development process into one that will provide the desired product. I would emphasize that each item costs some additional sum of money. If the relatively simpler process without the five complexities described here would work successfully, then of course the additional money is not well spent. Ii, my experience, however, the simpler method has never worked on large software development efforts and the costs to recover far exceeded those required to finance the five-step process listed.
Ok, I cheated
The above is the exact transcription of a paper by Dr. Winston W. Royce.
He wrote it in 1970, 41 years ago.
Should you have never read it before (actually, it’s pretty famous), it’s the original article in which what is now known as Waterfall Approach was presented. (Pretty funny: in that paper, the world Waterfall has never been used)
In fact, it sounds a bit old. Yet, I personally found it very different from the description most Agile coaches give.
Can you imagine 1970’s IT?
Prolog and Smalltalk had yet to come. C would appear in 2 years.
A typical 1970 computer was Imlac PDS-1, a computer with 4096 16-bit words
Agile was a revolution: it introduced the concept of On-Site Customer
On-site customer (in XP slang) or Product Owner (in Scrum) is not involved just in analysis. Customer involvement is continuing. That’s a revolution!.
Have you noticed figure 9’s caption?
Involve the customer – the involvement should be formal, in depth and continuing
Never the less, hordes of Agile Coaches taught us that before Agile the customer was involved in the first and in the last phases only. For the customer’s point of view, with Waterfall, the life-cycle of software was a great black hole, ending with a disastrous acceptance test, where she could have the opportunity to discover the team had produced a software completely far from her expectations.
Pretty funny, since there’s nothing like this in the original paper.
Surely, after 41 years our ability to communicate with the customer has evolved. But this is not an excuse for using FUD and propaganda.
What about the iterative nature of Agile?
Did you notice how many time the world iterative is used in Royce’s paper?
If the total resources to be applied are insufficient or if the embryo operational design is wrong it will be recognized at this earlier stage and the iteration with requirements and preliminary design can be redone before final design, coding and test commences.
John Russell in a comment to Christopher Goldsbury’s post Agile Hybridization on InfoQ wrote (thank you, John: it’s the second time I’m quoting your words):
There is no explicit methodology called “waterfall” and there never has been.
It is nothing more than a straw-man used to criticize well established and proven analysis, design, and development practices.
No “rules” are being broken by going back and changing requirements models or artefacts when something is discovered during later phases for example. Neither is there any “rule” which says you must design everything in minute detail before starting to code, or that you can’t use OOAD, UI prototypes, feature backlogs (prioritized lists), or that developers can’t have direct end-user contact when using a “waterfall” process
Agile has TDD!
Sure, that article is old.
I bet most of developers would disagree the part claiming that the testing phase must be done after the coding phase.
I like to read “testing phase” as the part where end-to-end and acceptance tests are done.
It’s hard to believe that TDD is not that new.
Try this little game. Read the following sentences, and try to guess when they have been written:
1 A software system can best be designed if the testing is interlaced with the designing instead of being used after the design. Through successive repetitions of this process of interlaced testing and design the model ultimately becomes the software system itself. I think that it is the key of the approach that has been suggested, that there is no such question as testing things after the fact with simulation models, but that in effect the testing and the replacement of simulations with modules that are deeper and more detailed goes on with the simulation model controlling, as it were, the place and order in which these things are done.
2 But one should not first make the program and then prove its correctness, because then the requirement of providing the proof would only increase the poor programmer’s burden. On the contrary: the programmer should let correctness proof and program grow hand in hand. […] If one first asks oneself what the structure of a convincing proof would be and, having found this, then constructs a program satisfying this proof’s requirements, then these correctness concerns turn out to be a very effective heuristic guidance. By definition this approach is only applicable when we restrict ourselves to intellectually manageable programs, but it provides us with effective means for finding a satisfactory one among these.
1. 1968, excerpt from Report of The Nato Software Engineering Conference
2. 1972, excerpt from The Humble Programmer, Edsger W. Dijkstra
It’s easy for us to do TDD with xUnit, Continuous Testing tools and super fast IDE.
Think about doing TDD with a 4096 bytes computer.
Those men were able to do it. At least, they were able to think about the possibility to do it. They invented it when no tool was even conceivable.
Conclusion (this time for real)
Please, don’t misunderstand me. I’m a big Agile fan. I am a big Agile objecter, sometimes. Most of the times, I just think I have so much to learn from both Agile and oldest methodologies.
What I think is that, as software developers, we are standing on the shoulders of giants. We should learn from the past, study it with no conceitedness and trying to evolve it, rather than always rejecting it just because it’s past.