Followup To The Big Bang Disaster: Microsoft Vista

December 04, 2006

The development of Microsoft Vista apparently took 10000 employees 5 years. This is even more people than I would have expected (assuming it's close to the truth, estimates are usually way off). Big Bang projects always seem to attract huge budgets and massive headcount, but there are few companies out there who could even field such a large workforce and invest so many billions.

The Vista development story has been covered heavily over the 5 year period, starting off with a bang (of course) of new directions, technologies and features, and ending with a whimper of reduced expections. That it even shipped at all is a miracle, although the real test will be when people start using it for real. Big Bang projects usually fail completely.

It would be interesting to know how many actually programmers and architects were part of this horde of people. Copland's engineering team was about 500 people, but I know not all were actually programmers. The largest team I ever worked with in my entire career was about a dozen people, in two states, and even that was fraught with communications issues. 12 people have 66 potential conversations; 10,000 people have 50,000,000. I have a hard time imaging any way to effectively manage a group of people that size working on a single project in any industry.

The Manhattan project ultimately employed around 130,000 people and cost (in today's dollars) something like $20 billion; however the core team was relatively small (a few hundred at most) and run under heavy military discipline. In the earliest days it was really the work of a handful of people and this team essentially created the basis of the entire project.

I think this is the core (!) of how to organize a software project of Big Bang size, an inverted pyramid: small team of highly capable, experience folks, building (not just designing) the core of the system on which the lower, larger teams build additional functionality. I still think the size of the team needs to be as small as possible, even to the point of emaciation. When people are faced with too few resources, but are highly skilled, and most importantly, have a high latitude in their choice of technologies and tools, they will find a way to make it work. The sad part of this is that in most places, your choice of technologies and tools, and even the way you go about building a software system, is highly constrained.

In the vase of Vista, Microsoft's team at first was excited about using managed code (basically the .Net CLR) but apparently not enough of the team agreed, and ultimately they fell back to using native code (C ), which in my experience is a pain in the butt in large teams. Of course any technology they want to use either had to be in-house or developed from scratch; unlike Apple's OSX, which uses many open source technologies (BSD, Mach, etc) in the core OS, Microsoft never considered such an approach.

So why was Vista such a nightmare? Too many people, too much "must be invented here", too many new untried technologies and the ultimately difficult requirement of supporting old API's in the core OS (a fundamental issue with Copland).

Oddly enough the Big Bang project that was The Manhattan Project actually produced two successful big bangs (killing 150,000 people is a sad success) while Vista is mostly producing a Big Thud.