How Microsoft dragged its development practices into the 21st century (Part 1)

Diem Do

 

For the longest time, Microsoft had something of a poor reputation as a software developer. The issue wasn’t so much the quality of the company’s software but the way it was developed and delivered. The company’s traditional model involved cranking out a new major version of Office, Windows, SQL Server, Exchange, and so on every three or so years.

 

The releases may have been infrequent, but delays, or at least perceived delays, were not. Microsoft’s reputation in this regard never quite matched the reality—the company tended to shy away from making any official announcements of when something would ship until such a point as the company knew it would hit the date—but leaks, assumptions, and speculation were routine. Windows 95 was late. Windows 2000 was late. Windows Vista was very late and only came out after the original software was scrapped.

 

In spite of this, Microsoft became tremendously successful. After all, many of its competitors worked in more or less the same way, releasing paid software upgrades every few years. Microsoft didn’t do anything particularly different. Even the delays weren’t that unusual, with both Microsoft’s competitors and all manner of custom software development projects suffering the same.

 

There’s no singular cause for these periodic releases and the delays that they suffered. Software development is a complex and surprisingly poorly understood business; there’s no one “right way” to develop and manage a project. That is, there’s no reliable process or methodology that will ensure a competent team can actually produce working, correct software on time or on budget. A system that works well with one team or on one project can easily fail when used on a different team or project.

 

Nonetheless, computer scientists, software engineers, and developers have tried to formalize and describe different processes for building software. The process historically associated with Microsoft—and the process most known for these long development cycles and their delays—is known as the waterfall process.

 

The basic premise is that progress goes one way. The requirements for a piece of software are gathered, then the software is designed, then the design is implemented, then the implementation is tested and verified, and then, once it has shipped, it goes into maintenance mode.

 

The wretched waterfall

 

The waterfall process has always been regarded with suspicion. Even when first named and described in the 1970s, it was not regarded as an ideal process that organizations should aspire to. Rather, it was a description of a process that organizations used but which had a number of flaws that made it unsuitable to most development tasks.

 

It has, however, persisted. It’s still being commonly used today because it has a kind of intuitive appeal. In industries such as manufacturing and construction, design must be done up front because things like cars and buildings are extremely hard to change once they’ve been built. In these fields, it’s imperative to get the design as correct as possible right from the start. It’s the only way to avoid the costs of recalling vehicles or tearing down buildings.

 

Software is cheaper and easier to change than buildings are, but it’s still much more effective to write the right software first than it is to build something and then change it later. In spite of this, the waterfall process is widely criticized. Perhaps the biggest problem is that, unlike cars and buildings, we generally have a very poor understanding of software. While some programs—flight control software, say—have very tight requirements and strict parameters, most are more fluid.

 

For example, lots of companies develop in-house applications to automate various business processes. In the course of developing these applications, it’s often discovered that the old process just isn’t that great. Developers will discover that there are redundant steps, or that two processes should be merged into one, or that one should be split into two. Electronic forms that mirror paper forms in their layout and sequence can provide familiarity, but it’s often the case that rearranging the forms can be more logical. Processes that were thought to be understood and performed by the book can be found to work a little differently in practice.

 

Often, these things are only discovered after the development process has begun, either during development or even after deployment to end users.

 

This presents a great problem when attempting to do all the design work up front. The design can be perfectly well-intentioned, but if the design is wrong or needs to be changed in response to user feedback, or if it turns out not to be solving the problem that people were hoping it would solve (and this is extremely common), the project is doomed to fail. Waiting until the end of the waterfall to discover these problems means pouring a lot of time and money into something that isn’t right.

 

Waterfalls in action: Developing Visual Studio

 

Microsoft didn’t practice waterfall in the purest sense; its software development process was slightly iterative. But it was very waterfall-like.

 

A good example of how this worked comes from the Visual Studio team. For the last few years, Visual Studio has been on a somewhat quicker release cycle than Windows and Office. Major releases come every two or so years rather than every three.

 

This two-year cycle was broken into a number of stages. At the start there would be four to six months of planning and design work. The goal was to figure out what features the team wanted to add to the product and how to add them. Next came six to eight weeks of actual coding, after which the project would be “code complete,” followed by a four-month “stabilization” cycle of testing and debugging.

 

During this stage, the test team would file hundreds upon hundreds of bugs, and the developers would have to go through and fix as many as they could. No new development occurred during stabilization, only debugging and bug fixing.

 

At the conclusion of this stabilization phase, a public beta would be produced. There would then be a second six- to eight-week cycle of development, followed by another four months of stabilization. From this, the finished product would emerge.

 

With a few more weeks for managing the transitions between the phases of development, some extra time for last-minute fixes to both the beta and the final build, and a few weeks to recover between versions, the result was a two-year development process in which only about four months would be spent writing new code. Twice as long would be spent fixing that code.

 

Microsoft’s organizational structure tends to reflect this development approach. The company has three relevant roles: the program manager (PM), responsible for specifying and designing features; the developer, responsible for building them; and QA, responsible for making sure the features do what they’re supposed to. The three roles have parallel management structures (PMs reporting to PMs, and so on).

 

Part 2

Share the news now

Source : arstechnica.com