The problem with moar testing

I’m going to start this off with a story, and when it’s done, we’ll come back to software testing.

The Boeing 737 is the most popular passenger aircraft in the world. The type first flew in 1967, and with four major revisions, is still being produced and flown today.

On March 3, 1991, United Airlines Flight 585, a 737-200, crashed in Colorado Springs, killing 25 people. The plane was on final approach into Colorado Springs when it suddenly banked hard, pitched nose down and plunged into the ground at 245 MPH. Following the tragedy, the NTSB sifted through the rubble and were unable to determine the cause.

On September 8, 1994, USAir Flight 427, a 737-300, crashed near Pittsburgh, PA, killing 132 people. Again the NTSB, the premiere accident investigation agency in the world, was stumped. Boeing 737’s were falling out of the sky, and over 150 people were dead. On June 9, 1996, a third 737, Eastwind Airlines Flight 517 experienced the same hard bank as the other two planes. Miraculously, the pilot was able to recover the plane and land safely. The break for investigators was that the plane was intact, and could be investigated.

The investigators focused on a piece of hydraulic equipment called the Power Control Unit (PCU) which is responsible for controlling the rudder in the planes vertical stabilizer. Undamaged, the unit was put through the standard battery of tests and performed flawlessly. Ultimately, a test was performed where the unit was chilled to -40 and tested using heated hydraulic fluid. Finally the unit jammed, which if it had been installed, would have pushed the rudder to it’s blowdown limit and crash an airplane.

The story here though isn’t the investigation or the tragic loss of life, but rather the story of all the testing that DID happen. The PCU had passed all it’s individual tests: operating under load, under certain temperatures, number of duty cycles, etc; what developers would call “unit tests”. The PCU had also functioned property during the mandatory pre-flight tests; what developers would call “integration tests”. Finally, the PCU had performed 1000’s of prior take-offs and landings on flight 585 prior to it’s crash; what developers would call in-production testing.

The airplane was built by Boeing, and it took a year from the prototype rolling out to the plane getting it’s type certification from the FAA. The PCU is build by Parker Hannifin, an industry leader in the area of aerospace and hydraulics. So here is an airplane, built by established industry leaders, the model tested for a year and totally unchanged, subjected to a battery of individual component tests, flown for almost a decade without incident, and yet still fell out of the sky. How? Simple. The edge case. Very hot fluid into a very cold part wasn’t something engineers expected, and it didn’t happen often, so it was never tested.

How does this apply to software development? Well, for starters, try telling product owners or company owners that the design is going to freeze for a year for extensive testing, and then will only change once per decade after that. Good luck selling it. Somehow software developers (and IT engineers) are expected to meet or exceed the level of testing reserved for the aerospace industry, and maintain a nearly constant rate of change, even when all that testing still doesn’t fully prevent plane crashes.

The moral of the story here is that no matter how much testing you do, it’s the edge cases that will get you: the infinite number of human decisions combined with environmental conditions that are totally impossible to predict. No one ever thought to consider the effect of thermal shock on a hydraulic valve, but it happened. I’m not suggesting that testing is pointless, just that there is no such thing as perfect testing. The only response is to investigate and to make changes to prevent it in the future. Software developers will get caught on the edge cases, but at least your website isn’t going to kill anyone.

Mayday S04E04 does a pretty good job of summarizing the Boeing 737 rudder issues, and Wikipedia has an article on the subject if you interested in learning more.

Praxis Institute in the future

Apparently the Praxis Institute is a south Florida massage therapy school, but as a Star Trek fan I couldn’t resist googling my way through video editing and cutting this together.