Automated Testing

I was recently discussing the problems of software security and code quality. These articles inspired the conversation;

Everything is Broken, a sincere recount of the awfulness of almost all software.
They Write the Right Stuff, how the Shuttle Group makes nearly-perfect software for NASA space shuttles.
The Explosion of the Ariane 5 was because of a software bug.

There are lessons for us software engineers/teams in these articles, especially in “They Write the Right Stuff”. But all of these quality control mechanisms come at a cost.

But first, let’s put the article in perspective;

It was published in 1996, 20 years ago. Git didn't exist. (And CVS sucked.) "Agile" meant something about your sporting ability, not software. And a tonne of software tools and modern high level languages didn't exist or were in their infancy. The Shuttle Group probably had few or no choice of libraries to use, and had to write their own GPS software, among other things.
Further; NASA is probably the perfect client/employer; Big budget, high technical capability, commitment, and detailed specs. Few software teams/engineers are so lucky to have such a client/employer.
On the flip-side, if the software fails, astronauts die. And the world’s media looks for someone to blame. And the US government will make sure some heads roll. There are high consequences to buggy software.
"If the software isn't perfect, some of the people we go to meetings with might die.
Finally, Fast Company is a media company. They publish high quality content that attracts the eyes of consumers, which allows Fast Company to sell advertising or subscriptions. That means they sensationalise the story. NASA’s own “Computers in Spaceflight” appears to be a more realistic account of the history of NASA shuttle software.

Now; What are the costs?

Most notably, the Shuttle Group followed a strictly waterfall process;

“...about one-third of the process of writing software happens before anyone writes a line of code.

“The specs for [a change that involves 6,366 lines of code] run 2,500 pages.”

Second, the Shuttle Group was expensive to run;

“... the group’s $35 million per year budget is a trivial slice of the NASA pie, but on a dollars-per-line basis, it makes the group among the nation's most expensive software organizations.

Extrapolating some other numbers; The program had about ~424,400 lines of code in 1996, which had been worked on for 21 years at a total cost of ~$700M. That's ~$1,650 per line of code. (Not that lines of code is a meaningful measure of code complexity.) I've not accounted for inflation.

The economic costs are just the software; I.e. travel and office expenses, but predominantly, people's time (salaries).

I first read that article about 9 years ago. After rereading it last weekend, I realised it had a profound impact on how highly I value software quality; I have always strived for the most elegant, simple and readable code.

More recently I've learned that it's almost never worth the effort to get software to its highest possible quality. The problem with making software right in the first round of development is that you probably haven't found the best solution yet. You probably missed a more elegant solution. And you probably have not fully understood the entirety of problem and all its edge cases.

Further, the more time you spend refining a work, the more emotional value you give it. This makes it harder to recognize or admit that it might be the wrong solution, and much harder to delete.

(Good code deletion skills are highly valuable, by the way!)

In other words, we have to find the balance between "It works" and "It is right" (correct, easy to read, simple, elegant, deduplicate, etcetera), considering the risk of not making it right (yet) along with the time, people and money available.

Usually it is better to get something out which only mostly works but is not totally “right”, then only make it “right” once we understand the problem and possible solutions better. That might be hours or months later, depending on the scope and scale of the problem and solutions, how fast the feedback loop is, and what other priorities come up in the meantime. Or it might be never.

Often "making it right" happens when that code needs more work for a new feature. That is why we groan at the old smelly corners of our code that we have to work with that were "so awfully written". And why we like blame the since-departed colleague. We fail to recognize those smelly corners were probably perfect at their time (given the other constraints).

This illustrates the importance of getting a fast feedback loop. Automated testing, good logging & analysis tools, frequently publishing code (even if it's not quite right), a small core of highly engaged customers (and/or awesome testers) all help tighten that feedback loop.

Automated testing is especially important when refactoring old smelly corners. Every edge case and subtlety must be captured by tests before you can safely start refactoring or modifying it. But once you have complete test coverage, you can unleash the hounds on the refactoring without concern for breaking it.

Obehave offers automated website tests for everyone. Anyone can write automate tests with Obehave, not just programmers. Tests are written in Gherkin, a plain English syntax pioneered for Behaviour-Driven Development (BDD). Obehave tests can run on a schedule; every hour, day or week. And can integrate with your continuous integration environment.

Bevan Rudge JavaScript and web technologist

Writing the Right Stuff, in the 21st Century

Around & about

Recent Articles