They say that two heads are better than one. So it’s no wonder that many software companies are starting to take a friendlier look at software development techniques in pairs. Two people work together on a single block of code – one is the driver, the other the navigator. The driver in this process is responsible for carefully developing the code, and the navigator’s job is to review and focus on the roadmap. Much research outlines the very big advantages that affect the increased productivity and the security and robustness of the software produced. But is it true for every issue?
Take the case of two programmers sitting down over a single piece of software code and thinking about a very extreme case that might occur in a really limited accident. They are working on an algorithm responsible for the behavior and actions of an autonomous car. All they have in front of their eyes is a block of code, after which they can vainly read the scene that will really happen in a moment.
They don’t realize that they are working on a piece of software that will take a specific action for which, in real life, we would have only a second. The situation they are looking at is hypothetical, not describable in words or pictures – it just is. They are sitting and programming, drinking their fourth coffee, and completely unaware of the gravity of the situation. Suddenly they make an unconsulted, irreversible decision. Even if they did, would the supervisor know what the risk of this decision was? In the automotive industry, no one watches over the quality of software, as there is in the case of airplanes. No one knows how to deal with this subject, and who would ask themselves, how can small decisions – made jointly by two programmers – without consultation or a point of reference, affect the world around us? After all, it’s just one line of code, a few characters, seemingly nothing. However, a few months later, someone dies on the lanes because of this small mistake.
The situation described is purely hypothetical and colored. However, many years ago, it was because of such a software error that a Toyota Camry did not allow the driver to stop. As a result of the impact, the female passenger was killed. And not so long ago, in 2018, a self-driving Uber car killed a woman. At the time, the Transportation Safety Board revealed that the company’s autonomous test vehicles had been involved in as many as 37 accidents in the 18 months leading up to the time of that incident.
Try calling for help using software
On April 10, 2014, a woman living in Seattle called 911 over 35 times. All she heard in the receiver was a busy signal, and meanwhile, an unknown perpetrator was trying to break into her home. The same busy signal appeared throughout Washington state. 11 million Americans no longer had the ability to call for help.
A few years earlier, local support systems were abandoned in favor of more complex Internet-based software. A server located in Colorado handled dispatch from across the country, and that’s where all calls to the emergency number went.
It turned out that this major failure was caused by a small piece of code — a prosaically simple error. Programmers had set a threshold for how many (total) connections the software could accept.
And when it reached that threshold, everything simply stopped working. There were no warning systems, and since the software cannot be examined with the naked eye, there was no indication that there was anything wrong with it.
What’s going on over our heads
Depending on the time of year, or day, there are between 8,000 and 20,000 airplanes over our heads. Statistically, flying them is the safest form of transportation, even though each of these planes is a huge metal can with hundreds of miles of cables and a huge computing machine onboard (and another working on the ground). Since the 1980s, airplanes have become increasingly dependent on software. Fortunately, the Federal Aviation Administration takes a very strict approach to test it and make sure it’s documented. Thanks to that, the number of produced errors is relatively low. However, we have to remember that no matter how scrupulously we test the produced software, it is impossible to find all the bugs and consider all the possible causes of their occurrence.
In recent years alone, we have had dozens of cases of aircraft groundings
due to miswritten code. In 2019, British Airlines canceled more than 100 flights, and more than 200 were delayed. In May 2017, the same airline had problems with over 1,000 flights, its hotline, website, and mobile app.
However, this is nothing compared to when a Boeing 737 Max caused two plane crashes precisely because of a miswritten program. Their cause was most likely a flaw in the MCAS software that automatically prevents stall (loss of lift). It is said that the reason for this defect was pressure to introduce cheaper outsourcing to less expensive contractors. The software was produced by a company that hired temporary workers from India at a rate of $9/hour. At the same time, Boeing was laying off its experienced engineers, resulting in a loss of control over the software being developed. Boeing would prepare the specifications, and the aforementioned company would develop the software according to them. Several rows of workers sat at desks worked on the software. The software that went to Boeing did not pass even the simplest of tests, and its code was of very poor quality. Despite the huge number of reported irregularities, it was impossible to catch them all. The cost-cutting decision thus taken became dramatic in its consequences. We can only imagine the magnitude of the problem and wonder how safe the software, which is updated quite often on airplanes, is. Boeing reported a $636 million loss for 2019.
Since we’re already so high
Wherever we go, we can hear the opinion that everything should now run in the cloud. It’s where the future is. It’s the only thing that will give us high security, scalability, and quality for our business. The world around us, the devices connected to the Internet, most of the software, websites, mobile applications are based on such services. Of course, I’m not questioning the validity of this solution, but it’s worth thinking hard if it’s not glorified too much.
One of the cloud solution providers is Amazon, which is regarded as the most reliable player in the market offering hosting services. Many companies of all sizes and industries store their data in AWS data centers. This includes well-known brands such as Netflix, Slack, Business Insider, IFTTT, Nest Trello, Quora, and Splitwise. Many startups or companies also base their operations on computing or microservices from this provider. So you can easily imagine what happened when in 2017, a minor failure of AWS servers on the west coast of the United States contributed to hundreds of Internet problems around the world. Devices that used and were connected to the Internet for their operation also failed.
The coldest day of the year and the card walks free from jail
Google-owned Nest released a software update for its thermostats in December 2015. Unfortunately – the update failed, and 99.5% of users of these smart devices were left without access to hot water and heating on one of the coldest weekends that year.
Google-owned Nest released a software update for its thermostats in December 2015. Unfortunately – the update failed, and 99.5% of users of these smart devices were left without access to hot water and heating on one of the coldest weekends that year.
This is not a problem of recent years
Between 1985 and 1987, at least 5 patients have died, and many others have been critically injured by a software error in radiation therapy equipment. A similar incident occurred later in 2000. In 1991, a software error prevented the interception of a ballistic missile that struck a US barracks in Saudi Arabia, killing 28 people and injuring another 96.
Why such changes
Every time we as a society have ascended to a higher level of civilization, newer and newer solutions have appeared in our surroundings to make our lives easier. From less to more complex devices. Until they had their certain physical limitations. We defined whether something would happen or not and whether it was possible using parts, gears, and mechanical elements. Later with the help of uncomplicated electronics. So it was simple to predict all the possible states that a machine could be in. With software, it’s different. Its flexibility of change is so great that it carries enormous possibilities but also enormous danger.
However, it seems that we know how to program. We know how to do it – at least in theory. There are dozens of software technologies and ways of making software. We have dedicated environments for this. We don’t have to program with zeros and ones anymore (it’s worth mentioning that one of
the largest programs for writing software – Visual Studio – has 55 million lines of code). The software knows exactly what it is supposed to do – provided it is well defined. These definitions constitute software bugs, and these bugs are born in the minds of the people who write the code. After all, there is nothing physical about software; you can’t see its reflection. Software is a thousand times more complex than mechanical elements, and as a result, we often create something beyond our cognitive and intellectual capacity.
Code is a kind of handiwork. When you are manually producing 10,000 lines of code, there is a little problem with controlling it. But once we’re talking about 30 million or 100 million — as in the case of Tesla’s software or other high-end cars — it becomes enormously expensive and complicated not only to create but also to test and maintain.
There is a recent trend to replace programmers with more software so that machines create the code. I am very skeptical about this. After all, the machines that create the software will still be created by those who currently produce the code. One bug created by the programmers could eventually lead to an infinite, ever-repeating number of bugs already created by the machine. Another idea for software development is to create it through flowcharts that describe the program’s rules, yet the computer itself generates code for it based on those rules.
I think that the changes around us, and the fact that software is eating the world, can no longer be undone. However, it’s worth remembering the limitations of software and that the people creating it are not infallible. We have to be careful not to fill the gaps in the programming world with people who may not be suitable for it (after all, not everyone is predisposed to be a doctor, teacher, kindergarten teacher, or nuclear physicist), or who may have a negative impact on the created software and its security. This is one of the many current dangers
we face.