Does OpenSSL bug prove that open source code doesn’t work?

Home > musing, open source tools > Does OpenSSL bug prove that open source code doesn’t work?

Does OpenSSL bug prove that open source code doesn’t work?

April 10, 2014 Cathy O'Neil, mathbabe

By now most of you have read about the major bug that was found in OpenSSL, an open source security software toolkit. The bug itself is called the Heartbleed Bug, and there’s lots of information about it and how to fix it here. People are super upset about this, and lots of questions remain.

For example, was it intentionally undermined? Has the NSA deliberately inserted weaknesses into this as well? It seems like the jury is out right now, but if I’m the guy who put in the bug, I’m changing my name and going undercover just in case.

Next, how widely was the weakness exploited? If you’re super worried about stuff, or if you are a particular target of attack, the answer is probably “widely.” The frustrating thing is that there’s seemingly no way to measure or test that assumption, since the attackers would leave no trace.

Here’s what I find interesting the most interesting question: what will the long-term reaction be to open source software? People might think that open source code is a bust after this. They will complain that something like this should never have been allowed to happen – that the whole point of open software is that people should be checking this stuff as it comes in – and it never would have happened if there were people getting paid to test the software.

First of all, it did work as intended, even though it took two years instead of two days like people might have wanted. And maybe this shouldn’t have happened like it did, but I suspect that people will learn this particular lesson really well as of now.

But in general terms, bugs are everywhere. Think about Knight Capital’s trading debacle or the ObamaCare website, just two famous recent problems with large-scale coding projects that aren’t open source.

Even when people are paid to fix bugs, they fix the kind of bugs that cause the software to stop a lot sooner than the kind of bug that doesn’t make anything explode, lets people see information they shouldn’t see, and leaves no trace. So for every Knight’s Capital there are tons of other bugs in software that continue to exist.

In other words it’s more a question of who knows about the bugs and who can exploit them. And of course, whether those weaknesses will ever be exposed to the public at all.

It would be great to see the OpenSSL bug story become, over time, a success story. This would mean that, on the one hand the nerds becoming more vigilant in checking vitally important code, and learning to think like assholes, but also the public would need to acknowledge how freaking hard it is to program.

Categories: musing, open source tools

Comments (25)

Benjamin Geer

April 10, 2014 at 7:05 am

I think it’s not just that nerds need to become more vigilant. Some programming languages, like C, make it especially hard to avoid bugs like this, and especially hard to notice them no matter how many people look at the code. I’m hoping that the recent series of security flaws due to C coding errors will motivate the wholesale abandonment of C in favour of safer programming languages.

LikeLike
RS

April 10, 2014 at 7:51 am

As the good as the excellent explanation at http://nakedsecurity.sophos.com/2014/04/08/anatomy-of-a-data-leak-bug-openssl-heartbleed/ in detail covers the heartbleed bug it also shows that tracing the feature to NSA intervention would be easy. If someone from the NSA really would have tried to sneak this bug past people it would be easy to spot the attempt in the logs of the version tracker software used. No names have been mentioned which implies this is a genuine bug and not an attempt by the NSA. If this bug had been present in a commercial software the responsible company would probably have lied and denied its existed in the first place increasing risk and damage to everybody affected. If anything the heartbleed bug shows why you should run open source and not commercial software.

LikeLike
- Cathy O'Neil, mathbabe
  
  April 10, 2014 at 7:56 am
  
  Great point!
  
  LikeLike
- araybold
  
  April 10, 2014 at 8:24 am
  
  On the contrary, the open source model would make it possible for an agent to establish a reputation as a contributor and then make a simple ‘mistake’.
  
  I followed the link that you provide, and found that nowhere in the original post or the comments are there statements supporting your claim that identifying any covert intent, let alone by the NSA specifically, would be easy. I suppose it is possible that it has been redacted. but as this is a blog from a respected security vendor, I would be very surprised if anything so patently wrong was published.
  
  LikeLike
  - Benjamin Geer
    
    April 10, 2014 at 9:13 am
    
    The code containing the bug was added by a German programmer, Robin Seggelmann, who worked for the German telecom T-Systems. I’m guessing it’s rather unlikely that the NSA would employ a German. His comments on the bug: http://www.smh.com.au/it-pro/security-it/man-who-introduced-serious-heartbleed-security-flaw-denies-he-inserted-it-deliberately-20140410-zqta1.html
    
    LikeLike
    - araybold
      
      April 10, 2014 at 10:14 am
      
      In the words of Mandy Rice-Davies, “He would, wouldn’t he?” Also, espionage organizations have been known to recruit foreigners as agents.
      
      Nevertheless, I agree in thinking that it is more likely to be accidental than deliberate – these sorts of things happen all the time. I merely take issue with the claim that a deliberate attempt to subvert the code would be easily traced.
      
      It is also possible that one or more espionage or security agencies discovered, and perhaps exploited, this bug before we learned of it.
      
      LikeLike
araybold

April 10, 2014 at 8:46 am

Whether an attack could proceed without leaving a trace depends on the thoroughness of activity logging, how well secured that logging is, and the retention period of the logs.

Traces of the attack have been seen, in server audit logs, going back at least to November:
http://arstechnica.com/security/2014/04/heartbleed-vulnerability-may-have-been-exploited-months-before-patch/

LikeLike
rageofreason

April 10, 2014 at 8:51 am

What’s remarkable about open source code, esp. core stuff like OpenSSL, is how well it works and how robust it is. It’s a bit of a generalization of course, but one huge driver of quality is intrinsic interest, pride of the developers and kudos of the community. I’ve spent over 20 years developing systems – and I certainly think I’m careful and methodical – but working on your own stuff (or ‘our’ stuff in open source) takes it to another level. You care more because you’re contributing to the commons, not just to further your organization’s ends.

LikeLike
araybold

April 10, 2014 at 9:02 am

Your claim that the open-source ‘many eyes’ auditing mechanism worked as intended is a denial of the reality of what happened here. As was pointed out elsewhere, it is like saying the Titanic’s lifeboats worked.

There is a persistent myth in the open source community, sometimes called Linus’ Law, that says ‘with enough eyes, all bugs are shallow’. This event shows that it is simplistic wishful thinking (and also that the criminal community has rather more, and more dedicated, eyes.) There is no open road to security.

LikeLike
- Count Dracula
  
  April 10, 2014 at 7:33 pm
  
  Yes, I think in some strong sense it shows that Linus’ Law does not apply to finding this type of bug, i.e., one that does not show up ordinarily. It is very hard to get people to carefully review code. It is incredibly boring and yet it is not a trivial thing to do (there is a theorem saying one cannot write a program that always finds all bugs). The boring aspect makes it less likely that good coders will do this kind of job, etc, etc.
  
  This got me very depressed about the whole thing for a while.
  
  But looking into it more, I agree with Bruce Schneider that the jury is till out on whether this was exploited widely in the past. At this point there is some evidence (in the link your gave above — basically one case), but one would expect more logs to contain examples if it was used widely (and since now we know what to look for you expect various people to have looked or be looking). We’ll see.
  
  LikeLike
  - araybold
    
    April 11, 2014 at 9:15 am
    
    Agreed – we seem to have dodged a bullet here, at least as far as simple crime is concerned. While that is better than the alternative, it is not particularly reassuring.
    
    LikeLike
  - Larry Headlund
    
    April 11, 2014 at 11:10 am
    
    “there is a theorem saying one cannot write a program that always finds all bugs”
    
    Variant of Godel’s Incompleteness as I recall.
    
    LikeLike
David18

April 10, 2014 at 10:27 am

There is mission critical software (eg, nuclear power plant control systems, airplane flight control computers) that requires a higher level of systemized development and testing (eg, with checklists and standardized procedures). Software that is not mission critical works well enough in the open source model. One might argue that security software such as OpenSSL belongs in the mission critical software that requires a more standardized model of development and testing.

LikeLike
Jessica Dodson

April 10, 2014 at 3:49 pm

” they fix the kind of bugs that cause the software to stop a lot sooner than the kind of bug that doesn’t make anything explode, lets people see information they shouldn’t see, and leaves no trace. ”

Very good point. How can you fix something that you don’t know is broken? It’d be like looking for one leaky pipe in a building when there are no signs that anything is amiss with the plumbing. Even with thousands of eyes you still have to be looking in the right place to catch something.

LikeLike
- Lior Silberman
  
  April 10, 2014 at 11:10 pm
  
  In this case, a design decision by the developers (using their homebrew malloc instead of a library one) meant exactly that this bug didn’t cause the software to crash.
  
  LikeLike
Bill Nichols

April 10, 2014 at 10:48 pm

“Even when people are paid to fix bugs, they fix the kind of bugs that cause the software to stop a lot sooner than the kind of bug that doesn’t make anything explode, lets people see information they shouldn’t see, and leaves no trace”

Which is precisely why relying on test to remove a large number bugs is doomed. Eric Raymond liked to say “with enough eyeballs, all bugs are shallow”. That’s kind of a moot point if no more than a handful of people do more than read, rather than inspect the code using the formal approaches pioneered at IBM and NASA in the 1970s. The problems are not unique to OSS, Apple’s recent “goto fail” bug should have been found by inspection or simple static checks for dead code. there should have been a test, but sometime it is hard to foresee or invoke the specific input conditions required to trigger the fault.

As for “mission critical”, does the code have to work? If it doesn’t matter if it works, why bother writing it at all? And asking which part of the code is critical is a little like asking which edge of the scissors cuts the paper.

The data I have seen in peer reviewed studies do indicate some very widely used OSS products do have higher levels of latent defects and security vulnerabilities than their commercial counterparts. There are multiple factors that may lead to this, only one of which is test.

LikeLike
- araybold
  
  April 22, 2014 at 9:15 pm
  
  “As for “mission critical”, does the code have to work? If it doesn’t matter if it works, why bother writing it at all?”
  There’s quite a lot of software that is useful even if not perfect, and will not do any real harm in failing – like the software running this blog, for example. Unfortunately, the superficial similarity of critical and non-critical applications sometimes leads to the former being developed with methods and skills that are only suitable for the latter. This is particularly likely when the critical parts are embedded in a generally non-critical context.
  With regard to security and safety, the open/closed divide is a false dichotomy. The divide that does matter, between the methods and skills required to create mission-critical software and those that are adequate for non-critical work, is under-appreciated, even by many people in software development.
  
  LikeLike
Lior Silberman

April 10, 2014 at 11:09 pm

It’s important to realize that the bug itself is not even the real scandal, as far as the openssl project is concerned. The reason this bug took so long to discover and fix is that they decided to implement their own memory-allocation code rather than use the system C library. Bugs happen, but bad design is intentional.

Reinventing the wheel is bad practice. In this case, it means that users couldn’t link openssl against secure malloc libraries (which, for example, zero allocated memory to prevent exactly this kind of leaks, or place guard pages around allocated blocks).

LikeLike
- Bill Nichols
  
  April 11, 2014 at 7:59 am
  
  I agree strongly with the importance of design, and that includes good design practice, design with intent, and recording the design in a way that it can be analyzed and reviewed.
  
  A somewhat successful approach to deliver functionality with low defect density is to include a high proportion of “certified” components. That is components that are not only reused, but have been certified through both development AND test to behave as specified. I have not analyzed the concept WRT OSS , but while open source and community ownership has some advantages with reuse volume, it is not clear how certified would work with this model as it would inhibit base changes and relies upon more than just testing.
  
  I cannot emphasize this enough, testing is necessary but insufficient. The test and fix method is NOT the scientific method and it is not a very effective engineering method. There are a couple deeper and longer essays here. The short version is that test exercises the code under a very small portion of total possible conditions, both internal and external. One cannot test to completion. On the other hand, faults discovered in test are a pretty good predictor of more latent, as yet undiscovered, problems. Rather than “proof”, think of testing as a poor mans form of statistical verification
  
  LikeLike
medicalquackblog

April 11, 2014 at 1:16 pm

Thank you for this article and helping me find my identity too, after reading this I have identified that I have been thinking like an asshole for a while now (grin). Well thinking and acting are two different things so I’m not out trying to act like one hopefully:)

I used your post will full credits and link backs here over at my blog as I thought it was important to gets something out there other than the standard stuff you read in the news too as I have quite a few consumer readers and you are spot on with consumers needing to realize how difficult programming is as I beat my readers over the head all the time about this. Those that have not come to this conclusion are still looking for the Algo Fairies as I call it:) Hope I’m not “acting” like an asshole with that reference:)

Cheers and thank you for the great article as love the reality here as that’s a big issue I feel today and I have written about it a few times too that people are confused and can’t tell where virtual world values begin and end and when the real world decides to intersect. Of course the real world is what matters and we can use the virtual worlds to mange and get smarter, what it was meant to be to begin with. I call it “The Grays’ when it seems to sometimes get reserved out there with priorities and we have steroid marketing and inflated stock values intangibles as contributors to the growth of “The Grays” too.

LikeLike
John Baker

April 16, 2014 at 2:29 pm

No, it only shows that programmers make mistakes – lots of them. The number of bug free programs in the wild is a very small number and the ratio of bug free, to buggy programs, is effectively zero. This holds for all programs open and closed source. The SSL problem shows software is a bit like strains of corn. If you plant all your fields with the same strain the first pest that takes a liking to it could wipe out your crop.

LikeLike
JamesNT

April 19, 2014 at 9:57 pm

As a software developer of 10 years, I’d like to respond to some of your comments.

@Benjamin Geer: The C language isn’t going anywhere. The C language is on a one-to-one basis with assembly language which means by using C the programmer can get as close to the hardware as one can get. Things such as direct manipulation memory by address is very possible in C. This allows for very high performance. Most video games on your computer are written in C or C++ and so are many hardware drivers. When raw performance and maximum flexibility count, C is very often the language of choice. Other languages, such as Java, that run on top of a runtime that hides the lower levels have a serious overhead that impedes performance. This isn’t to say these languages are slow, but they aren’t the first ones you’ll use for complicated operations such as encryption. For an interesting read on unmanaged (C, C++) versus managed (C#) speed see this write-up regarding a showdown between Raymond Chen (using unmanaged code) and Rico Mariani (using managed code). While Raymond had to do considerably more work to eventually beat the .Net runtime, he did beat it and beat it badly. It took a lot of work that will mostly certainly not be worth it for many applications, but for things like SSL that work is worth it.

http://blog.codinghorror.com/on-managed-code-performance-again/

@Everyone: I’ll not be responding to the NSA paranoia.

@araybold, @CountDracula, @LarryHedlund, @David18: I’m not open source developer. I’m Microsoft all the way. Regardless, whether closed source or open source there are issues on either side. No matter how many checks-and-balances you have, no matter how many eyes you have, after all these years of development something was bound to get through. This isn’t some failure of open source, an incompetent programmer, or some massive failure. This is a mistake. Plain and simple. The open source guys have their “many eyes” and the close source guys have their methodologies as well but things still get through. The problem is that if you employee enough double-checking to catch every possible error and guarantee 100% quality, you’ll never see another new car get made, another new house get built, or another program get written. 100% is not feasible nor it is achievable. The open source guys did a great job for well over a decade before something like this happened. I say good job.

@Liorsilberman: There may have been a good reason why it was decided to make a specialized memory allocator. Performance or maybe some feature was required that the native allocator did not support. Since we are talking encryption, it is very possible the developers wanted memory allocated in a certain way to the encryption would be preserved in RAM and guaranteed not to be swapped out to disk if memory got low for some reason. I can think of many good reasons why some programmers make their own memory allocators.

@JohnBaker: Yes, we make mistakes. Some of us are new and just graduated college. Others of us are using a cool new library for the first time. Still others are dealing with documentation originally written in another language so there are some translation issues. And, my personal favorite, we are all dealing with trying to write software that fits the needs of end users who have no idea what they are asking for or how to ask for it. Trying to work with business needs is HARD. Knowing four different languages (SQL, C#, HTML, VB) just be effective in my job is HARD. And knowing all the different versions of those languages and the quirks between them is also HARD. There are days I’m surprised any of this stuff works at all.

I hope I’ve helped. Everyone have a great weekend.

JamesNT

LikeLike
merian

April 20, 2014 at 10:13 pm

Even though more than a week has passed, I’m adding a late comment because I’m not convinced this comment thread has hit the main issue. And, I hope you’ll forgive me, but I think the post was one of your weaker ones. (And I feel I should apologize, as I’m an enormous fan of yours and all your writing.)

The point I think is missing despite being crucial is that in the current open-source software field, there are pieces of highly critical software that are the work of a very small group of (often specialist) authors but on the other hand are either used by a huge number of other products or deployed on a huge number of system, or both (as for OpenSSL). They are crucial bits of infrastructure, in both the architectural and the build-out sense. Yet, they do not get anywhere near the same attention that many of those third-party products that rely on them do, especially the commercial ones. The main thing I would wish to see coming out of this — and what I hear in some quarters goes into this direction — is that if a corporation has a commercial product (or indeed an open-source product) that relies critically on one of these pieces of open-source infrastructure, they might want to consider either dedicating some of their expertise or developing the necessary expertise to get involved in that community, not only by contributing bugs that they may find to upstream, but also to review, vet and understand, and indeed improve. Indeed, some notable open-source projects have drawn exactly this conclusion, such as OpenBSD, some of the devs of which are currently reviewing OpenSSL (and are blogging what they’re finding: http://opensslrampage.org/ ….. well, nothing says you can’t be a bit of an asshole AND do the right thing).

This would mean that commercial companies have an incentive to increase their involvement with open source.

LikeLike
- Cathy O'Neil, mathbabe
  
  April 21, 2014 at 6:56 am
  
  Great point! And yes I am not an expert here nor do I think about this topic very much in my daily life, so I really appreciate people like you!
  
  LikeLike
merian

April 24, 2014 at 1:01 pm

:blush:

Anyhow, this just fluttered over my readers. The Open Source ecosystem (or socio-economic system, sometimes metaphors merge) is complicated. http://arstechnica.com/information-technology/2014/04/tech-giants-chastened-by-heartbleed-finally-agree-to-fund-openssl/

LikeLike