Hey, as a starting point and a simplest-thing-that-could-possibly-work, how about making use of the publicity and pain of Heartbleed by launching a timely crowdfunding project to greatly increase the Internet Bug Bounty money for OpenSSL, and maybe (as a stretch goal?) start similar bounty pools for GnuTLS and other core crypto programs? Just stick it up on Kickstarter or wherever. There have to be plenty of unhappy sysadmins and IT managers who would readily chip in right at the moment, and the publicity should also help attract support from a much broader group of people who normally wouldn't even think about these things. The hoopla should also hopefully embarrass some big companies into making useful contributions to the pot. (Due credit to FB and Google for funding the IBB in the first place.) Strike while the iron is hot.
Yes, bug bounties on their own aren't a sufficient solution: more systematic, less ad hoc approaches are needed too, no doubt. But nonetheless I think it makes sense to double down on bug bounties, for two reasons. First, bug bounties are an approach that seems to be fairly successful, though the IBB money probably wasn't part of Neel Mehta's motivation to start inspecting OpenSSL at all. There may be better, though more ambitious, approaches, but if there's one thing that seems evident from the recent SSL/TLS bug fiasco it's that making the perfect, or even the great, the enemy of the pretty good is counterproductive here. Second, probably nothing will motivate the maintainers of OpenSSL etc. to actually implement (and keep implementing) systematic, proactive changes better than the constant, realistic threat of being pantsed again by bounty-hunting outsiders. (Remember: all technical problems are social problems in disguise, just as all social problems are technical problems in disguise. ;) ) Or failing that, fairly frequent bounty-funded bug finds would at least serve as a warning to the rest of us that the maintainers aren't doing very well.
I'd suggest that Plan A would be just to throw the money at HackerOne, as a donation to their Internet Bug Bounty program https://hackerone.com/ibb which is already set up and running.
JT Olds asks: "Why did Go implement its own SSL library? I've scanned the list and the website and I might have missed a design doc somewhere, but as far as I can tell I can't find any motivational reason for why Go implemented its own SSL library."
agl responds: "Have you seen the OpenSSL code? I do, indeed, work with it a lot and it doesn't exactly fill one with positive feelings. NSS has a different set of issues. (I'm not familiar with alternatives like Polar or GnuTLS.)"
Since Go's SSL library written in Go, it should be a lot more resistant Heartbleed-type errors. In Go you use slices to manipulate memory, which are basically (pointer, length, capacity) triples (IIRC). This is a nice feature of Go -- a new balance between safety and performance.
> In general it does not help the fact that the system that is the de facto standard in today’s servers infrastructure, that is, Linux, has had, and still has, one of the worst allocators you will find around, mostly for licensing concerns, since the better allocators are not GPL but BSD licensed.
I couldn't find any really good, recent benchmarks of different allocators, but it does seem to be the case that ptmalloc is not as fast as tcmalloc. However, i suspect that at the time at which it was adopted, ptmalloc was one of the fastest options available.
Sorry I was not aware that Glibc malloc is also BSD licensed, thanks for providing more information about this. Given that, to me it is a big mystery why they don't migrate to something more reasonable like jemalloc, which is able to provide less fragmentation and top performances. Jemalloc is also very actively developed.
I don't think the Linux malloc implementation is why OpenSSL has some of it's own memory management stuff built in. I think it's more due to the wide range of other platforms it runs on, including some embedded ones that have very strict allocation policies.
jemalloc may very well be a good thing for Linux in general but I don't think it would have changed this.
Yes I don't want to imply that glibc malloc was the culprit here, but glibc malloc is one of the reasons why certain systems have their own malloc / slab allocator / whatever.
About the embedded systems, usually you don't try to cover everything, it is often a more valid a approach to have specialized libraries for very resource constrained systems. In the case of OpenSSL I guess the problem is that the implementation was hard enough to force OpenSSL to blend instead. This is where to be able to say "no" is a good idea.
I think this comes under the heading of squabbling over details.
To quote from your blog post:
"This is a sign that still performances, even in security critical code, is regarded with too much respect over safety. In this specific instance, it must be admitted that probably when the OpenSSL developers wrapped malloc, they never though of security implications by doing so. However the fact that they cared about a low-level detail like the allocation functions in some system is a sign of deep concerns about performances, while they should be more deeply concerned about the correctness / safety of the system."
As you stated in this bit of your post this is a sign that the OpenSSL team have their priorities backward. In other words they are NOT fit to run a security critical project. Period.
Historically, performance has ALWAYS been a major security issue. To the point that it's been claimed to be easy to build highly secure ciphers and tools, it's difficult to build them that perform well. Performance was a major factor in the AES and SHA selection processes. Performance is a major factor in the interest in ECC. Arguably, performance is why you'd ever use C for anything at all.
Seems easy to dismiss them and say their priorities are backwards, fundamentally at the end of the day, a huge majority of the users will simply pick the benchmark winner and use it. It a free project and their priorities are what they are, what are the priorities of the legions that used it instead of writing their own?
I believe the stress on performances is largely community-driven in a project like OpenSSL. A lot of environments did not switched to encrypted connections because of cost concerns and unfortunately costs are related to OpenSSL performances. So it's not like they are crazy and mess with priorities IMHO. But it is time to re-evaluate.
Don't get me wrong -- performance can matter, but in the end it's up to the user of your cryptographic solution/process/abstraction to decide what tradeoff is acceptable to them. If you're responsible about it, you try to explain to potential users what the tradeoffs are -- you don't try to optimize before you can be sure that you're doing so sensibly (see next paragraph).
Of course you should try to optimize as much as is sensible, but if you don't have any process in place (unit tests, integration tests, fuzz testing, formal methods, timing attacks, whatever you can throw at it) that lend some assurance that Implementation#1 (optimized) actually does exactly the same as Implementation#0 (less or unoptimized) under all circumstances, then you should be as conservative as humanly possible... and choose to NOT optimize unless/before you have those processes in place.
(I'll happily admit that this is somewhat armchairy -- I'm glad I don't work in a particularly security sensitive area. I'm not sure I could handle all the obligations/pressures myself.)
What would be the heap protection features you need? [1] lists a couple features that I find useful. Of course, in Firefox, we use valgrind and ASAN extensively in automation and during development, so it matters less for us.
(edit: to reduce confusion, I just want to say that Firefox uses jemalloc, otherwise my comment makes no sense. It's however disabled if you compile Firefox for use with valgrind).
Those are the debugging features. Those features (plus the tools you listed) help you during the test phase of your SDLC.
In contrast, heap protection / exploit mitigation is intended to protect your software in the field from the bugs you did not know about until after the software shipped. Correct software would not need ASLR, stack canaries, NX or heap protection. But these things are widely accepted as required these days.
Desirable features are mainly heap overflow protection and double free protection (see [1] for a list of what ptmalloc does). The situation is not entirely simple because the specific protections you want (say in terms of heap metadata protection) depend on the internal design of the allocator and there are a lot of devils in the details.
There is some information around on jemalloc exploitation [2] [3], but I do not know if that information is current. I'm not aware of any jemalloc hardening which has happened since then. If you could provide any information in that area I'd be interested to know about it.
A bit of a tangent, but have you looked at musl libc's malloc? It's way simpler, primarily because it doesn't use per-thread heaps, yet its author claims that it's competitive with jemalloc on performance. I'm inclined to favor the simpler malloc.
Given the attention OpenSSL just "received" wouldn't it make sense to start a mass audit effort?
Why not counterbalance the "crypto software is hard, don't touch it" idea with some simple tasks, like: check all allocs in openssl for bounds checks? "mark" missing ones? "mark" fishy ones etc.
Can a software security audit be crowd-sourced and be meaningful?
The problem with #3 (Abstract C with libraries.) is that, while there are less place / code to audit, in case of failure (e.g. a security hole in said library), damages are everywhere.
A good exemple is Daniel J. Bernstein. The man doesn't even trust the libc and wrote its own layer for all of his software (djbdns, qmail, ...). (source : http://cr.yp.to/qmail/guarantee.html #7)
Though with the "weakest link" dynamics, I'm not sure "damages are everywhere" isn't a feature. If your library primitives are supposed to be designed such that they can only be used safely, and if there is a flaw in your primitives, the fact that everything uses those same primitives means that there's more testing of that same code in different situations, and so "damages are everywhere" means that errors are more likely to show up in some test. That is on top, of course, of DRY naturally shrinking the code that needs the most carefully audited.
I think there is a good balance between using third party library (possibly untrusted) and plain C. Kind of an analogy to centralized network vs decentralized.
So far as I can see TFA isn't even insisting on third party libs (even though he himself has released e.g. "a sane dynamic string library"); he just wants a refactoring, so that overflows and other problems are avoided in DRY fashion, using libraries. memcpy() isn't quite as smelly as gets(), but a well-designed library can certainly help even expert coders use it more safely.
I can't see how the DJB example supports the claim that "damages are everywhere". It's trivial to say that there could conceivably be a "balance" here (what human activity doesn't include the concept of balance?), but a cursory glance suggests that OpenSSL code is currently far from that point.
What about not designing protocols that will echo anything back? All that was really needed was a UUID here. Fixed length. Type checked preferably. They didn't need a client specified size and client specified content 64k or under echoed back to them. That isn't the server with SSL's job.
It's amazing how much noise to how little action the programming field produces. If the aerospace industry was like the programming industry, stalled planes would be falling out of the sky 100x more often, there would be no stick shakers and automated warning facilities, and the bulk of the reaction would be devoted to saying how incompetent the crashed pilots were.
We need to take pragmatic actions that fix problems, not exchange recriminations and continue to live with the problems.
In other words, we as a field sometimes use this algorithm:
1) Identify the problem
2) Repeat
We should have fixed bounds checking and use after free errors by now.
>If the aerospace industry was like the programming industry, stalled planes would be falling out of the sky 100x more often
On the other hand, if the aerospace industry was like the programming industry, we'd now be flying 10000000000 faster planes with warp drives, that take us to A-Centauri in an hour or so. And everybody would have one such plane, folded in his pocket.
On the other hand, if the aerospace industry was like the programming industry, we'd now be flying 10000000000 faster planes with warp drives, that take us to A-Centauri in an hour or so. And everybody would have one such plane, folded in his pocket.
If you're going to carry the analogy to that ridiculous corner then a black-hat hacker with a heartbleed equivalent blows up Earth in an antimatter explosion.
Not any less riduculous than the original (old and tired) analogy.
You brought up the rotten analogy. I'm talking about what humans comparatively actually do or don't do in different fields, not making an analogy. (Please look "analogy" up.) Historically, aerospace is very good about taking concrete actions that actually reduce accident rates. Historically, computer programming is pretty rotten at this. Our algorithm is literally:
It surprises me that something as important as OpenSSL doesn't have hundreds of coders studying every check in to find potential bugs, in particular obvious C errors.
Valgrind alone wouldn't have caught this bug. You would also need to trigger the bug, which would involve either specifically testing for it, or sending randomly invalid requests hoping to hit bugs.
Granted, security software should be tested against invalid requests.
I think you are greatly exaggerating how hard it is to find invalid memory use in openssl.
According to Ted Unangst, all you have to do is turn off the custom allocator in openssl to immediately hit a use-after-free bug.
He writes:
"This bug would have been utterly trivial to detect when introduced had the OpenSSL developers bothered testing with a normal malloc (not even a security focused malloc, just one that frees memory every now and again). Instead, it lay dormant for years until I went looking for a way to disable their Heartbleed accelerating custom allocator."
The bug he is referring to is not in heartbeat, but rather in OpenSSL's use of its own allocator.
I am not claiming that it would be difficult to find invalid memory usage in OpenSSL, rather I am claiming that the particular invalid usage that led to heartbleed would be difficult to find, becuase it does not occur under normal circumstances.
Regarding point #5. If the performance of OpenSSL was terrible we would not be having this discussion. This is system level component which must pass all encrypted traffic.
OpenSSL is ubiquitous and runs everywhere including phones and low powered VPSs that everyone is using. If OpenSSL burned RAM and CPU cycles for the sake of correctness, alternatives would appear. The hard part about developing a library like OpenSSL is it has to be fast AND secure.
It seems like companies that can afford to work on goggles & thermostats or drones & microblogs could set up small teams to review and test the core open source projects used by the larger Internet community.
Yes, bug bounties on their own aren't a sufficient solution: more systematic, less ad hoc approaches are needed too, no doubt. But nonetheless I think it makes sense to double down on bug bounties, for two reasons. First, bug bounties are an approach that seems to be fairly successful, though the IBB money probably wasn't part of Neel Mehta's motivation to start inspecting OpenSSL at all. There may be better, though more ambitious, approaches, but if there's one thing that seems evident from the recent SSL/TLS bug fiasco it's that making the perfect, or even the great, the enemy of the pretty good is counterproductive here. Second, probably nothing will motivate the maintainers of OpenSSL etc. to actually implement (and keep implementing) systematic, proactive changes better than the constant, realistic threat of being pantsed again by bounty-hunting outsiders. (Remember: all technical problems are social problems in disguise, just as all social problems are technical problems in disguise. ;) ) Or failing that, fairly frequent bounty-funded bug finds would at least serve as a warning to the rest of us that the maintainers aren't doing very well.