Using Heartbleed as a starting point

leoc · on April 10, 2014

Hey, as a starting point and a simplest-thing-that-could-possibly-work, how about making use of the publicity and pain of Heartbleed by launching a timely crowdfunding project to greatly increase the Internet Bug Bounty money for OpenSSL, and maybe (as a stretch goal?) start similar bounty pools for GnuTLS and other core crypto programs? Just stick it up on Kickstarter or wherever. There have to be plenty of unhappy sysadmins and IT managers who would readily chip in right at the moment, and the publicity should also help attract support from a much broader group of people who normally wouldn't even think about these things. The hoopla should also hopefully embarrass some big companies into making useful contributions to the pot. (Due credit to FB and Google for funding the IBB in the first place.) Strike while the iron is hot.

Yes, bug bounties on their own aren't a sufficient solution: more systematic, less ad hoc approaches are needed too, no doubt. But nonetheless I think it makes sense to double down on bug bounties, for two reasons. First, bug bounties are an approach that seems to be fairly successful, though the IBB money probably wasn't part of Neel Mehta's motivation to start inspecting OpenSSL at all. There may be better, though more ambitious, approaches, but if there's one thing that seems evident from the recent SSL/TLS bug fiasco it's that making the perfect, or even the great, the enemy of the pretty good is counterproductive here. Second, probably nothing will motivate the maintainers of OpenSSL etc. to actually implement (and keep implementing) systematic, proactive changes better than the constant, realistic threat of being pantsed again by bounty-hunting outsiders. (Remember: all technical problems are social problems in disguise, just as all social problems are technical problems in disguise. ;) ) Or failing that, fairly frequent bounty-funded bug finds would at least serve as a warning to the rest of us that the maintainers aren't doing very well.

wglb · on April 10, 2014

Bug bounties have proven successful in many circumstances.

Funding the bug bounty is one problem, but administrating and adjudicating it is quite another task.

Also at what level is the bug at the standardization level?

leoc · on April 10, 2014

I'd suggest that Plan A would be just to throw the money at HackerOne, as a donation to their Internet Bug Bounty program https://hackerone.com/ibb which is already set up and running.

colin_mccabe · on April 10, 2014

Relevant: https://groups.google.com/forum/#!topic/Golang-Nuts/0za-R3wV...

JT Olds asks: "Why did Go implement its own SSL library? I've scanned the list and the website and I might have missed a design doc somewhere, but as far as I can tell I can't find any motivational reason for why Go implemented its own SSL library."

agl responds: "Have you seen the OpenSSL code? I do, indeed, work with it a lot and it doesn't exactly fill one with positive feelings. NSS has a different set of issues. (I'm not familiar with alternatives like Polar or GnuTLS.)"

al2o3cr · on April 10, 2014

Nothing like rejecting all the alternatives and getting a brand-new shiny set of security holes, amirite?

chubot · on April 10, 2014

Since Go's SSL library written in Go, it should be a lot more resistant Heartbleed-type errors. In Go you use slices to manipulate memory, which are basically (pointer, length, capacity) triples (IIRC). This is a nice feature of Go -- a new balance between safety and performance.

twic · on April 10, 2014

> In general it does not help the fact that the system that is the de facto standard in today’s servers infrastructure, that is, Linux, has had, and still has, one of the worst allocators you will find around, mostly for licensing concerns, since the better allocators are not GPL but BSD licensed.

False. Glibc uses ptmalloc:

http://www.malloc.de/en/

Which is under the MIT license.

I couldn't find any really good, recent benchmarks of different allocators, but it does seem to be the case that ptmalloc is not as fast as tcmalloc. However, i suspect that at the time at which it was adopted, ptmalloc was one of the fastest options available.

antirez · on April 10, 2014

Sorry I was not aware that Glibc malloc is also BSD licensed, thanks for providing more information about this. Given that, to me it is a big mystery why they don't migrate to something more reasonable like jemalloc, which is able to provide less fragmentation and top performances. Jemalloc is also very actively developed.

TheCondor · on April 10, 2014

I don't think the Linux malloc implementation is why OpenSSL has some of it's own memory management stuff built in. I think it's more due to the wide range of other platforms it runs on, including some embedded ones that have very strict allocation policies.

jemalloc may very well be a good thing for Linux in general but I don't think it would have changed this.

antirez · on April 10, 2014

Yes I don't want to imply that glibc malloc was the culprit here, but glibc malloc is one of the reasons why certain systems have their own malloc / slab allocator / whatever.

About the embedded systems, usually you don't try to cover everything, it is often a more valid a approach to have specialized libraries for very resource constrained systems. In the case of OpenSSL I guess the problem is that the implementation was hard enough to force OpenSSL to blend instead. This is where to be able to say "no" is a good idea.

lomnakkus · on April 10, 2014

I think this comes under the heading of squabbling over details.

To quote from your blog post:

"This is a sign that still performances, even in security critical code, is regarded with too much respect over safety. In this specific instance, it must be admitted that probably when the OpenSSL developers wrapped malloc, they never though of security implications by doing so. However the fact that they cared about a low-level detail like the allocation functions in some system is a sign of deep concerns about performances, while they should be more deeply concerned about the correctness / safety of the system."

As you stated in this bit of your post this is a sign that the OpenSSL team have their priorities backward. In other words they are NOT fit to run a security critical project. Period.

TheCondor · on April 10, 2014

Historically, performance has ALWAYS been a major security issue. To the point that it's been claimed to be easy to build highly secure ciphers and tools, it's difficult to build them that perform well. Performance was a major factor in the AES and SHA selection processes. Performance is a major factor in the interest in ECC. Arguably, performance is why you'd ever use C for anything at all.

Seems easy to dismiss them and say their priorities are backwards, fundamentally at the end of the day, a huge majority of the users will simply pick the benchmark winner and use it. It a free project and their priorities are what they are, what are the priorities of the legions that used it instead of writing their own?

antirez · on April 10, 2014

I believe the stress on performances is largely community-driven in a project like OpenSSL. A lot of environments did not switched to encrypted connections because of cost concerns and unfortunately costs are related to OpenSSL performances. So it's not like they are crazy and mess with priorities IMHO. But it is time to re-evaluate.

lomnakkus · on April 10, 2014

Don't get me wrong -- performance can matter, but in the end it's up to the user of your cryptographic solution/process/abstraction to decide what tradeoff is acceptable to them. If you're responsible about it, you try to explain to potential users what the tradeoffs are -- you don't try to optimize before you can be sure that you're doing so sensibly (see next paragraph).

Of course you should try to optimize as much as is sensible, but if you don't have any process in place (unit tests, integration tests, fuzz testing, formal methods, timing attacks, whatever you can throw at it) that lend some assurance that Implementation#1 (optimized) actually does exactly the same as Implementation#0 (less or unoptimized) under all circumstances, then you should be as conservative as humanly possible... and choose to NOT optimize unless/before you have those processes in place.

(I'll happily admit that this is somewhat armchairy -- I'm glad I don't work in a particularly security sensitive area. I'm not sure I could handle all the obligations/pressures myself.)

xyzzy123 · on April 10, 2014

At least currently, jemalloc does not have the kind of heap protection you would want for a system allocator :(

padenot · on April 10, 2014

What would be the heap protection features you need? [1] lists a couple features that I find useful. Of course, in Firefox, we use valgrind and ASAN extensively in automation and during development, so it matters less for us.

(edit: to reduce confusion, I just want to say that Firefox uses jemalloc, otherwise my comment makes no sense. It's however disabled if you compile Firefox for use with valgrind).

[1]: https://github.com/jemalloc/jemalloc/wiki/Use-Case%3A-Find-a...

xyzzy123 · on April 10, 2014

Those are the debugging features. Those features (plus the tools you listed) help you during the test phase of your SDLC.

In contrast, heap protection / exploit mitigation is intended to protect your software in the field from the bugs you did not know about until after the software shipped. Correct software would not need ASLR, stack canaries, NX or heap protection. But these things are widely accepted as required these days.

Desirable features are mainly heap overflow protection and double free protection (see [1] for a list of what ptmalloc does). The situation is not entirely simple because the specific protections you want (say in terms of heap metadata protection) depend on the internal design of the allocator and there are a lot of devils in the details.

There is some information around on jemalloc exploitation [2] [3], but I do not know if that information is current. I'm not aware of any jemalloc hardening which has happened since then. If you could provide any information in that area I'd be interested to know about it.

[1] https://wiki.ubuntu.com/Security/Features#Heap_Protector

[2] http://phrack.org/issues/68/10.html#article

[3] https://github.com/argp/unmask_jemalloc

mwcampbell · on April 10, 2014

A bit of a tangent, but have you looked at musl libc's malloc? It's way simpler, primarily because it doesn't use per-thread heaps, yet its author claims that it's competitive with jemalloc on performance. I'm inclined to favor the simpler malloc.

pointernil · on April 10, 2014

Given the attention OpenSSL just "received" wouldn't it make sense to start a mass audit effort?

Why not counterbalance the "crypto software is hard, don't touch it" idea with some simple tasks, like: check all allocs in openssl for bounds checks? "mark" missing ones? "mark" fishy ones etc.

Can a software security audit be crowd-sourced and be meaningful?

paraboul · on April 10, 2014

The problem with #3 (Abstract C with libraries.) is that, while there are less place / code to audit, in case of failure (e.g. a security hole in said library), damages are everywhere.

A good exemple is Daniel J. Bernstein. The man doesn't even trust the libc and wrote its own layer for all of his software (djbdns, qmail, ...). (source : http://cr.yp.to/qmail/guarantee.html #7)

dllthomas · on April 10, 2014

Though with the "weakest link" dynamics, I'm not sure "damages are everywhere" isn't a feature. If your library primitives are supposed to be designed such that they can only be used safely, and if there is a flaw in your primitives, the fact that everything uses those same primitives means that there's more testing of that same code in different situations, and so "damages are everywhere" means that errors are more likely to show up in some test. That is on top, of course, of DRY naturally shrinking the code that needs the most carefully audited.

jessaustin · on April 10, 2014

Libraries are so bad that DJB has to write his own? What are you trying to say?

paraboul · on April 10, 2014

I'm not saying anything. He is.

I think there is a good balance between using third party library (possibly untrusted) and plain C. Kind of an analogy to centralized network vs decentralized.

jessaustin · on April 10, 2014

So far as I can see TFA isn't even insisting on third party libs (even though he himself has released e.g. "a sane dynamic string library"); he just wants a refactoring, so that overflows and other problems are avoided in DRY fashion, using libraries. memcpy() isn't quite as smelly as gets(), but a well-designed library can certainly help even expert coders use it more safely.

I can't see how the DJB example supports the claim that "damages are everywhere". It's trivial to say that there could conceivably be a "balance" here (what human activity doesn't include the concept of balance?), but a cursory glance suggests that OpenSSL code is currently far from that point.

bch · on April 10, 2014

Exibit a: OpenSSL library providing encryption...

On the surface, a tricky problem.

lnanek2 · on April 10, 2014

What about not designing protocols that will echo anything back? All that was really needed was a UUID here. Fixed length. Type checked preferably. They didn't need a client specified size and client specified content 64k or under echoed back to them. That isn't the server with SSL's job.

kosinus · on April 10, 2014

This is a far more reasonable response to the Heartbleed situation.

Earlier, Theo de Raadt's response on a mailing list was linked. Theo being himself was to be expected, but that kind of talk doesn't help anyone.

stcredzero · on April 10, 2014

It's amazing how much noise to how little action the programming field produces. If the aerospace industry was like the programming industry, stalled planes would be falling out of the sky 100x more often, there would be no stick shakers and automated warning facilities, and the bulk of the reaction would be devoted to saying how incompetent the crashed pilots were.

We need to take pragmatic actions that fix problems, not exchange recriminations and continue to live with the problems.

In other words, we as a field sometimes use this algorithm:

    1) Identify the problem
    2) Repeat

We should have fixed bounds checking and use after free errors by now.

coldtea · on April 10, 2014

>If the aerospace industry was like the programming industry, stalled planes would be falling out of the sky 100x more often

On the other hand, if the aerospace industry was like the programming industry, we'd now be flying 10000000000 faster planes with warp drives, that take us to A-Centauri in an hour or so. And everybody would have one such plane, folded in his pocket.

stcredzero · on April 11, 2014

On the other hand, if the aerospace industry was like the programming industry, we'd now be flying 10000000000 faster planes with warp drives, that take us to A-Centauri in an hour or so. And everybody would have one such plane, folded in his pocket.

If you're going to carry the analogy to that ridiculous corner then a black-hat hacker with a heartbleed equivalent blows up Earth in an antimatter explosion.

coldtea · on April 11, 2014

Not any less riduculous than the original (old and tired) analogy.

Plus, I don't recall computers having killed many people. Planes do. If planes were like computers, we'd just reboot them and carry on.

stcredzero · on April 11, 2014

Not any less riduculous than the original (old and tired) analogy.

You brought up the rotten analogy. I'm talking about what humans comparatively actually do or don't do in different fields, not making an analogy. (Please look "analogy" up.) Historically, aerospace is very good about taking concrete actions that actually reduce accident rates. Historically, computer programming is pretty rotten at this. Our algorithm is literally:

    1) Identify the problem
    2) Repeat

With meaningless variations.

BuildTheRobots · on April 11, 2014

If you have the URL to hand still, I'd love to give his response a read (google did not help)

weavie · on April 10, 2014

Code reviews.

It surprises me that something as important as OpenSSL doesn't have hundreds of coders studying every check in to find potential bugs, in particular obvious C errors.

qwerta · on April 10, 2014

They do, but only way to make some real money is to sell vulnerabilities.

colin_mccabe · on April 10, 2014

It's sad that he didn't mention the obvious solution that would have caught this problem: running valgrind or Thread Sanitizer.

(Yes, you will have to create a build with OpenSSL's custom allocator disabled to get a full run. No, it isn't hard.)

gizmo686 · on April 10, 2014

Valgrind alone wouldn't have caught this bug. You would also need to trigger the bug, which would involve either specifically testing for it, or sending randomly invalid requests hoping to hit bugs.

Granted, security software should be tested against invalid requests.

colin_mccabe · on April 10, 2014

I think you are greatly exaggerating how hard it is to find invalid memory use in openssl.

According to Ted Unangst, all you have to do is turn off the custom allocator in openssl to immediately hit a use-after-free bug.

He writes:

"This bug would have been utterly trivial to detect when introduced had the OpenSSL developers bothered testing with a normal malloc (not even a security focused malloc, just one that frees memory every now and again). Instead, it lay dormant for years until I went looking for a way to disable their Heartbleed accelerating custom allocator."

http://www.tedunangst.com/flak/post/analysis-of-openssl-free...

gizmo686 · on April 10, 2014

The bug he is referring to is not in heartbeat, but rather in OpenSSL's use of its own allocator.

I am not claiming that it would be difficult to find invalid memory usage in OpenSSL, rather I am claiming that the particular invalid usage that led to heartbleed would be difficult to find, becuase it does not occur under normal circumstances.

antirez · on April 10, 2014

Valgrind is covered in 2 of the 5 points of the blog post.

qwerta · on April 10, 2014

I wonder why Ada is not widely adopted by security community. GCC supports it for ages and it is as fast as C

cmbaus · on April 10, 2014

Regarding point #5. If the performance of OpenSSL was terrible we would not be having this discussion. This is system level component which must pass all encrypted traffic.

OpenSSL is ubiquitous and runs everywhere including phones and low powered VPSs that everyone is using. If OpenSSL burned RAM and CPU cycles for the sake of correctness, alternatives would appear. The hard part about developing a library like OpenSSL is it has to be fast AND secure.

stevewilhelm · on April 10, 2014

It seems like companies that can afford to work on goggles & thermostats or drones & microblogs could set up small teams to review and test the core open source projects used by the larger Internet community.

dllthomas · on April 10, 2014

Is the OpenSSL protocol sufficiently pinned down that any difference between behavior of NSS and OpenSSL is a bug in one or the other?