Hacker News new | past | comments | ask | show | jobs | submit login

> Mistrust of commercial solutions does not translate into trust for open-source ones.

Well, how well can you trust the commercial ones ? At least with open source, you can look into it more easily and eventually find security holes. It's a step towards trust. There is no trust to gain with commercial solutions, but at least with open source, it's at least possible.

The fact that proprietary agrees with a sound market economy makes it somehow more functional and more attractive, but when you're concerned about ethics, it's a totally other concern.




how well can you trust the commercial ones ? At least with open source, you can look into it more easily and eventually find security holes. It's a step towards trust. There is no trust to gain with commercial solutions, but at least with open source, it's at least possible.

Ever heard of reverse engineering? It turns out you'd need even that approach even with open source as soon as you use binaries you haven't compiled yourself. And you'd have to verify the compiler and your disassembler that way too. It's all possible, but requires more than it's currently being done, at least on the level of the stuff openly available.

And even if you manage to verify everything you have to check the computer. Modern computers be it servers or notebooks start to have BIOS-es that can even phone home and allow remote access without your control (having the keys which you can't control!).


"Ever heard of reverse engineering? It turns out you'd need even that approach even with open source as soon as you use binaries you haven't compiled yourself."

This is true: reverse engineering can be used for verification, but it's a whole lot more work than inspecting source.

"And you'd have to verify the compiler and your disassembler that way too."

This is false. You can verify the compiler with diverse double compiling: http://www.dwheeler.com/trusting-trust


Am I missing something: does it mean that to verify the compiler with DDC you need a trusted compiler that always produces the same binary output as an untrusted one, so to verify GCC you need a compiler that duplicates the whole GCC functionality but is trusted? What is practicality of that approach? Proving that "hello world" produces the same output doesn't prove that the crypto functions wouldn't be patched?

Please a specific example of what would be needed to prove GCC and LLVM now.

EDIT: I'm not interested in toy compiler and theoretical pie-in-the-sky examples, I want to know how practical it is for the systems in real use. GCC and LLVM as they are now please. If the proposition is "suppose that we have something that can compile gcc sources and we trust it" tell me what is that, does it exist and how hard would it be to make it. Don't talk to me about your experiment where you change one line in TTC and then prove it's changed by comparing the binaries.


You're missing something.

The idea is to take one compiler source (S), and compile it with a diverse collection of compilers (Ck being a compiler in C0-CK), producing a diverse collection of binaries that are compilations of S: (Bk = Ck(S)). Because the different compilers are almost certainly not functionally identical, the various Bk should not be expected to be bitwise identical. However, because they are compilations of the same source, they should be functionally identical, or one of the original compilers was broken (accidentally or deliberately). So now we can compile that original source with the Bk compilers, and because these compilers are functionally identical, the results (Bk(S))should be bitwise identical. There is certainly some chance of false positive, due to bugs in the Ck compilers or exploitation of undefined behavior in S, but if you do get the same output (Bk(S)) from all of the (Bk) compilers then you can be pretty confident that there is no Trusting Trust style attack present: exceedingly so, when the various compilers have diverse histories so that it's exceedingly unlikely that all Ck compilers contain the same attack. If there are any differences, you can manually inspect them to determine what the issue is and either issue a bug report to the appropriate compiler, change the source (S) to avoid undefined behavior, or notify people of the attack present in the compiler in question, depending on what you find. This does involve some binary digging, but quite targeted compared to a full audit and it may well not be necessary at all.

Obviously, if you do have a trusted compiler, including it in the mix is great, but the technique doesn't rely on this, nor on any two compilers returning the same binary output except when they are compilations of the same source.


Please explain which exact steps and which assumptions would be needed to have a trusted GCC 4.8.1, both gcc and g++ and then keeping it trusted as the new releases appear.

Then the same for LLVM.


I don't know enough about the details of the build dependencies for any of these projects to give exact steps. To get a known-clean build (that is, a build guaranteed to match the source) of GCC 4.8.1, plug the GCC 4.8.1 into the procedure I gave above:

In case it wasn't clear, k is used for indexing, and I use "function application" f(x) to mean compilation of x by compiler f.

"Take one compiler source (GCC 4.8.1), and compile it with a diverse collection of compilers (Ck being a compiler in { C0 = GCC 4.8.1, C2 = LLVM, C3 = icc, C4 = visual c/c++, ...}[1]), producing a diverse collection of binaries that are compilations of GCC 4.8.1: (Bk = Ck(GCC 4.8.1)). Because the different compilers are almost certainly not functionally identical, the various Bk should not be expected to be bitwise identical. However, because they are compilations of the same source, they should be functionally identical, or one of the original compilers was broken (accidentally or deliberately). So now we can compile that original source with the Bk compilers, and because these compilers are functionally identical, the results (Bk(GCC 4.8.1)) should be bitwise identical. If there are any differences, you can manually inspect them to determine what the issue is and either issue a bug report to the appropriate compiler, change the source (GCC 4.8.1) to avoid undefined behavior, or notify people of the attack present in the compiler in question, depending on what you find. This does involve some binary digging, but quite targeted compared to a full audit and it may well not be necessary at all."

Likewise for any of the others, but note that once you've got a known-clean build of any (sufficiently capable) compiler you could use it to build known-clean builds of the others.

[1] the more compilers and the more diverse the background of the compilers, the better; it may well be worth using quite slow compilers that are proven correct and/or implemented in other (possibly interpreted) languages for a high degree of confidence.


One of the most useful forms of diversity is the "my opponent does not have access to time machine" defense. e.g. use some C compiler for amiga, or 1980's DEC unix, or whatever to bootstrap gcc3 for windows, and use that to bootstrap clang for linux, etc. The odds that hardware and binaries you've had for 30 years could carry a trojan that successfully applies to a compiler that was not written yet, for an architecture that was not designed yet, inserting a trojan for yet another such pair, seem low. Feel free to follow more than one such path if paranoia dictates. When you arrive at the end (some compiler, built with itself), the binaries should all match however you got there, presuming no undefined behavior in the compiler itself. If there is something, fix it.

And better yet if this chosen starting point(s), being old, are also small and simple.


I mostly agree, although careful about cross contamination if you're intending to actually use DDC - clang bootstrapped by gcc3 is not going to be independent of gcc3.


You're not giving a useful procedure for me. Let's say that only Gcc can compile itself and its own libraries (e.g version n-1 can compile version n). How can I make trusted GCC 4.8.1 if other compilers won't compile the sources of GCC? Do you agree that I have to implement all the features of GCC used in the sources of GCC in one or more other compilers? If not, don't I have to have a trusted GCC from the start? And if I have such GCC, then I don't need other implementations anyway?


I am not sure if gcc was able to compile itself always, but if it was, you can argue that there existed a smallest kernel of gcc sometime ago that did not depend on any of the "features" of gcc that makes it impossible for other compilers to compile gcc. Now, if there existed such a thing before, it probably exists now, because the incremental "features" that make it impossible for other compilers to compile gcc, would make it impossible for gcc too. My bet would be that there exists a logical separation somewhere, and there is still a small kernel in it, that you can bootstrap with other compilers, from which point you can do what your parent says.


You do need other compilers that can compile the GCC source. These do not need to be trusted, just diverse in origin so that they are unlikely to contain the same attacks.

If GCC is in fact the only thing that can compile GCC, then you cannot use DDC to get a trusted version of GCC.


Yes, you're missing something unfortunately. The author apparently states it several times, but many people must miss it in reading.

"I say it in the ACSAC paper, and again in the dissertation, but somehow it does not sink in, so let me try again.

Both the ACSAC paper and dissertation do not assume that different compilers produce equal results. In fact, both specifically state that different compilers normally produce different results. In fact, as noted in the paper, it’s an improvement if the trusted compiler generates code for a different CPU architecture than the compiler under test (say, M68000 and 80x86). Clearly, if they’re generating code for different CPUs, the binary output of the two compilers cannot always be identical in the general case!

This approach does require that the trusted compiler be able to compile the source code of the parent of the compiler under test. You can’t use a Java compiler to directly compile C code."


You seem quite paranoid.

Open source was thought to sweep away for hidden code, I really doubt GCC or other compilers has that special code that is reproduced each time you recompile a compiler with it.

If there was such self-reproducing code in a compiled GCC, it would be quite easy to find. There are many eyes looking at a program like GCC.

And even with such a conspiracy theory, which is still possible, open source has better margin than proprietary. It's not perfect, but it's much more transparent if you get what I mean.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: