Hacker News new | past | comments | ask | show | jobs | submit login

"LLVM IR at some fixed version" seems on the face of it like a reasonable attack on the problem, yes, and you can get close to solving the reproducibility problem that way. But you can't actually solve it, because LLVM IR is two to four orders of magnitude more complicated than Nock, so it's full of definitional ambiguities, which mean that in practice it is defined by the behavior of its implementation, which is full of bugs, some of which must be fixed because they are security holes. Even without fixing the bugs, its behavior will vary depending on what compiler it's built with and what CPU and OS it runs on. Operations as commonplace as sequences of two floating-point additions can give different results, sometimes radically different, for example in the presence or absence of guard digits, to say nothing of non-IEEE-754-compliant hardware like Intel Gen. In this particular case GCC has a flag, -fstore-float, to get deterministic behavior on some hardware, so you can read about the issue.

The same flaw (for this purpose) is present in JVM bytecode, i386 machine code, ARM2 machine code, CIL, and wasm. (e)BPF and the Z-machine might be plausible bases for a solution on those grounds, but you'd still need a lot of work to be sure you'd rooted out the sources of irreproducibility.

Now, as I said, I'm not convinced that Nock in practice solves the problem it sets out to, and Faré's criticisms that you have helpfully linked to are a major reason for that. But nothing that existed does in fact set out to solve that problem, much less succeed. It's true that some other computational basis would also have worked, given enough effort; but I think you are seriously underestimating the amount of effort required.

I do think my previous comment already explained why I think rigorous computational reproducibility is valuable.




I mean, I'm happy to say that neither Nock nor anything else solves the problem, and I think that gets to the same conclusion: Nock is not a useful/worthwhile invention, it is an obscurantist rehashing of old ideas with more emphasis on looking elegant than being elegant.

That's what I'm pushing back on. There's an idea that because Nock exists and looks cool, it must mean something. And just like the idea that because the rest of the blog where Nock was introduced exists and looks cool, it must be a meaningful philosophical contribution, it's not true.

For instance - Nock, as both its advocates and detractors readily point out, is too low-level to be a language to actually write in. So that means you're writing in some other language which has an implementation in Nock (or perhaps in Nock-and-hopefully-equivalent-C, how's that for security risk surface). And that other language is rich enough to have security issues just like LLVM, and it's either frozen and you accept the potential insecurity, or it's patchable and you accept the potential nonreproducibility. You haven't solved the problem, you've just shoved it up one level so you can get a cool blog post about your elegant lower level.

I agree, I'd love to have a language that solves the problem you describe. I don't think Nock is tangibly closer to solving it than other things are, nor do I think it is tangibly closer to solving the Urbit system's needs than other things are, nor do I think it has some other unspecified merit. And I certainly feel no obligation to figure out some use case which would make Nock look meritorious.

(And, again, Google shipped pNaCl as a way to run untrusted native code. They had to do something to deal with the very concerns you raise. And unlike Urbit, the stakes for getting it wrong were very real.)


Well, it's good that we've gotten to the point of us both agreeing that what is Nock designed to do is novel and useful. It's certainly up in the air whether Nock will succeed in doing it, but I think it has a better chance of doing it than systems that don't even try.

I want to point out an error in your reasoning here, though, which results in you seriously underestimating the importance of Nock's goal:

> you're writing in some other language which… [i]s either frozen and you accept the potential insecurity, or it's patchable and you accept the potential nonreproducibility

Suppose I gather some data, do some statistical predictions based on it, plot the results, write them up, and publish the whole bundle as a set of programs for some reproducible computing system, including a build script that rebuilds the article from the source text, the observation data, the statistical code, and so on, with a secure hash of the whole thing. I don't even need to include a PNG of the plot in the bundle — anyone in the future, whether in 02021, 02030, 03020, or 12020, can recompute that bitwise-identical plot from the source data. Moreover, if I made an error in my analysis — either due to a software bug or for any other reason — they can reproduce that error.

Suppose the statistical code is compiled from a Julia-like language to bytecode for the reproducible VM. To achieve reproducibility, I could include the compiled bytecode in the bundle, but a better solution is to include the version of the compiler that I was using. That way, someone who wants to criticize my conclusions in the future can recompile my statistical source code with a new version of the compiler to see if my results were due to a compiler bug. Lacking that, they can reproduce my incorrect results, and they can recompile my source with a new compiler and get a different executable that produces different results, but it will be harder to tell why they were different.

In neither case, though, does fixing compiler bugs destroy the reproducibility of the computation, as you say it does. The executable compiled by the buggy compiler will continue to produce unchanged results on future versions of the VM, as long as they are not buggy. Shoving the problem up one level makes it into a completely different kind of problem with completely different implications.

I don't think there's any other system out there that attempts to deliver this level of reproducibility today, although some video game emulators come close. I'm probably going to have to write one myself, because I'm not convinced Nock is going to achieve it. Substantial parts of Dercuano and BubbleOS are steps toward this goal.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: