Probably the code used to generate the 200TB was not 200TB itself, right? So in ...

Houshalter · on May 27, 2016

In fact you can encode any proof to a few hundred lines. Just write a program that brute forces through all possible proofs. If that counts, then no proof need be longer than it's conjecture plus a constant.

timlyo · on May 27, 2016

And then if we get enough monkeys/children in a room tapping away, eventually they'll write enough programs to solve every problem. That's it, maths is solved.

firethief · on May 27, 2016

I don't know, they tried that with Windows Phone apps...

edko · on May 27, 2016

I'm not sure about that. Those "all possible proofs" will be an infinite numerable set. To pick a specific proof from the set, you will need an arbitrarily large number as the index, so your few hundred lines will not be enough (assuming line sizes are finite).

rspeer · on May 27, 2016

I think the "plus a constant" here refers to a bound on the size of the proof. So you don't need the exact index, just a really rough upper bound. (This would, of course, be unreasonably inconvenient to check.)

poppingtonic · on May 27, 2016

Hooray for proof search techniques. How much data would the Godel-numbering process generate, though? I'm assuming such a search would use this, unless I'm mistaken about my mathematical logic.

ifdefdebug · on May 27, 2016

I'm not sure: doesn't the halting problem apply to this? In other words, can you write such a brute force program that is guaranteed to halt after a finite number of steps, providing a yes or no? I guess not.

hyperpape · on May 27, 2016

It doesn't, and it's interesting why. You can enumerate all possible proofs. If there is a proof of a given statement, you will eventually find it. But if there is no proof, you will never know, because it always could be the next proof you haven't looked at.

For the same reason, for any program that halts, you can run it long enough, and see that it halts. But (unless you are lucky enough to detect a loop), if it doesn't halt, you will never witness that, you will only be able to say "it hasn't halted yet".

It definitely can take a while until that kind of asymmetry becomes intuitive.

ifdefdebug · on May 27, 2016

> But if there is no proof, you will never know, because it always could be the next proof you haven't looked at.

But that's exactly what I meant. So the brute force program proposed in the post I replied to doesn't exist, right?

Let's say, you have proposition A. You write a program to brute force proofs for A. If it finds a proof, it will halt and output Yes, if it finds a disproof it will halt and output No, otherwise it will check the next proof.

So if you say that your brute force program can proof/disproof proposition A (and A1, A2, A3, ... for that matter), you will have to proof that your program will halt for sure after a finite number of steps. Now that's proposition B.

If you find a clever proof for proposition B, then you can claim that you wrote a program that can proof or disproof proposition A - maybe in a zillion years, but it can.

If you can't find such a proof for proposition B, you can always try to write a brute force program to proof/disproof proposition B... then C, D, E, etc. :)

And here comes the halting problem: you may be able to proof that a specific program will halt on a specific input, but it can be very hard to proof. But you can't proof that a specific program will halt on every input. This is directly related to Godel's incompleteness of formal systems.

Certhas · on May 27, 2016

Every conjecture that is provable can be proved by the proposed program of length (conjecture + constant) in finite time.

That was the original statement:

> you can encode any proof to a few hundred lines.

This type of encoding is funny of course. You can not determine from the encoding of the proof whether the encoding is the encoding of a proof (as you point out). But if you have a proof, and thus know that a proof exists you can "encode" it very succinctly in this program for whatever that is worth.

ifdefdebug · on May 27, 2016

> you can encode any proof to a few hundred lines.

Yeah, I misread that. The sentence implies the existence of a proof. So I was clearly talking about something different. Thanks.

jerf · on May 27, 2016

There's also the problem of identifying a proof when you have it. Hooray, after running a super-exponential number of steps in the real universe somehow it turns out Turing Machine #38,384,768,209,040,100 halts after fifteen googleplex and change steps. What does that mean?

(This is not criticism, this is elaboration. The approach is useless in practice, but deeply understanding all the reasons why is a good exercise in computer science.)

sp332 · on May 27, 2016

This was an issue in another recent story. https://news.ycombinator.com/item?id=11676918 The incompleteness theorem bites you before the halting problem does.

Ar-Curunir · on May 27, 2016

You just look for all proofs of a certain length. This is equivalent to checking if a Turing machine halts within a bounded number of steps. This is a NP problem.

bweitzman · on May 27, 2016

You are not trying to prove the thing, just trying to encode it, The assumption is that the proof already exists.

tel · on May 27, 2016

There's a lot of fine distinction, and a hell of a lot of study, of these kinds of questions. One thing you might ask of your proofs is that they are immediate—which is to say, they can be (a) grasped and (b) believed, at least in principle, after only a finite amount of work. Further, you should be able to prove that your proof has these properties.

Without these properties, it's hard to call something a proof since it might simply go on forever and never resolve the question in the way you wish it would. This is absolutely the realm of things like the Halting Problem.

So 200TB of data plus a short proof that the search completed could at least in principle be immediately checked in finite work. On the other hand, the mere program which generates this data could not be unless it also comes with a proof of its own termination. We have no general process to generate these, so you must create a specific one for your generation program.

If it's cheaper to just spit out 200TB than communicate why you're sure that your program won't spin, then the better proof is 200TB long.

---

As a side note, this is sort of odd sounding because we're implicitly saying that "small is better" in proofs. In an ultimate sense this is probably true---if I had before me all of the proofs of something I would want to order them and pick the smallest.

But in practice, we cannot enumerate proofs and must expend energy to achieve new ones. Thus, there are at least two kinds of energy optimization which prevail. The first is how much effort is required to achieve a proof and the second is how much effort is required to communicate/transmit the proof to someone else.

This 200TB proof might be well-optimizing the search question and it might also be failing on the communication question.

Dylan16807 · on May 27, 2016

Once you know the answer, you say "Take this program and run it with n=7825". That is your compact finite proof, not the untested program. You'd don't have to meta-prove halting, you just give it the required CPU hours.

tel · on May 28, 2016

While I think what your saying is true, it's frowned upon. It's saying "I've discovered a truly marvelous proof and its at the bottom of this cave". We've dodged the infinity risk, but not the resources involved in communication risk.

So it's kind of just crap UI at this point. I think that given that proofs are essentially social objects, this is a pretty big negative on that style of proof even if it doesn't totally sink the ship.

Dylan16807 · on May 28, 2016

The communication is easy. You show someone your nice compact program, and they don't have to run it to understand it.

A 200TB proof is literally impossible to communicate to a human. Where is the benefit to that formulation?

The huge version is not better for a machine either. 200TB is a lot of storage, and you could rerun 30000 CPU-hours of calculations in under a month on a single semi-high-end 4U server.

tel · on May 30, 2016

But if I don't trust you then your program and limit can only be verified by my execution of it. This might be incredibly or even in feasibly expensive and unless you've also got a resource bound proof then I might be sunk.

On the other hand, code which verified a fully expanded proof may be very simple and easy to check/trust/believe.

prof_hobart · on May 27, 2016

Depends on how you look at it. The 200TB seems to be the data required to prove that none of the possible combinations works.

If it had been temporary data to get to a simple proof of something, then I'd agree that this data was just a side-effect.

But if I'm understanding it correctly, as it stands even if you're told that it stops at 7,825, you'd still need all of this data to prove that it's true.