I'm pretty certain that DeepMind (and all other labs) will try their frontier (a...

blinding-streak · 2026-02-13T08:05:44 1770969944

As a non-mathematician, reading these problems feels like reading a completely foreign language.

ky3 · 2026-02-13T17:55:17 1771005317

LLM to the rescue. Feed in a problem and ask it to explain it to a layperson. Also feed in sentences that remain obscure and ask to unpack.

zozbot234 · 2026-02-12T18:27:26 1770920846

The 1st proof original solutions are due to be published in about 24h, AIUI.

energy123 · 2026-02-13T04:37:03 1770957423

Feels like an unforced blunder to make the time window so short after going to so much effort and coming up with something so useful.

sinuhe69 · 2026-02-13T10:06:50 1770977210

5 days for Ai is by no mean short! If it can solve it, it would need perhaps 1-2 hours. If it can not, 5 days continuous running would produce gibberish only. We can safely assume that such private models will run inferences entirely on dedicated hardware, sharing with nobody. So if they could not solve the problems, it's not due to any artificial constraint or lack of resources, far from it.

The 5 days window, however, is a sweat spot because it likely prevents cheating by hiring a math PhD and feed the AI with hints and ideas.

energy123 · 2026-02-13T10:27:41 1770978461

5 days is short for memetic propagation on social media to reach everyone who has their own harness and agentic setup that wants to have a go.

zozbot234 · 2026-02-13T13:34:18 1770989658

That's not really how it works, the recent Erdos proofs in Lean were done by a specialized proprietary model (Aristotle by Harmonic) that's specifically trained for this task. Normal agents are not effective.

energy123 · 2026-02-13T14:06:00 1770991560

Why did you omit the other AI-generated Erdos proofs not done by a proprietary model, which occurred on timescales stretched across significantly longer time than 5 days?

zozbot234 · 2026-02-13T14:55:10 1770994510

Those were not really "proofs" by the standard of 1stproof. The only way an AI can possibly convince an unsympathetic peer reviewer that its proof is correct is to write it completely in a formal system like Lean. The so-called "proofs" done with GPT were half baked and required significant human input, hints, fixing after the fact etc. which is enough to disqualify them from this effort.

energy123 · 2026-02-14T09:28:03 1771061283

That wasn't my recollection. The individual who generated one of the proofs did a write-up for his methodology and it didn't involve a human correcting the model.

octoberfranklin · 2026-02-12T22:44:35 1770936275

Really surprised that 1stproof.org was submitted three times and never made front page at HN.

https://hn.algolia.com/?q=1stproof

This is exactly the kind of challenge I would want to judge AI systems based on. It required ten bleeding-edge-research mathematicians to publish a problem they've solved but hold back the answer. I appreciate the huge amount of social capital and coordination that must have taken.

I'm really glad they did it.

lofaszvanitt · 2026-02-13T07:45:04 1770968704

Of course it isn't made the front page. If something is promising they hunt it down, and when conquered they post about it. Lot of times the new category has much better results, than the default HN view.