Lean4 helped Terence Tao discover a small bug in his recent paper

X6S1x6Okd1st · on Oct 27, 2023

He started learning lean4 with the help of GPT4 just at the start of the month:

https://mathstodon.xyz/@tao/111208692505811257

Many of his mastodon posts this month have been about his learning progress.

Certainly an interesting case study of how LLMs can accelerate the work of even of the most extremely successful people

mensetmanusman · on Oct 27, 2023

I have found that good communicators that don’t code can quickly make functional automation.

Interestingly, LLMs may end up contributing to more inequality if only the highly skilled can leverage them effectively.

anonylizard · on Oct 27, 2023

LLMs CURRENTLY favor the more curious and open individuals (High on openness in big 5 scale). Half of the population is not open, does not want to try new things, unless it leads to very direct and proven benefit.

However, over time, the overwhelming benefits of using LLMs will be well understood, and these ladder climbers will absolutely master LLMs, no matter their intelligence. People can become experts at taking exams despite how boring and soul sucking that can be, let alone using something way funner and useful like LLMs.

marshray · on Oct 27, 2023

My friend had never written anything more than an Excel formula a few months ago and now he's using GPT-4 to write very nontrivial Python applications and automate large parts of his job.

I (having 30 years experience as a professional Software Developer^TM) am begging him to teach me his techniques.

Now that you mention it, I met him and we became friends in large part due to his communications abilities.

an_aparallel · on Oct 28, 2023

5 months ago - a friend wrote me a python script and sent it to me...i couldnt get it to work. Used phind.com to explain what to do...it worked out my windows environment variables needed to be changed, told me how to structure a folder schema to place the src script...mindblowing stuff. And i have been using it - when it turn the same friend told me to write a similar script in python myself...it has been amazing to forego stackoverflow/google to get answers. When the GPT-4 model kicks in - my questions are answered with beautiful clarity...whether or not this technology "hallucinates" or not...without a doubt it's empowered me to learn to program. If it's helped me out of 50% of my ruts - i count that as overwhelmingly positive!

heavenlyblue · on Oct 28, 2023

I am not going to visit the website you just posted here unless you explain how the mechanism works, otherwise I will assume it's just crappy advertisement in this post.

hobs · on Oct 28, 2023

Phind is just prompt engineering, as is anything sitting on top of anything OpenAI is building right now - if you are doing Q&A then maaaaybe it does some resource augmented generation as well (say, from Stack Overflow or other technical resources)

Phind is often recommended to me by programmers who find it produces "better" results than a naive GPT-4 session, but I don't know that anyone has done any real world testing.

an_aparallel · on Oct 28, 2023

as explained before me phind is a website built ontop of GPT-4 + a free tier (which is what i use) - which uses what they call the Phind Model...though it changes often.

I understand your skepticism (because the internet is fully broken right now/and forum posts are more or less paid for content at this point)

but...its a bit funny you'd like someone who just explained they're a beginning programmer who got into programming because an LLM helped remove insane amounts of friction from the learning process to "explain a mechanism" to them.

astrange · on Oct 28, 2023

Nobody knows how LLMs work, much less LLMs behind an OpenAI server API, so that's a hard question.

melagonster · on Oct 28, 2023

sorry, everyone is losing their job. we have no chances of living in future.

strikelaserclaw · on Oct 27, 2023

gpt4 is amazing, i rarely use google as as starting point for my programming related queries these days.

kaba0 · on Oct 28, 2023

Google has actively become catastrophically bad to the point that it will ignore the only special keyterm I deliberately looking for for more general dumb results.. like it is literally unusable - I used to be able to find that very specific stackoverflow answer I have read years ago, or some barely read blogpost, and now I’m happy if it actually finds the website if I search for the domain name..

GPT would be good as a search engine, but who would want a search engine that stopped indexing a few years back? Also, it is not good at ultra niche topics, which would be the whole point of a search engine.

justinclift · on Oct 29, 2023

Yeah. DuckDuckGo also seems similarly bad these days too. :(

RealityVoid · on Oct 27, 2023

Huh, I can't seem to get in the groove of using it, maybe I'm old or something, but it annoys me all the subtle ways it's wrong and I feel I have a much better grasp if I think through it myself supported by Google.

popularonion · on Oct 27, 2023

I get a lot of mileage out of ChatGPT just treating it like an intern who turns around work instantly. You don't expect interns to write perfect code, but they can save you a ton of time if you set them loose on the right problems.

For any relatively simple task I can say "Write a Python script to do X" and it will almost always spit out working code, even if it has subtle mistakes. Fixing mistakes is fine and part of the process. I don't have to read StackOverflow posts saying "Do you really want to do X?", or sift through documentation that follows the author's approach of how they want to introduce the material but doesn't directly address my question.

amelius · on Oct 28, 2023

Not OP, but I always want to read the code generated by chatgpt before I run it. And I dislike reading other people's code much more than writing it myself.

danenania · on Oct 27, 2023

Especially now that the training cutoff is being moved up to April 2023. Questions requiring more recent results were the main ones I’ve been going back to google for.

shusaku · on Oct 28, 2023

Part of this is that ad infested AI generated blog spam is flooding Google! But it’s also my go to. I also really liked GPT to bring me up to speed on a libraries I’ve never used.

colinhb · on Oct 28, 2023

Agree in part but also I think Terry is such an outlier (though also generous and humble) that it’s hard to extrapolate from this example to a more general case

lulznews · on Oct 27, 2023

They’re an easy 100x for the elite. Top engineers are now 10000x’ers.

voxl · on Oct 28, 2023

The only benefit Terrence Tao has received is in correcting syntax. He has mentioned that he has gotten (in my opinion better) benefits from talking to existing Lean4 specialists.

rowanG077 · on Oct 28, 2023

I don't think you realize what you are saying here. I agree that it is a large boost but 100x is just too ridiculous to take serious. Do you really believe an engineer can now finish in 20 hours what would have taking them a year before?

ghshephard · on Oct 28, 2023

As someone who uses ChatGPT4 pretty much nonstop all day (I have a monitor dedicated to it) - it's almost certainly a 10-20x in fields I have no knowledge of (writing in languages I've never used, dealing with OS/Kernel/Network constructs that are unfamiliar to me) - I can knock off in an hour what might have taken me a day or two previously - but I don't think I've ever had a task where I could say that I completed in 1 hour what would have taken 100 hours - though I would love to hear of counterclaims from people who have been able to do so - definitely not saying impossible, just that I haven't had that experience yet.

pradn · on Oct 28, 2023

The idea of the 10x programmer can mean 1) someone who produces a ton of code quickly 2) someone who can solve seemingly-intractable problems 3) someone who’s presence on a team improves everyone’s productivity quite a bit 4) someone who chooses technical decisions that save a ton of time down the line.

amelius · on Oct 28, 2023

(1) is the generally accepted meaning. The others need different words.

hobs · on Oct 28, 2023

Disagree, those are all valid interpretations, and depending on your experience you will have a vastly different understanding.

To me, the first one is the most basic and frankly, most silly definition of a 10x programmer - produces more code? really? Code sucks. Nobody needs more code, people need solutions. Solving a problem with no more code, or understanding it so a _tiny_ change solves all problems? Way better than adding code, actually much harder to synthesize and produce new results. Way more efficient.

Now can you remove things and solve the problem? Realize you can build an even more simple and generic system that solves your problem? Even more amazing.

amelius · on Oct 28, 2023

> Nobody needs more code, people need solutions.

More code per unit of time, not necessarily more code in an absolute sense.

firedaemon · on Oct 28, 2023

I have always used it to mean “solves problems 10x faster” not “produces 10x code”.

The latter would be a bad thing.

kaba0 · on Oct 28, 2023

I would argue that it only helps in very beginner questions as it is beyond useless on anything more complex.

Like if the answer is not a leetcode/CRUD code sample you can easily find on the internet either way, it can’t do anything useful. Like, my problems mostly constitute things like reasoning about the correctness of a given change in the context of the whole codebase, its threading model, etc. It will at best regurgitate some bullshit like “race conditions can be this and that” here, not helping at all.

tux3 · on Oct 27, 2023

For people looking for an easy introduction to Lean4, the Natural Number Game is great: https://adam.math.hhu.de/#/g/hhu-adam/NNG4

And if you just want to read without playing a game: https://lean-lang.org/theorem_proving_in_lean4/introduction....

ohduran · on Oct 27, 2023

Noob here, how is Lean4 different from TLA+ or Alloy? Is that even a reasonable comparison?

Edit: Wrote Allow the first time, good stuff.

tux3 · on Oct 27, 2023

I'd say TLA+ is designed more for software people trying to design systems, write down specs, and reason about how the thing behaves dynamically.

Lean is used mostly for writing down math proofs, and a lot less for software (although by the Curry–Howard correspondence, math proofs and programs have an equivalence, so the line is a little blurry). Lean has "mathlib", which is like a standard library of formally verified math that people can contribute to and use in new proofs.

A big multi-year effort to start formalizing the proof of Fermat's Last Theorem in Lean 4 was approved recently: https://www.ma.imperial.ac.uk/~buzzard/xena/pdfs/AITP_2022_F...

kmill · on Oct 27, 2023

Leo de Moura wants Lean 4 to be used for software verification too.

A cool thing about Lean 4 is that it's also a programming language, using the same syntax as for proofs, making it easy to consider proving correctness properties of programs you write. Most of Lean 4 and its tactics are written in Lean 4 (though at this point almost none of this code has any associated proofs).

dunham · on Oct 27, 2023

Yeah, I believe they said they intend for it to be used as a general purpose programming language. I used it to complete Advent of Code last year.

There are some really interesting features for general purpose programming in there. For example: you can code updates to arrays in a functional style (change a value, get a new array back), but if the refcount is 1, it updates in place. This works for inductive types and structures, too. So I was able to efficiently use C-style arrays (O(1) update/lookup) while writing functional code. (paper: https://arxiv.org/abs/1908.05647 )

Another interesting feature is that the "do" blocks include mutable variables and for loops (with continue / break / return), that gets compiled down to monad operations. (paper: https://dl.acm.org/doi/10.1145/3547640 )

And I'm impressed that you can add to the syntax of the language, in the same way that the language is implemented, and then use that syntax in the next line of code. (paper: https://lmcs.episciences.org/9362/pdf ). There is an example in the source repository that adds and then uses a JSX-like syntax. (https://github.com/leanprover/lean4/blob/master/tests/playgr... )

pjmlp · on Oct 27, 2023

Yeah, here is where TLA+ fails to gain adoption, because it is only a model, how people actually implement the architecture is completely unrelated to how the model was validated.

nextaccountic · on Oct 28, 2023

> Most of Lean 4 and its tactics are written in Lean 4 (though at this point almost none of this code has any associated proofs).

Do you have to actually prove a tactic work for it to be sound? A tactic will rewrite the program into a simplified program that will verified by a kernel; if the tactic wrongly expands something, the kernel will later reject the program. Only the itself kernel need to be verified for the whole thing to be sound.

Or am I missing something?

kmill · on Oct 28, 2023

You're not missing anything. I was going to mention this but decided not to get into it. (One detail: you can't verify the kernel exactly because of Gödel incompleteness issues.)

ohduran · on Oct 27, 2023

Thank you so much for your answer

krsrhe · on Oct 27, 2023

Lean is for verifying proofs, not writing them. It helps you find mistakes, but doesn’t help you understand or express ideas in a human readable way.

kaba0 · on Oct 28, 2023

I am absolutely not an expert at either of them, but I believe TLA+ is interested mostly about all the possible ordering of “code”, that models your program, verifying the absence of concurrency bugs.

Do correct me if I’m wrong, but Lean is more about total functions/FP code, not sure how well does it handle threading, if at all. It might be more related to the correctness of the actual implementation of a serial algorithm.

atomicnature · on Oct 27, 2023

A few years back, I was trying to find out how to reduce mistakes in the programs I write.

I got introduced to Lamport's TLA+ for creating formal specifications, thinking of program behaviors in state machines. TLA+ taught me about abstraction in a clear manner.

Then I also discovered the book series "software foundations", which uses the Coq proof assistant to build formally correct software. The exercises in this book are little games and I found them quite enjoyable to work through.

https://softwarefoundations.cis.upenn.edu/

Genbox · on Oct 27, 2023

Code correctness is a lost art. I requirement to think in abstractions is what scares a lot of devs to avoid it. The higher abstraction language (formal specs) focus on a dedicated language to describe code, whereas lower abstractions (code contracts) basically replace validation logic with a better model.

C# once had Code Contracts[1]; a simple yet powerful way to make formal specifications. The contracts was checked at compile time using the Z3 SMT solver[2]. It was unfortunately deprecated after a few years[3] and once removed from the .NET Runtime it was declared dead.

The closest thing C# now have is probably Dafny[4] while the C# dev guys still try to figure out how to implement it directly in the language[5].

[1] https://www.microsoft.com/en-us/research/project/code-contra...

[2] https://github.com/Z3Prover/z3

[3] https://github.com/microsoft/CodeContracts

[4] https://github.com/dafny-lang/dafny

[5] https://github.com/dotnet/csharplang/issues/105

mjburgess · on Oct 27, 2023

The issue is that modern software fails because it's part of a complex system with many moving parts, rather than, it is inherently complex at the source-level.

Certain sorts of algorithmically complex development (games, cars, medical hardware, etc.) would benefit from a 'closed-world verification' -- but that's not most software, and they have alternatives.

'Code correctness', including unit testing, ends up being a big misdirection here. What you need is comprehensive end-to-end tests, and instrumentation to identify where failures occur in that end-to-end.

The effort to source-level-check source-level-code is largely a huge waste of time and creates an illusion of reliability which rarely exists.

throw0101c · on Oct 27, 2023

> The issue is that modern software fails because it's part of a complex system with many moving parts, rather than, it is inherently complex at the source-level.

A reminder of Gall's law:

> A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.[8]

* https://en.wikipedia.org/wiki/John_Gall_(author)

mrkeen · on Oct 27, 2023

Strong disagree.

> The issue is that modern software fails because it's part of a complex system with many moving parts, rather than, it is inherently complex at the source-level.

The choice to run a system as many different moving parts is a decision taken by the team in order to avoid failure.

> Certain sorts of algorithmically complex development -- but that's not most software

It's all software.

> 'Code correctness', including unit testing, ends up being a big misdirection here. What you need is comprehensive end-to-end tests, and instrumentation to identify where failures occur in that end-to-end.

No and no. I have comprehensive end-to-end tests. They take forever, don't fit into RAM (for some services I need to run them on my home PC because my work laptop only has 16GB), and most importantly: they show that the code is not correct. Now I have to change incorrect code to correct code (while not breaking any public interfaces. I wish my predecessors did not put incorrect code into the live system.

mjburgess · on Oct 27, 2023

I'm not really sure what you're saying here -- the parent commenter is talking about closed-world source-level code verification.

My claim is that these techniques help in relatively few areas of software development. In the main, software is built with open-world interactions across a range of devices (network, disk, etc.) where those interactions dominate the net failure modes of the system.

In the latter case, source-level verification is not only of no help, but it's a huge resource misallocation. You end up writing tests that just create the illusion of success; and then the whole thing constantly falls over.

Spend all that time instead in designing tests that do fit into ram (eg., sample your datasets); and instrumentation that does reveal errors (eg., sampling of memory, requests, etc.).

One of my first software jobs was writing a reverse proxy -- one of my co-devs wrote unit tests that simply established the in/out of various functions was "as expected". Pretty useless -- the issue is whether the proxy actually worked.

Likewise most source-level correctness efforts are 'testing theatre'.

Devs are looking for cargo-cult solutions they can copy/paste. No, testing software is an actual area of development -- and you need to develop tests, not cargo-cult

snovv_crash · on Oct 27, 2023

Tests only show that it is correct for the sets of values and code paths you exercise. It's quite possible for other aspects to be incorrect, and this is what theorem provers like Coq help with.

charcircuit · on Oct 27, 2023

If you have an incomplete or buggy specification Coq won't actually prove the absence of bugs.

Jweb_Guru · on Oct 27, 2023

This is true but mostly meaningless in practice. I have never found the people who say this every time formal verification comes up to be the same people who actually work in formal verification. The reality is that it's far far harder to mess up a formal spec that you actually have to prove a concrete instance of, than it is to mess up the code. With a buggy spec, in the course of trying to prove it you'll arrive at cases that seem nonsensical, or be unable to prove basic lemmas you'd expect to hold true. Formal spec bugs happen, but they're quite rare. Usually even when there are edge cases that aren't covered, they will be known by the authors of the spec due to the dynamics I mentioned earlier (cases encountered during the proof that are clearly being solved in ways that don't follow the "spirit" of the spec, only the letter). Of course, "informal" specs, or paper proofs, will have subtle bugs all the time, but that's not what we're talking about here.

charcircuit · on Oct 27, 2023

>Formal spec bugs happen, but they're quite rare

A lot of specs bugs happen all the time. If you think people can account for all edge cases in massively complex projects you are wrong. There are many behaviors that you can't predict will be nonsensical ahead of time until you actually hit them.

Formal verification is not a silver bullet. Despite all of the extra effort bugs will still happen. It's better to invest in making things safe by default, or by aborting when getting into a bad state.

Jweb_Guru · on Oct 27, 2023

Can you point me to some of the concrete bugs in formal specs with associated proofs that you are thinking of? Specifically in the part that was formally verified, not in a larger project that incorporates both verified and unverified components. Because my experience is that when asked to actually provide concrete examples of how formally verified code has bugs in practice, people find it quite difficult to do so. Formal verification is about as close to a silver bullet as it gets for eliminating bugs, even if it's not 100% (nothing is). It's certainly far better at eliminating bugs than vague philosophical positions like "making things safe by default" (what does that mean?) or "abort when you're in a bad state" (how do you know you're in a "bad" state? Isn't it in general at least as difficult as figuring out what the correct formal spec is? How easy is it to detect? Is the performance cost acceptable?).

In fact, what usually happens is the opposite of a formal proof being undermined by a bad spec--when an informal spec is formally verified, inconsistencies are often found in the original spec during the course of the proof process, and bugs are subsequently found in older implementations of that spec. Fixes are then incorporated into both the original spec and the existing implementations. Formal verification is a spec-hardening process.

charcircuit · on Oct 27, 2023

>Specifically in the part that was formally verified

No, I am saying that is that it is easy to not verify something because you don't know the requirements up front.

>"making things safe by default" (what does that mean?)

It mean that complexity is abstracted away such that it is hard to the wrong thing.

>"abort when you're in a bad state" (how do you know you're in a "bad" state?

There are invariants which you can assume that are always true. If they aren't true for whatever reason you can abort and later track down what caused you to enter this state. It could be as simple as some logic bug or obscure as hardware malfunctioning and causing a bit flip (at scale you need to deal with hardware misbehaving).

>inconsistencies are often found in the original spec during the course of the proof process

Bugs are found in the process of testing too.

Jweb_Guru · on Oct 27, 2023

> No, I am saying that is that it is easy to not verify something because you don't know the requirements up front.

Granted, but "this spec doesn't attempt to cover [X]" is very different from "this spec claims to cover [X] but actually doesn't, in ways the authors are unaware of." The former is quite common in formal specs with machine-verified correctness proofs and I've never heard anyone call that a "spec bug". The latter is not common in them at all. Perhaps you didn't intend this yourself, but the way people generally talk about "there could be a bug in the spec" is usually used to imply that the latter case happens often enough to make formal verification comparably effective to other defense mechanisms, when empirically formal verification is far, far more effective than any other bug prevention mechanism--to the point that people struggle to find even individual examples of these kinds of spec bugs.

> There are invariants which you can assume that are always true. If they aren't true for whatever reason you can abort and later track down what caused you to enter this state. It could be as simple as some logic bug or obscure as hardware malfunctioning and causing a bit flip (at scale you need to deal with hardware misbehaving).

Yes, and stating these invariants precisely is generally exactly what you need to do as part of writing the formal spec; getting them right is equally hard in both cases, so it seems weird to me to say that it's easier to check the invariants at runtime than to get the spec right. In fact, many formal specs have invariants that are exactly of the form "assert that if I ran this line of code, it would not throw an exception." Moreover, you can use much more "expensive" invariants when you're creating a spec, since you don't actually have to check them at runtime, which in practice lets you catch far more and more subtle bad states than you can with software checks. For example, your proof state can include a full history of every action taken by the application in any thread, and prove that invariants on them are maintained during every instruction, even when those instructions proceed concurrently and might witness partial or stale views of the latest values in memory; storing and checking all of this information on every step on an actual machine would be wholly impractical, even for high level languages and non-performance-critical software.

Obviously, things are defined according to some base model of the world, so if you are concerned about hardware bit flips (which in rare contexts you are) your spec should take that into account, but the vast majority of bugs in modern applications are not bugs of this form, and most defensive runtime checks are equally susceptible to hardware issues or other unexpected environment changes.

> Bugs are found in the process of testing too.

Yes, and I never said testing was bad. Formal verification and testing are complementary techniques, and tests are important in a formal verification context because you should be able to prove the tests will pass if you've gotten your spec right. However, them being complementary techniques doesn't mean that they are equally effective at reducing bugs.

jerf · on Oct 27, 2023

"Code correctness is a lost art."

No, it really isn't. It's doing better now than it ever has... and I mean all the negative implications of that. It was not an art that was ever "found" in the sense you mean.

throwalean · on Oct 27, 2023

Note that the Z3 SMT solver was written by Leonardo de Moura, who also is the lead dev of Lean 4. Not a coincidence (-;

Lean 4 seems to be used in production at AWS: https://github.com/cedar-policy/cedar-spec/pull/138

pjmlp · on Oct 27, 2023

Or using F* and then generate F# code, https://www.fstar-lang.org/

kccqzy · on Oct 27, 2023

Formal verification is often not the best tool for ensuring code correctness from an ROI perspective. Things like unit tests (including property based tests) and ensuring 100% code coverage often achieve adequate results with less effort.

jlouis · on Oct 27, 2023

The key point is that most software don't need correctness to the level formal verification provides. There's a subset of software however, for which there's no substitute for a formal verification process.

aSanchezStern · on Oct 27, 2023

Additionally, formal verification usability is an area of constant research, and the set of software for which it is the best ROI increases over time.

pfdietz · on Oct 28, 2023

If you can specify your program (or a part of it) well enough to prove it correct, you can also use that specification for property-based testing.

gmadsen · on Oct 27, 2023

Ada is gaining popularity in safety critical systems

gilcot · on Oct 27, 2023

Ada/SPARK is already popular in safety critical systems

aeonik · on Oct 27, 2023

Have you looked into Idris2 at all. While looking into these theorum provers, it always felt like they had an impedance mismatch with normal programming.

Idris2 portends to a general purpose language that also has a more advanced type system for the theorum proving.

https://github.com/idris-lang/Idris2

atomicnature · on Oct 27, 2023

That's interesting; no I wasn't aware of it. Will check it out sometime, thanks.

iso8859-1 · on Oct 27, 2023

There's a list of packages here: https://github.com/stefan-hoeck/idris2-pack-db/blob/main/STA...

samvher · on Oct 27, 2023

I had the same positive experience with Software Foundations.

There is another book somewhat derived from it (if I understand correctly) using Agda instead of Coq: https://plfa.github.io/

I haven't had the chance to go through it yet, but it's on my list - I think Agda (and as mentioned by another commenter, Idris) is likely to feel more like a programming language than Coq.

user3939382 · on Oct 27, 2023

I’m really excited about dependent types. I’m expecting we won’t get them for a while though. Dependent Haskell is progressing but apparently it’s hard to retrofit. Idris’ own creator has said he expects it to be a model for other languages, I don’t think it will ever have mainstream adoption. Coq and Agda, F* aren’t really designed to be general purpose.

Although the implementation for the compiler is complex, and the syntax can get complex and verbose, to me my requirement is simple: I want to encode everything about input and output that I know.

Right now in mainstream languages I often know more about my arguments or output than the type system will allow me to specify.

valyagolev · on Oct 27, 2023

I totally share your excitement about dependent types, but it seems that, unlike the type systems we're used to, theorems about the dependent types are much harder to prove, which makes them not very comfortable to use for the whole program.

If only there was some kind of a gradual, perhaps typescript-like approach to adding arbitrary type-level value-limiting information in random places without having to have everything proven everywhere...

skulk · on Oct 27, 2023

Every non-dependent typing relation is also a dependently typed relation so I think things are already the way you want, unless you have a certain example in mind.

valyagolev · on Oct 27, 2023

Sure, in the same sense that every "untyped" program is already typed with some kind of a universal type, but what's the point?

What I want is to be able to specify and have (when possible) automatically proven arbitrary assertions anywhere in the code without necessarily making sure that every possible presupposition is proven from the ground up. Just like in Typescript, I can add a type at any point where there was only "any", and have a small portion of the program typed without typing the whole thing

dunham · on Oct 27, 2023

Not sure if it's what you're asking for, but in lean, you can put `sorry` to skip a proof. It's `believe_me` in Idris2, and I think the Agda version is `trustMe`.

You can also use stuff like ! when you don't want to prove your array indices (it might crash or substitute a dummy value if you're wrong).

skulk · on Oct 27, 2023

> What I want is to be able to specify and have (when possible) automatically proven arbitrary assertions anywhere in the code

This doesn't even exist in TypeScript. If I change

    function foo(a: any) { return baz(a); }

to

    function foo(a: number) { return baz(a); }

whoever calls foo has to still prove (or assert) that the argument is a number.

Is that what you're after, asserting a dependent type? For example being able to change:

    function divByTwo(a: number): number { return a/2; }

to

    function divByTwo(a: even_number): number { return a/2; }

You want every place divByTwo is called to also automatically supply a proof that the argument is even?

valyagolev · on Oct 27, 2023

i don't even need it to go beyond the insides of a function. something like a proof of an invariant that's only relevant inside the function's body would be fine. e.g. in Rust, almost every place where you see something like ".unwrap()" with "// unwrap safety:" comment, this comment could be an assertion easily proven from the few lines above

mettamage · on Oct 27, 2023

Eli5 dependent types?

Chatgpt:

Dependent types are a concept in computer science and programming languages where the type of a variable can depend on the value of another variable. In simpler terms, imagine you have a list of numbers and you also have information about how long that list is. With dependent types, you could create a type for the list that explicitly includes its length, ensuring at compile time that operations respect this length. This makes it possible to catch certain kinds of errors before the code even runs.

For example, in a language with dependent types, you could specify a function to take a list of length 3 and no other sizes. If you tried to pass a list of length 4, the program wouldn't compile, preventing this kind of mistake early on.

It's a bit like having an extra layer of safety checks, where the types are more expressive and can encode more intricate relationships between variables.

howenterprisey · on Oct 27, 2023

I never care much for chatgpt answers. I don't know why people post them on here.

In the first sentence, "another" is wrong because you don't need two variables, you just need one. Final paragraph's wrong for the same reason.

The example given is poor given that I can write [i8; 3] or int[3] in Rust or C and those very much do not have "dependent types" in the popular sense (Rust's const generics excepted). To be fair, those examples are technically dependent types, but it would be better to give an example that's impossible in those languages, such as "array with length at most 3" or "even integer".

Finally, to nitpick, "a bit like" is unneeded hedging.

Stack Overflow did a much better job: https://stackoverflow.com/questions/9338709/what-is-dependen.... Wikipedia's article is pretty bad and maybe I'll get around to fixing it at some point.

ChadNauseam · on Oct 27, 2023

I like ChatGPT’s answer better than yours. Rust’s cost generics and C++ aren’t dependent types in any sense. And saying “the type of one variable can depend on the value of another” is vague but a good intuition pump.

Since I guess we’re doing dependent types explanations I’ll give it a go. Dependent types extend a type system with two new type thingies: 2-tuples where the type of the second element depends on the value of the first, and functions where the type of the function’s return value depends on the value that was passed to it.

howenterprisey · on Oct 29, 2023

Ha! Looks like I was wrong and confusing refinement types with dependent types: https://cs.stackexchange.com/a/21733 (unless of course this person is also confidently wrong, heh)

> Dependent types are types which depend on values in any way. A classic example is "the type of vectors of length n", where n is a value. Refinement types, as you say in the question, consist of all values of a given type which satisfy a given predicate. E.g. the type of positive numbers. These concepts aren't particularly related (that I know of). Of course, you can also reasonably have dependent refinement types, like "type of all numbers greater than n".

mettamage · on Oct 27, 2023

Because I had the question and I figured this was quicker. I didn’t know what dependent types were.

So now you know why I do it. Also, I believe this is my first time doing it. I might be wrong.

Is it better to ask and wait for an answer instead?

There is nothing in the guidelines on HN about it. I don’t know what’s reasonable and I haven’t seen strong cultural norms from HN yet. I at least labeled that the text was from chatgpt as to not confuse it was my own text.

howenterprisey · on Oct 27, 2023

It's of course fine to ask chatgpt. I also appreciate that you wrote that it was from chatgpt.

However, I wouldn't recommend posting the result here if you don't know if it's correct. Moreover, anyone can ask chatgpt themselves. It's better to wait for someone here to post an answer.

Yes, there's nothing in the guidelines, but they're (deliberately) not all-encompassing. Besides, I would hope it's part of basic Internet etiquette; just because we now have access to tools that generate plausible-sounding but not necessarily correct answers to questions doesn't mean we need to post what they create in the place of genuine answers.

mettamage · on Nov 2, 2023

> Besides, I would hope it's part of basic Internet etiquette

Yea that got lost on me. I think I view these things a bit differently than most on HN. I've noticed in general that the more I'm not at uni, the more my thinking has become heuristic and quick-ish. It used to be more thorough and in-depth. The trade-off of it being that such type of thinking is more time consuming but the answer is more comprehensive and/or accurate.

gosub100 · on Oct 27, 2023

> Is it better to ask and wait for an answer instead?

No, around here it's better to say "So dependent types are pretty much $SOMETHING_COMPLETLY_WRONG ?" and wait for all the "corrections" (aka Cunningham's law someone linked to nearby).

kccqzy · on Oct 27, 2023

Did you forget about search engines that enable you to do your own research rather than asking a human or an LLM?

mettamage · on Oct 27, 2023

I could use google but chatgpt wins on speed and asking on HN wins on vine. When someone asks me a question I am mostly quite enthusiastic in explaining it. You can’t get that as easily from a search engine.

It all has trade offs.

kccqzy · on Oct 27, 2023

I don't fault you if you explicitly emphasize speed rather than accuracy. Just don't post these on HN.

xeonmc · on Oct 27, 2023

ChatGPT = Cunningham bait

gosub100 · on Oct 27, 2023

I don't see anything wrong with GPT's answer other than pedantic quibbling. I just don't like it because I don't want to share this space with non-humans.

mettamage · on Oct 27, 2023

What does that mean?

Jtsummers · on Oct 27, 2023

https://meta.wikimedia.org/wiki/Cunningham%27s_Law

Cunningham's Law: Post something wrong online so you can get a correct answer.

Half the ChatGPT answers on here seem to be wrong in obvious ways or, worse, subtle but critical ways. When people post them they get downvoted, and other people chime in with "Why are you trusting a fabulist like ChatGPT instead of going to actual resources with definitions and explanations that aren't garbage? Here's what it actually is..."

gosub100 · on Oct 27, 2023

I wonder if it would create a singularity in spacetime by posting the GPT answer to "what is cunningham's law" here?

howenterprisey · on Oct 27, 2023

Worked for me :<

bjornasm · on Oct 27, 2023

That is really interesting. Whish there was some better name for that, as I feel like it isn't that descriptive. However their benefit seem really obvious, saying that something is an areay, or even an array of integers is barely half the picture if in reality it is a array of length 5 of even integers to loan from your example. I guess you would try to implement it in other languages creating objects?

icen · on Oct 27, 2023

Another classic use case is that you can have expressions in the types of your function, for example this won't compile:

    stringOrInt : (x : boolean) -> int -> (if x then String else int)
    stringOrInt true x = toString x
    stringOrInt false x = x + 1

    1 + stringOrInt true 37 # this will error, because it knows you've not returned an int

The other example that you can do in depedently typed languages, but is too involved to write out here, is make a type-safe printf, where the format string produces the types for the other arguments.

turndown · on Oct 27, 2023

It’s simplest explanation is that dependent types are types whose exact type use some kind of parameterized value in the definition. So a dependently typed array would use a parameterized length value as part of it’s definition(this is usually the easiest example to understand). So an array of length 4(of whatever type) could be identified as different from an array of length 5, etc.

omneity · on Oct 27, 2023

That one of the brightest minds of our generation is able to increase his bandwidth with the combination of LLMs and automated proofs makes me super bullish on this tech combo in the future!

It starts with bug-fixing, then supports verification, until it starts propelling new discoveries and push the envelope.

We need a term when a dynamic like Moore's Law "infects" a field that had no such compounding properties before.

EDIT:

There's additional context that Terence Tao is using Copilot to help him learn Lean. As shared by adbachman: https://mathstodon.xyz/@tao/111271244206606941

Could Terence have done it without Copilot? Sure, but like many of us he might not have initiated it due to the friction of adopting a new tool. I think LLM tech has great potential for this "bicycle for the mind" kind of scenarios.

aylmao · on Oct 27, 2023

Lean 4 is a theorem prover, and has nothing to do with LLMs as far as I know though.

aylmao · on Oct 27, 2023

Lean 4 is a programming language and theorem prover, and has nothing to do with LLMs as far as I know though.

adbachman · on Oct 27, 2023

the missing context from the previous comment is that Tao used GitHub Copilot to help with learning Lean.

He's been writing about it as he goes, most recently: https://mathstodon.xyz/@tao/111271244206606941

omneity · on Oct 27, 2023

Thanks for adding the clarification. I thought this was more common knowledge. I will update my comment accordingly.

passion__desire · on Oct 27, 2023

Not to beat the point: LLMs are compiler for english (natural) language.

syngrog66 · on Oct 27, 2023

I found a "bug" in one of Terence Tao's math blog posts too, years ago. I told him about it, he fixed it, and thanked me.

I didnt make the front page of Hacker News, of course. lol

syngrog66 · on Oct 27, 2023

If anyone interested, I went into more detail about it, over in a post on my blog:

https://news.ycombinator.com/item?id=38040982

Dudester230602 · on Oct 27, 2023

I was worried that Lean4 is a yet another LLM, but it's actually some hard and reliable stuff.

jamesrom · on Oct 27, 2023

Anything that helps Terence Tao find bugs in his papers is hard and reliable in my books.

ykonstant · on Oct 27, 2023

The Lean 4 community is quite optimistic about combining LLMs and theorem provers for proof assistance and formalization.

woolion · on Oct 27, 2023

"Terry Tao finds ChatGPT very helpful to prove new theorems" would actually be bigger news than this one, IMO.

throwalean · on Oct 27, 2023

"Terry Tao finds ChatGPT very helpful to formally verify his new theorems" seems to be a true statement. See some of his other recent mathstodon toots.

woolion · on Oct 27, 2023

The point is that LLMs and similar tools tend to be very good at automating the trifle but not very useful at what would be considered really "interesting" work.

So while your point is somewhat true [0], as he mentions that these tools could become good enough to do the formal verification part, it's precisely not the interesting part. See [1] and [2]; in particular some things that are very easy to do in real maths can be very challenging in an automated theorem prover, quoting from [2]:

>In the analyst's dialect of Mathematical English, this is a one-line proof, namely "by the standard limiting argument". Unpacking this into a formal proof required me to go through a fair bit of the Mathlib documentation [...]

It's impressive to be able to do such mathematics in Lean/Coq..; at all, but it is very tedious mechanical work [3].

>It was more tedious than I expected, with each line of proof taking about an hour to formalize

So I think that rather proves the point of what LLMs are currently good for, and what tools can help for really difficult tasks, rather than invalidate it.

[0] https://mathstodon.xyz/@tao/111305365372766606

[1] https://mathstodon.xyz/@tao/111305336701455719

[2] https://mathstodon.xyz/@tao/111259986983504485

[3] https://mathstodon.xyz/@tao/111305336701455719

OscarCunningham · on Oct 27, 2023

He is finding Copilot useful to help with the formalisation: https://mathstodon.xyz/@tao/111271244206606941.

cubefox · on Oct 27, 2023

I wonder whether we could combine formal proof checkers (like the Lean proof checker) with language models that generate synthetic conjecture-proof pairs in a formal language like Lean.

The Lean proof checker could be used to automatically verify whether the synthetic proofs written by the language model are correct. This information could be used to provide an RL reward signal applied to the original language model, which would result in it writing better proofs. (Or we train a new model using the correct synthetic proofs of the previous round as training data.) And then the process repeats. So the model would self-train using its synthetic training data, without further human intervention.

We could even make this process more adversarial. First we split the generator language model into two: One which generates conjectures, and one which tries to prove/disprove them in Lean. Then add a predictor model which tries to predict whether a synthetic proof is verified by the Lean proof checker. The lower the predicted probability that the proof will be correct, the more reward gets the proof-generator model if it did indeed provide a correct proof.

Finally, we add another model which tries to predict the reward the proof-generator model will get for a given synthetic conjecture. Then the conjecture-generator model is rewarded for conjectures that are predicted to yield a high reward in the proof-generator model. So conjectures that are neither too hard not too easy for the proof-generator model.

So we would expect that the whole system would progressively create harder and harder synthetic proofs, which in turn allows for better and better self-training of the proof-generator.

It seems this could in principle scale to superhuman ability in generating proofs. The process would be somewhat similar GANs or to self-play in AlphaGo Zero.

I think the hard part is the initial bootstrapping part, to get the whole process off the ground. Because the initial training of the generator models has to be done with human provided training data (Lean proofs). But once the synthetic proofs are good enough, the system would self-train itself automatically.

ykonstant · on Oct 27, 2023

This workflow is an explicit goal of the Lean 4 devs. The official Zulip chat has a channel dedicated to the interface between the two:

https://leanprover.zulipchat.com/#streams/219941/Machine%20L...

cubefox · on Oct 27, 2023

That would be interesting, though this link requires an account.

pfdietz · on Oct 28, 2023

It would also be interesting to train these models on the entire math literature. One would want to use this to verify the math literature, translating the proofs there to checkable forms. Eventually, we would get a model that knows every theorem and every correct proof in the math literature.

aSanchezStern · on Oct 27, 2023

Sounds like you've stumbled into the wonderful world of machine-learning guided proof synthesis! While I don't think the full system you're describing has been built yet, many similar systems and pieces have. In terms of the initial phase of supervised learning on existing proofs to prove new ones, there's TacticToe (https://arxiv.org/abs/1804.00596), Tactician (https://arxiv.org/pdf/2008.00120.pdf), CoqGym/ASTactic (https://arxiv.org/abs/1905.09381), Proverbot9001 (https://arxiv.org/abs/1907.07794), and Diva (https://dl.acm.org/doi/10.1145/3510003.3510138#sec-terms), among others. Most of these have some sort of language model within them, but if you're specifically looking for the LLM's that have been big recently, there's GPT-f (https://arxiv.org/abs/2009.03393), Baldur (https://arxiv.org/abs/2303.04910), and COPRA (https://arxiv.org/abs/2310.04353), though currently these models don't seem as effective as the specialized non-LLM tools. In terms of using reinforcement learning to learn beyond human written proofs, there's TacticZero (https://openreview.net/forum?id=edmYVRkYZv), this paper from OpenAI (https://arxiv.org/pdf/2202.01344.pdf), rlCoP (https://arxiv.org/abs/1805.07563), the HOList line of work (https://arxiv.org/pdf/1905.10006.pdf), and HyperTree Proof Search (https://arxiv.org/abs/2205.11491), as well as some in progress work I'm working on with a team at the University of Massachusetts.

cubefox · on Oct 27, 2023

Wow, that's a great overview. While most of it goes over my head, it is interesting that quite a lot of research has been done in this direction.

The system I outlined above was inspired by this intriguing paper:

https://arxiv.org/abs/2207.14502

It is superficially about a different topic, namely generating and solving "programming puzzles", basically a type of unit test, that are then verified by the python interpreter. This system seems quite analogous to one where the programming puzzles are replaced by conjectures in a language like Lean, the proposed puzzle solutions are instead proposed proofs, and the python interpreter is replaced by the proof assistant.

I just discovered that there seems to be at least one guy working on applying a similar system to formal mathematics: https://www.cs.cmu.edu/~jlaurent/pdf/reports/thesis-proposal... He actually cites the paper above.

staunton · on Oct 27, 2023

There are people doing this already. For Lean, an example is https://morph.so/blog/the-personal-ai-proof-engineer/

cubefox · on Oct 27, 2023

If I understand this correctly, they want to use an AI model for "autoformalization", which sounds like "using a language model to translate natural language proofs into Lean proofs". Which is cool but much less ambitious than the self-train system I described above. I guess AI technology isn't yet far enough to make such a proposal work.

staunton · on Oct 28, 2023

Given a billion dollars or so, I guess one moght get pretty far. The roadblock might be more that nobody is seeing a lot of market potential.

pama · on Oct 27, 2023

Here is earlier context for how Tao used LLM tools, including GPT-4, to help him in this journey.

https://mathstodon.xyz/@tao/111233986893287137

ocfnash · on Oct 27, 2023

You can even follow his progress on GitHub here:

https://github.com/teorth/symmetric_project/

deterministic · on Oct 27, 2023

Lean4 is brilliant. Worth digging into as a programmer. Coq, Lean4, Agda etc. made my brain explode in a good way. Making me a better software developer.

gridentio · on Oct 27, 2023

I'm kind of interested in how useful Lean4 is as a programming language, and if it's easy to prove things about a program written in Lean. I should probably look into that when I have a minute.

ykonstant · on Oct 27, 2023

Regarding usefulness: Lean is very nice to program in, if you care about pure functional languages; its FFI allows you to incorporate fast C routines very easily if pure Lean is not performant enough or lacks features. However, in some domains, Lean is within a decimal order of magnitude of (not hand-optimized) C; some benchmarks I hand-made recently impressed me.

Regarding proving things about programs, no, it is not easy, and the developers do not seem to consider it a core goal of Lean.

staunton · on Oct 27, 2023

> the developers do not seem to consider it a core goal of Lean

I guess it depends on who you ask. The original devs of Lean wanted to do "everything" (because that's how you start projects, I guess). Since then it has attracted a lot of mathematicians (especially those working on Mathlib, a library that aspires to formalize "all known math") who are happy to have "algorithm objects" and prove things about them without being able to actually run an algorithm on any input.

This goes together with mostly embracing classical logic (which breaks the original and most powerful version of Curry-Howard, which allowed you to extract programs from proofs). However, in practical situations, algorithms extracted in this way tend to be too slow to be useful, so maybe that's not actually a downside for programming purposes.

Finally, Lean4 "compiles" to C-code, so at least it is (or can reasonably easily be made) portable. People have been trying to use it for real applications, like the AWS stuff others have linked in this thread.

kmill · on Oct 27, 2023

The core Lean 4 developers do want proving properties about programs to be easy. In the short term maybe priorities have been elsewhere due to limited resources, but that doesn't mean they do not consider this to be a core goal. My understanding is that there are still some research-level problems here that need to be worked out. (Proving things about do notation is currently a real pain for example.)

tgv · on Oct 27, 2023

Does anyone understand what the proof is about? An improved boundary on the mean of geometric series? And why it's only a problem for n=3, k=2, and not for all k=n-1?

pa7x1 · on Oct 27, 2023

From a very quick glance at page 6 of https://arxiv.org/pdf/2310.05328.pdf

You can see he is treating the case of k=1,2 with that formula and uses induction to extend it to 1 ≤ 𝑘 ≤ 𝑛 − 2.

For k = n - 1 he uses a different bound defined in equation 2.2. So he bypasses the issue.

pmarreck · on Oct 27, 2023

If you have a denominator composed of “n - k - 1”, I hardly find it surprising that if n=3 and k=2 that you have a slight problem…

Anyway, check out Idris[2], it’s cool for this sort of thing

SmartestUnknown · on Oct 28, 2023

Exactly lol. Not sure why everyone is taking the contribution of lean in finding the error so seriously.

I would be more impressed when some mathematician finds a more severe error in their proofs with the help of theorem provers (meaning a mistake in their own intuition).

GEBBL · on Oct 27, 2023

Is it possible that small bugs or assumptions in a root paper could cascade through referencing papers leading to wildly inaccurate outcomes 5 or 6 papers down the line?

eigenket · on Oct 27, 2023

Yes! I work in quantum information theory and this recently happened in a subfield called resource theories.

The generalised quantum Stein's lemma [1][2] is (or was) a very powerful result that was used for over 10 years to prove things in this subfield. However last year it was noticed that there was an error in the proof of this lemma, and a whole bunch of results based on it weren't valid [3][4]. One of the authors of the paper where they wrote about the error gave a talk at the conference QIP 2023 this year, and there is a video of that talk available here [5]. Bartosz is a good speaker and I recommend watching the talk if you're interested, if you go to about 10 minutes, 30 seconds in the talk he discusses the consequences of this result now being not known to be true.

[1] Published paper https://link.springer.com/article/10.1007/s00220-010-1005-z

[2] Arvix version: https://arxiv.org/abs/0904.0281

[3] Published version: https://quantum-journal.org/papers/q-2023-09-07-1103/

[4] Arxiv version: https://arxiv.org/abs/2205.02813

[5] Youtube link: https://www.youtube.com/watch?app=desktop&v=2Xyodvh6DSY

zmgsabst · on Oct 27, 2023

This is arguably what caused the push for formalism in mathematics:

Multiple papers on calculus claimed results about continuity and derivatives, but we’re using subtly different definitions.

The conflict between those results, and the counter-examples to demonstrate the difference, led to mathematicians building the modern machinery around proofs.

> The Weierstrass function has historically served the role of a pathological function, being the first published example (1872) specifically concocted to challenge the notion that every continuous function is differentiable except on a set of isolated points. Weierstrass's demonstration that continuity did not imply almost-everywhere differentiability upended mathematics, overturning several proofs that relied on geometric intuition and vague definitions of smoothness.

https://en.wikipedia.org/wiki/Weierstrass_function

lmm · on Oct 27, 2023

In theory yes. In practice mathematicians tend to have a good instinct for which things are true (although not always - some false theorems stood for decades if not centuries) and will avoid looking into ideas that don't pass the sniff test. Plus if you keep building on the consequences of a false result then you'll likely eventually reach a contradiction, which might inspire you to spot the bug in the original result.

pbhjpbhj · on Oct 27, 2023

If mathematicians have "a good instinct" does that suggest that their brains can somehow apply an informal proof?

I wonder if they are good, or if it's selection/confirmation/other biases that lead is to think say.

Also, aren't surprising results the most interesting!?

eigenket · on Oct 27, 2023

In an informal sense disproving things is easier than proving things. For example theorems often say thing like "all objects of type Y have property X", its very difficult to work out even where to start proving such a statement unless you're an expert in Y and X, but to disprove it all you have to do is find some example Y which doesn't have property X. If you've worked with Y things for a while you probably have a bunch of "pet" Y objects you can think of and if someone proves something you kinda automatically check to see what their proof says about the ones you're familiar with.

Ar-Curunir · on Oct 27, 2023

> In an informal sense disproving things is easier than proving things.

Note that this is not true in general, and depends on the type of theorem. The idea is that while it’s easy to show why a particular proof is incorrect, it’s much more difficult to show that every proof is incorrect.

Formally, this idea is captured by CoNP, which is believed to be different from NP and hence a strict superset of P.

drdeca · on Oct 27, 2023

> which is believed to be different from NP and hence a strict superset of P

“a strict superset of P” doesn’t really follow from that. P is also believed to be different from NP, and P is certainly not a strict superset of P.

Of course, I assume you just misspoke slightly, and that whatever it is that you actually meant to say, is correct.

E.g. maybe you meant to say “coNP is believed to both be strict superset of P (as is also believed about NP), and distinct from NP.”

I think it is believed that the intersection of NP and coNP is also a strict superset of P, but that this is believed with less confidence than that NP and coNP are distinct?

I imagine I’m not telling you anything you don’t already know, but for some reason I wrote the following parenthetical, and I’d rather leave it in this comment than delete it.

(If P=NP, then, as P=coP, then NP=P=coP=coNP , but this is considered unlikely.

It is also possible (in the sense of “no-one has yet found a way to prove otherwise” that coNP = NP without them being equal to P.

As a separate alternative, it is also possible (in the same sense) that they are distinct, but that their intersection is equal to P.

So, the possibilities: P=NP=coNP, P≠cocapNP=NP=coNP, P=cocapNP≠NP,coNP, All 4 are different, (cocapNP is the intersection of NP and coNP)

Where the last of these is I think considered most likely? )

eigenket · on Oct 27, 2023

This is a beautifully and slightly ridiculously formal comment to read two comments deep in response to my comment beginning "In an informal sense".

thfuran · on Oct 27, 2023

I work in medical software and a few years ago fixed a bug that was an incorrect parameter value in a model used for computing diagnostic criteria that had proliferated throughout the literature after seemingly being incorrectly transcribed in one influential paper. The difference was relatively small, but it did make results somewhat worse.

isaacfrond · on Oct 27, 2023

More likely, wildly inaccurate outcomes will cause a re-examination of the cited theorems, which will probably flush out the bug.

By the way it is not clear to me, if the theorem was false or if only proof was wrong.

eigenket · on Oct 27, 2023

The theorem was mostly correct. As stated it was false, but it was true for n >= 8. If you change some not very interesting constants it becomes true for all n. All you need change is the constants for n < 8.

gridentio · on Oct 27, 2023

https://proofwiki.org/wiki/False_Statement_implies_Every_Sta...

pfdietz · on Oct 28, 2023

There was an entire school of mathematicians in Italy that went off the rails with incorrect results.

https://en.wikipedia.org/wiki/Italian_school_of_algebraic_ge...

forward-slashed · on Oct 27, 2023

Also check out Morph Labs, which is working with Lean to create an AI proof assistant. Cool startup by ex-OpenAI folks.

Essentially a strong type system of Lean can help with constrained generation. Thus every token would always lead to some valid (if not correct) proof in Lean, iiuc. Maybe people @ Morph can comment.

https://x.com/morph_labs

0xpgm · on Oct 27, 2023

Is there a way to do lightweight incremental proof checking in a typical say Python or Javascript codebase?

Maybe specifying some conditions/assertions in comments and have it verified using some static analysis tool? Though I recognize it could be quite a challenge in dynamically typed languages.

logicchains · on Oct 27, 2023

>Is there a way to do lightweight incremental proof checking in a typical say Python or Javascript codebase

Running a JavaScript codebase through the Typescript compiler is a lightweight way to do incremental proof checking, albeit it can only check proofs about the soundness of the code.

ianandrich · on Oct 27, 2023

Python's Deal library provides contracts and a small portable formal proofs area of the codebase.

Additionally, Deal integrates with CrossHair which does concolic execution of the tests and functions annotated with contracts. It's integrated with Z3, and most of the Python primitives are covered.

It just works surprisingly well for incrementally building up provable code properties.

0xpgm · on Oct 28, 2023

Thank you! This is great, sounds like what I'm looking for.

gtirloni · on Oct 27, 2023

Tangent useless comment: while checking this Mastodon instance, I noticed a specific user is spamming the global feed by replying to his own posts every other day. I only saw his posts, mostly, and wondered if this was a personal instance of sorts.

olddustytrail · on Oct 27, 2023

Did you notice that the domain is mathstadon.xyz so it's probably a pretty small user base.

gtirloni · on Oct 27, 2023

Yes, I noticed the domain and the small instance, but I missed your point.

clircle · on Oct 27, 2023

Never heard of a mathematical error called a bug before

hanche · on Oct 27, 2023

Me neither, but it makes sense to me in the case of an error that is relatively easily corrected. If a proof is fatally flawed, however, I would not use that term.

gsuuon · on Oct 27, 2023

Is there value in learning verified programming for application developers? I've always been curious but they mostly seem like tools for academics.

deepsun · on Oct 27, 2023

I believe the most value is taken when there's an asynchronous system and you need to verify it's correctness. Backend dev

For app dev I'd say the main problem is always "what we even need to build?", and then polishing over user experience.

neeleshs · on Oct 27, 2023

Lean4 looks like a great language. Has anyone used it in production capacity for "normal" products/applications?

spootydooty · on Oct 28, 2023

AWS has started using it internally recently, but the biggest Lean software project remains Lean 4 itself, which is almost entirely written in Lean 4 (though not verified except for some index operations and data structures). Galois has used Lean 4 internally, too, and I know of another smaller verification project by a German company for verifying some internals that were very difficult to understand.

One reason for lack of adoption is that the verified standard library for programming is still rather small. Fortunately, it is expected to grow much more quickly now that there are developers who are getting paid to work on it and I expect that we will likely see a lot more on this front in the coming years.

SushiHippie · on Oct 27, 2023

For people that know neither (like me 5 minutes ago):

>Lean4

> Lean is a functional programming language that makes it easy to write correct and maintainable code. You can also use Lean as an interactive theorem prover.

https://lean-lang.org/about/

> Terence Tao

> [...] is an Australian mathematician. He is a professor of mathematics at the University of California, Los Angeles (UCLA), where he holds the James and Carol Collins chair.

https://en.wikipedia.org/wiki/Terence_Tao

j7ake · on Oct 27, 2023

Saying Terence Tao is a math prof at UCLA is like saying John Carmack is a programmer at Keen Technologies.

jvm___ · on Oct 27, 2023

If you're going for a PhD or researching some obscure, furthest reach study of some mathematical principle and you get stuck. One of the ways to move forward is to hope that Terence Tao finds your topic interesting and can give you a few minutes of his time - because he'll probably have some ideas for moving your research forward.

daynthelife · on Oct 27, 2023

Conversely, there’s a running joke that you’re in big trouble if Terry Tao takes too much of an interest in your topic of study, since he will inevitably solve all but the hardest problems in the field.

benibela · on Oct 30, 2023

I got stuck solving a polynomial equation system for years

Perhaps I should try sending him a mail

btreecat · on Oct 27, 2023

So that's an excuse for such a low context post?

andrewflnr · on Oct 27, 2023

You can't expect the entire Internet to always provide enough context for you personally, with your exact level of knowledge, to understand a social media post with no additional effort. How do you expect that to work at scale? Meanwhile, Google is just a tab away.

thornewolf · on Oct 27, 2023

For the HN crowd, explaining who Terrence Tao is would be ike explaining who Obama is

swatcoder · on Oct 27, 2023

This may be hard to believe, but a lot of us here in "the HN crowd" have zero interest in celebrity and do not keep a catalog of them all in our heads. Even intellectual ones. There's nothing wrong with that, but it's also just its own interest one or may not have.

People approach things in a lot of different ways and it would be nice if we can just respect that instead of digging on each other or making unfounded assumptions.

ndsipa_pomu · on Oct 27, 2023

I don't have much interest in celebrity either, but I feel that you're short-changing Terence Tao. He's not really famous due to being celebrated, but he's celebrated due to his level of skill and breadth of mathematical knowledge (sometimes known as the Mozart of Mathematics).

I think your sentiment is misplaced as I would expect HN commenters to have heard of Tao as he often features in posts about mathematics. I'm sure I recall seeing some comments from him too.

There's an approachable Numberphile video featuring him here: https://www.numberphile.com/videos/the-worlds-best-mathemati...

VirusNewbie · on Oct 27, 2023

I started to agree with you, but let's keep in mind that there are new generations of tech geeks out here who are just learning about the things we live and breathe.

fifilura · on Oct 27, 2023

I am sure 100% of thornewolf and 99% of people reading the same news as you know who Terence Tao is, but extrapolating from your own perspective is treacherous.

https://xkcd.com/1053/

LoganDark · on Oct 27, 2023

https://xkcd.com/2501

hiddencost · on Oct 27, 2023

Yes. Not knowing who Terry Tao is,, if you're on HN, makes you an outlier.

LoganDark · on Oct 27, 2023

Hi, it's me, I'm the outlier, it's me

omginternets · on Oct 27, 2023

Welcome! You now know who Terence Tao is :)

wholinator2 · on Oct 27, 2023

Not any more

pfdietz · on Oct 28, 2023

Congratulations! You're one of today's lucky 10,000.

https://xkcd.com/1053/

omginternets · on Oct 27, 2023

Lack of culture on your part?

c7b · on Oct 27, 2023

It's correct that he is a professor at UCLA, but it's also worth mentioning that he's regularly called nicknames like 'greatest mathematician alive' (just try googling that phrase): https://academicinfluence.com/rankings/people/most-influenti...

meowface · on Oct 27, 2023

Famous photo of him studying with Paul Erdős at age 10: https://i0.wp.com/blogs.ams.org/blogonmathblogs/files/2015/0...

yodsanklai · on Oct 27, 2023

What would make him greater than other fields laureate?

jyunwai · on Oct 27, 2023

Beyond the incredible quality and quantity of his work starting from early in his life, what makes Terence Tao memorable to me, is his approachability and willingness to write advice for mathematicians and math students in a blog: https://terrytao.wordpress.com/career-advice/

He also has an active Mastodon, which further makes him more approachable: https://mathstodon.xyz/@tao

It's rare to see a professional at the top of an academic field remain so encouraging to other people in the field, and work to make their work accessible to colleagues across different levels of mathematical maturity.

ndsipa_pomu · on Oct 27, 2023

Other mathematicians opinions of him and his work. He's very prolific and was considered a child prodigy.

chx · on Oct 27, 2023

Tao got his PhD at the age of 21 and a tenured professor chair at the age of 24. To compare, at the age of 21 ordinary people only get their bachelors and typically it's 23-24 they when they get their masters degree. A PhD takes several more years and of course very few become tenured professors even more years later.

pjmlp · on Oct 27, 2023

Depends on the country, before Bologna Portuguese degrees would be 5 years, so 23-24 would be the age of finalizing the degree, given that PhD without Msc wasn't possible back then, having everything done meant at least 30.

n4r9 · on Oct 27, 2023

In the other direction, in the UK it's quite possible to have your bachelors by 21, masters by 22 and PhD by 25. I had my mathematics PhD by 26 and am not a remarkable mathematician.

alliao · on Oct 27, 2023

I'm sure you're remarkable enough congratulations for your hard work

n4r9 · on Oct 27, 2023

Not remarkable enough to land a post doc and stay in academia. But thank you :)

poulpy123 · on Oct 27, 2023

in france the age if you follow the traditional scholl schedule to have a phd is 26.

NotYourLawyer · on Oct 27, 2023

Yeah but since he did all this in the USA, I’m not sure why that’s relevant.

pjmlp · on Oct 27, 2023

It isn't, I was making the point university degrees aren't the same everywhere.

malwrar · on Oct 27, 2023

Dumb question: how do people skip years like this in college. Like suppose you do know the material well already, does that just get recognized by the faculty at some point (you are noticed pre-college and get a faculty member interested in you? In college?) or would he have needed to specifically apply for that in some way?

I ask in jealousy, I felt like my college put as many administrative barriers as possible to ensure they got at least 4 years of tuition out of me. I’ve always wondered how people somehow managed to speedrun to the top like this, rather than just being bored by their classes (esp gen-ed classes ugh) until they reach top rank the normal way.

waythenewsgoes · on Oct 27, 2023

He had a lot of acclaim as a young mathematician, according to Wikipedia he was taking college level math at 9 years of age. At this level of talent I believe you begin to be courted by top universities similar to D1 athlete's, not only that, but you are likely genius-level smart, so sticking you in a classroom of people your age would be doing you an intellectual disservice, similar to sticking a college freshman in a 2nd grade classroom. At this level, you are probably seen as the next Newton or Gauss, and professors will want to attach themselves to you and work alongside you. At this point you bring a ton of value to the school just for attending and they won't mind if you are testing out of course material since they are just happy for you to be there for any length of time, and it just becomes more of a formality anyway.

dllthomas · on Oct 27, 2023

It wasn't "like this" but I just (with parental support) enrolled in the local community college after junior high, took a light load of college work, and got a homeschool program to sign off until I could test out at 16. Did 2 years worth of college work over the 4 years that would have been highschool, then transferred to a "proper" university as a junior at 18.

Scaless · on Oct 27, 2023

For non-geniuses like myself, you can just ask to test out of some of the lower tier introduction courses. I got a typically 4-year degree in 3 years this way. It's called credit-by-examination.

I'm sure for the prodigy-level students there is an even higher streamlined process for keeping them engaged.

chx · on Oct 27, 2023

In the 90s I sort of attended two universities at the same time in Hungary (given the time and the place it's not as big a deal as it sounds) and I said to the second one, hey I know the basic stuff, let me attend the more advanced one and they said, sure, no skin off my back, do whatever you want, if you fail, that's a you problem but if you pass the more advanced tests we will credit you the basic stuff.

pas · on Oct 27, 2023

Kevin Buzzard (a professional mathematician) has a few good talks about Lean

https://youtu.be/Dp-mQ3HxgDE from 2019 at MS Research

https://www.youtube.com/watch?v=SEID4XYFN7o&t=4m35s (2022 at ICM international math congress)

JonChesterfield · on Oct 27, 2023

Youtube has forgotten what "audio" is for me so noting the first talk is also at https://www.microsoft.com/en-us/research/video/the-future-of... - which is using youtube - but manages to make noises. Which I suppose is a pointwise instance arguing that software does not work as well as it might do.

opan · on Oct 27, 2023

I'm relieved that Lean4 doesn't seem to be some AI thing.

julianeon · on Oct 27, 2023

Now that you mention it, Tao did say he used ChatGPT to help teach himself Lean.

https://x.com/8teapi/status/1713867160886599920?s=46&t=4jd61...

svat · on Oct 27, 2023

Instead of a twitter post of screenshots of it, the direct link to Tao's Mathstodon is: https://mathstodon.xyz/@tao

He's been posting about it since Oct 9: https://mathstodon.xyz/@tao/111206761117553482

> I have decided to finally get acquainted with the #Lean4 interactive proof system (using AI assistance as necessary to help me use it), as I now have a sample result (in the theory of inequalities of finitely many real variables) which I recently completed (and which will be on the arXiv shortly), which should hopefully be fairly straightforward to formalize. I plan to journal here my learning process, starting as someone who has not written a single line of Lean code before.

There are several posts mentioning how GPT has been useful (though not always) at suggesting things, including one linking to this transcript with ChatGPT: https://chat.openai.com/share/857353ad-a05b-4f9e-bb10-55f15a...

sebzim4500 · on Oct 27, 2023

There is a project to connect GPT-4 and lean, but I doubt they will make a lot of progress until they can finetune GPT-4.

https://www.youtube.com/watch?v=CEwRMT0GpKo

zozbot234 · on Oct 27, 2023

Automated reasoning is usually included in AI (Lean4 has tactics so it qualifies as such) but it's also quite different from the ML stuff that's the more usual kind of AI these days.