He got a lot of things right in that video despite being just 2 years involved with mechanised theorem proving. I thought it was a fantastic and inspiring talk.
For example, when he said at the end, that it is obvious to him that this will change how mathematics is done in the future. Yeah, thought the same when I encountered this technology, but I was just a 2nd year math student back then, and Isabelle was developed in the same building. Still, obvious.
When he said that math is more about structure, not so much about induction; yeah, said the same at some ITP more than a decade ago. Got a dismissive question from Andrew Appel when I said that. Tobias Nipkow seemed to agree with the dismissal, I never understood the question though. Didn't get a reply when trying to investigate further on stage :-)
Of course set theory can be automated. I would argue, it can be automated even better than dependent types. Just embed the set theory including Grothendiek universes in simple type theory. The problem here is how to keep the notation sane, as for example + could be defined only once.
Tactic style theorem proving is sooooo old. It is how interactive theorem proving was done from the very beginning. Tactic style proofs are not readable at all except maybe via experiencing them by stepping through them. Declarative proofs are much more readable. Writing declarative proofs with the machine bridging the intermediate steps is the obvious future.
I was always astonished how people can learn stuff. When taking Latin classes, I thought, sure, I can kind of learn to read it, but speak it in real time? No way! But of course, at some point in time people have. Without understanding how.
AlphaGo Zero kind of explains how people can do something really difficult, even strategy based, without exactly knowing what they are doing. This is not much different from how mathematics is done. I think we will have a deep mathematician solving at least one of the remaining millennium problems within the next 20 years. The path starts exactly with how Buzzard and Hales envision it: Bring enough mathematics into one system, so that the millenium problems can actually be stated in the system. Machine learn that stuff. Give feedback to users, they will find it easier to use the system interactively. Create more mathematics. Machine learn that stuff. Rinse repeat.
This can be done. Certainly with 100 million dollars in funding.
It seems like encoding ‘all of math so far’ in some theorem prover seems like an ideal open source project where lots of people can make small contributions which don’t require a vast amount of mathematical knowledge — you’d mostly be simply encoding other people’s results. Is there such an effort now?
I tried to do something like that (http://proofpeer.net), but got lost in the details. Learnt a lot from it, but implementing your own cloud versioning system is probably more appropriate for a project with 100 million dollars funding, not 600000 pounds, half of which goes to university bureaucracy ;-)
The Archive of formal proofs (https://www.isa-afp.org) is the biggest effort I know, but the logic (Isabelle/HOL) is not powerful enough for doing advanced mathematics comfortably, and the process of contributing is quite arcane.
> It seems like encoding ‘all of math so far’ in some theorem prover seems like an ideal open source project... Is there such an effort now?
Yes. Metamath's set.mm is expressly such a project, and I recently created a Gource visualization of set.mm's progress over time. For the first few years it was basically one person, but that increased over time and there have now been 48 different contributors.
> AlphaGo Zero kind of explains how people can do something really difficult, even strategy based, without exactly knowing what they are doing.
Yet neither does Alpha go. Finding solutions really is an NP-class problem, while proving them is much easier.
> Machine learn that stuff. Give feedback to users, they will find it easier to use the system interactively.
This is interesting. I wonder how much could the computer learn about our natural (or mathematical) language if we were to express problems before coding them, and then feeding both to the ML system.
Going a step further, having the ML system generate the code, and the user correct it (as far as my understanding of deep learning goes, correcting a net's output and backfeeding it is currently strictly equivalent to just providing that data in the training set before performing stochastic gradient descent. We might be missing something here, as it is obviously much more effective with us, and that would make ML a lot eaier to work with, requiring less training data).
> Yet neither does Alpha go. Finding solutions really is an NP-class problem, while proving them is much easier.
That's not really true. Proving that an already found solution is correct might be easy for certain solutions (that's the whole idea of proof certificates). But finding a proof for an arbitrary theorem is certainly not easy. And that is what mathematicians do. Here, finding the proof IS finding the solution.
Of course, but when you have found a proof, verifying that it is correct is fairly trivial (if tedious by human standards, it's only verifying that each statement is correct).
I meant proving in the sense of "checking that the solution works". In a mathematical problem, the solution would be a tentative proof for a theorem. Proof which needs to be verified. I should probably have used "verify" instead of "prove" in this context.
It is not difficult to keep the notation sane in formalizations based on set theory. One way is to defer overloading of symbols to the presentation layer. You can see an example of that at [1] where the + sign is used to denote the group operation, right and left shifts on sets and the natural "addition" of subsets of the group derived from the group operation.
> It is not difficult to keep the notation sane in formalizations based on set theory.
Keeping notation sane is, of course, very important. I think keeping notation sane is harder than it might first appear, but it is certainly possible.
One interesting trick used by Metamath's set.mm is to define variables that look similar to operators but can be defined for purposes of a particular proof (e.g., as hypotheses). For example, + underline, * underline, and so on.
For an example, see theorem grpridd, "Deduce right identity from left inverse and left identity in an associative structure (such as a group)." http://us.metamath.org/mpeuni/grpridd.html - the hypotheses include statements about "+ underscore", and lead to a proof about + underscore.
If it can be done, it needs to be done at the presentation layer, I guess so too.
It has been a while since I worked with Isabelle locales. So how come that in your example "x + y" is not being confused with "A + B", although all involved entities are ZF sets? Or does A actually have a different type than x ?
Isabelle allows the user to define symbols. So, those two operations are represented by different symbols in the source (the *.thy files) but the presentation layer renders them both as "+". You may want to look at the corresponding IsarMathLib source at [1] on GitHub.
Isabelle/ZF uses only two types: set ("i") and prop("o").
Ah, I see. So, when I type in those formulas, can I use + or do I need to type in the original symbols?
Nevermind, I know the answer to that. That's why Set theory notation is a difficult problem, and you need some sort of type presentation / input layer. Some people also refer to that as a user space type system or soft types: https://www21.in.tum.de/~krauss/publication/2010-soft-types-...
> Writing declarative proofs with the machine bridging the intermediate steps is the obvious future.
Do you have a take on what systems are best for this? I found Isabelle/HOL quite reasonable in this respect. There are some toy add-ons to Coq, but I don't think any of them work. There's also FStar which has Z3 integration (if I recall correctly), but last time I played with it it needed quite a lot of hand-holding.
Kind of embarrassing, but I cannot! The thing is, development of mechanised theorem proving has been driven by computer scientists, so proving program correctness, doing programming language semantics, and explaining constructive ideology is at the heart of most introductory text books.
I would recommend just following the "Theorem Proving in Lean" book for starters.
I found this really worth watching, as much for the sociological commentary about the way modern mathematics is done as for the presentation of the program (and the latter resonates strongly for me).
One of the more interesting bits for me was in the Q&A, where Prof Buzzard was asked about alternatives to Lean. Lean is the most evolved of the dependent-typed calculus of constructions provers, but two other approaches might work. One is univalence, which is sexy, but Prof Buzzard observed that they haven't actually got much done in their system.
The other is set theory, which is more familiar and approachable to working mathematicians, but the systems out there lack automation. He didn't mention Metamath by name, but that's currently the most developed such system, and they have managed to get a lot done, either in spite of or because of the lack of type theory sexiness of the foundations.
So the question I'd love to see answered is whether automation is inherently easier in type theory, or whether it might be possible to build automation for a set theoretical approach. John Harrison gave a talk last year at AITP on the topic, but I haven't heard much more of this.
Maybe that wasn't the best way to phrase it, but the question was asked and Prof Buzzard replied basically that in Lean it's straightforward to express quotients, while in Coq you get "setoid hell". This specific statement is at 1:04:20 in the video. The question is at 1:00:02, and is probably useful for context.
Aren't quotients non-constructive? Univalence is probably also relevant to this issue, given that it generalizes the treatment of 'equality' in a broadly similar direction to the one needed for quotients. Groupoids are after all a fairly natural generalization of setoids.
I can't, perhaps lean's definition of quotient isn't constructive (I hadn't looked or noticed that),
But there is at least this construction of quotients in NuPRL.
I think quotients are not typical of constructive objects in that if you construct some object, and then project it into the quotient, then you have the quotient, but you cannot project that object back out. You get subtly less, that the resulting quotient satisfies some equivalence relation.
Under a specific set of assumptions, you may very well be able to construct the quotient... but under a different set of assumptions (including the assumption of the quotient), Perhaps in these differing sets of assumptions, you have the quotient, but lack the assumptions necessary to construct it yourself.
Anyhow, the argument that they are non-constructive in a humpty dumpty sense you cannot
put it together -> take it apart -> put it back together again.
Where it seems reasonable to consider the "put it together" phase as not entirely incompatible with constructivity?
thanks for the reply! i'm gonna need some time to mull it over...
would you be able to say how "normalizing" fits into this? by "normalizing" i mean applying some function that takes each equivalence class to a representative element (think simplifying fractions). using something like that, we could round-trip a value (object -> quotient -> object), though at the end we might end up with a different value than what we put in. so yeah, a normalizing function seems like a useful thing when looking at quotients constructively. sorry for the handwavyness, i hope i managed to get my point across!
So it seems at least that the definition of quotients in lean is classical, and justified by axioms, NuPRL at least seems constructive, this at least leads me to believe quotients as defined in lean aren't constructive. But i'm not sure we can take the step to "quotients aren't constructive".
Anyhow it's an interesting question, and one which I too wish I had more clarity on.
I don't know whether or not Lean is the "best" (no doubt that depends on what aspects you think are important). But I completely agree with the speaker that there is a need to formalize mathematics with computer verification. The current "counsel of elders" model of verification is simply not able to verify proofs with any serious level of confidence today, given the explosion of complexity in modern math. This will change how math will be done in the future; we will look back on current math "proofs" in the way we look back on medicine based on the humors (woefully inadequate).
There are, of course, people working on fixing this. I just created a Gource visualization of the Metamath set.mm project, which has been working to formalize mathematics with absolute rigor. Over the years there has been increased activity; there have now been 48 contributors to set.mm.
In the end, that is what is necessary to formalize mathematics: efforts by many people working together to do it.
How is interoperability going on between those various formal systems? Couldn't we devise some common intermediate representation, or API?
Some languages (and their approaches) are probably better suited to some mathematical objects, like some demonstrations are easier to perform algebraically than geometrically, for instance.
This turns out to be quite hard, because there are irreconcilable differences in the fundamentals of the different proof systems. That said, there's the Dedukti project, which is showing promising interop results between (at least) Coq and HOL style logics, and the work Mario Carneiro is doing trying to bridge between Metamath and more type-theory approaches.
I did my undergrad at Imperial. My first lecture was supposed to be with Prof. Liebeck. Kevin came in and wrote on the blackboard: "Lemma 1 - I am not professor Liebeck". He was wearing his (I think typical) trousers. I never interacted with him personally but I did look at his Wikipedia page once. He was the top student in his undergraduate class (Mathematics) at Cambridge. Very impressive to a mere mortal like me.
I once played with Lean but found the tutorial not very approachable. It took an hour of reading through abstract explanations until it finally explained the idea how it works. Essentially, true statements are expressions that pass the "type check" of a compiler. A function taking type A as param and returning type B is an implication A->B. To prove this implication, you need to find a function implementation that passes the corresponding type checks. This is what I would have wanted the tutorial to say at the start.
> Essentially, true statements are expressions that pass the "type check" of a compiler. A function taking type A as param and returning type B is an implication A->B. To prove this implication, you need to find a function implementation that passes the corresponding type checks. This is what I would have wanted the tutorial to say at the start.
This sounds like the tutorial was geared toward those already familiar with other proof assistants, since this concept is present in most proof assistants.
That's the one I meant I think. It is nice, but as I said the impression it left me with was "you could have told me how the approach generally works sooner".
That books wants to build the foundation (dependent types) before making the big claim in chapter 3.
It's a bit weird because has Haskell shows, you don't need dependent types for basic theorem proving. (but dependent types do give a lot of useful power)
Well if you just have Haskell 2010 types, you're talking about really really basic theorem proving since all you have is propositional logic. The most interesting thing I can think of to prove with Haskell 2010 types is (the constructive version of) de Morgan's laws. Almost all other interesting mathematical statements are out of reach.
I was afraid there would be a CLA, as is customary with Microsoft's projects (and the main reason I didn't contribute to any), but I couldn't find one. Good call.
> Lean 4
> We are currently developing Lean 4 in a new (private) repository. The source code will be moved here when Lean 4 is ready. Users should not expect Lean 4 will be backward compatible with Lean 3. [Committed one year ago]
Really, really not a fan of this. This basically prevents anyone from attempting to add new features or fixes, as they might be obsolete by the time the new version comes out (incompatible or already fixed).
Lean 4 WIP is public now: https://github.com/leanprover/lean4. But PRs aren't welcome. Personally I think Coq's done waaay better at community management.
Can I ask how Mizar compares to Lean, Coq, and Isabelle?
I've wondering about that quite a while because I knew someone who was involved in the Mizar project, but never had the time to get into automated theorem proving myself. I was impressed by the semi-natural language proofs.
Comparing these different approaches is not trivial, of course.
One view is to look at "Formalizing 100 Theorems" by Freek Wiedijk, which lists 100 mathematical theorems and the various systems that have formalized a nontrivial number of them. It's basically a "challenge list" for these kinds of systems:
This list is discussed in "Formal Proof - Getting Started" (Freek Wiedijk, Notices of the AMS, Volume 55, Number 11, December 2008).
That list is absolutely not the only way to compare different tools. Still, it gives you a sense of how far along each one has come in actually making proofs. Here's the current status:
We missed that earlier link, and this talk is so incredibly good that I'm going to pretend I didn't see it here and so couldn't mark it as a duplicate.
FWIW, my comment wasn't a demand to mark this as duplicate. I only included the discussion link for further reference. The thread here had very valuable comments like the one by raphlinus.
Very symptomatic for the current state of mechanised theorem proving. And it has been going on like that for decades. It's mostly because computer scientists are driving development, not mathematicians, I think.
After reading this, are you really any wiser now on the topic of quotients in Lean? I didn't learn much from it except that some Coq developers are fed up with the current popularity of Lean among top mathematicians.
> After reading this, are you really any wiser now on the topic of quotients in Lean?
Only a little but I'm hoping that some weekend reading up on Canonicity and Subject Reduction (now that I know these are the issues at play) will shed some light.
I'm interested in both Lean and Coq but what I'm most excited about is the (Lean-based) Mathlib project.
I can believe that Lean may have made the wrong call on quotients (I guess with `quot.sound`) though I am not qualified to decide. If this is so, I imagine that it will manifest in terms of a natural limit on how far a project like Mathlib can push its borders before it reaches some sort of natural limit (maybe where the complexity of formalising what we want to say dominates the inherent complexity of the statements themselves).
However from what I've seen, Mathlib is by far the most successful formalisation project in terms of what seems to matter most sociologically right now: attractiveness to mathematicians. Whatever its fate, I think it will help make formalisation of mathematics much more mainstream, and will teach us a lot. I still think that a univalent type theory looks like most promising candidate, but we'll have to wait and see.
Near the end he talks about mathematicians with published papers that say "you shouldn't believe any of my papers, they're just telling stories" and don't care if their theories are wrong. Wow
I suspect nobody who has been in academia long enough to get a masters degree at least would be surprised at this statement, no matter what their background. Every field has people like this.
Don't forget how essential story-telling is to the human condition: culture, accumulated knowledge and everything we as a species have achieved is built on top of it. With that in mind, it's much easier to see why people some might want to play the role of storyteller at all costs.
I agree that Lean or Coq or something else is the future of solving mechanistic argument, but my opinion is that this has been the expectation for maybe a hundred years already and was right up there in the time of Turing and Church (and Boole and Heyting and Curry and Kolmogorov).
But compare all of mathematics to just linear algebra and specifically neural network implementations. You have a lot of people working on AI who sometimes grossly overstate the capabilities of their system and fail to understand their systems when they do succeed. I would venture that the issue is not solving problems as much as it is to understand things to the level of mastery. It is always worth it to understand something to a continually exhaustive level of detail.
I think this is what artisans are. If you can make incredible hand made books [1], then surely you have underlying skills and abilities that transfer as well? If you are a grand master chess player then you may be dismayed that computers will always beat you [2], until you use a computer yourself to beat another computer (or at the very least until you become resentful towards IBM for misleading you in 1997). [3]
I watched the whole thing and it was fantastic. Professor Buzzard is an excellent presenter. I downloaded Lean and started playing around with it after this lecture.
Mainly for my own reference when I return to this story with more time, links to Professor Kevin Buzzard's project Xena [0][1]
>Basically, Lean can understand mathematics, and can check that it doesn't have any mistakes in. Most of the files here are Lean verifications of various pieces of undergraduate level mathematics.
>Some of the lean files are in a library called Xena. One could imagine Xena as currently studying mathematics at Imperial College London.
> So in the end it wasn't Gödel, it wasn't Turing, and it wasn't my results that are making mathematics go into an experimental mathematics direction, in a quasi-empirical direction. The reason why mathematicians are changing their working habits is the computer. I think that this is an excellent joke!
Looking at the code frequency, it looks like development of Lean 3 all but stopped around January 2018. The Lean 4 repo shows lots of activity since January 2019, but isn't in a useable form.
What are the future plans for the project? How will this be distributed in a form that mathematicians can use and contribute to?
This might be a very naive question but I was wondering: are there mathematical proofs that fundamentally cannot be computed? And I don't mean that in the Ackermann function[0] sense of the words, I mean is there mathematics that is inherently beyond computation?
(still watching the video)
EDIT: follow-up, since the replies helpfully point out that yes, it is: does this limit this kind of software, or can theorem proving software get around this by human intervention?
Yes, we have proofs (which probably can be proven formally!) that there are things which cannot be proven. That is essentially the incompleteness theorem [0]
The smallest 'practical' example I know, is a finite state machine with a proof that we cannot proof (within a specific set of axioms) whether it will stop or not.[1]
It's not even very big: 1919 states.
> Yes, we have proofs (which probably can be proven formally!) that there are things which cannot be proven.
That is a very subtly different question than the one I was trying to ask. Or.. well.. I think it is. What I was trying to ask was if there are formally provable theories that cannot be expressed in a way that would let a computer "compute" the proof.
I think that this is not the same as proving that there are unprovable theories.
EDIT: I just noticed my phrasing was off in the original question.. gah, this reminds me of conversations with my mathematician friends IRL where I would always trip over my own sloppy use of human language. What I mean with "computing" is verification, not execution
> What I was trying to ask was if there are formally provable theories that cannot be expressed in a way that would let a computer "compute" the proof.
Do you mean "find the proof" or "verify the proof"? That difference is critical.
Theorem-provers that work to find proofs have made many strides, but as of today humans easily run rings around them. That's not to say it can never be done. Years ago good Go players could trounce machines, and that is no longer true. So: Today, most proofs cannot practically be found by a machine, but they are getting better and there's no fundamental reason computers can't be excellent at it.
Verifying proofs is something a computer is par excellence at. To verify a proof, the proof checker simply needs to verify that each step is valid. No human can compare to the ability of a computer to be picky :-).
In some systems it may not be possible to express a proof of a particular claim in a reasonably short sequence. But that would be a limitation for both humans and computers. In both cases, the answer would be the same: find a way to switch to a "different" system where it's easier to express the proof. In some sense normal math does this; people routinely define new terms, and then build on those definitions so that everything else is simpler.
I really hope machine learning will succeed in producing systems that surpass contemporary mathematicians in not only finding proofs, but also suggesting definitions and theorems to simplify a database of mathematics.
It doesn't even need to use machine learning. I have written several analyses of metamath databases to look for common patterns, and then recommend definitions or theorems to simplify things. If some expression is repeatedly used over and over again, that suggests something that should be created once for reuse.
What you're talking about can be formalized via https://en.wikipedia.org/wiki/Arithmetical_hierarchy#The_ari...
where your notion of "computable" likely corresponds to some set between (Sigma_1 intersect Pi_1) and (Sigma_1 union Pi_1) depending on some nuance. In any case, you can show by diagonalization that there are statements strictly outside these sets--that is, first order propositions that cannot be verified or refuted by computer.
The question was whether there are statements provable by humans but not computers, which is either trivially impossible or guaranteed, depending on whether you allow the computer to use the same axioms as the people.
I cannot pretend to truly follow the actual mathematics you linked and summarized (I appreciate the attempt!), but the conclusion in the last sentence answers my intended question. Thank you!
> That is a very subtly different question than the one I was trying to ask. Or.. well.. I think it is. What I was trying to ask was if there are formally provable theories that cannot be expressed in a way that would let a computer "compute" the proof.
It depends on the logical system + axioms that your formally provable theory is in respect to, and "how soon" you design the computer that verifies the proof. Recall that a proof is a finite sequence of applications of rules of inference on zero or more axioms to derive a new statement that may be regarded as an axiom for applications later in the sequence, so that the last step derives the theorem.
If there are a countable number of rules of inference and a countable number of axioms, and each rule of inference has finite valence (i.e. infer from finitely many formulae-patterns as hypotheses), then yes, there is a way of encoding the rules of inferences and possible axioms of the logical system that a Turing machine (TM) will find a proof of any provable theorem (i.e. verify it), simply by enumeration of axioms in rules of inferences in the positions in a sequence that constitute the proof, while checking whether the axioms legally fit the rules of inference. (It is always possible to check in finite time whether a (finite-length) formula satisfies the pattern it needs to satisfy by a rule of inference).
On the other hand, if there are an uncountable number of rules of inference or an uncountable number of axioms, there may be a computer that cannot verify all the formally provable theories. For example, trivially, if there are an uncountable number of axioms, then the axioms admit no encoding; that is, there is no set of finite strings in a finite alphabet that is able to uniquely label all the axioms. But an axiom is itself a formally provable theorem; its proof is just introducing itself, yet that proof cannot be represented for all the axioms. (dependent on definition of proof; this example may not be so if we accept a zero sequence as a proof; but in general this illustrates the fact that there are simple theorems in some formal systems + uncountable axioms that cannot be computed).
Yet, if we know beforehand the theorem and rules of inferences and axioms used in some proof of it, and the rules of inference all have finite valence, then that proof only invokes finitely many rules of inference on finitely many axioms, such that we can encode a fragment of the logical system + axioms + expressible formulae such that the TM can verify all formally provable theorems in that fragment, including the desired theorem.
It's not a FSM but a Turing machine. It is quite easy to prove if a FSM with 1919 states reaches a certain state or not. You just run it and within 1919 steps it either reaches the target state or returns to a previous state that it already visited, creating a cycle.
I think Godel's has been proven in Coq, but I also heard that it relies on library functions that don't make sense to use when proving incompleteness... and that if you weaken what's in the library to account for it, I'm not sure if the proof still works. Then again, I think that thought came from a notable crank, so I'm not sure it's true.
A proof of a statement A may be valid in one formal system X but not capable of being written down in another formal system Y (and more strongly, there may be no proof at all that can be written down in Y that proves A or not-A, even if the statement A itself can be written down in Y).
It helps to be precise. Stating the incompleteness theorem imprecisely generally just makes people confused about what it could even mean. Strictly speaking, an incompleteness theorem has to be proven within a particular formal system X (the "meta-system"). It expresses--within X--that "for every consistent formal system Y expressible in X, there is a statement A expressible in Y such that there is no proof in Y of A or not-A". Here Y is the "object system".
The statement A is only unprovable within Y in particular. If it is expressible in X as well, it may be provable (or disprovable) within X. Indeed, one can trivially construct a formal system in which A is provable and which can express Y by adding A as an axiom to X.
The real, philosophically significant, non-particular-formal-system-specific import of the incompleteness theorem is not just that there are unprovable statements (which isn't really true, and is in fact kind of mathematically meaningless, outside of the context of a specific formal system, given that any statement or its negation can be taken as an axiom, and all we have to judge such choices is a non-mathematical notion of "self-evidency") but that (as the name suggests!) no formal system is complete--every formal system can express statements that it can't prove. It's all about that universal quantification.
The title mentions Gödel, but it’s a broad treatment that really discusses a lot of the context of the subject.
Keep in mind there are multiple reasons why proofs can become problematic.
Gödel deals with some of them, other problems go unproven for centuries until our techniques catch up. Some proofs are impractical being nearly infinitely complex, etc.
Your question hinges on the definition of "proof" and what it means to "compute" a proof.
The shortest answer I can give is no. There are no proofs that are beyond computation. There's some subtlety here and your final question of whether there is mathematics that is beyond computation is a much deeper question.
The question becomes far more interesting if you replace "fundamentally" with "efficiently." But that's getting ahead of myself.
Modern mathematics and logic view (formal) proofs as computation. A formal proof is a purely mechanical process that should require no creativity whatsoever to follow, just as a computation is a purely mechanical process that should require on creativity whatsoever to carry out.
If you require a non-mechanical leap (say of creativity) then it is no longer a "proof" (each leap corresponds to a "leap of logic" colloquially).
Modern mathematical standards usually mean that every informal mathematical proof should in theory be formalizable into a formal proof. If this is not possible, the informal mathematical proof is most likely insufficiently rigorous.
The creativity then lies rather in the construction of the proof rather than its verification.
Note though that given infinite time, again all proofs can be generated mechanically. Given that all proofs can be verified mechanically, you just keep generating every possible string and verifying it until you find a proof that either proves or disproves your statement in question. This is an extremely long and utterly impractical process, but it demonstrates that in theory this is possible.
There is a subtlety here. Mathematical statements can be independent of the set of axioms that you consider when deriving a proof, that is the axioms may not be strong enough to either prove or disprove the statement. This mechanical enumeration process cannot find that a statement is independent of the starting axioms, since it cannot know when to stop looking for a proof or disproof if it has not yet found one. This is analogous to the halting problem, where a computer cannot tell if a program has not yet halted whether it will halt in the future (e.g. by waiting just another few seconds more) or if it will never halt.
Hence in a certain sense the verification of mathematics is entirely bound by computation. The creation of it, given infinite time is also bound by computation. However, given that we do actually care about efficiency, that's where things get interesting.
Okay so what about the things that sibling comments touch on. Such as:
1. Godel's incompleteness theorem(s)
2. Arithmetical hierarchy (and the closely related notions of Turing jumps and Turing degrees)
3. (Not touched upon by other comments, but potentially relevant) Tarski's undefinability of truth
4. (Also not touched upon by other comments, but also an interesting one) P =? NP
1 - 3 apply to humans as well as machines. 4 is the only one in my eyes that's interesting.
1. Godel's incompleteness theorems are statements about the limitations of formal systems, but this limitation holds for humans as well. That is, modern mathematics is usually stated to be built on the foundations of ZFC set theory. All the constraints that Godel's incompleteness theorems place on ZFC (namely that there will always be statements independent of ZFC and that we cannot use ZFC to prove itself logically consistent) also hold for all humans using ZFC. The only advantage that humans have is that we can decide if ZFC is too restrictive to switch formalisms and use something else (i.e. the creativity in constructing a proof). This is the creative spark that is not yet captured by machines (although who knows with AGI).
2. The arithmetical hierarchy is a generalization of the following phenomenon: for universally quantifiable statements, i.e. "for all" statements such as "all cows have spots" or "all people are mortal," it is easy to disprove them since that requires only a single counterexample, but it is hard to prove them, since you need some way to show that the statements holds for all exemplars. On the other hand for existentially quantifiable statements, i.e. "there exists" statements such as "there exists a cow which is a sphere" or "there exists a person who is mortal," the opposite is true. It is easy to prove them since you only need to find one example, but it is difficult to disprove them, because you need to show that any potential exemplar is incorrect. The arithmetical hierarchy generalizes this by examining the "difficulty" of statements where "for all"s and "there exists"s interleave with each other. It turns out that "there exists" corresponds to the halting problem (there exists a number of steps after which the machine halts) and "for all" corresponds to the "not-halting" problem (for all finite numbers of steps the machine will not have yet halted). The hierarchy is an observation that each time you interleave a "there exists" and a "for all" you make the halting problem even "harder" (an oracle that magically gives you the answer to the halting problem for a lower number of interleavings will have its own halting problem it cannot decide in higher numbers of interleavings). An example of an interleaving is the question of whether a function is total. This is equivalent to the statement "for all inputs, the function halts." However, we just said that halting is a "there exists" statement. So we can further reduce the statement to "for all inputs, there exists a number of steps such that the function halts after such a number of steps." Notice the interleaving of "for all" and "there exists." However, note again that all the limitations I've described here apply both to computers and humans.
3. Tarski's undefinability of truth is the statement that for essentially all interesting logical systems, there is no way to write a logical predicate that takes another logical statement as an argument and is true if and only if that logical statement is true (essentially a logical version of the halting problem). Again this is a problem that afflicts both humans and machines and is an artifact rather of the attempt to formalize something (the heart of logic and mathematics).
4. This is the interesting question. So far I've outlined a brute-force way for a computer to enumerate all proofs. This clearly does not help actual human mathematicians do mathematics. Can we do better than that? Especially since we don't need to brute-force the validity of a proof? Humans certainly can do this better for specific instances of proofs (in much the same way that humans can prove specific programs halt, but cannot give a general formula for doing so). Is there a globally better way of doing this? Scott Aaronson delves into this far better than I could here: https://www.scottaaronson.com/papers/philos.pdf.
There is a correspondence between proofs and programs (curry-howard) so I guess if there are non computable functions, there are non computable proofs?
Greg Egan's Diaspora has a rather good illustration of what a fully formalized mathematical system might look like. He called it the Mines[1]: a representation of the full acyclic digraph of all the theorems proved so far by a community of mathematicians. A student can examine any proposition, see if its yet been proven (and if so, how, and from what antecedents) as well as what deeper results might depend on it. Critically, a student might select a research topic simply by travelling to the ends of the Mines and start digging, or might select some not yet proven goal as an end point and start working towards it... today, each mathematician must build their own mental model of the mines by exhaustively reading all related papers - a time consuming and somewhat fallible effort. Such a comprehensive map would be one of the main benefits of formalization.
The video mentioned Formal Abstracts[2], which is still getting off the ground... a similar project that has been around for a while is Metamath[3]. Metamath uses an SRS (string rewriting system[4]) to formalize much foundational mathematics. It's proof explorer is conceptually similar to Egan's Mines: a database full of theorems and connections starting from fundamental axioms and definitions. Today, Metamath only includes fairly basic results, and has not yet reached a level where it can capture the state-of-the-art in modern research mathematics the way Formal Abstracts and other systems aspire to. To be fair, no system today is really at that level yet. Coq has a good set of packages and I think is probably in the lead today. I don't know where Lean is in comparison.
The aspect of formalization where it somehow lends additional credence to proofs seems less important. Famously, when Hilbert went to fully formalize Euclid's geometry, he found a missing axiom[5]. (Euclid had assumed that two circles with centers closer than their radii would intersect at somewhere... which is not true over the field of rational numbers! Therefore this requires an explicit axiom, which had been overlooked for 2,000 years!) This seems to be the exception rather than the rule... as far as I know, no important theorem has been "overturned" as a result of formalization efforts.
I politely disagree; Metamath doesn't really have any ceiling on its complexity and can do anything that fancier, slower solvers can do. Did you have a specific example of something that Metamath cannot capture? Keep in mind that, in terms of formal power, there ought not to be a higher level of power than the level that Metamath accesses.
Oh, it can capture everything. I believe that. It's just that it relies on volunteers to actually do the work of formalizing advanced theories. It has some pretty darn advanced stuff already, like complex numbers. But there's also a huge amount of research-level mathematics which nobody's gotten around to coding in the Metamath formalism yet. This is the same issue discussed in the video: with millions of dollars and a full time team, you could get something like the proof of Fermat's last theorem in there, but its not there yet.
This kind of reminds me of that Avengers: Endgame scene where Tony Stark (cf. Kevin Buzzard) was working with F.R.I.D.A.Y. (cf. Lean Theorem Prover) to interactively look for time travel solutions. :D
Are you referring to constructivism in terms of education? Or in terms of proving the existence of an object in mathematics? Traditionally, it is not necessary to construct an object to prove its existence. One can simply assume the object does not exist and prove a contradiction. Requiring constructive proofs does not seem to provide tangible benefits, but it does unnecessarily hinder mathematical thinking.
Constructivism/intuitionism seems to have culminated into modern-day finitism/ultrafinitism. In the likes of Doron Zeilberger (crank or genius - you decide).
Par for the course for computer scientists - if we had infinite time/space we wouldn't have to optimise anything.
This paper [1] by Ed Nelson presents a brilliant analysis on the difference between classical logicians and intuitionists.
It starts with a crucial distinction in semantics.
Assuming ∀x¬A and arriving at a contradiction not a proof for ∃xA.
And yet, simply on the grounds of utility, the intuitionist perspective is far more useful to me as an every-day philosophy simply because "incomplete communications" is a mental model of how human systems/interactions work and how information flows in general. That which we call knowledge is socially constructed [2] and is always incomplete.
You won't convince anyone of this until they start working heavily in a theorem prover. With constructive proofs you can introduce certain automation that is not possible otherwise; until then, when we are working on pen and paper, it is a limitation on your proof techniques.
But you are are aware of this already, I'm just writing it out.
I'm surprised by this statement. Most of the research in automatic theorem proving (including for first and higher order logics) is based on classical logic, because it's much easier to reduce to a search for false, than to try and prove an arbitrary formula. The automatic provers able to do intuitionistic proofs generally do it by encoding the intuitionistic logic into a classical logic first.
Even Isabelle/HOL, which is quite user friendly and has a lot of automation (like Sledgehammer, which can call to the automatic provers mentioned above) is based on classical logic with choice.
Hmm, I was more thinking about proof translation across isomorphisms. I'm not speaking from my own experience here, just that I have seen people grumble about it.
:-). I agree. Sure constructive proofs are sometimes longer, but it's the computer doing most of the work.
But you also end up with proofs that are much, much shorter.
I guess it's all about the degree of uncertainty. Constructive mathematics reduces the degree of uncertainty, which grows exponentially as you get in the higher level proofs.
I'd bet that we will invent a general AI which can invent math faster than we can make theorem provers do something useful.
What separates math from pure logic is that math has a set of very intuitive axioms. So mathematics is first and foremost about stretching and fixing inconsistencies in our intuition, I personally see no value in trying to refactor the foundations of mathematics like constructing integers based on sets etc, taking integers and their operations as an axiom hurts nobody. Theorem provers seems to be like that as well, people don't prove new things in it they just refactor old things to fit in, kinda like people rewriting their services in languages with more street cred like Haskel or Rust.
>What separates math from pure logic is that math has a set of very intuitive axioms.
Taking certain sets of axioms can lead to very un-intuitive conclusions, that's why some people care about building a solid foundations. A classic example is https://en.wikipedia.org/wiki/Banach%E2%80%93Tarski_paradox , which states "Given a solid ball in 3‑dimensional space, there exists a decomposition of the ball into a finite number of disjoint subsets, which can then be put back together in a different way to yield two identical copies of the original ball." If we change the foundations to remove the axiom of choice, this paradox (and many others like it) is destroyed.
For example, when he said at the end, that it is obvious to him that this will change how mathematics is done in the future. Yeah, thought the same when I encountered this technology, but I was just a 2nd year math student back then, and Isabelle was developed in the same building. Still, obvious.
When he said that math is more about structure, not so much about induction; yeah, said the same at some ITP more than a decade ago. Got a dismissive question from Andrew Appel when I said that. Tobias Nipkow seemed to agree with the dismissal, I never understood the question though. Didn't get a reply when trying to investigate further on stage :-)
Of course set theory can be automated. I would argue, it can be automated even better than dependent types. Just embed the set theory including Grothendiek universes in simple type theory. The problem here is how to keep the notation sane, as for example + could be defined only once.
Tactic style theorem proving is sooooo old. It is how interactive theorem proving was done from the very beginning. Tactic style proofs are not readable at all except maybe via experiencing them by stepping through them. Declarative proofs are much more readable. Writing declarative proofs with the machine bridging the intermediate steps is the obvious future.
I was always astonished how people can learn stuff. When taking Latin classes, I thought, sure, I can kind of learn to read it, but speak it in real time? No way! But of course, at some point in time people have. Without understanding how.
AlphaGo Zero kind of explains how people can do something really difficult, even strategy based, without exactly knowing what they are doing. This is not much different from how mathematics is done. I think we will have a deep mathematician solving at least one of the remaining millennium problems within the next 20 years. The path starts exactly with how Buzzard and Hales envision it: Bring enough mathematics into one system, so that the millenium problems can actually be stated in the system. Machine learn that stuff. Give feedback to users, they will find it easier to use the system interactively. Create more mathematics. Machine learn that stuff. Rinse repeat.
This can be done. Certainly with 100 million dollars in funding.