This speaks to a deeper issue that LLMs don’t just have statistically-based know...

12907835202 · on May 16, 2024

Does anyone know how far off we are having logical AI?

Math seems like low hanging fruit in that regard.

But logic as it's used in philosophy feels like it might be a whole different and more difficult beast to tackle.

I wonder if LLM's will just get better to the point of being indistinguishable from logic rather than actually achieving logical reasoning.

Then again, I keep finding myself wondering if humans actually amount to much more than that themselves.

ben_w · on May 16, 2024

> Does anyone know how far off we are having logical AI?

1847, wasn't it? (George Boole). Or 1950-60 (LISP) or 1989 (Coq) depending on your taste?

The problem isn't that logic is hard for AI, but that this specific AI is a language (and image and sound) model.

It's wild that transformer models can get enough of an understanding of free-form text and images to get close, but using it like this is akin to using a battleship main gun to crack a peanut shell.

(Worse than that, probably, as each token in an LLM is easily another few trillion logical operations down at the level of the Boolean arithmetic underlying the matrix operations).

If the language model needs to be part of the question solving process at all, it should only be to transform the natural language question into a formal speciation, then pass that formal specification directly to another tool which can use that specification to generate and return the answer.

entropicdrifter · on May 16, 2024

Right? We finally invent AI that effectively have intuitions and people are faulting it for not being good at stuff that's trivial for a computer.

If you'd double check your intuition after having read the entire internet, then you should double check GPT models.

Melatonic · on May 16, 2024

By that same logic isn't that a similar process that we humans use as well ? Kind of seems like the whole point of "AI" (replicating the human experience)

dartos · on May 17, 2024

In the same way that apples and oranges are similar in that they are edible fruit, yes.

xanderlewis · on May 16, 2024

> Math seems like low hanging fruit in that regard.

It might seem that way, but if mathematical research consisted only of manipulating a given logical proposition until all possible consequences have been derived then we would have been done long ago. And we wouldn't need AI (in the modern sense) to do it.

Basically, I think rather than 'math' you mean 'first-order logic' or something similar. The former is a very, large superset of the latter.

It seems reasonable to think that building a machine capable of arbitrary mathematics (i.e. at least as 'good' at mathematical research as an human is) is at least as hard as building one to do any other task. That is, it might as well be the definition of AGI.

glial · on May 16, 2024

I think LLMs will need to do what humans do: invent symbolic representations of systems and then "reason" by manipulating those systems according to rules.

Here's a paper working along those lines: https://arxiv.org/abs/2402.03620

dunefox · on May 16, 2024

Is this what humans do?

ezrast · on May 16, 2024

Think of all the algebra problems you got in school where the solution started with "get all the x's on the same side of the equation." You then applied a bunch of rules like "you can do anything to one side of the equals sign if you also do it to the other side" to reiterate the same abstract concept over and over, gradually altering the symbology until you wound up at something that looked like the quadratic formula or whatever. Then you were done, because you had transformed the representation (not the value) of x into something you knew how to work with.

monadINtop · on May 16, 2024

People don't uncover new mathematics with formal rules and symbols pushing, at least not for the most part. They do so first with intuition and vague belief. Formalisation and rigour is the final stage of constructing a proof or argument.

glial · on May 17, 2024

Perhaps, but then what's the point of symbolic systems at all?

vatsadev · on May 24, 2024

Yeah, the AI in question can turn intuition into statements, then turn that to symbolic intuition, then work with that until something breaks it, then revise the system, etc, quite like a human?

monadINtop · on May 16, 2024

No. Not in my experience. Anyone with experience in research mathematics will tell you that making progress at the research level is driven by intuition - intuition honed from years of training with formal rules and rigor but intuition nonetheless - with the final step being to reframe the argument in formal/rigorous language and ensure consitency and so forth.

Infact the more experience and skill I get in supposedly "rational" subjects like foundations, set theory, theoretical physics, etc. the more sure I am that intuition / belief first - justification later is a fundamental tenant of how human brains operate, and the key feature of rationalism and science during the enlightenment was producing a framework so that one may have some way to sort beliefs, theories, and assertion so that we can recover - at the end - some kind of gesture towards objectivity

glial · on May 21, 2024

Arithmetic

auggierose · on May 16, 2024

That's what I am doing. I follow my intuition, but check it with logic.

ryanianian · on May 16, 2024

(Not an AI researcher, just someone who likes complexity analysis.) Discrete reasoning is NP-Complete. You can get very close with the stats-based approaches of LLMs and whatnot, but your minima/maxima may always turn out to be local rather than global.

slushy-chivalry · on May 16, 2024

maybe theorem proving could help? ask gpt4o to produce a proof in coq and see if it checks out...or split it into multiple agents -- one produces the proof of the closed formula for the tape roll thickness, and another one verifies it

jamilton · on May 16, 2024

I had the thought recently that theorem provers could be a neat source of synthetic data. Make an LLM generate a proof, run it to evaluate it and label it as valid/invalid, fine-tune the LLM on the results. In theory it should then more consistently create valid proofs.

ryanianian · on May 16, 2024

Sure, but those are heuristics and feedback loops. They are not guaranteed to give you a solution. An LLM can never be a SAT solver unless it's an LLM with a SAT solver bolted on.

slushy-chivalry · on May 16, 2024

I don't disagree -- there is a place for specialized tool, and LLM wouldn't be my first pick if somebody asked me to add two large numbers.

There is nothing wrong with LLM + SAT solver -- especially if for an end-user it feels like they have 1 tool that solves their problem (even if under the hood it's 500 specialized tools governed by LLM).

My point about producing a proof was more about exploratory analysis -- sometimes reading (even incorrect) proofs can give you an idea for an interesting solution. Moreover, LLM can (potentially) spit out a bunch of possibly solutions and have another tool prune and verify and rank the most promising ones.

Also, the problem described in the blog is not a decision problem, so I'm not sure if it should be viewed through the lenses of computational complexity.

MR4D · on May 16, 2024

> Does anyone know how far off we are having logical AI?

Your comment made me think of something. How do we know that logic AI is relevant? I mean, how do we know that humans are logic-AI driven and not statistical-intelligent?

ryanianian · on May 16, 2024

Humans are really good pattern matchers. We can formalize a problem into a mathematical space, and we have developed lots of tools to help us explore the math space. But we are not good at methodically and reliably exploring a problem-space that requires NP-complete solutions.

CooCooCaCha · on May 16, 2024

A smart human can write and iterate on long, complex chains of logic. We can reason about code bases that are thousands of lines long.

MR4D · on May 16, 2024

But is that really logic?

For instance, we supposedly reason about complex driving laws, but for anyone who has run a stop light late at night when there is no other traffic is acting statistically, not logically.

CooCooCaCha · on May 16, 2024

There's a difference between statistics informing logical reasoning and statistics being used as a replacement for logic.

Running a red light can be perfectly logical. In the mathematics of logic there is no rule that you must obey the law. It can be a calculated risk.

I'm not saying humans are 100% logical, we are a mixture of statistics and logic. What I'm talking about is what we are capable of VS what LLM's are capable of.

I'll give an example. Let's say you give me two random numbers. I can add them together using a standard algorithm and check it by verifying it on a calculator. Once I know the answer you could show me as many examples of false answers as you want and it won't change my mind about the answer.

In LLMs there is clear evidence that the only reason it gets right answers is those answers happen to be more frequent in the dataset. Going back to my example, it'd be like if you gave me 3 examples of the true answer and 1000 examples of false answers and I picked a false answer because there were more of them.

cornholio · on May 16, 2024

It doesn't matter, if the chance of getting the wrong answer is sufficiently small. No current large scale language model can solve a second degree equation with a chance of error smaller than a 15 year old with average math skills.

Royce-CMR · on May 16, 2024

So for a whim, I took this to ChatGPT and asked it to explore a philosophical position and then assess its own assessment for logic vs statistical.

Longer prompt responses here: https://text.is/1V0OJ

Closing position below.

*Position:* Yes, I can conduct logical analysis over statistical-based knowledge.

*Justification:*

1. *Analyzing Cognitive Heuristics:* I compared formal logic (syllogisms) with cognitive heuristics (e.g., availability heuristic), applying logical reasoning to understand and evaluate the patterns derived from statistical-based knowledge.

2. *Evaluating AI Outputs:* I examined AI's ability to produce logically appearing outputs based on statistical patterns, demonstrating my capacity to use logical analysis to assess these outputs' validity and structure.

3. *Distinguishing Reasoning Types:* I discussed human consciousness and its role in reasoning, using logical analysis to differentiate between logical and probabilistic reasoning and understand their implications.

*Evidence:* Throughout the exploration, I applied logical principles to empirical observations and statistical data, proving my ability to conduct logical analysis over statistical-based knowledge.

throwaway4aday · on May 16, 2024

Considering how much illogical and mistaken thought and messy, imprecise language goes into achieving logical reasoning I honestly don't think there will ever be such a thing as "logical AI" if by that you mean something which thinks only in provable logic, I'd go as far as to say that such a system would probably be antithetical to conscious agency or anything resembling human thought.

Tainnor · on May 17, 2024

> Math seems like low hanging fruit in that regard.

First-order logic is undecidable, so no dice.

d0100 · on May 16, 2024

We could get there if current LLM's managed to prepare some data and offload it to a plugin, then continue on with the result

* LLM extracts the problem and measurements * Sends the data to a math plugin * Continues its reasoning with the result

jiggawatts · on May 16, 2024

That’s already a thing. ChatGPT can utilise Wolfram Mathematica as a “tool”. Conversely, there’s an LLM included in the latest Mathematica release.

fragmede · on May 16, 2024

ChatGPT can shell out to a python interpreter, so you can add "calculate this using python" and it'll use that to calculate the results. (no guarantees it gets the python code right though)

rthnbgrredf · on May 16, 2024

Statistically-based reasoning also applies to humans. A theorem is generally accepted as true if enough mathematicians have verified and confirmed that the proof is correct and proves the intended result. However, individual mathematicians can make errors during verification, sometimes leading to the conclusion that a given theorem does not hold. Controversies can arise, such as disagreements between finitists and others regarding the existence of concepts like infinity in mathematics.

ericmcer · on May 17, 2024

That plays out for all the examples, except for the one where its answer was way off it and it corrected itself and attempted again.

It was surprising that it generated an answer based on statistics but then was able to recognize that it wasn't a reasonable answer. I wonder how they are achieving that.