KANs (Kolmogorov-Arnold Networks) are one example of a promising exploration pat...

astrange · 2024-11-06T10:38:34 1730889514

"Explainable" is a strong word.

As a simple example, if you ask a question and part of the answer is directly quoted from a book from memory, that text is not computed/reasoned by the AI and so doesn't have an "explanation".

But I also suspect that any AGI would necessarily produce answers it can't explain. That's called intuition.

diffeomorphism · 2024-11-06T10:46:50 1730890010

Why? If I ask you what the height of the Empire State Building is, then a reference is a great, explainable answer.

astrange · 2024-11-06T12:28:11 1730896091

It wouldn't be a reference; "explanation" for an LLM means it tells you which of its neurons were used to create the answer, ie what internal computations it did and which parts of the input it read. Their architecture isn't capable of referencing things.

What you'd get is an explanation saying "it quoted this verbatim", or possibly "the top neuron is used to output the word 'State' after the word 'Empire'".

You can try out a system here: https://monitor.transluce.org/dashboard/chat

Of course the AI could incorporate web search, but then what if the explanation is just "it did a web search and that was the first result"? It seems pretty difficult to recursively make every external tool also explainable…

diffeomorphism · 2024-11-06T13:04:34 1730898274

Then you should have a stronger notion of "explanation". Why were these specific neurons activated?

Simplest example: OCR. A network identifying digits can often be explained as recognizing lines, curves, numbers of segments etc.. That is an explanation, not "computer says it looks like an 8"

krisoft · 2024-11-06T16:54:52 1730912092

But can humans do that? If you show someone a picture of a cat, can they "explain" why is it a cat and not a dog or a pumpkin?

And is that explanation the way how they obtained the "cat-nes" of the picture, or do they just see that it is a cat immediately and obviously and when you ask them for an explanation they come up with some explaining noises until you are satisfied?

diffeomorphism · 2024-11-06T18:42:09 1730918529

Wild cat, house cat, lynx,...? Sure, they can. They will tell you about proportions, shape of the ears, size as compared to other objects in the picture etc.

For cat vs pumpkin they will think you are making fun of them, but it very much is explainable. Though now I am picturing a puzzle about finding orange cats in a picture of a pumpkin field.

fragmede · 2024-11-06T18:56:02 1730919362

Shown a picture of a cloud, why it looks like a cat does sometimes need an explanation until others can see the cat, and it's not just "explaining noises".

Retric · 2024-11-06T12:59:48 1730897988

LLM’s are not the only possible option here. When talking about AGI none of what we are doing is currently that promising.

The search is for something that can write an essay, drive a car, and cook lunch so we need something new.

Vampiero · 2024-11-06T15:49:20 1730908160

When people talk about explainability I immediately think of Prolog.

A Prolog query is explainable precisely because, by construction, it itself is the explanation. And you can go step by step and understand how you got a particular result, inspecting each variable binding and predicate call site in the process.

Despite all the billions being thrown at modern ML, no one has managed to create a model that does something like what Prolog does with its simple recursive backtracking.

So the moral of the story is that you can 100% trust the result of a Prolog query, but you can't ever trust the output of an LLM. Given that, which technology would you rather use to build software on which lives depend on?

And which of the two methods is more "artificially intelligent"?

astrange · 2024-11-06T22:05:25 1730930725

The site I linked above does that for LLaMa 8B.

https://transluce.org/observability-interface

LLMs don't have enough self-awareness to produce really satisfying explanations though, no.