The author, Marcus Hutter, created the "Hutter Prize" [1] that offers a C50,000 ...

vesinisa · on Aug 11, 2019

Hmm, the prize is defined to be at maximum €50,000. The formula is 50,000€×(1-S/L), where S is the compressed size + code of your program and L is the size of the current best entry. So if you are able to produce 5% smaller compression than the next best guy, you get 2,500€. The whole 50,000€ is only possible with infinite compression ratio.

sieabahlpark · on Aug 12, 2019

You'd make more money selling cloud storage at a fraction of the cost.

NullPrefix · on Aug 12, 2019

High ratio (de)compression is usually latency slow. You could look into selling tape storage alternatives.

tshaddox · on Aug 11, 2019

I might be missing something here. Where does intelligence come in? On any given deterministic computer platform, there is a finite number of 15MB executables. And there is a finite number of 100MB text files. But the second number is much larger than the first, and thus not every 100MB text file has a 15MB executable that outputs it.

Of course, maybe we can find a 15MB executable that outputs the specific Wikipedia file he’s asking for, and there might be clever ways to search for or construct that executable on a specific computer platform, but it doesn’t strike me as some particularly generalizable sense of compression requiring intelligence.

Scarblac · on Aug 11, 2019

He has a book on "Universal Artificial Intelligence" , that defines what he considers AI and then solves it. The only catch is that it relies on Kolmogorov complexity, which is incomputable.

The Kolmogorov complexity of a string is (loosely) the size of the smallest possible Turing machine that will output that string when run. So a string with 1000x "a" has a lower complexity than a random string of length 1000.

He sees GAI as a system that, given all its input up to now (from sensors and the like) and some value-function that defines the goal we're trying to reach, takes the action that has the highest expected value of the value-function. It's hard to give a better definition of intelligence in a completely general sense: given all input, choose the most likely best action.

That requires some guessing about the future: given all this input so far, for all possible next inputs in the next time step, what is the likelihood of each.

And his answer is: the likelihood is correlated with the Kolmogorov complexity of the sequence of all inputs so far, plus the possible future input we're currently considering. The idea is that there is information in the inputs so far, and the sequence that best uses that has the lowest Kolmogorov complexity and is most likely to occur. If we've, seen 999 "a" s, we expect another one. If we've seen a car-shaped object slide left to right on previous camera images, we expect it to continue its movement. If the input so far is completely random then every possible future is equally likely. Et cetera.

Given that, it becomes possible to calculate the value function for every combination of possible action and possible next sensor input, and the AI can know what to do.

Except of course for the detail that this is (wildly) incomputable.

But what is somewhat correlated to Kolmogorov complexity? Compressibility.

So that is why Hutter is interested in better compression algorithms.

All this is very loosely stated from memory, I was interested in this stuff over a decade ago...

0-_-0 · on Aug 12, 2019

Why aren't you interested anymore? We're getting closer to actually doing what you just outlined.

Scarblac · on Aug 14, 2019

I got a wife and kids so have no free time anymore, and nowadays I feel there are far more important problems than AI in the world (climate change, huge loss of biodiversity, cheap very effective propaganda).

mpoteat · on Aug 11, 2019

Code golf is anecdotal, at least, evidence that proving lower bounds on the Kolgomorov complexity of a particular datum requires intelligence.

tshaddox · on Aug 11, 2019

I don’t dislike this challenge, and I don’t dislike code golf. What I don’t understand is the notion that compressing a 100MB Wikipedia file somehow requires an “intelligent” compression algorithm. If the challenge was to be able to compress any English text down to a certain ratio, then perhaps making the algorithm behave “intelligently” would be a requirement. But as far as I can tell, this challenge is asking for a lossless reproduction of one specific chunk of data, so it seems to me that we just have to hope that, on a given platform, there exists a 15MB executable that outputs that exact chunk of data. One might exist, but one might not. And you probably won’t even be able to prove the latter, because of the halting problem (technically it doesn’t apply with a computer with finite memory, but in practice no one will wait long enough).

carapace · on Aug 11, 2019

> one specific chunk of data

Not just any chunk though. It (presumably) contains a lot of structure that encodes information about [English representations of] the real world.

It's not random, nor is it simple, nor is it an image of a fractal (maybe?) etc.

A system that is able to compress it well makes good predictions about what the data contain which indicates "intelligence" about the structure of the (encoded) world.

tshaddox · on Aug 11, 2019

But would you expect that program to itself be useful for anything else? It literally just takes no input and outputs that one specific Wikipedia dump. It’s not like I can tweak it slightly and have it output new predictions of things that sound like Wikipedia articles.

Unless the idea is that whatever methodology is required to build this particular program could then be adapted to do other things that seem intelligent. That seems possible, but not at all likely given my reading of the challenge requirements.

carapace · on Aug 11, 2019

See the links in my other comment in the thread for all the gory details.

> But would you expect that program to itself be useful for anything else?

That exact program, no, but the process of creating that program would be educational, both the compressor that makes it and the process of making the compressor.

Another way of putting it is in terms of the "Twenty Questions" game. Which pre-selected twenty questions allows you to "cover" the most of the Universe of phenomenon? If one set of 20 allows you to distinguish more of the Universe than some other set, then the first set is somehow a better model of the Universe. Does that help?

derefr · on Aug 12, 2019

> If one set of 20 allows you to distinguish more of the Universe than some other set, then the first set is somehow a better model of the Universe.

Is Akinator’s 20q database available as a dataset for non-commercial research purposes? It’d be beat to attempt to compress it in a way where training with one half predicts the other half at the smallest trained model size.

klipt · on Aug 11, 2019

The point of the challenge is to actually create a model that fits in 15MB together with its compressed data. Not just to prove its existence.

One of the best contenders is the open source cmix which, among other tricks, uses LSTM neural networks to predict the next character/bit: http://www.byronknoll.com/cmix.html

kzrdude · on Aug 11, 2019

Seems like cmix compresses better than the winners, but uses far too much memory to be eligible for the prize.

Dylan16807 · on Aug 12, 2019

I had to search out the size of the decompressor on a different page, but it proves you right. The compressed file is ~400KB smaller than the record holder and the cmix decompressor is only ~200KB.

I wonder what the memory/speed/efficiency tradeoffs looks like for that kind of design.

ychen306 · on Aug 12, 2019

It's related to intelligence in the sense that being able to predict well requires intelligence and helps compression. If you have an intelligent model that predicts new words with high confidence, then you can use fewer bits to encode them. If I were to be even more handwavy, I would say having the ability to summarize effectively is a sign of intelligence.

codebje · on Aug 11, 2019

There is a much smaller set of useful 100Mb text files than there is of total possible ones. If there's only 15 million such useful 100Mb text files we should be able to map them to 15Mb executables.

0-_-0 · on Aug 12, 2019

Maybe "useful" is not the right word, I would say "non-random".

dang · on Aug 11, 2019

We moved this comment from https://news.ycombinator.com/item?id=20669214 to this thread, where it seems more at home.