The remarkable thing that this explanation doesn't really explore in depth is that the maximum information (~entropy) contained in a region scales according to its boundary area rather than to its volume. That means that "maximum information per unit volume" drops precipitously with size.[1]
Just for example, this allows you to calculate an ultimate limit on Moore's law of only ~800 years, for any possible computer functioning within the bounds of the observable universe. As sketched below[1], the observable universe can hold only about 10^123 bits of information. Processors currently contain about 10^9 transistors (each of which has to be doing computations with an independent bit of information to be useful), a factor of 10^114 less. If Moore's law claims that this number doubles every 2 years, that means it grows by a factor of 10^3 every 20. And 20x(114/3) is about 760 years. (A more detailed calculation carried out in a paper by Krauss and Starkman at http://arxiv.org/abs/astro-ph/0404510 came up with a limit of about 600 years.) That's almost frighteningly soon.
[1] As the link says, for a cubic centimeter (cc) of volume, the maximum entropy is about 10^66 bits. But if you consider a cubic meter instead, you find it can hold at most 10^70 bits, which comes out to 10^64 bits per cc! For a cubic kilometer you get 10^76 bits, which is only 10^61 bits per cc. If the solar system has radius ~10^13 m, it could hold at most 10^96 bits, or 10^51 bits/cc. And the whole observable universe (with radius ~5x10^26 m) could hold at most 2.5x10^123 bits, or 10^38 bits/cc. That's remarkably less than the direct one cc calculation! This behavior is exceedingly non-intuitive, at least to me.
> This behavior is exceedingly non-intuitive, at least to me.
I've been thinking about the notion of the Bekenstein Bound lately, and I'm not really sure how accurate the whole idea is. There seems to be a few problems with it, and I don't get the sense that it is a particurly heavily studied problem in the physics community (compared to something like general relativity or QFT).
For one thing, nobody is certain that "actual" black holes exist. What I mean by this is a black hole where matter has actually fallen into it. Instead, the matter approaches the event horizon at an increasingly slower pace for an observer until at some point in time it is effectively frozen an infinitesimal distance from the event horizon (but still not in the black hole). In this sense, of course the information content is proportional to the surface area of the black hole.
(I should note that mathematically, "effective" black holes and "real" black holes end up having the same properties and behavior; they'd be indistinguishable to an observer).
Another problem with the Bekenstein Bound is that quantum states aren't localized to a volume of space. You can't just hold up a beach ball and say "What's the maximum information content of this beach ball?". Why? Because you can't just have a wavefunction of just "the beachball". It's a pretty good approximation, but what about the electron-electron correlations at the edge of the beach ball? And what if the spin of an electron within the beach ball is entangled with an electron outside of it? The information describing that particular quantum state is then delocalized.
Don't count on finding big misconceptions here: in fact, questions about black hole entropy and the Beckenstein bound are some of the most intensely studied topics in theoretical physics today. Not that we understand the answers! But these issues are very near the heart of questions about quantum gravity.
As for your question about "actual" black holes, I think you may be missing some pieces of the puzzle, and I don't have the time to chat about them at length here. But one important point is that while outside observers may see infalling matter appear to stall out an infinitessimal distance from the black hole horizon, the standard classical or semiclassical result is that an actual infalling observer would pass through the event horizon without noticing anything special about it at all. (For a stellar-sized black hole, they would probably have already been ripped apart by tidal forces at that point ("spaghettification"), but that's unrelated to the horizon. For a truly massive galaxy-sized black hole where tidal forces are small at the horizon, the observer wouldn't notice anything special at all as she crossed the point of no return.) There is at this moment a massive debate raging in the particle physics community about whether quantum gravity effects might lead to a "firewall" at the horizon of a black hole that would change this story, but I don't think the question is settled yet.
As for localization of quantum states and information, again, all I can tell you is that people have been studying those issues a lot, too. The details of entanglement of states inside and outside a black hole are closely tied in with the current "black hole firewalls" debate that I mentioned earlier, for instance. So don't make the mistake of thinking that the physics community has neglected this sort of issue!
So I take my cubic meter and divide into 100^3=10^6 1cm^3 segments, and store 10^66 bits in each one. This brings me to 10^72 bits in the 1m^3 volume. Does something stop me from doing this? Or take it the other direction, store 10^66 bits in 10^6 little cubes and just stack them up. What gives?
See, that's the fun part. If you try to pack in more information than given by these limits, you inevitably form a black hole (and then you lose access to those individual cm^3 regions, and your plan falls apart).
How so? Those 1cc-cubes all have the same exact density, and placing them together does not affect/change the density of the formed object (the 1m-cube).
For what you're saying, they would have to be so massive and dense (in the first place) that they would gravitate-in any matter placed near them (to then form a black hole).
But does that even hold true when you run the numbers to see if that "max-info" 1cc-cube is anywhere near the required mass (given it's volume of 1cc^3) to form a black hole?
One of the confusing issues here is that there is not a single, constant density necessary for the formation of a black hole. Instead, the Schwarzschild radius of a (potential) black hole is proportional to total mass. That means that as you increase its mass, the actual radius of a sphere of constant density grows only as m^(1/3) while its Schwarzschild grows much faster as m^1.
Thus, as you pile up more and more of your identical 1cc cubes, the size of the pile will grow more slowly than the size of its Schwarzschild radius. As soon as you add enough cubes for the Schwarzschild radius to exceed the actual radius, the system must inevitably form a black hole.
I think that this behavior is directly related to the information density limits that we've been talking about, but certainly the end result is the same: piling up lots of similar stuff in one place will eventually lead to gravitational collapse.
Information contained in a cubed meter should be equal to information contained in 100^3 cubed centimeters (the number of cubed centimeter parts in a cubed meter).
> However, this figure [infinite perimeter] relies on the idea that space can be subdivided indefinitely. This fiction—which underlies Euclidean geometry and serves as a useful model in everyday measurement—almost certainly does not reflect the changing realities of 'space' and 'distance' on the atomic level.
A subatomic particle has a physical size of zero (the "size" of a particle is essentially the range of the forces it participates in, but the spatial size is zero), so depending on how you measure it (i.e. which forces) you can get an area that really is infinite.
Since the Schwarzschild radius is proportional to the mass, it is probably better to note that the maximum entropy of a black hole seems to scale with mass squared, instead of being proportional to the mass as in an ideal gas. But this can sort of be understood by noting that ( using a really badly flawed model of black holes) the minimum energy states are all occupied and to add matter, it needs to sit at a energy proportional to the mass of the black hole. Additionally, arguing about black holes assumes that we can not get any information out of the computer.
The thing I've always wondered about here is how we distinguish between information density, and our ability to calculate information density. How can we tell the difference between the limits of our mathematics, and the actual limitations of the universe?
You have a molecule. Something that will "stay put" probably written on a "2d" surface like Graphene.
Graphene is composed of lots of little hexagons. Each side of the hexagon can be broken and have an atom attached to it in "3d".
You have 6 sides and the angle break can go "up" or "down".
You can only use 3 sides however so that each hexagon has data and you can tell unique data.
This gives you 3 positions in 3 states, Up, down, or Flat.
0.142 nanometers per bond...
That's 27 states per hex, and 190 hexes per nanometer... 36,100 hexes per square nanometer...
I'm sure I screwed up a calculation in there somewhere. But Based on current tech this is my answer to what is possible to write. Now Reading might be a bit harder at any speed... but hey this is all theory right?
One cool idea (I saw it on Charles Stross' blog) is "diamond memory": use a diamond crystal, with two different isotopes of carbon for 1 and 0 bits. This also doesn't feel too unimaginable, in theory. According to Wolfram Alpha[1] this gives 1.75*10^23 bits (20 zettabytes) per CC.
This assumes that all information is stored physically.
Let's say you get a telegram. The telegram itself can contain maybe a couple of paragraphs of text. You might think the bandwidth of the channel is at most a kilobyte or so. But there's plenty of other aspects that can raise the amount of information conveyed. Say you get one saying "Short Dow." Just going by the content, you'd have no idea what it was saying. Is there a guy named Dow somewhere that's short? But if you're a stockbroker, all of a sudden there's a whole lot more info there. When you allow for context, information density can approach infinity.
'If you can store infinite information outside of a system, then you can achieve infinite information density inside the system if you use outside the system as 'context'. I hope that paraphasing makes it clear enough the problem with this line of thinking. You have to include all the information when calculating absolute density. In your case, the context has to be stored somewhere too.
It's worse than that. There are a finite number of states of a message, and you can only reference that number of states of the outside system. If you think of the message as a pointer containing n bits, then you can only reference the first 2^n positions in the dictionary.
If the outside system is a probability distribution over output messages (a more general case of the dictionary you described) then the problem is synonymous with compression.
If all information has to be accounted for and stored somewhere, and context is part of the information, then you can't store any information without storing all information, everywhere. Because every bit of information exists inside the context of the entire universe.
You're getting very metaphysical here, but reality remains the same even if you expand these principles to the universe. The rules of physics still apply.
I think he has a point, although it's more about the semantics of the term 'information' density.
Shannon information is always measured relative to a receiving context in which it the symbols are understood, and information content is related to the inverse of the probability of observing a particular signal as assessed by the receiver. So from that perspective vinceguidry has a reasonable point.
However really the question being asked is something more like 'data density' and that is generally what people are talking about when the term 'information density' is invoked.
Edit: I see that the original article does indeed refer to Data Density, and that HN title is just wrong.
You are Dr. Who. You have a Tardis. It will convert your language in to any other language telepathically. Why not create a word that means the summation of everything you know, and say it to a person. They create a word that means the summation of everything they know and everything you know and say it to the next person...
Even better - attribute a meaning, in that language, to refusing to say anything to the next person (remaining silent). Then you can put the summation of everything they know and everything you know into the case where you don't say anything (in that language).
This is similar to making a version of gcc that outputs a Tetris program everytime it's asked to compile a 0-byte file, or maybe outputs all of Wikipedia. I mean, sure, you can hide Tetris, or the whole of human knowledge, in 0 bytes this way. But it's not a very useful exercise.
Remaining silent is as good as communicating one bit of information.
Even a 0-byte file has metadata. Even if it is 0-byte long, GCC knows that it is reading an input file. How did it come to know? Because you communicated some amount of information by initiating the compilation.
> When you allow for context, information density can approach infinity.
I absolutely agree with you. You are my hero.
The amount of information stored in an object (which is capable of storing at least one bit of information in traditional sense) depends on the size of the context.
If the object is not even capable of storing one bit of information in traditional sense, then the amount of information that can be stored is zero.
And for all objects that can store one bit or more of information in traditional sense, the total amount of information that can be stored in it = the number of bits it can store + the number of bits that can be stored in rest of the universe (context). So any one bit object can store the same amount of information that can be stored in the entire universe.
And if it turns out that our universe is enclosed in yet another larger universe, then you have to include that as part of the context as well.
If you were a stockbroker, that message would contain less information - and you probably would know it already. It's useful information to a stockbroker, but it would be more informative to almost anyone else, who doesn't encounter such messages daily. As another example, the information in the Rosetta stone was diminished - if more accessible - to its carvers, and a 700 MB disk holds 700 MB regardless of what those bits are.
What you're referring to is compression. High information density is indistinguishable from noise to any recipient who cannot translate it, which is why compression contests regularly require the extractor to be included in the message.
Can anyone put this in terms like 100100 Petabytes or something? Or is the number so large that it really isn't conceivable at this point?
I'll admit that I don't understand the problem completely, but isn't this assuming the absolute maximum with little consideration of actual technology limits? How much does it change when we consider the limits of technology and our ability to store it on said technology?
The number is far too large to have an SI name. Some people have put forward suggestions for new prefixes, but they wouldn't help you understand the number.
Edit: I suppose you could say 1 exayottayottabit, or 100 pebiyobiyobibytes, per cc.
And yes, this relies on using qubits for storage, in greater densities than is achievable.
might be nitpicking a bit here, but "information" != "data". It is hard to tell how little or much information I can extract from any amount of data. data + context = information
So the theories here assume qubits are the smallest unit of storage. I'm not into physics, so how's the evidence they are really the smallest units possible to store information as?
After all, atoms were said to be the smallest thing at some point.
Just for example, this allows you to calculate an ultimate limit on Moore's law of only ~800 years, for any possible computer functioning within the bounds of the observable universe. As sketched below[1], the observable universe can hold only about 10^123 bits of information. Processors currently contain about 10^9 transistors (each of which has to be doing computations with an independent bit of information to be useful), a factor of 10^114 less. If Moore's law claims that this number doubles every 2 years, that means it grows by a factor of 10^3 every 20. And 20x(114/3) is about 760 years. (A more detailed calculation carried out in a paper by Krauss and Starkman at http://arxiv.org/abs/astro-ph/0404510 came up with a limit of about 600 years.) That's almost frighteningly soon.
[1] As the link says, for a cubic centimeter (cc) of volume, the maximum entropy is about 10^66 bits. But if you consider a cubic meter instead, you find it can hold at most 10^70 bits, which comes out to 10^64 bits per cc! For a cubic kilometer you get 10^76 bits, which is only 10^61 bits per cc. If the solar system has radius ~10^13 m, it could hold at most 10^96 bits, or 10^51 bits/cc. And the whole observable universe (with radius ~5x10^26 m) could hold at most 2.5x10^123 bits, or 10^38 bits/cc. That's remarkably less than the direct one cc calculation! This behavior is exceedingly non-intuitive, at least to me.