Hacker News new | past | comments | ask | show | jobs | submit login

>Just the vision data of a baby’s first year easily adds up to petabytes

What encoding is this??




Uncompressed 2x 8k by 8k 24bpp 24FPS video. Comes at about 500GB per hour.


It’s always interesting to think about the exact technological analog for our biological sensors, but I believe that our vision would be way less than that in terms of raw data. We have a super high-res area at the center of vision (the fovea), but the rest is extremely low resolution (but with high movement and light sensitivity).

I think we could reasonably say that if an optical nerve has 1mm neurons on average, and they can fire at 250Hz at the most, that’s 250mbps or ~31mb/s per eye of uncompressed data as an upper bound.


I don't think you correctly calculate bandwidth in this case. You assume 1 bit per neuron per tick, but time when it fires within the tick also matters, and that information is missing from multipliers.

Also, there's no reason to use data from optical nerves as input, as it is already precompressed. You should be counting optical receptors instead (120 000 000).


I don’t think it matters that much. The firing itself takes a couple of milliseconds, and there’s a refactory period of a millisecond. I’m approximating 250hz as the maximum rate of firing. You’re arguing that the neuron can encode more information with the phase (e.g. fire, recover, wait 2ms, fire) but I think information theory tells us the 250hz actually still bounds the information. Maybe there’s a small constant factor, but I don’t think it changes the order.

I don’t believe it is precompressed as it hasn’t been processed by the visual cortex yet, no? Aren’t the optical receptors simply an artifact of the “sensor design”? E.g. if the refractory period of an optical receptor is 100x that of the neuron (or you simply need to cover a certain area, as you probably have tons of receptors attached to a single neuron outside the fovea and a small number per neuron inside the fovea), you’d hook up 100 optical receptors per neuron to use its full capacity. I think this is less compression and more combining a bunch of low information channels into a higher information channel.

All we really care about here is the amount of information reaching the brain not what your physical eye is capable of receiving, so I think using the nerve makes the most sense. There’s an interesting direct analogy: we don’t really care about the number of CCD sensors in the camera that took the image, we only care about how much information is in the video coming from the camera.


I disagree with the later point, as unlike camera sensors the cells in question already include trainable parameters for every single one of 100M+ inputs.

But it matters little as even with 100x reduction the estimate blows GPT out of the water in the first year, making it very sample inefficient in comparison.

As for signal I am a layman in its most extreme here (only mist-like idea about information theory and frequency relationship), but don't the bandwidth limits only apply to fixed rate measurements? E.g. there's basically infinite (sans plank limits) number of values between 4ms and 5ms and as long as the receiver can separate them, they can encode information?

To put it in other words, if the neurons can control the impulse peak delay down to a nanosecond, then shouldn't the limit be measured based on 10^9Hz of that control vs 250Hz of max firing rate?


I don’t think the connections between the receptors (rods and cones) and the ganglion cells are “trainable”? You seem to be assuming Brain-like learning functionality inside the eye. I’m not a biologist, but I don’t think this is the case unless you’re considering evolution as training. If it were true, wouldn’t people have wildly varying fovea? I feel that these connections are anatomical and not learned in exactly the same way the number of arms, legs or teeth is not trainable.

Regarding the nanosecond point — I don’t believe that’s how information works, and there should be many obvious problems with the idea of an infinite information channel not to mention the obvious practical ones (propagation variability, lack of a reference point, etc.). There may be some optimizations, but generally the frequency (or frequency bandwidth, which is where the generic computing term comes from) determines the information capacity, and phase modulation doesn’t magically change this (it is actually what is used in many radio systems).


31mb/s = 14GB/hr (bits to bytes). 81TB per year, assuming 16 hours awake per day. Fits snuggly on a large SSD ;)


Maybe if we store text data as sequences of 10k x 10k PNGs (one for each letter) and add an image recognition layer it would improve LLM perf


> > Just the vision data of a baby’s first year easily adds up to petabytes

Just to add to this, the human brain also encodes quite a lot of evolutionary lessons. We didn't have to learn edge detectors.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: