If you think humans are "two stereoscopic cameras and a limited sonar", you need to learn more about humans.
Not only are our eyes far more versatile than the most sophisticated camera equipment, we constantly maintain a mental
model of the world that allows us to focus only on the things that matter, and accurately infer state. This affects everything from
vision, to hearing, to proprioception and situational awareness.
Machines don't come close, which is why the quality of the sensors matters so much more. This trope gets repeated in nearly every thread about self-driving cars, and reveals mainly that engineers don't know
much about biology.
Our eyes may be versatile, but I've never heard of a man-made camera that comes equipped with a sophisticated image regeneration algorithm because the film/CCD has a giant black hole about 1/3 from the center.
(...And now imagine it also comes with an automatic jitter motion generator and sophisticated time-differential logic because the sensor zeros out if exposed to the same color for ten seconds.)
We have terrible sensors. We just evolved to work around the problems.
That's no different from the problems you need to engineer around in a CCD or CMOS - stuff like dark current, shot noise, read noise, RGB filters effectively dividing resolution by 3, an upper limit on integration time which is almost the same as "sensor zeroes out if exposed to the same color for ten seconds".
OTOH, the eye's selective resolution and instant slew+focus of the high resolution spot together give it an effective resolution north of 600 megapixels. The auto-exposure is also pretty bad ass.
>Not only are our eyes far more versatile than the most sophisticated camera equipment,
This is inaccurate. You do need a fairly good camera to beat the human eye, but only a couple hundred dollars. Cameras not only beat the human eye in total resolution and color depth, they beat it in peak resolution over an area much, much larger.
One of the main things you hear about cameras is that they have inferior dynamic range, but that is also untrue. A camera with an actual aperture (like the eye has the iris) is far superior. Cinema cameras are more than an order of magnitude better than the human eye, but even small cameras can be as good or better than eyes.
Reports that the human eye can detect single photons are highly exaggerated. Technically true, but on a very different basis than a camera. Compared fairly the camera will come out ahead.
The biggest "real" difference between cameras and eyes is field of view. Personally I have a 220 degree field. Although technically with two circular fisheye lenses you can get almost to 360 degrees, even individual eyes can have fields of view greater than 180. We know how it works and could reproduce it, but it's still very impressive and I don't think normal lenses do it.
Even a cheap webcam has higher "fps" and less blur than a human eye. Cameras are also much better at estimating velocity even besides that.
Some people are more sensitive to colors- especially infrared, and a number of people are tetrochromats. Cameras can naturally see infrared far deeper than humans although tetrachromatic/hyperspectral cameras aren't exactly common.
>we constantly maintain a mental model of the world that allows us to focus only on the things that matter, and accurately infer state.
This is not an advantage. What this means is that we aren't paying attention elsewhere. This is a very coarse first order optimization- cut out a ton of data, only pay attention to what you're paying attention to. It's tautological and it means you don't get to apply optimization algorithmically. This is a huge advantage for computers.
- I am very skeptical that even a cinema camera can exceed the dynamic range of the human eye in one exposure. It is probably more important that features never saturate or go black when the image is properly exposed than that exposures are possible in a wider range of conditions. Also, cinema cameras are as big as toaster ovens and cost kilodollars.
- You're saying the peak angular resolution of a camera with a near 180 degree field of view exceeds the angular res of clear human foveal vision?
- Dark sensitivity: Yes, you can detect more a lot more light if you make the aperture 200x the size of a pupil. Could a space-efficient camera compete?
- Fairly saying there is higher fps AND less blur in a camera means you have to show me a camera the size of a human eye, with the dynamic range of a human eye, with the resolution of a human eye, and the low light sensitivity of a human eye that can produce more than about 60 fps. You can certainly have more fps, but often at the cost of size or resolution or light sensitivity. You can certainly have less blur, but then that means you need a bigger aperture to let more light in so you can crank your shutter down. Are you saying a small camera can beat the eye on all counts?
So yes, if we stud a car with RED weapon 8ks, we'll probably exceed human perception. Can the same be done cheaply and hidden in the body of a car with current technology?
The human eye has a dynamic range of 10-14 f-stops[1], a Nikon D810 has 14.8.
The human eye has a ~10 megapixel resolution over a 20 degree view. Over 60 degrees that ~52 megapixels. There are DSLRs that exceed that. I don't mean cameras have the peak resolution over 180 degrees, just more than 20 degrees.
Cameras are certainly comparable eyes for space efficiency. Take away the frame, battery, electronics, simplify the lensing, and you have a very small device indeed. It would be better if we had spherical sensors, of course. The human eye's sensitivity can get up to ~1000 ISO after a long adjustment period. Cameras are much better.
>Fairly saying there is higher fps AND less blur in a camera means you have to show me a camera the size of a human eye, with the dynamic range of a human eye, with the resolution of a human eye, and the low light sensitivity of a human eye that can produce more than about 60 fps. You can certainly have more fps, but often at the cost of size or resolution or light sensitivity. You can certainly have less blur, but then that means you need a bigger aperture to let more light in so you can crank your shutter down. Are you saying a small camera can beat the eye on all counts?
Do cars not beat humans because they are bigger than people? There is no need for a camera the size and performance of the eye. Machine vision has no need of that resolution or dynamic range, and photography/film have no need of the size. Why would anyone make a smartphone with 50 megapixels? That's impractically large.
Even still, take the google pixel. By volume it's probably less than a tenth the size of the human eye. It's got comparable resolution (>13 MP), though lower peak resolution. The dynamic range is at the low end of a human eye despite the massive size difference. The FPS and ISO are better. All it needs is a fisheye lens. If a camera company wanted to exceed the human eye, they could certainly do it.
Human visual hardware isn't very good. Slow integration time, very small (~10 degree) high-resolution area with mediocre resolving power (~1moa). Good dynamic range, but more than is really necessary for driving. A few tricks like ocular microtremors.
Everything else you said is a factor of software, not hardware.
Perhaps don't be so condescending unless you actually have a really good point. You don't need to know much about biology to point out that humans get by without very fancy sensor hardware.
OK, but we developed or mental model of the world through those two cameras. I agree that we still have aways to go, the fact is only two cameras and processing is all that is needed. But we can do better with more sensors
If a company's solution to self driving cars with just two cameras requires developing a machine learning "model of the world" (I don't think it does, but it does make it a much harder research problem), then they are going to be years behind everyone else in shipping a self-driving car.
If a company's solution is able to maintain a real-time model of the world on top of which reasoning and reaction at human-level speeds is possible, never mind the driving cars - that's priceless!
That's the understatement of the week right there. We've been working on that tiny little problem of "...and processing" for about a century (wall time), yet the result is still quite rudimentary.
I think there's an important difference in requirements for engineers: an artificial system shouldn't need years and years of constant training. Sure, we make do with two eyes and ears but if I need to check my blind spot, I still need to turn my head, taking my eyes off where I'm going.
Not only are our eyes far more versatile than the most sophisticated camera equipment, we constantly maintain a mental model of the world that allows us to focus only on the things that matter, and accurately infer state. This affects everything from vision, to hearing, to proprioception and situational awareness.
Machines don't come close, which is why the quality of the sensors matters so much more. This trope gets repeated in nearly every thread about self-driving cars, and reveals mainly that engineers don't know much about biology.