Hacker News new | past | comments | ask | show | jobs | submit login

I really do not believe lidar is going to be the right answer for self driving cars. We already know that you can be successful at driving a car in various weather conditions with two stereoscopic cameras on a swivel and limited sonar (humans). Cameras, infrared cameras, and sonar will likely be much cheaper and just as capable as lidar.



If you think humans are "two stereoscopic cameras and a limited sonar", you need to learn more about humans.

Not only are our eyes far more versatile than the most sophisticated camera equipment, we constantly maintain a mental model of the world that allows us to focus only on the things that matter, and accurately infer state. This affects everything from vision, to hearing, to proprioception and situational awareness.

Machines don't come close, which is why the quality of the sensors matters so much more. This trope gets repeated in nearly every thread about self-driving cars, and reveals mainly that engineers don't know much about biology.


Our eyes may be versatile, but I've never heard of a man-made camera that comes equipped with a sophisticated image regeneration algorithm because the film/CCD has a giant black hole about 1/3 from the center.

(...And now imagine it also comes with an automatic jitter motion generator and sophisticated time-differential logic because the sensor zeros out if exposed to the same color for ten seconds.)

We have terrible sensors. We just evolved to work around the problems.


That's no different from the problems you need to engineer around in a CCD or CMOS - stuff like dark current, shot noise, read noise, RGB filters effectively dividing resolution by 3, an upper limit on integration time which is almost the same as "sensor zeroes out if exposed to the same color for ten seconds".

OTOH, the eye's selective resolution and instant slew+focus of the high resolution spot together give it an effective resolution north of 600 megapixels. The auto-exposure is also pretty bad ass.


>Not only are our eyes far more versatile than the most sophisticated camera equipment,

This is inaccurate. You do need a fairly good camera to beat the human eye, but only a couple hundred dollars. Cameras not only beat the human eye in total resolution and color depth, they beat it in peak resolution over an area much, much larger.

One of the main things you hear about cameras is that they have inferior dynamic range, but that is also untrue. A camera with an actual aperture (like the eye has the iris) is far superior. Cinema cameras are more than an order of magnitude better than the human eye, but even small cameras can be as good or better than eyes.

Reports that the human eye can detect single photons are highly exaggerated. Technically true, but on a very different basis than a camera. Compared fairly the camera will come out ahead.

The biggest "real" difference between cameras and eyes is field of view. Personally I have a 220 degree field. Although technically with two circular fisheye lenses you can get almost to 360 degrees, even individual eyes can have fields of view greater than 180. We know how it works and could reproduce it, but it's still very impressive and I don't think normal lenses do it.

Even a cheap webcam has higher "fps" and less blur than a human eye. Cameras are also much better at estimating velocity even besides that.

Some people are more sensitive to colors- especially infrared, and a number of people are tetrochromats. Cameras can naturally see infrared far deeper than humans although tetrachromatic/hyperspectral cameras aren't exactly common.

>we constantly maintain a mental model of the world that allows us to focus only on the things that matter, and accurately infer state.

This is not an advantage. What this means is that we aren't paying attention elsewhere. This is a very coarse first order optimization- cut out a ton of data, only pay attention to what you're paying attention to. It's tautological and it means you don't get to apply optimization algorithmically. This is a huge advantage for computers.


[citation needed].

- I am very skeptical that even a cinema camera can exceed the dynamic range of the human eye in one exposure. It is probably more important that features never saturate or go black when the image is properly exposed than that exposures are possible in a wider range of conditions. Also, cinema cameras are as big as toaster ovens and cost kilodollars.

- You're saying the peak angular resolution of a camera with a near 180 degree field of view exceeds the angular res of clear human foveal vision?

- Dark sensitivity: Yes, you can detect more a lot more light if you make the aperture 200x the size of a pupil. Could a space-efficient camera compete?

- Fairly saying there is higher fps AND less blur in a camera means you have to show me a camera the size of a human eye, with the dynamic range of a human eye, with the resolution of a human eye, and the low light sensitivity of a human eye that can produce more than about 60 fps. You can certainly have more fps, but often at the cost of size or resolution or light sensitivity. You can certainly have less blur, but then that means you need a bigger aperture to let more light in so you can crank your shutter down. Are you saying a small camera can beat the eye on all counts?

So yes, if we stud a car with RED weapon 8ks, we'll probably exceed human perception. Can the same be done cheaply and hidden in the body of a car with current technology?


The human eye has a dynamic range of 10-14 f-stops[1], a Nikon D810 has 14.8.

The human eye has a ~10 megapixel resolution over a 20 degree view. Over 60 degrees that ~52 megapixels. There are DSLRs that exceed that. I don't mean cameras have the peak resolution over 180 degrees, just more than 20 degrees.

Cameras are certainly comparable eyes for space efficiency. Take away the frame, battery, electronics, simplify the lensing, and you have a very small device indeed. It would be better if we had spherical sensors, of course. The human eye's sensitivity can get up to ~1000 ISO after a long adjustment period. Cameras are much better.

[1]: http://www.cambridgeincolour.com/tutorials/cameras-vs-human-...

>Fairly saying there is higher fps AND less blur in a camera means you have to show me a camera the size of a human eye, with the dynamic range of a human eye, with the resolution of a human eye, and the low light sensitivity of a human eye that can produce more than about 60 fps. You can certainly have more fps, but often at the cost of size or resolution or light sensitivity. You can certainly have less blur, but then that means you need a bigger aperture to let more light in so you can crank your shutter down. Are you saying a small camera can beat the eye on all counts?

Do cars not beat humans because they are bigger than people? There is no need for a camera the size and performance of the eye. Machine vision has no need of that resolution or dynamic range, and photography/film have no need of the size. Why would anyone make a smartphone with 50 megapixels? That's impractically large.

Even still, take the google pixel. By volume it's probably less than a tenth the size of the human eye. It's got comparable resolution (>13 MP), though lower peak resolution. The dynamic range is at the low end of a human eye despite the massive size difference. The FPS and ISO are better. All it needs is a fisheye lens. If a camera company wanted to exceed the human eye, they could certainly do it.


Human visual hardware isn't very good. Slow integration time, very small (~10 degree) high-resolution area with mediocre resolving power (~1moa). Good dynamic range, but more than is really necessary for driving. A few tricks like ocular microtremors.

Everything else you said is a factor of software, not hardware.

Perhaps don't be so condescending unless you actually have a really good point. You don't need to know much about biology to point out that humans get by without very fancy sensor hardware.


"Human visual hardware isn't very good."

What an incredibly clumsy statement.


If just isn't, even in the animal world. The only thing we have an edge in is colour resolution and that not always.


OK, but we developed or mental model of the world through those two cameras. I agree that we still have aways to go, the fact is only two cameras and processing is all that is needed. But we can do better with more sensors


If a company's solution to self driving cars with just two cameras requires developing a machine learning "model of the world" (I don't think it does, but it does make it a much harder research problem), then they are going to be years behind everyone else in shipping a self-driving car.


If a company's solution is able to maintain a real-time model of the world on top of which reasoning and reaction at human-level speeds is possible, never mind the driving cars - that's priceless!


That's the understatement of the week right there. We've been working on that tiny little problem of "...and processing" for about a century (wall time), yet the result is still quite rudimentary.


> reveals mainly that engineers don't know much about biology.

reveals mainly that people who don't work on self driving cars know nothing about self driving cars.


why would you make assumptions like that?


I think there's an important difference in requirements for engineers: an artificial system shouldn't need years and years of constant training. Sure, we make do with two eyes and ears but if I need to check my blind spot, I still need to turn my head, taking my eyes off where I'm going.


> "stereoscopic cameras on a swivel" !?

The movement, focus and light aperture mechanisms of the human eye technologically outstrip this description so ridiculously that I have to assume you're joking. Try designing and building a system that can change heading and focus as rapidly as the human eye, while handling nearly as high a input dynamic lighting range. Not even considering the physical space efficiency and impossibly low power consumption of the human eye relative to to do its job compared to any manmade sensors.

AI discussions often get caught up purely in processing and learning, but the mechanical and sensory systems of humans are also non-trivial to integrate into machines, and not to be underestimated.


>The movement, focus and light aperture mechanisms of the human eye technologically outstrip this description so ridiculously that I have to assume you're joking. Try designing and building a system that can change heading and focus as rapidly as the human eye, while handling nearly as high a input dynamic lighting range. Not even considering the physical space efficiency and impossibly low power consumption of the human eye relative to to do its job compared to any manmade sensors.

and I'm sure the majority of what you described is completely unnecessary for the purposes of self driving cars. I hope you enjoyed the few moments of pointless outrage you had when writing your comment.


It's really a question of timeframe. You are completely right that we theoretically only need two cameras. But by limiting yourself to 2 cameras (or cameras + other non-lidar sensors) you're just making the problem a lot harder for yourself. That's a lot more computer vision tasks your research team is going to have to solve, a lot more models to develop, a lot more failure cases (less redundancy).

Some of these vision problems have uncertainty over how long it will take to perform at human-level accuracy. Whereas if you go with lidar, I don't think there's any doubt the costs will go down. IMO on the market, a $5k sensor cost is also nothing if your car is actually fully self-driving.


Theoretically you only need one camera - there's plenty of people who've lost sight in one eye that learn how to drive and get their drivers licence.


The more I think about it, the more I am convinced that we humans are _un_suited for driving safely, given the equipment we have. Because we have only "two stereoscopic cameras on a swivel," we have to rely on a complicated system of mirrors -- often at least four (rear-view, two side-views, and a small blind spot checker) -- to get a full view of our surroundings, and we have to constantly swivel our eyes and neck to flip between front view, mirrored views, and instrument panels. We are unable to monitor all of those things at once.

Moreover, we have only an extremely crude way of communicating with other drivers to coordinate our actions. We have two turns signals, a brake signal, hazard lights, and a horn, none of which is guaranteed to be used appropriately. A lot of traffic jams and crashes could be avoided if our vehicles could broadcast their intended actions to each other in greater detail and with more reliability.

Crash data certainly bears out the fact that humans are not well suited for driving. While you say "We already know that you can be successful at driving a car in various weather conditions," I would instead say that in the aggregate, we do better than one would think, given our constraints, at avoiding crashes -- but we humans crash a lot. Millions of times a year in the U.S., killing tens of thousands, and resulting in some sort of injury in about half of cases.

So, whatever technology is being considered for self-driving cars, comparing it with the "technology" of human drivers is a natural baseline, but I would think that the greatest gains would come from breaking away from the equipment limitations humans have.


> comparing it with the "technology" of human drivers is a natural baseline,

I don't think that should be the baseline. You can have driver assistance like collision avoidance with human drivers. And lane correction too. Humans can be helped to drive safer without giving away complete control away to machines. That should be the baseline.


why does everyone here think I was suggesting that we should limit self driving cars to human capabilities? My comment was simply a comment on lidar v cameras. Obviously the cameras should cover 360 degrees around the car and have whatever additional sensors necessary for safety. Which would all be cheaper than lidar.


Limited sonar? I'm not sure what sonar has to do with driving. Between luxury cars with soundproof boxing, or constant road/wind house in a convertible I don't think sound plays any type of role in driving


I didn't read the comment you're responding to, but auditory input is an important part of a drivers toolset. I often hear an ambulance before I see it, sound helps diagnose if I have a flat tire or whenI need to change gears. Heck, car horns communicate information only through sound.


> I didn't read the comment you're responding to

Go and read it, it's just about 30 cm above


Meh, no


It can play a role, especially when cars "come out of nowhere."


That's kind of Elon Musk's take as well:

>“Once you solve cameras for vision, autonomy is solved; if you don’t solve vision, it’s not solved … You can absolutely be superhuman with just cameras.”

though I can see the argument for using a bunch of other sensors as well especially given that current self driving AI is a lot less smart than humans.


No. The bar for new technology to meet is to improve upon safety outcomes, not maintain.


I wasn't suggesting limiting it two cameras but cameras should be more than enough to solve self driving. You don't need lidar which costs thousands of dollars.


You are right. Lidar is the right answer _right now_, because alternative requires too much computation. SLAM, Visual Odometry, Structure from Motion, Multiple View Stereo, PMVS, CMVS, or even raw NNs will replace Lidar when computation gets cheap and compact enough.


Even that isn't sure. Certification is a huge barrier, and simple systems win there. And once lidar has proven itself and gained enough miles, it would be hard to change, just for cost savings.


Some of the things you mentioned, like SLAM, are not tied to cameras and used with LIDAR as well.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: