While Uber followed Google’s cars closely, it was Tesla and Elon Musk that the duo discussed most frequently.
9/14/2016 Levandowski: Tesla crash in January … implies Elon is lying about millions of miles without incident. We should have LDP on Tesla just to catch all the crashes that are going on.
9/22/2016: We’ve got to start calling Elon on his shit. I'm not on social media but let's start "faketesla" and start give physics lessons about stupid shit Elon says like [saying his cars don’t need lidar]"
Does anyone know what they're referencing here? I don't take Elon as a person to lie, his character seems too strong for that - he understands public perception and seems to deeply cares about it.
Multiple images can be used to compute a 3D point cloud. This is computer vision stuff around for many years. The challenge is this is a passive sensor in that the cameras count on light illuminating the scene. So at night; in bright light (that causes images to blow out); shadows; etc; you can have voids. If a person is in that void bad things can happen.
But cameras now cost under a $1 each in volume (thanks smartphones!) so dirt cheap. An imaging based point cloud extraction system main components are therefore cheap. Add a GPU enabled system to process (it's quite compute heavy) and you are set. OpenCV has the algorithms needed.
LiDAR is an active sensor in that the laser "illuminates" the target area. This adds cost but that is coming down quickly. Also as the sensor delivered 3D points (not images) the computational cost with images can be saved; so less CPU/GPU required.
Levandowski is a LiDAR guy. It's what he believes is the best solution for the problem.
Some feel that LiDAR is not a fit either as it doesn't work well in rain/fog/sleet/snow. There was a youtube video showing a self driving car running a test course in clear weather and again in the rain. You would not want to be a pedestrian during the rain test.
In reality this is all engineering dick waving. Prices will come down and the sensor payload will converge.
For full autonomy it is likely that cameras, LiDAR, Radar, and sonar all will be used. They all bring some advantage to the problem that addresses a weakness of one of the other sensor techs.
Oh yeah, and Levandowski is a complete prick. Someone should teach him about IP theft and give him a prison life lesson. He's going to need it.
Incidentally, Musk's take: "The whole road system is meant to be navigated with passive optical or cameras and so once you solve camera vision then autonomy is solved if you don't solve vision it's not solved so that that's why our focus is so heavily on having a vision neural net that's very effective for road conditions." https://www.youtube.com/watch?v=gv7qL1mcxcw&feature=youtu.be...
Why would you want to limit yourself to passive cameras and make your life harder? This is like limiting yourself to flapping bird wings to make airplanes.
No, it's like limiting yourself to using skis to move down a ski slope. He's right: the roads are design to be navigated using vision. Signage, regulations, paint, curbs, etc. There's no proof that you could safely navigate the roads with LIDAR, but we prove every time we drive that you can do it with vision.
And sure, there might be a better way to get down a ski slope, but skis would be a pretty good starting point. And they guarantee you don't end up in an impossible situation because you're doing things a fundamentally different way than the system expects.
They're designed to be navigated using human vision, which has very different characteristics in terms of dynamic range, resolution, processing pipeline, inferring details about the scene based on past experiences, etc than machine vision.
Because not everyone can afford to spend $20k on extra sensors that make the car 1% safer. And holding back autonomous cars until they're perfect can kill more people than near-perfect autonomous cars. It's an economic tradeoff like any other.
His thesis is that relying on cameras makes it easier, since the entire preexisting road network is literally designed around optical navigability.
Adding other sensors isn't free. Every minute you spend on developing techniques to process inputs from other sensors, not to mention integrating their conclusions with that of other sensors, is time, money, and energy you could have used to improve your optical system.
I'm not saying I necessarily agree (though I find his position intuitively compelling), but he clearly thinks that it's easier, faster, and cheaper to bring an optical-only system to a point of reliability than it is to bring a mixed-sensor system to the same point.
It's interesting if you consider we have 2 eyes [cameras] and we drive under all conditions, and under bad driving conditions - if you're sane - you slow down or even completely stop and pull over with your 4-ways on. When I've been in very heavy rain downpours on the highway, it feels like I'm only driving because I'm able to follow the flow of lights in front of me as having a bunch of guidance points - autonomous vehicles could likely do a much safer job of this..
So just from reading the pulled quote, is he saying current road users, i.e. humans, navigate using a passive optical system, i.e. our eyes take in photons, they don't emit lasers, but our eyes are also components of a general intelligence, does "solving vision" entail development of a general intelligence?
> Multiple images can be used to compute a 3D point cloud. [...] Add a GPU enabled system to process (it's quite compute heavy) and you are set.
It requires enough information ("features") in the images to compute a stereo pair. Flat patches of color have no features and as such, cannot be correlated between cameras. In such a case, you have a spot with no depth information.
This is exactly why you want to have other sensors, and saying that "oh humans have two eyes and they do fine" doesn't really cut it. Humans can say "hey this flat patch of color is a sign and signs are not dangerous" or "hey this flat patch of color is a really clean semi truck, and crashing into trucks is bad" but computers aren't that smart.
> Flat patches of color have no features and as such, cannot be correlated between cameras.
Flat patches of color are also flat, which makes it possible to fill in the missing depth information.
I took a course in computer vision where one of the projects [1] involved monocular vision, and assuming that flat patches are flat, vertical lines are vertical, all others are horizontal and the background is flat, it was possible to get a pretty good reconstruction.
Everything has texture, even things that appear solid, this is how optical mice can work on glass. Throw in NIR/UV camera, or maybe thermal for good measure, slap LSTM on top and you are covered for everything human would spot.
There are a lot of techniques to avoid it. I remember listening to an interview with Greg Charvat who said that coding a polarised pulse train with some random phase distribution is a possibility for avoiding interference.
If it's deliberate, I would imagine that there are plenty of techniques that would work with regular cameras, as well. Ultimately, you can always target the algorithm that works on the data, regardless of how the data is collected.
Some people believe this to be impossible for the time being - LIDAR gives you a 3d point cloud (every obstacle, with its distance from you measured), which is amazing (but currently expensive, bulky, fragile, etc) while with cameras it's way harder, and RADAR can't see some materials at all.
Tesla argues that their suite is enough (humans manage with just 2 "cameras", some even 1), but the algorithms to let you safely entrust your life to such a sensor suite are harder (i.e. more far off in the future).
Kalanick + Levandowsky disagree with Musk, and were planning on calling him out on it.
I know that you were just doing a tl;dr and not necessarily trying to start a discussion about it, but wow, that seems kinda stupid to be coming out of someone who is ostensibly fairly intelligent.
A 3D point cloud is great, but with the raw computing power we have these days, image processing should be pretty reliable, at least until LIDAR becomes more. . .well, reliable and affordable.
Doesn't the HoloLens just use an IR camera to map its environment in real time?
> I thought that that was (also?) a problem with the radar, where it bounced under the truck and thought the road was clear.
Ultimately this was driver error, since he wasn't paying attention and didn't break.
As for why autopilot did not engage the breaks, you could blame any of the forward facing sensors since they all failed to see the truck.
If the vehicle had windshield-height radar, or a better vision system, perhaps the system would work.
The problem with the radar was it couldn't see the object at that height.
The vision system confused the trailer for an overhead road sign.
After the incident, I remember some people reporting more breaking occurring on highways underneath overhead signage. That made me think the fix they put in place was to raise the threshold for what is considered an overhead sign. That is, they chose to err on the side of assuming the sign is an object ahead.
That may have been a temporary fix. I'm only speculating based on driver reports that appeared in /r/teslamotors a month or two after the crash was reported.
I don't know whether Tesla ever gave an official statement about how they fixed that issue. Tesla wasn't found to be at fault so they probably didn't have to. Also, the whole system is constantly under development.
> This is where fleet learning comes in handy. Initially, the vehicle fleet will take no action except to note the position of road signs, bridges and other stationary objects, mapping the world according to radar. The car computer will then silently compare when it would have braked to the driver action and upload that to the Tesla database. If several cars drive safely past a given radar object, whether Autopilot is turned on or off, then that object is added to the geocoded whitelist.
Relying on whitelists seems like a hack. Then again, I'm not building it =)
> Relying on whitelists seems like a hack. Then again, I'm not building it =)
Agree that it feels hacky at first, but once the whitelist dataset gets huge, it becomes unique training data for Tesla (and hopefully their machine learning will be able to generalize from it).
> Agree that it feels hacky at first, but once the whitelist dataset gets huge, it becomes unique training data for Tesla (and hopefully their machine learning will be able to generalize from it).
I guess. But then you still have to deal with rollout in other countries. And maintenance of sign locations which can change over time, get removed, etc. Still hacky IMHO.
How would machine learning make use of white-listed data? I doubt they could use that data to predict the GPS location of unknown signs.
If you mean image recognition, I assume if machine learning could properly identify the signs with accuracy, then they wouldn't need the whitelist. Then again, maybe they truly haven't collected a full overhead-sign dataset yet. I'd be shocked though if they don't by now. Anyway, you could be right. It would be fun to learn more about these setups.
The reason I bring it up is that the HoloLens is pretty limited. Given what I've seen it do with its limited power and the kind of computing power they're putting in Teslas, not to mention the extra sensors (RADAR), it seems more than adequate to perform real-time image processing, even in the short term. That's not even considering what's coming down the pipeline, like fleet learning, the new Nvidia GPUs, etc.
Also when the HoloLens throws a fit it's maybe a 5% margin of error in terms of how the image moves. You'd think an autonomous vehicle wouldn't crash because of that much inaccuracy considering human drivers don't normally with less accuracy than that.
The Hololens uses an IR projection and camera in addition to 4 "environmental cameras", which is why it doesn't work well in very low or very bright lighting. It also uses the inertial sensors to help correct the map.
There are some things about SLAM that are best done optically.
Musk has said on a few occassions that he can achieve full automony on a Tesla with no LIDAR setup. I think instead they use a front-facing radar and a bunch of cameras. Maybe IR?. Levandowski strongly disagrees and sees LIDAR as critical to reliably mapping the environment. This all came up when that guy drove into the side of a trailer in a Tesla.
Elon has publicly claimed that he will be able to provide an L5 autonomous car without using lidar in the next few years. AL, another expert in the field, believes that claim is so ridiculous that he must be willingly lying, not just overconfident.
While Uber followed Google’s cars closely, it was Tesla and Elon Musk that the duo discussed most frequently.
9/14/2016 Levandowski: Tesla crash in January … implies Elon is lying about millions of miles without incident. We should have LDP on Tesla just to catch all the crashes that are going on.
9/22/2016: We’ve got to start calling Elon on his shit. I'm not on social media but let's start "faketesla" and start give physics lessons about stupid shit Elon says like [saying his cars don’t need lidar]"
Does anyone know what they're referencing here? I don't take Elon as a person to lie, his character seems too strong for that - he understands public perception and seems to deeply cares about it.