Except that they have remote humans monitoring every vehicle so the whole thing ...

kajecounterhack · 2024-06-25T19:11:32 1719342692

1:1 remote human monitoring would not scale from a unit economics perspective, and even if they did that, the remote operators can't drive the car, only offer small feedbacks. So the car is really driving itself.

The safety story is an interesting one. Companies like Cruise and Waymo are not forthcoming with their incident data. They share infrequently and through spreadsheets that do not capture every incident. It's pretty ass, and I'd be wary of trusting their self-reported data. I imagine their insurance companies have slightly better data than the gov't, but even then maybe not.

jessriedel · 2024-06-25T19:17:15 1719343035

Spreadsheets are the best and most open way to share data because it lets other people analyze it. But they put out plenty of their own research if you want to read that.

I'm really not sure what you mean by "infrequently". They release new raw data every 3 or 6 months or something, and every single accident is trumpeted in the news. What other industry has more publicly accessible safety data?

kajecounterhack · 2024-06-25T19:32:43 1719343963

Every software release is a totally new driver. If they release software at a cadence of say, 6 or 8 weeks, would you feel comfortable riding? You don't know how safe the car was during that time -- any part of the driver can regress. You're basically trusting that they know how to do simulation, that they bore the cost of running that simulation, and that their simulator is realistic enough to yield trustworthy statistics.

Sadly, our regulatory agencies are currently set up for very delayed decisionmaking. Since every company can release software at any time, you could imagine a regulatory platform that tracks software releases and their corresponding safety statistics in real time.

jessriedel · 2024-06-26T05:12:18 1719378738

I’m quite confident in it and personally ride it monthly. Happy to bet it will turn out to be no more than twice as dangerous as the median driver today, ie, as dangerous as driving in the 1980s. In the long term it will be vastly safer, obviously.

kajecounterhack · 2024-06-26T08:24:15 1719390255

I'm glad you feel that way (and I mostly feel the same / ride Waymos frequently), but it doesn't change the reality that you're putting all your faith in a self-regulating company that does not cooperate with regulators more than it absolutely needs to, and where a change in leadership / safety culture / process can cause a catastrophe overnight. And this isn't hyperbole; 1 software push is all that separates a car that crashes in scenario X from one where it was found to not crash in that same scenario. Do you trust the code to be bug free? Do you trust the bugs to be discovered before they maybe kill you?

I have a vested interest in the success of AVs (worked in the industry for the better part of a decade) but the companies building them are far from perfect in terms of how they conduct themselves wrt safety. There needs to be some development in terms of the supporting tech.

The way I think about this is: planes came before the FAA and key technologies that we use to manage flights, such as radar and radio being used in conjunction. AVs are in their "just learned to fly" era.

jessriedel · 2024-06-27T05:00:46 1719464446

In the next couple years Waymo will have accumulated more than 100M miles driven, which is the number a human drives statistically before causing 1 death on average. Are you still going to be worried when they drive 1B miles and have 2 deaths? What about 100B miles with 30 deaths? At what point will you accept the statistical evidence?

kajecounterhack · 2024-06-27T07:21:37 1719472897

A new software release has driven zero miles. You do not know what kind of performance it has -- regressions both happen and are expected with every launch.

The number of miles a release ends up driving is mostly in simulation + some human-supervised on-road testing, and may be low / influenced by the availability of resources ($$$). There are little to no regulations or 3rd party checks on the process used to clear releases.

They do not simulate every release on all X miles driven. They can't, it's super expensive, and that's to speak nothing of challenges like simulation realism or compatibility of the simulator with multiple generations of hardware that isn't backwards compatible.

They could have no deaths for years, then suddenly a whole lot after a bad software push. Recently they issued a recall because of a bug where the car could hit a telephone pole. Imagine if the software had a bug when their cars were ubiquitous on highways, slamming cars into concrete medians? Imagine if a fault could bork critical technologies like the lidar at a high rate across the fleet?

What happens to Waymo if they kill even 1 child due to a bug or lapse in model quality? I imagine many would wake up to the reality of their safety story and demand a way to audit software releases. Or would you continue to let your kid ride Waymo? This is also to say nothing of sensor failures and other ways the car can mess up.

The probability of your dying in a Waymo is not # incidents / # miles, because past performance does not guarantee future performance when it comes to new releases. It's unknown quantity L (likelihood of encountering a dangerous bug per unit time) * T (time spent in car). Without more data about L (which changes based on many factors, like the company's process for certifying releases), L*T could go from 0 to something unacceptably high really fast.

You trust the car you drive today because the parts are quality controlled and regulated. Software drivers are a very new part added to this chain of things to trust. Their quality control process is not yet publicly regulated. The reality is there needs to be an advancement in technology to ensure software quality / safety, ideally from the role of an unbiased auditor that can catch faults and ground the fleet asap, then restore public faith by ensuring gaps in the qualification processes are fixed. No such mechanism exists today.

jessriedel · 2024-06-28T22:23:29 1719613409

> They could have no deaths for years, then suddenly a whole lot after a bad software push.

The software release process itself becomes the thing that is tested statistically. If they release 100 versions, and each is safer than the last, it’s silly to think that one can’t be confident in the safety of the 101st without some gov’t software approval/validation process.

> What happens to Waymo if they kill even 1 child due to a bug or lapse in model quality? …would you continue to let your kid ride Waymo?

If they had driven billions of death-less miles and then a software update killed a child, I would obviously let my kid ride Waymo. It wouldn’t be close.

kajecounterhack · 2024-06-29T04:31:38 1719635498

> If they release 100 versions, and each is safer than the last, it’s silly to think that one can’t be confident in the safety of the 101st without some gov’t software approval/validation process.

That's just the thing. Each release is not always safer than the last. Even ascertaining that with any reasonable level of certainty requires a lot of technology. In a company's hands alone, this "trust me bro" amounts to a ticking time bomb since companies have a tendency to trade safety for profit.

> If they had driven billions of death-less miles and then a software update killed a child, I would obviously let my kid ride Waymo. It wouldn’t be close.

You don't think the outcry at Boeing applies to Waymo?

rsanek · 2024-06-27T23:35:35 1719531335

I feel like this concern is mostly a thing having to do with safe rollouts. Just as with any other software, I'm sure there's a gradual rollout to ensure bugs don't just kill a ton of people at once. Google and other large cos have largely solved the reliability problem with web services. while cars are ofc different i think the same learnings apply

kajecounterhack · 2024-06-29T04:40:05 1719636005

> while cars are ofc different i think the same learnings apply

They are very different. Unit tests and stress tests can be conducted on pure software. You can monitor them and if there are failures you can retry.

Robot software needs all the stuff you use for regular software, but then you have this huge space of counterfactuals that can only be verified using large scale simulation. How do you know the simulator is realistic? That it has a sufficient level of scenario coverage? That the simulation was even run? That the car didn't catastrophically regress in a way that will lead to a crash after 1M miles of driving (which in a scaled service will happen quite frequently)?

Oh and a fault can result in a death for Waymo, but not for Google Search. So that's kind of a major difference.

irjustin · 2024-06-25T19:14:10 1719342850

> Except that they have remote humans monitoring every vehicle so the whole thing is an illusion

I think this argues the opposite of what you think it does.

Monitoring != driving and if you had humans pesudo driving, the experience would be insanely bad cuz the human would be interjecting way too much.

greenthrow · 2024-06-25T19:19:58 1719343198

We don't know how often the humans interject. That's my whole point. It's an illusion that the car is operating alone. When I drive a Tesla on FSD i only need to interject periodically, but it's enough that the car cannot be called autonomous IMHO. How many remote human supervisors are needed for Waymo vehicles? How often do they interject? Without that data it is absurd to call Waymo autonomous.

foota · 2024-06-25T20:10:55 1719346255

Waymo operators do not ever drive the vehicle. My understanding is that Waymo operators can specify things like "take this path" (e.g., by drawing on a map or something) or "yes that's safe" but this doesn't correspond to the actual driving inputs.

greenthrow · 2024-06-25T20:14:34 1719346474

I'd appreciate if they were open and honest about the reality of how they operate. But since they keep it very secret, I have no choice but to assume the worst. If it was impressive rathee than detrimental to their valuation, they'd be open about it.

the-rc · 2024-06-26T00:19:23 1719361163

Except that they don't keep it very secret:

https://waymo.com/blog/2024/05/fleet-response/

"the autonomous driver can reach out to a human fleet response agent for additional information to contextualize its environment. The Waymo Driver does not rely solely on the inputs it receives from the fleet response agent and it is in control of the vehicle at all times. As the Waymo Driver waits for input from fleet response, and even after receiving it, the Waymo Driver continues using available information to inform its decisions. This is important because, given the dynamic conditions on the road, the environment around the car can change, which either remedies the situation or influences how the Waymo Driver should proceed. In fact, the vast majority of such situations are resolved, without assistance, by the Waymo Driver."

But let's assume they're lying and you're right: what kind of throughput and latency do you expect to be required between the car and the remote driver for this to work safely? How does the car know if the remote driver is actually looking at the data and acting on that? What happens if the connection stutters? Does the car take control again? How would it decide when to alternate between remote and local control? What if the two sides disagree on what to do?

greenthrow · 2024-06-26T00:58:43 1719363523

One blog post lacking in any detail is not being open about how it works, and I think you know that. I was very clear in the kinds of detail I think they should legally be required to provide to operate on public roads.

the-rc · 2024-06-26T02:36:01 1719369361

I don't know that. You asked for numbers on how many supervisors per car are needed (I bet it's around 1:100, probably less) and how often they intervene (a few times a day at most? Otherwise you'd read articles about cars being stuck all the time). If you knew the actual numbers, you still wouldn't know HOW it works. The post does shed light on the actual workflow and gives you important details that you wouldn't glean from the numbers alone, e.g. that the car is always the one in control.