Hacker News new | past | comments | ask | show | jobs | submit login

> It relies on a complex internal model of the world to both augment the object recognition and to sometimes discard the visual data as invalid.

That in particular is what makes the hiring fascinating. This problem is Andrej Karpathy’s expertise[0]. His CNN/RNN designs have reached comfortable results, in particular showcasing the ability to identify elements of a source image, and the relationship between different parts of the image.

The speed at which those techniques improve is also stunning. I didn’t expect CNNs to solve Go and image captioning so fast, but here we are!

I think the principles are already there; a few tweaks and a careful design is all it takes to beat the average driver.

[0]: http://cs.stanford.edu/people/karpathy/main.pdf




I think LIDAR will have a place, if they can get the per-sensor cost down to something reasonable (under $500.00 per sensor, for 3D 180 degree with at least 32 vertical res beams - or the equivalent).

But I think first we'll see cars utilizing tech as described in this paper:

https://arxiv.org/abs/1604.07316

...and variations of it to handle other modeling and vision tasks.

Self-driving vehicle systems are amazing complex; it won't ultimately be any single system or sensor, or piece of software or algorithm that solves the problem - it's going to be a complex mesh of all of them working in concert.

And even then, there will be mistakes, injuries, and deaths unfortunately.


Image captioning is not solved (yet), even if there was a lot of progress made in recent years.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: