Hacker News new | past | comments | ask | show | jobs | submit | rsp1984's comments login

Without taking away anything from the substance or achievement of this release, I find phrases like "openpilot is an operating system for robotics." always quite fishy.

No, it's not an OS for robotics. You can't do actual robotics stuff with it, like drive actuators to control limbs or grippers, do motion control or SLAM or perception or any of the usual robotics stack.

Their website correctly says openpilot is an open source advanced driver assistance system that works on 275+ car models of Toyota, Hyundai, Honda, and many other brands. Should've stuck to that.

Thinking about it some more, it's probably just another engagement baiting strategy to get attention and I'm their gullible puppet. Well played.


George Hotz says: "we developed a proper successor to ROS. openpilot has serialization (with capnp) and IPC (with zmq + a custom zero copy msgq). It uses a constellation of processes to coordinate to drive a car."[1] And Comma sells a robot that runs Openpilot: https://comma.ai/shop/body

> You can't do actual robotics stuff with it, like drive actuators to control limbs or grippers, do motion control or SLAM or perception or any of the usual robotics stack.

A lot of the "usual robotics stack" is not going to be relevant for the new wave of consumer robotics that is coming soon. It will be enabled by end-to-end machine learning and stuff like traditional SLAM methods will not be a part of that. The Bitter Lesson[2] is coming for robotics.

[1] https://x.com/__tinygrad__/status/1834792473081815259

[2] For those unfamiliar: http://www.incompleteideas.net/IncIdeas/BitterLesson.html


In the robotics community, the stuff coming out of George Hotz has always been considered a kludgy mess, and unsuitable for serious work. Dude is a talented hacker, but the idea that this will replace ROS is kind of a joke.

To be fair, even ROS I would consider a hobby one.

With due respect, this has to be one of the most ignorant takes on robotics I have read in a while. Yes, you can always slap serialization and ZMQ on your framework. That doesn't make it an OS.

And no, the usual robotics stack is not going away anytime soon. Maybe develop some actual useful robots before posting like an expert on robotics topics.


I enjoy Hotz as a hacker, but I'm really allergic to this kind of oversold language. "[W]e developed a proper successor to ROS" is a past tense statement, as if they've already done this thing. In reality, at best they have presented a roadmap for a thing that could approximate ROS one day.

The point of the bitter lesson is "leverage compute as best you can" not "use DNNs everywhere just because". Oftentimes your available compute is still a crappy ARM machine with no real parallel compute where the best DNN you can run is still not large nor fast enough to even be viable, much less a good fit.

And well some classical algorithms like A* are mathematically optimal. You literally cannot train a more efficient DNN if your problem needs grid search. It will just waste more compute for the same result.

Besides, the nav stack is not really the point of ROS. It's the standardization. Standard IPC, types, messages, package building, deployment, etc. Interoperability where you can grab literally any sensor or actuator known to man and a driver will already exist and output/require the data in the exact format you need/have, standard visualizers and controllers to plug into the mix and debug. This is something we'll need as long as new hardware keeps getting built even if the rest of your process is end to end. It doesn't have to be the best, it just needs to work and it needs to be widely used for the concept to make sense.


The future of consumer robotics will not be built on "a crappy ARM machine with no real parallel compute". Traditional robotics has failed to produce machines that can operate in the real world outside of strictly controlled environments, and more of the same isn't going to change that. Fast hardware for running DNNs will be a hard requirement for useful general purpose robots.

I agree that it'll be needed, but that hardware that can provide enough compute at acceptable wattage is yet to materialize. Only once that changes the equation will change. Today you'd be surprised how many production UGVs run off an actual Pi 4 or something in a comparable compute ballpark.

I believe the idea is that openpilot replaces the usual robotics stack with an end to end neural net.

While I agree operating system is usually a marketing term, it does feel correct in this case as it is the operating system for the Comma Three, which can operate cars but also this thing: https://www.comma.ai/shop/body


I definitely thought it was a ROS clone based on that first line.

ROS doesn't need a clone, it needs a successor.

Took the bait as well.


ROS2? I'll see myself out...

Could someone explain the joke? I've been dabbling with learning robotics and I've been confused by how ROS and ROS2 both appear to be actively developed/used. Is ROS2 a slow-moving successor version (like Python 3 was) or a complete fork?

Slow-moving successor, which the community isn't exactly going wild over. It offers modest improvements in exchange for a painful upgrade process, and many of the original issues with ROS1 remaining unsolved.

The other half of the joke is that ROS was never an operating system either.


Well there is one thing that ROS 2 does better, you can declare params directly inside nodes and reconfigure them all without building extra config files. And it doesn't stop working if your local IP changes.

But the rest are firmly downgrades all around. It's slower (rclpy is catastrophically bad), more demanding (CPU usage is through the roof to do DDS packet conversions), less reliable (the RMWs are a mess), less compatible (armhf is kill). The QoS might count as an improvement for edge cases where you need UDP for point clouds, but what it mostly does on a day to day basis is create a shit ton of failure cases where there's QoS incompatibility between topics and things just refuse to connect. It's lot more hassle for no real gain.


Config generally feels more complex though, since there isn't a central parameter server anymore. The colcon build system also just feels more complex now, which I thought was already impressively complex with catkin.

Yep it takes super long to get parameters from all nodes cause you need to query each one instead of the DDS caching it or something.

And yeah I forgot, there's the added annoying bit where you can't build custom messages/services with python packages, only ament_cmake can do it so you often need metapackages for no practical reason. And the whole deal with the default build mode being "copy all" so you need to rebuild every single time if you don't symlink, and even that often doesn't work. The defaults are all around impressively terrible, adding extra pitfalls in places where there were none in ROS 1.


It does a lot of things better but it also does a lot of things worse and also doesn't fix a lot of the real problems with ROS as a system.

ROS2 has been pushed a the successor to ROS for like a decade, and people still prefer ROS for various reasons. So yeah like Python 2/3 kinda.

No it’s much worse, python3 was all round better, it just took a while to get all your dependencies ported which made the transition hard. Judging by the comments it doesn’t seem like people agree that ROS2 is even all round better from ROS.

It's funny this topic came up today because I have a group of students working on a ROS2 project and at our meeting this afternoon they had a laundry list of problems they've been having related to ROS2. I'm thinking our best option is to use ROS1...

You're right ROS2 isn't all round better than ROS so the transition will never happen fully.

FWIW I'm working on an actual replacement for ROS, I'll post it to ShowHN one day soonish :P


Isn’t the software for training end-to-end NN to be used in automation? Just a first version that it’s used for cars, and they have been using it for their own robot.

So the claim still stands?


The docs (https://docs.comma.ai/) begin with a more honest - and useful - description:

openpilot is an open source driver assistance system.


Yeah came to say the same, I thought a new big player is in the market. It looks great nonetheless.

I launched gotit.pub [1] last year. It's very much the same thing.

[1] https://gotit.pub


Wow, how is this not getting enough attention when it is almost the same thing?!

Because so are PubPeer and SciRate which exist for much longer. (And neither those are getting much attention either.)

> This is equivalent to applying a box filter to the polygon, which is the simplest form of filtering.

Am I the only one who has trouble understanding what is meant by this? What is the exact operation that's referred to here?

I know box filters in the context of 2D image filtering and they're straightforward but the concept of applying them to shapes just doesn't make any sense to me.

Can someone clarify?


The operation (filtering an ideal, mathematically perfect image) can be described in two equivalent ways:

- You take a square a single pixel spacing wide by its center and attach it to a sampling point (“center of a pixel”). The value of that pixel is then your mathematically perfect image (of a polygon) integrated over that square (and normalized). This is perhaps the more intuitive definition.

- You take a box kernel (the indicator function of that square, centered, normalized), take the convolution[1] of it with the original perfect image, then sample the result at the final points (“pixel centers”). This is the standard definition, which yields exactly the same result as long as your kernel is symmetric (which the box kernel is).

The connection with the pixel-image filtering case is that you take the perfect image to be composed of delta functions at the original pixel centers and multiplied by the original pixel values. That is, in the first definition above, “integrate” means to sum the original pixel values multiplied by the filter’s value at the original pixel centers (for a box filter, zero if outside the box—i.e. throw away the addend—and a normalization constant if inside it). Alternatively, in the second definition above, “take the convolution” means to attach a copy of the filter (still sized according to the new pixel spacing) multiplied by the original pixel value to the original pixel center and sum up any overlaps. Try proving both of these give the answer you’re already accustomed to.

This is the most honest signal-processing answer, and it might be a bit challenging to work through but my hope is that it’ll be ultimately doable. I’m sure there’ll be neighboring answers in more elementary terms, but this is ultimately a (two-dimensional) signal processing task and there’s value in knowing exactly what those signal processing people are talking about.

[1] (f∗g)(x) = (g∗f)(x) = ∫f(y)g(x-y)dy is the definition you’re most likely to encounter. Equivalently, (f∗g)(x) is f(y)g(z) integrated over the line (plane, etc.) x=y+z, which sounds a bit more vague but exposes the underlying symmetry more directly. Convolving an image with a box filter gives you, at each point, the average of the original over the box centered around that point.


There’s a picture of the exact operation in the article. Under “Filters”, the first row of 3 pictures has the caption “Box Filter”. The one on the right (with internal caption “Contribution (product of both)”) demonstrates the analytic box filter. The analytic box filter is computed by taking the intersection of the pixel boundary with all visible polygons that touch the pixel, and then summing the resulting colors weighted by their area. Note the polygon fragments also have to be non-overlapping, so if there are overlapping polygons, the hidden parts need to be first trimmed away using boolean clipping operations. This can all be fairly expensive to compute, depending on how many overlapping polygons touch the pixel.


OK, so reading a bit further this boils down to clipping the polygon to the pixel and then using the shoelace formula for finding the area? Why call it "box filter" then?


It’s very useful to point out that it’s a Box Filter because the article moves on to using other filters, and larger clipping regions than a single pixel. This is framing the operation in known signal processing terminology, because that’s what you need to do in order to fully understand very high quality rendering.

Dig a little further into the “bilinear filter” and “bicubic filter” that follow the box filter discussion. They are more interesting than the box filter because the contribution of a clipped polygon is not constant across the polygon fragment, unlike the box filter which is constant across each fragment. Integrating non-constant contribution is where Green’s Theorem comes in.

It’s also conceptually useful to understand the equivalence between box filtering with analytic computation and box filtering with multi-sample point sampling. It is the same mathematical convolution in both cases, but it expressed very very differently depending on how you sample & integrate.


OK, I get it now. Thanks for the explanation. I would just never in a lifetime call it a "filter". That's extremely poor naming.

If they called it a choice of basis or influence function, it would've been so much clearer.


Oh interesting, I hadn’t thought about it, but why does it seem like poor naming? I believe “filter” is totally standard in signal processing and has been for a long time, and that term does make sense to me in this case because what we’re trying to do is low-pass filter the signal, really. The filtering is achieve through convolution, and to your point I think there might be cases in which referring to the filter function as a basis function occurs.

I would think that conceptually that a basis function is different form a filter function because a basis function is usually about transforming a point in one space to some different space, and basis functions come in a set that’s the size of the dimensionality of the target space. Filters, even if you can think of the function as a sort of basis, aren’t meant for changing spaces or encoding & decoding against a different basis than the signal. Filters transform the signal but keep it in the same space it started from, and the filter is singular and might lose data.

May be better if I just link to what others say about filters than me trying to blabber on https://en.wikipedia.org/wiki/Filter_(signal_processing)


It is more similar to the convolution of the shape with the filter (you can take the product of the filter, at various offsets, with the polygon)

Essentially if you have a polygon function p(x,y) => { 1 if inside the polygon, otherwise 0 }, and a filter function f(x,y) centered at the origin, then you can evaluate the filter at any point x_0,y_0 with the double-integral / total sum of f(x-x_0,y-y_0)*p(x,y).


This kind of makes sense from a mathematical point of view, but how would this look implementation-wise, in a scenario where you need to render a polygon scene? The article states that box filters are "the simplest form of filtering", but it sounds quite non-trivial for that use case.


If it essentially calculates the area of the polygon inside the pixel box and then assigns a colour to the pixel based on the area portion, how would any spatial aliasing artifacts appear? Shouldn't it be equivalent to super-sampling with infinite sample points?


It literally means that you take a box-shaped piece of the polygon, ie. the intersection of the polygon and a box (a square, in this case the size of one pixel). And do this for each pixel as they’re processed by the rasterizer. If you think of a polygon as a function from R^2 to {0, 1}, where every point inside the polygon maps to 1, then it’s just a signal that you can apply filters to.


But as I understand it, the article is about rasterization, so if we filter after rasterization, the sampling has already happened, no? In other words: Isn't this about using the intersection of polygon x square instead of single sample per pixel rasterization?


This is about taking an analytic sample of the scene with an expression that includes and accounts for the choice of filter, instead of integrating some number of point samples of the scene within a pixel.

In this case, the filtering and the sampling of the scene are both wrapped into the operation of intersection of the square with polygons. The filtering and the sampling are happening during rasterization, not before or after.

Keep in mind a pixel is an image sample, which is different from taking one or many point-samples of the scene in order to compute the pixel color.


It is applying the filter before rasterization, and then taking a single sample of the filtered signal per pixel.


The problem is determining the coverage, the contribution of the polygon to a pixel's final color, weighted by a filter. This is relevant at polygon edges, where a pixel straddles one or more edges, and some sort of anti-aliasing is required to prevent jaggies[1] and similar aliasing artifacts, such as moiré, which would result from naive discretization (where each pixel is either 100% or 0% covered by a polygon, typically based on whether the polygon covers the pixel center).

[1] https://en.wikipedia.org/wiki/Jaggies


Counterargument: Atlassian offers a very similar product and is worth $50B. Of course they've got JIRA but everybody hates it.

GitLab (or whoever ends up buying it) could acquire Linear.app for example and eat Atlassian's lunch.


This looks pretty awesome but, speaking only for myself here, the thing I actually want is just Webflow but without the BS and predatory pricing.

A visual editor that produces plain old HTML, CSS, JS and that anyone in our company can use to make changes to pages or create new ones. That's it.

I don't think it exists (if so, pointers would be very welcome!), so here's my comment to incentivize someone to build it.


I agree. What I personally would love is a WYSIWYG front end to a static site generator that uses eex or erb. If the tool is sufficiently open source and works well with some hand tweaking of generated HTML, then eex/erb isn't strictly necessary.

I'm optimistic about this though, because my suspicion is that since this tool just exports React, you could relatively easily achieve this using Next.js SSG building. As long as you aren't doing any build time or runtime dynamic data loading, by adding one more step you can use this for that, with the bonus that if complexity goes up to the point where you would want to componentize, your tool is ready for that.


Pinegrow Web Editor and Bootstrap Studio could fit the bill. No subscription, no cloud, one-time purchase. Exported HTML is fully readable and editable outside the app.


> because my suspicion is that since this tool just exports React, you could relatively easily achieve this using Next.js SSG building

At the core, we generate pure HTML and CSS, then serialize those into React and Tailwind. It would be one less step to expose the HTML and CSS instead. I wanted a narrow scope to this so that's the focus but I imagine there's a plugin setup we can do to swap in what framework (or non-framework) you would need.


"the thing I actually want is just Webflow but without the BS and predatory pricing" - checkout Webstudio, it's free and open source - https://webstudio.is/


You can try to use auto-cms [1], it works on plain HTML and light js-enabled interactions.

[1] https://github.com/beenotung/auto-cms


I'm not super deep into web development and since it doesn't have a demo or any other visual preview unfortunately I don't understand what this does or how it could be useful. Let alone how to "install" it for "my website".


but this is a tough problem with constant maintenance involved and you want it solved for free and handed to you on a silver platter

have you considered paying developers or supporting open source software? I doubt it.


I get where you're coming from. FWIW, at many points I considered just paying for Webflow but their pricing is just nuts. I'm simply not going to pay ~ $1k per year for them to host my static site with a couple 1000 visitors per month.

If Webflow had a competitor with same functionality and reasonable pricing I'd gladly pay for that.


you are missing the point: if you don't have a business case for spending $1k/year then your problem is not worth solving


you have no idea what you're talking about, do you?


There's really not much maintenance for maintaining simple static assets.

Once you build the pipeline, it's not that hard to maintain CDN releases.

GP was just voicing a wish that a certain piece of software or service as described without predatory pricing exists.

They never said they wanted it for free, you did.


Here'a feature request in case people from Anthropic are listening:

I'd like to dump our product knowledge base (emails, tickets, support articles, etc..) into a Project and then create an embeddable chat-bot from it that can handle customer support right from our website.

Most support requests we get are similar or equal to previous requests. It could help our support staff save a lot of time to focus on the requests that actually matter.


Leveraging thermodynamics to more efficiently compute second-order updates is certainly cool and worth exploring, however specifically in the context of deep learning I remain skeptical of its usefulness.

We already have very efficient second-order methods running on classical hardware [1] but they are basically not being used at all in practice, as they are outperformed by ADAM and other 1st-order methods. This is because optimizing highly nonlinear loss functions, such as the ones in deep learning models, only really works with very low learning rates, regardless of whether a 1st or a 2nd order method is used. So, comparatively speaking, a 2nd order method might give you a slightly better parameter update per step but at a more-than-slightly-higher cost, so most of the time it's simply not worth doing.

[1] https://andrew.gibiansky.com/blog/machine-learning/hessian-f...


Agreed that it's very cool, and also about how hard it is to make second order methods worthwhile. We're just using such huge datasets that sometimes it's hard to even get a decent estimate of the gradient for a minibatch. Getting a useful estimate of second order information over the dataset is even harder, especially when the whole point of using minibatches is computational feasibility.


Those are valid points! Hessian-free (HF) optimization is a really nice method, but as you say remains costly so people don't use it. The key idea in this paper is that if you are able to solve linear systems faster by using an analog device, the cost of a HF-like method is brought down, so the method can become competitive.

About the noise, it is true that the second-order information will be noisier than the gradient for a given batch size (and a lot of results out there for HF optimization are with impractically large batch sizes). In the paper we use relatively small batch sizes (eg 32 for the fine-tuning example) and show that you can still get an advantage from second-order information. Of course it would be interesting to study in more detail how noisy 2nd order information can be, and on more datasets.


At the risk of missing something completely obvious: where's the "emulator" part in it? What is it emulating? It looks like a terminal (no emulator) to me. Not trolling, just trying to learn.



Thank you. I think it's a bit misleading then to call it a "terminal emulator". It's not really emulating one of those old boxes, is it? It's providing a text interface to a computer's functions. That's great but it's not emulating anything.


It emulates them in the sense that they implement a virtual display and keyboard for the same wire protocols those old boxes used. Those protocols are mostly plain text, but also escape sequences for cursors, colors, layout, etc. Check out the VT100 sequences for example.


Like all similar software, it's emulating a physical terminal: https://en.m.wikipedia.org/wiki/Computer_terminal.


If you can get through this, I highly recommend it.

https://www.linusakesson.net/programming/tty


"Terminal" used to refer to a dumb computer that would send user input to a remote mainframe and receive instructions for drawing to the screen.


If you don't want to work hard (10h+ per day) at a startup, then don't. Nobody's forcing you. Working at a startup and finding it's not for you? Just quit! You can just walk out the door. Nobody's forcing you to stay.

Germany is really great in treating its citizens like infants and it shows.


Don't play naive.. the laws and regulations are not put in place for that percent of business owners/managers that would not abuse their employees out of principle even when they can, it's for the other ones that do.


There are actually employees a company is allowed to abuse more than their "regular" workers, namely managers.

If there was a legal category for "startup employee", with significantly fewer protections, but who can only be employed with a minimum wage of e.g. 3x the average income, would you object to that?


TBH that sounds like a nightmare to debug.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: