Hacker Newsnew | past | comments | ask | show | jobs | submit | bcantrill's commentslogin

It took all of my self-control to not put "plan" in quotes in the submission: this is a sloppy, sorry excuse for a plan that involves somehow magically extracting a $40B bailout from Intel's customers, which Barrett confusingly names as "Nvidia [sic], Apple, Google." What do these three companies buy from Intel, exactly? (Ask Google about their affection for Intel after how Mt. Evans was treated!) Or is it that Barrett knows that these aren't Intel customers, but that they are TSMC customers should conscript them to bailing out Intel?

But wait, there's more: not only are "customers" shaken down to the tune of $5B (you only need 8 of them!), Barrett's "plan" also calls on 50% (!!) tariffs for semiconductors. This plan is a joke, and an example of the toxic Intel exceptionalism that got them so deep into the mess they find themselves in.


So (foolishly?) I had never bothered to check Glassdoor, because we have been so upfront about our compensation.[0] But apparently that was a mistake! If it needs to be said, the Glassdoor numbers are comically wrong -- and in fact the high number in the "Engineer" range was quite a bit less than what we payed everyone six years ago!

[0] https://oxide.computer/blog/oxides-compensation-model-how-is...


That’s fair.

First I’m going to make an argument and then immediately refute it before someone else makes the argument. That $235K is still lower than what mid level developers make at any of the BigTech companies.

Yes, that’s true. But they are all toxic hellholes where everyone is jockeying for position, making sure they show “impact” that looks good on promo docs and they all have RTO mandates even for positions that were formally “field by design”.

$235K and the ability to work remotely is something I would definitely think is fair (and a little more than I make now that I’m outside of Bigtech working remotely) as long as you give cost of living increases and is more than most developers will ever make inflation adjusted.

The other point you make is that performance is just a form of stack ranking and even hard work is usually just awarded with a 1-2% raise more than someone else gets. Why not separate it from comp?

I also like sales having variable compensation that is based on performance. I work somewhat as a post sales architect and I have an appreciation for the sales side more than most engineers.


Big tech companies have mostly figured out some kind of infinite money glitch. I don't think comparing compensation at them to startups makes a huge amount of sense. (It makes some sense from the perspective of how much money's coming into your bank account—just not a huge amount.)

Being a prominent face at a startup can also set you up for greater success in your career than being one of a hundred thousand at a bigco.

And as you touched on already, the environment at Oxide is a million times better than the toxicity and empire building that happens at bigcos.


Well, good news: we have one of those too![0]

[0] https://oxide.computer/blog/hubris-and-humility


Oh I am well aware. But I am hoping to run dynamic workloads, including virtual Linux machines, on a PC. It's a bit of a different world.

Latest one still in my to-read pile: https://lwn.net/Articles/1022920/


You want to run dynamic workloads on a PC? As in a desktop PC? That is clearly a completely different market than Oxide serves.

Or do you mean PC as in rackmounted servers? If that's what you meant, PC is a very poor word for it. That's kind of the point Oxide made from the beginning. Why are you running server workloads on a PC with a funny shape? Why do you need 84 power supplies (2/shelf) in your rack? Why do you need any keyboard or graphics controllers? Why don't you design for purpose a rack-sized server?

Or did you mean exactly what you wrote: "a PC"? You only need one server, not a whole rack's worth? Again, that is not the market Oxide is targeting.

Or you need to be able to run "dynamic workloads" that could require 40-4000 CPUs? You need hypervisors and orchestration, etc.? And you don't want them to be Solaris, or to run on Solaris? And you know all about Hubris and you don't want that either? But you think it would be nice if they weren't Linux? Maybe if they were modern microkernels written in something like Rust? But not the Hubris microkernel written in Rust?

I'm going to have to take you at your word. Your needs are "a bit of a different world" than Oxide fits.

But it's pretty cool that you still got some friendly personal attention from two big-name Oxide employees who seem willing to try to help you if they can. If you ever do find yourself in a world that aligns with theirs it appears that they are willing to try to accommodate you.


We're talking about healthy competition for Linux, Rusty microkernels, and I'm saying Hubris is not what I'm looking for because of the stated reasons. Hubris workloads are defined at build time and it does not target x86.

When I say PC I mean the large ecosystem of compatible performant hardware that exist out there, as opposed to e.g. RISC-V at this stage.


This "job" that you speak of, that you are so good at. Are you... at it now?


> This "job" that you speak of, that you are so good at. Are you... at it now?

Hello, Bryan! No, I'm on my own dime, no one else's. But it's decent of you to show such concern over my situation, in these uncertain times.


Check out Oxide and Friends[0]! We've been doing it for several years now, and it's a much more flexible format that allows the team to be heard in its own voice -- and allows us to weigh in on whatever's on our collective mind.

[0] https://oxide-and-friends.transistor.fm/


Because VCs need to have a way to sell their shares within the (limited) lifetime of their fund.


We didn't use anything from OCP. When we first started the company, we thought we might use the enclosure (and considered ourselves "OCP inspired"), but there ended up being little value in doing so (and there was a clear cost). And on the stuff that we really cared about (e.g., getting rid of the traditional BMC), we were completely at odds with OCP (where ASPEED BMCs abound!).

So in the end, even the mechanical and power didn't come from OCP. We clearly build on other components (we didn't build own rectifiers!), but we absolutely built the machine from first principles.


My read on the situation is that you copied OCP only inasmuch as OCP made some observations about physics and you made the same observations.

The most obvious change is that you guys use half-width, full length enclosures.

But I'm realizing now that I haven't looked at the OCP specs in a long, long time. I recall the common DC rail from some early Facebook papers on this topic, and I thought they were pretty similar to how you plug into power.

I just looked at the OCP power connector, and I would have lost any bet anyone was willing to make me about what they looked like. That's not at all where I thought we were. I think I understand now why you guys went to such pains getting the keyed connectors to work exactly right. Their power connectors look like something from a scifi movie, and not in a good way.


How does the Oxide 48V architecture compare to what is known about the Google/OCP architecture? Does Oxide use single stage conversion, intermediate 12V buses, or ??


I'm not someone who works on this part of the product, but we talk a little about this stuff here:

* https://oxide.computer/blog/how-oxide-cuts-data-center-power...

* https://docs.oxide.computer/guides/introduction

I feel like we had a good Oxide and Friends on this too... https://oxide-and-friends.transistor.fm/episodes/bringing-up... has some info about our power setup.

Anyway, I barely know anything about this topic, but I think this answers your immediate question: we convert AC -> DC once at the rack level, and then use a bus bar to distribute that to each sled. Each sled also has a converter to convert that 54V down to 12V for its own bus within each sled.


Probably the sweet spot, since you can get to market fast with known 12V designs, but still enjoy the possibility of later announcing you've made the sled even more efficient by getting rid of the intermediate voltage!


At a certain point in EE power design you don't really want to go from 54V -> point of load for every rail (1.8V, 1.1V, 0.9V, SVI3 rails etc), so sticking with an intermediate voltage makes sense often even when viewing this from an efficiency perspective. Voltages such as 54V require different creepage and clearance requirements, so saddling every point of load regulator (of which we have many many!) with those requirements is often detrimental to an already complex board layout. Picking something like 12V or 24V as an intermediate voltage helps balance those requirements with the amount of copper you need for power delivery since the parts use low voltages but are extremely power hungry so your current at the point of load rail is a lot. This also means that your point of load regulators have to be distributed around the board near their loads otherwise the copper losses and noise would become problematic.


The thermal dissipation is a function of the current squared. The heat in the conductor is a function of the size of the conductor and the surface area for heat dissipation. So these high current common rail systems you can sometimes see in youtube videos for power distribution, have great honking bars of copper in them. And in most of the videos I've seen, the video is about someone screwing one of these up, damaging the bar, and now the electrician has to wait for a new one to arrive, because they are shipped from far away and they are expensive per pound, and the dumb things weigh many pounds.

So you don't actually want the power to be at 12V for very long in a power dense rack. Their spec sheet says that each rack can pull 15KW. And that's wired for 208 or 3-phase power. That's 10 hair dryers of power per rack, so yeah maybe you shouldn't step it down until the last responsible moment.

Do any parts of the rack run at the full 54V? That would make for some very nice cooling fans.


In Oxide's design, we do have a 54V DC busbar so that's what the rectifiers put out, and runs vertically up and down the back of the rack. The power connection into each of the cubbies for the sleds, and the power into the sidecar switches connect to this bus bar at 54V. Each of these assemblies has an intermediate bus converter IBC that does the 54->12V conversion on board and 12 and other lower rails are used for the various supplies required.

We do run the 54V to our fans (in both sleds and switches) without additional DC-DC conversion as those can be fairly power hungry and we can buy reliable fans that are rated for this voltage.

Our sleds actually don't connect directly to the bus bar to help mitigate some of the "oops" factor as they're going to be potentially mated and unmated buy customers as they reconfigure, upgrade or support things. The sled cubbies are wired to the bus-bar and support hot insertion of the sleds. And yes, while possible, an in-field replacement of the bus bar wouldn't be fun, but in our design it's a big copper bar hidden away so the risk of damage or dropping stuff into is minimal.

So our 12V IBC design gets us into more normal range for commodity point-of-load supplies, and balances the losses due to higher current at 12V vs the complexity of dealing with 54V all over the boards. For the AMD parts, we also have to have supplies that deal with SVI2 or SVI3 where the part itself can adjust its voltage at run-time for efficiency. These are pretty complicated devices (like the RAA228218) that we're happy to not have to design ourselves and they have expected operating envelopes for their supply-side rails that don't work at 54V.


One of the things I recall from automotive discussing the switch to 48v systems as a route to hybrid vehicles, is that the coils in motors and generators can be smaller when the voltage is higher. An example they gave is that the alternator could drop about a pound of copper out of its windings. Do the 54v fans have smaller hubs?


Does Oxide require 3-phase power? Do datacenters typically provide 3-phase power?


Their spec sheet shows 208 and 3 phase as options. 3 phase is smaller wiring, and with 15KW per rack I could see how that would quickly become a problem.


It's certainly the current mainstream style to have an intermediate voltage rail of 12V or more. But this OCP talk from a few years ago was interesting, showing a prototype direct 48V-1V conversion with high efficiency.

https://www.youtube.com/watch?v=JQHiKIfrwI0


Yes it certainly can be done, but there's a cost and design complexity with doing that too. I did a quick count of gimlet (our server sled's) power rails and got to over 26 different power domains, and I probably missed a few in my quick scan! It's unclear to me if the efficiency gains from re-doing these with something more exotic to go from 54V would make enough of a difference to justify doing so, and we'd still end up with some stuff like the SVI2/3 controllers needing an intermediate rail (or have to go design those ourselves too) and some analog rails needing LDOs for noise rejection reasons etc. As mentioned before, creepage and clearance at higher voltages cascades into layout complexity and pain if you have to run it everywhere on the board: but for the same reasons we're talking about this, we can't very well do a 54V:1V conversion in the back of the sled and run it all the way to the front- losses, noise etc.

As with all things engineering is a series of tradeoffs and right now, an IBC from 54V to 12V has been a reasonable design point for us.


Ah OK, thanks for clarifying! I'm glad to see these kinds of machines being built, especially with so much open source software


Just because it's hopelessly on-brand for us to offer up a podcast episode for everything, you may also be interested in our Oxide and Friends episode on RFDs with our colleagues Robert Mustacchi, David Crespo, Ben Leonard, and Augustus Mayo.[0]

[0] https://oxide-and-friends.transistor.fm/episodes/rfds-the-ba...


I need to dive back in. Lots of distractions recently.


Bryan you absolute legend. You give the best technical seminars i've ever watched (& countlessly rewatched). Ty for inspiring a generation of engineers. Best of luck with everything at Oxide!


What was it?


This is hot! I -- like maybe everyone at Sun in the late 1990s and early 2000s? -- had a soft spot for SunRay. The original SunRay demo from Duane Northcutt to the Solaris Kernel Group in February 1999 (when it was a Sun Labs project code-named Corona) was just... jaw-dropping. Later, it was a point of personal pride that one of the first, concrete, production use-cases for DTrace came on a SunRay server (an experience that we outlined in §9 of our USENIX paper[0]). I'll always be sentimental about SunRay -- and Sun's misexecution with respect to SunRay was a lingering disappointment for many of us.

[0] https://www.usenix.org/legacy/publications/library/proceedin...


For an extended, colorful telling of the experience outlined in that paper, check out Bryan's earliest recorded talk on DTrace, particularly the section starting at the timestamp in this link: https://www.youtube.com/watch?v=TgmA48fILq8&t=35m6s


Oracle killed it and we all moved to Windows PCs.


Long before Oracle killed it, Sun fumbled it, sadly. The failure of SunRay to live up to its potential -- and it clearly had tremendous potential -- was Sun at its most frustrating: the company tended to became disinterested in things at exactly the moment that really called for focus.

As a concrete example, the failure to add USB printing support killed SunRay at airline kiosks in the early 2000s. American Airlines was the first airline to adopt kiosk-based check-in; they were very hot on SunRay, but needed USB printing. When American found out that Sun had just gutted the team (including everyone responsible for USB support!), they (reluctantly!) used Windows-based PCs instead. Sun tried to put the group back together, but it was too late -- and every airline followed American's lead.

Could/would SunRay have been used for airline kiosks? There are reasons to believe that it would have -- and it was certainly a better technical fit than an entire Windows PC.

There were examples like this all over the place, not just with SunRay but at Sun more broadly; despite the terrific building blocks, Sun often lacked the patience and focus to add the polish needed for a real product. (Our frustration with Sun in this regard led us to start Fishworks in 2006.[0])

RIP SunRay -- and what could have been!

[0] https://bcantrill.dtrace.org/2008/11/10/fishworks-now-it-can...


What would USB printing on a SunRay look like? Even in general, how do thin clients work with accessories and the like? It does feel like there's some tension between "this USB device is plugged directly into a computer" and "the computer is not 'the' computer"?

Seems like a tricky problem but clearly at least some of it was solved given USB ports were on the machine


What's tricky about it? You can use USB over IP on Linux in the real world today, so I can't see any reason why you couldn't pass devices from the thin client to the server in the same way. The only slight oddity is that you would pretend that it was a USB hub that got plugged and unplugged whenever the session connected and disconnected.


Well GP mentioned that there _wasn't_ USB printing support, right? If USB over IP "just worked" and the sunray had USB over IP support then there's USB printing support right?

But I just realized the "USB printing support" stuff was maybe less about USB printers themselves and more about being able to have, say, 30 thin clients with 30 different printers hooked up but the application would know which printer was the right one (instead of showing 30 printers available, for example)


> Long before Oracle killed it, Sun fumbled it,

Oracle did try to monetize SunRay but for whatever reason it didn't meet their profit threshold. It was fantastic technology and I'm almost certain I still have the dual monitor variation in my basement somewhere.


The point behind SunRay was to give organizations 100% control, not 99.9%, 100% over everything any of their computers were used for, like a mainframe. Of course, this made them 99.9% unusable. Because they sucked at what they were meant for, not because of Sun, but because of cheapass management under provisioning the mainframes, and they just couldn't do anything else. This meant they were unable-to-scroll-one-page-down-in-a-list-in-a-minute level of unusable in practice.

Occasionally you'd find one where the security was about as well executed as the function they were meant for and there was some fun to be had, but not much.

I find it hard to have much sympathy for SunRay. Their advantages were supposed to be price, but they were never cheap, and security, but that required hiring engineers that understand mainframe unix security, which management just didn't do.


Thanks for the kind words (I think?), and you're right that I don't use LLMs to write. (I do use them to read near-final drafts -- the most substantive result in this case was the deletion of a paragraph.) As for the substance abuse allegations: sorry to be boring, but just coffee and Diet Coke, I'm afraid.


Sorry about my comment, it didn't add anything to the conversation.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: