Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sucks that that there's no ECC-RAM model. A phone-sized x86 slab, as opposed to those impractical mini-PC/Mini-Mac boxes, that one could carry around and connect to a powerbank of similar size, and/or various types of screens (including a smartphone itself), would make for a great ultramobile setup.


Odroid H4 family (H4, H4 Plus, H4 Ultra) supports in-band ECC, which supports one-bit error correction and two-bit error detection. And the 8-core model is just $220 (+case, +heatsink/fan, +shipping, but oh well)


Is the kernel support for those still awful or has it gotten better? Its been a long time since I had an odroid... C1 I think


The Odroid H4 is an amd65 board like the N150 NUC. So the kernel should be standard amd64.


It has UEFI and runs mainline Linux, it's an Intel reference design for Alder Lake-N with a couple additions (SATA controller is 3rd party for example)


If you want relatively small low-power box with ECC, checkout Asustor AS6804T. It is nominally a NAS but really you can use it for anything you want, it is just an x86-64 server with some disk bays. You also get nice 2x10GbE, which is rare with these minipcs


But the price of that is $1200, which is about 5 times the price of the average N150 mini PC.


Minisforum N5 Pro AI NAS isn't substantially cheaper but the performance matches the price premium over a potato PC. Tho DDR5 ECC SO-DIMMs are obscenely expensive right now.


If it had a a few more cores, something like this would make for a great node in a distributed system like k8s or ceph for a homelab. At the asking price, however, one could also cross shop an HP micro server gen11.


Odroid H4 Ultra? It has 8 Gracemont cores that can stay boosted for quite a long time, and supports in-band ECC. 4x SATA too for those who care.


I like to pretend options without ECC simply do not exist. (i.e. as it should be)

It shortens the list of options, making choices much easier.


Bring back the Intel Compute Stick? https://liliputing.com/this-cheap-intel-n150-mini-pc-is-smal...

Arm RK3399 SoC is blob free and some (Pinephone Pro, N4S, Chrome tablet) devices are small enough for sidecar usage.


How many times do you think ECC RAM has caught an error? Online anecdotes I've found indicate almost no one experiences regularly corrected errors that weren't due to imminently failing hardware.


I've managed a couple thousand servers with ECC. The vast majority had zero reported errors the whole life. Of those that reported errors, there were a few categories:

Some reported a couple errors a day for months (maybe years?) but worked fine.

Some ramped up error counts over hours or days.

Some went from zero to lots in one step.

A few managed to hit uncorrectable errors; sometimes just once.

For a small number of correctable errors (< 10/day), there was no action needed, or one uncorrectable, but that kind of failure is what drives people without ECC crazy; some of the machines that hit an uncorrectable only did it once and were fine. The other ones we'd replace ram for. A small number of daily errors or a single uncorrectable were less common than the ones that got their ram swapped. I don't know for sure if uncorrectables correlated with many correctable errors, because correctable errors were only reported hourly ... if it was a step change to bad ram, it's likely to halt before a reporting interval, so no report. Unless the correctables were several a second, the impact of corrections isn't obvious.


For a small number of correctable errors (< 10/day), there was no action needed,

Those should've been replaced, so in other words ECC is just a crutch. All the RAM problems I've had were found by Memtest86.


Why replace when the system is stable? I guess there may be an increased chance of multibit errors. But sometimes new ram is flakey or disturbing the rack causes other problems.

Is ECC a crutch? Sure. But it's hard to walk with a bum leg/bad ram, so why not have it? (Cause it's expensive is a fine reason, but if it were closer to 25% more than 100% more, it'd be easier to say yes)

Memtest86 is great, but systems change and most people aren't running memtest frequently. On my non ecc systems, I run it during setup to make sure things are good, and only later if things get crashy... but if things get crashy because of bad ram, my data may already be corrupted.


Fun fact: DDR6 contains built in ECC by default. RAM sizes are getting so large it's causing issues in the field and also issues with yields

So, the industry thinks its a problem.


DDR5 has built in ECC too. Unfortunately, AFAIK there's no error reporting mechanism, so while it should reduce error rates, it likely increases error severity. Assuming no bitflips between the ram module and the cpu, ECC on the ram corrects any single bitflips, but multiple flips are uncorrectable and must pass through, so any incorrect value the cpu gets has multiple bitflips.


In other words, the industry has gone to shit as usual, starting with rowhammer.

But my question still stands.


> imminently failing hardware

Are you under the impression that ECC is for catching software issues? This is precisely what I want ECC for: to let me know a stick of RAM is failing on me before I let it silent corrupt my fucking data for months on end until it completely dies.


I feel like userbinator is expecting that a failing stick will go from working to failing so hard you'd notice, with or without ECC; so the corruption would be time limited. My experience with ECC suggests that many, maybe most of the failing sticks probably would fit that, but some of the failing devices only threw a few errors a day for months and we continued to use them until retirement; because replacement is intrusive and a few corrected errors a day didn't hurt anything... had a non-ECC stick failed in the same way, chances are you wouldn't notice in a timely fashion.

That said, I don't run ECC in my home. I'm not willing to spend the premium in dollars, performance, or time to do it. My storage servers are all ex-desktops and I try to chase performance in a budget, ECC ram usually doesn't run at high speed and it often costs at least twice as much... that doesn't make sense for a desktop, so my servers suffer too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: