I ran a cluster of ~30k blade based computers booting entirely off iPXE. They didn't have any onboard ssd/disk storage or ECC memory. Every day, a few of them would randomly lock up, they'd reboot with a fresh network image and keep on humming.
Indeed. Although, sometimes the machine wouldn't fully crash. It was like the disk was corrupted, but apps were still running, which makes me suspect it was the lack of ECC.