Hacker News new | past | comments | ask | show | jobs | submit login
The source of the e1000e corruption bug (2008) (lwn.net)
114 points by davegauer on July 24, 2020 | hide | past | favorite | 31 comments



> But the other one was just as important: the e1000e driver should never have left its hardware configured in a mode where a single stray write could turn it into a brick.

... because, like, it's totally okay for the hardware to be bricked by a single stray write?

Make like Microsoft and blame the driver. :)

It should be a basic manufacturing test to repeatedly write pseudo-random garbage to the entire I/O register space, and check that the hardware never gets into a state that is not recoverable by either a simple power cycle, or, failing that, a factory reset.


A long time ago in college I was in a silly play (as an actor) and also helping build a robot for the performance: I did the software, and a friend of mine did the hardware (and notably, he was like "a god of hardware"). One day, the robot caught on fire... like smoke seriously started coming out of it and we had to quickly turn it off and the logic board was charred.

My friend, whom of course knew his hardware was perfect, immediately looked at me and was all "what did you do?!" and I was super confused as all I had done was written some code to let us control it remotely with a game controller. We go through my code, and he finds a place where I set two variables to true at the same time, and he explains to me that by doing that I shorted the robot's power transformer and made it catch on fire... :/.

What I am mostly known for these days is working on "jailbreaking" of iOS and Android devices. I can tell you that many early Android devices were easily bricked by just flashing a bad restore image from restore mode after having flashed a bad normal image. iOS is super resilient, and we have only very seldom managed to truly brick a device... one story, though, is quite topical: pod2g was demonstrating a new fuzzer he had written to do just what you describe to one of the NAND controllers to look for exploitable bugs... but on stage during the demo it managed to find just the right sequence of register changes to brick his demo device ;P.

Honestly, given the mental model of most hardware developers and how much trust is put in drivers, I'd just be glad that the e1000e merely bricked itself instead of burning your entire house down ;P.


Driving all FETs in an H-bridge with a microcontroller to save parts costs.

pop smoke

Didn't save part costs.


You jest, but there's truth here: it's disingenuous (as I suspect the GP knows) for GP's friend to claim that the hardware is perfect when in fact it's missing a simple protection against software bugs causing it to catch fire.


If I had a dollar for every time a silicon or analog designer said "well don't do that" when I identify a combination of register writes that will damage the hardware...


Oh no, I'm not joking at all. I did just this a few times: I also wrote the firmware which blew the fragile hardware up.


> What I am mostly known for these days is working on "jailbreaking" of iOS and Android devices.

Looks at username Oh hey, thanks for what you've done!


Androids based on the MTK platform are preferred by many in the modding scene because they are relatively easy to recover -- they have no boot loader lock by default, and plugging into USB while holding down a button will enter a recovery mode whose code is in actual ROM in the SoC.


While the rest of the world hates MTK devices because I've yet to find a single one that can go a week without having to have airplane mode toggled to resolve some weird radio/WiFi/Bluetooth bug...


It’s the same kind of people who “blame the user”. Only here other developers are the “user”.

It’s what made many many people afraid of computers and software.

It’s the same logic as having a TV catch fire when switching channels too fast.

The creator should apologize, thank the person who found the issue and fix it.

Things should not fatally break when interacted/operated with the publicly available controls. Worst case, just shut down.


Unfortunately people are often proud creatures and we often have emotional attachments to the stuff we build. So it can be hard for some people to take constructive feedback without getting personally offended.


also, the paranoia of writing regulator drivers :)


> write pseudo-random garbage to the entire I/O register space, and check that the hardware never gets into a state that is not recoverable by either a simple power cycle, or, failing that, a factory reset.

Anything with fuses or an PROM is inherently vulnerable to this problem. If the EEPROM is for user configuration, I agree, it should always be recoverable from total corruption. Many buggy UEFI firmware implementations are anti-patterns - the BIOS can be bricked by writing or deleting UEFI variables - they should not exist.

> It's totally okay for the hardware to be bricked by a single stray write?

Yes, it is.

Of course, not for user visible configuration, unfortunately, it's always true for low-level hardware configuration. You overlooked an important issue here - many devices have EEPROM for OEM configuration, not user configuration. The EEPROM stores low-level, critical configuration, such as I/O voltage, clock frequency, device serial numbers, etc. and by definition, a misconfiguration is often unrecoverable.

Toggle the wrong fuse bit, and you're dead. Any embedded programmer can recall the moment when they write the wrong data to the One-Time Programmable ROM, or accidentally disable the main crystal oscillator, and in an somewhat amusing case (saw it on a mailing list), turning on chip's Secure Boot feature without writing a public key.

Given an infinite number of random inputs, all OEM EEPROM settings will be corrupted and brick your hardware, inevitably. These OEM EEPROMs are exposed to the external world, just like normal registers. Ultimately, it's the responsibility of the device driver authors to protect them from modifications. Certainly, a hardware manufacturer should also take some responsibility and make such corruptions a bit harder in the field - for example, many hardware provides an explicit lock/unlock feature for protecting low-level configurations and registers. But it's still the responsibility of the device driver to lock them down.

In this case, Intel indeed provides an option to lock down the content of the EEPROM according to the article.

> The good news is that both bugs have been fixed. The e1000e hardware was locked down before 2.6.27 was released

But in previous versions of the Linux kernel, the lock was unused and EEPROM was left open. In this case, I'd say the hardware manufacturer didn't do anything wrong, it's the negligence of driver developers that allowed it to occur. Just like the articles' conclusion,

> the e1000e driver should never have left its hardware configured in a mode where a single stray write could turn it into a brick.

If the hardware only supports one possible mode, where a single stray write could turn it into a brick, then it's the fault of the hardware. But it's not the case here.

The only potential missstep of the hardware I can think of, is leaving the EEPROM in a writable state as its power-on default, instead of requiring an explicit unlock sequence. But I haven't checked the code or datasheets in question, so I can't tell whether it's the case. But still, RW-by-default, RO-by-request is common in many hardware devices (also software), and it's difficult to say it's a fault.

----

Update 1: I examined the e1000e driver from 2008 [0] in question. The situation is a bit different than I previously thought. According to the fix [1], the actual reason behind EEPROM corruption was not simply writing to the I/O memory in itself, but the race condition it creates.

> The EEPROM corruption is triggered by concurrent access of the EEPROM read/write. Putting a lock around it solve the problem.

In other words, if the commit description was correct, in the majority of cases, the EEPROM was corrupted not because of the write itself, because of an inadvertent write in the middle of an EEPROM read, and the solution was adding a spinlock before performing a EEPROM read/write.

If this is the failure mechanism, I'd say both Intel and the device driver are innocent, it's just a side-effect of an unforeseeable accident.

----

Update 2: Further analysis of the EEPROM read/write sequence in e1000e.

Upon device initialization, all registers and its EEPROM is mapped into the I/O memory by ioremap(). And to write data into the EEPROM, four different methods are used according to hardware types, this includes e1000_write_eeprom_eewr(), e1000_write_eeprom_ich8(), e1000_write_eeprom_microwire(), and e1000_write_eeprom_spi() [2].

Before writing to the ich8, microwire or spi variants, e1000_acquire_eeprom() must be called to explicitly unlock EEPROM access. And interestingly, in ich8, the EEPROM was first written to a shadow RAM before it's checksumed and committed to EEPROM - a good practice and robust design. Also, after write is completed, access is revoked by the device driver.

In other words, both Intel and the device driver are careful enough, most of the time, there's nothing wrong with the device driver.

Unfortunately, for Intel 82573, the only write to write EEPROM is by e1000_write_eeprom_eewr(), which writes straight into the EEPROM from the EEWR register, without any initialization sequence.

Verdict:

* The e1000 driver in Linux (2008) was well-written and contained no mistakes. My initial assumption was incorrect.

* On other e1000e devices, Intel offered reasonably robust protection against inadvertent writes. I'm not sure if these devices were also the victims, but if so, and if the commit message was correct (it's the read/inadvertent write race condition that ultimately corrupts the EEPROM), then neither the device driver nor Intel was at fault, and bricking was purely an unforeseeable incident.

* On Intel 82573, Intel's design flaw of allowing an EEPROM write without any initialization sequences is directly responsible for bricking the hardware.

[0] https://github.com/torvalds/linux/blob/78566fecbb12a7616ae9a...

[1] https://github.com/torvalds/linux/commit/78566fecbb12a7616ae...

[2] https://github.com/torvalds/linux/blob/78566fecbb12a7616ae9a...


Update 3: There are two EEPROM corruption bugs in the e1000/e1000e driver, both fixed in 2.6.27, and I was chasing the wrong one. The correct fix commit should be [3].

Still, the problem here is that the EEPROM of Intel 82571/82573 was writable directly via EEWR [4] without an initialization sequence. But according to the fix [3], there's indeed a hardware option to lock the EEPROM from unwanted writes.

Corrected verdict:

* 30% responsibility - Device driver not locking the EEPROM.

* 70% responsibility - Intel's EEWR that write directly into EEPROM.

[3] https://github.com/torvalds/linux/commit/4a7703582836f55a1cb...

[4] https://github.com/torvalds/linux/blob/4a7703582836f55a1cbad...


> These OEM EEPROMs are exposed to the external world, just like normal registers.

> many hardware provides an explicit lock/unlock feature for protecting low-level configurations and registers

Is this really enough? It seems that devices which will eventually be connected to another system outside the purview of the manufacturer should have such registers isolated to as great an extent as possible; if these are only ever touched once during manufacture, they should only be accessible via e.g. JTAG; if the manufacturer sometimes pushes out updates which touch them, they should be on a different PCI(e) BAR which is only mapped into memory for the purpose of an update.

In this case it looks like the driver developers do share some of the blame, but it seems that such systems should be resilient enough to accept a lot more interference (I'm reminded of the FCC regulation, although it doesn't quite fit here since we don't want "undesired behaviour") on interfaces over which they have little to no control; what happens when a bit-flip results in the wrong MMIO address being written to?


I remember a (Raymond Chen?) blog post about network adapters that passed QA testing but would immediately crash when faced with the horrors of the real world network traffic from Microsoft's corporate LAN. :)


Another problem is a lack of viable way to do a power cycle for PCIe/USB devices. This clears up a lot of errors.

The functionality used to exist for USB devices, but MS started removing the ability to do it in the API and now more manufacturers have dropped it.


It's called "turning it off and on again", and for USB devices, can be as simple as unplugging and replugging.

I've personally done it to fix a malfunctioning USB device many times.


The complaint is that you can't turn it off and on again via software any more.


You'd think so... but modern hardware rarely cares about that. Especially once you get into power delivery subsystems, things that can change voltages. Flip the wrong bit and your chip is toast. This is especially prominent on embedded systems like smartphones.


Whew, kernel bugs due to complexities of live patching code! (https://nickdesaulniers.github.io/blog/2020/04/06/off-by-two...) high fives...anyone...anyone...?


Nick, just dropping by to say that I have enjoyed your blog posts on all topics kernel development tremendously. Thanks!


I enjoyed this when it was posted previously. Keep up the good work, Nick.


Oh my, that last bit from Linus there. Been there done that, on a modern system!

So I was making Linux run on a PlayStation3 booting from an exploit, so that we could have full access to the hardware/GPU (not the locked down PS3 Linux support that it came with, and had since been retroactively removed in an update). This is running on top of a built-in hypervisor, on a PowerPC CPU.

I had to debug early bring-up code of the Linux bootloader I was writing from scratch (AsbestOS), and at that point you don't have any real hardware access. There is an internal serial port but it's not accessible to VMs, graphics is way too hard to bring up, USB is also a pain. No usable LEDs too. So what is there? Panic. The hypervisor panic hypercall had an argument with two modes: you could either reboot, or shut down with a beep. So that was my boolean output primitive, which at one point I used to "print" out an address, bit by bit, during debugging the assembler bringup code.

https://github.com/marcan/asbestos/blob/master/stage2/start....

Once I got into C and I could start writing more interesting code, I eventually graduated from panic/reboot to using Ethernet. The Ethernet device was virtualized to some extent. You still had to write DMA descriptors to get packets out, but you didn't have to worry about low-level hardware init. So I stuffed some hand-crafted broadcast UDP packets in there, and the descriptors to blast them out, and called the "start transmit" Ethernet device hypercalls to get printf-over-UDP working. Fun thing is the PS3 has a built-in Ethernet switch with VLANs, so you needed to stuff VLAN tags into the packets if you wanted them to make it out of the Ethernet port. Until the PS3 slim, when they got rid of that switch, and then you no longer needed tags. Fun.

https://github.com/marcan/asbestos/blob/master/stage2/debug....

I used this same UDP-over-LV1-hv-ethernet debugging tool to debug early kernel startup too, after the bootloader hands off, since we had to revamp chunks of the Linux on PS3 early memory startup code (as the memory config while booting in "Game" mode was very different from the formerly officially supported "Other OS" mode) way before graphics was up. This code was eventually upstreamed:

https://github.com/torvalds/linux/blob/master/arch/powerpc/p...

By comparison, when I was doing (unofficial) Linux on PS4 we had the luxury of running on bare metal, and there were testpoints for a serial port on the motherboard, so we could just poke data out of there with some trivial UART code, and use the existing Linux earlycon support for the same after handoff (though the baud rate multiplier was wrong... and also, bizarrely enough, it's an "XScale" ARM variant 8250 port because the PS4's southbridge is a repurposed Marvell ARM SoC with a PCIe bridge to the x86!).


Wow! That's a great read, you should post a copy to your blog.

I remember running yellow dog Linux on my PS3 (as a kid) but didn't know the GPU was inaccessible.

For the hypervisor, did you have to do any kind of RAM training?

Are folks running Linux on PS4's these days? I haven't been keeping up to much with the system exploit scene (had exploited Xbox 360, PSP) other than the excellent YouTube channel "modern vintage gamer."


This sounds familiar. Around 2002 I bricked a Pentium motherboard while probing the SPI address space looking for the fan and temperature sensors.


I bricked an expensive high end motherboard by ignoring the "Win98 or later" requirement listed on the box and trying to install Win95.

Actually, two of them...when the first one bricked I assumed that it had just been defective, so tried again on the second one we had.

It turned out that something in the Win95 device scan happened to trigger a BIOS flashing, what flashed garbage into the BIOS.

I really hate old buses where there is no mandatory and standard form of device ID, so the only ways a driver can find its device are (1) ask the user, or (2) read and write registers at likely addresses looking for ones that respond the same way its device would hoping that at addresses where the is something that is not your device your commands don't accidentally make what is there do something terrible.


I really respect Jonathan Corbet's writing skills. The article was clear, informative, and engaging.


Oh the memories...this was something similar, but just an innocent RESET ATA command.

https://www.zdnet.com/article/mandrake-linux-9-2-kills-some-...


Love the understatement: "As a general rule, bricking the hardware is a level of overhead which goes well beyond the acceptable parameters."


> As it happens, doing nothing is a highly optimized operation




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: