> I suppose that if the memory contains code, the process is killed (if ECC correction failed).
Generally, it would make the most sense to kill the process if the corrupted page is data, but if it's code, then maybe re-load that page from the executable file on non-volatile storage. (You might also be able to rescue some data pages from swap space this way.)
If you go that route, you should be able to avoid the code/data distinction entirely; as data pages can also be completly backed by files. I believe the kernel already keeps track of what pages are a clean copy of data from the filesystem, so I would think it would be a simple matter of essentially pageing out the corrupted data.
What would be interesting is if userspace could mark a region of memory as recomputable. If the kernel is notified of memory corruption there, it triggers a handler in the userspace process to rebuild the data. Granted, given the current state of hardware; I can't imagine that is anywhere near worth the effort to implement.
> What would be interesting is if userspace could mark a region of memory as recomputable.
I believe there's already some support for things like this, but intended as a mechanism to gracefully handle memory pressure rather than corruption. Apple has a Purgeable Memory mechanism, but handled through higher-level interfaces rather than something like madvise().
Generally, it would make the most sense to kill the process if the corrupted page is data, but if it's code, then maybe re-load that page from the executable file on non-volatile storage. (You might also be able to rescue some data pages from swap space this way.)