It works this way by design. Most companies will retain logs for exactly as much time as legally required (and/or operationally necessary), then purge them so they don't show up in discovery for some lawsuit years down the line.
It has nothing to do with discovery or legal liability and everything to do with cogs. Log size at cloud provider scale is genuinely something you have to see to believe; recall that these are logs for a company with multiple services that see 9-figure daily active users.
This is the real answer. The amount of logs generated at cloud provider scale now are massive compared to what they were just a few years ago. The last time I was involved in these sorts of systems, circa 2014, logging was one of the core functions at a cloud provider that was /most/ demanding of physical hardware, everything from compute, memory, and storage, all the way to networking. A typical server in the environment in that provider in 2014 would have 2x10GigE connections set up for redundancy, log servers needed a minimum 2x40GigE connections /for throughput/.
These days I wouldn't be surprised if they are running 100GigE or 400GigE networks just for managing logs throughput at aggregation points.
we’re talking an intrusion to the corp network not to the prod one (getting the keys from the crash dump)
I assume that’s a way smaller scale. However the document doesn’t go into detail which kind of logs exactly they were missing, so maybe these were network logs
For each piece of PI/PII data, generate a mapping in a table of that piece to a secure random number, and store the generated random number in place of the personal data, and use that in the log.
Then, if deletion is required, simply erase the row that holds the mapping.
And finally, be sure to not store that mapping table in the same place as your backups or your logs.