I helped co-found a large Dropbox-like white label product. We used AWS and especially s3 for storage.
After many many years of experience with systems, I made sure we had as many possible ways to recover user data as we could. The initial solution was a large Postgres database for all the metadata/indices and s3 for the actual storage.
Despite much pushback we built in little things like an individual meta file on the file system for each file we stored. That way, if we lost the Postgres dB for any reason, we could create a script to rebuild the dB and restore access avoiding massive counts of orphaned files. A simple and probably stupid solution but...
Well guess what - the DB got corrupted and after some ado, we restored all access and none of our customers lost anything.
After many many years of experience with systems, I made sure we had as many possible ways to recover user data as we could. The initial solution was a large Postgres database for all the metadata/indices and s3 for the actual storage.
Despite much pushback we built in little things like an individual meta file on the file system for each file we stored. That way, if we lost the Postgres dB for any reason, we could create a script to rebuild the dB and restore access avoiding massive counts of orphaned files. A simple and probably stupid solution but...
Well guess what - the DB got corrupted and after some ado, we restored all access and none of our customers lost anything.
No it’s not full backups but...