Hacker News new | past | comments | ask | show | jobs | submit login
How a bug in Dropbox permanently deleted my 8000 photos (medium.com/jan.curn)
206 points by jancurn on July 29, 2014 | hide | past | favorite | 141 comments



I think this bug should be considered completely separately from how unwise it is to use a cloud service as the sole storage of important files.

Regardless of the circumstances, losing user files against the wishes of the user is the absolute worst thing a cloud backup provider can do.

Even for files that are deleted intentionally and unambiguously by the user, I'm astonished that Dropbox actually deletes the files at the end of the 30-day restore window. I would expect them to keep the files for some multiple of the publicly-stated restore deadline where the multiple >= 2, if for no other reason than as a goodwill generator. There is no more evangelical advocate for your company than the customer you email to say "Yes, you intentionally deleted this file six weeks ago. The 30 day deletion deadline has passed, but I have managed to restore the most recent version of your dissertation. Thank you for using Dropbox."

For files that aren't intentionally deleted by the user but are "de-synced", it is disgraceful and appalling that there is no contingency system in place. Keeping user files when the user assumes or wishes them to be kept safe is the core competency of a service like Dropbox.

"The user should have kept multiple redundant copies" is not an excuse for a poorly managed online backup service. "Keep multiple backups of everything important" is good advice for a user, but "Keep user files safe when the user thinks they are safe" is the most essential advice imaginable for the CEO of an online backup service.


I would expect them to keep the files for some multiple of the publicly-stated restore deadline where the multiple >= 2, if for no other reason than as a goodwill generator.

On the other hand, keeping copies of the user's data after you say you've deleted it doesn't sound too ethical to me. I don't expect immediate purging, but I also don't expect potentially sensitive files to linger for months after they were supposed to be deleted.

I think a better system would be to warn about big deletion events (e.g. send an email) to allow the user to revert it within the 30-day period.


> On the other hand, keeping copies of the user's data after you say you've deleted it doesn't sound too ethical to me. I don't expect immediate purging, but I also don't expect potentially sensitive files to linger for months after they were supposed to be deleted.

This is a fair concern, but I think the issue of recovering files is much more important than the issue of retention of deleted files to the huge majority of the Dropbox user base. I imagine most privacy-conscious users would use an online sync service that offers encryption by default, etc.


Also because Dropbox's EULA almost certainly has a provision that deleting files doesn't necessarily mean Dropbox erases the data right away, just that they won't make it available for recovery and it might be deleted at some future point.


They say "there might be some latency in deleting this information from our servers and back-up storage". But as a user, I expect this to be a reasonable period of a couple of weeks, not months.


The 30 day deletion deadline has passed, but I have managed to restore the most recent version of your dissertation.

Then you have the opposite problem; privacy-centered users complaining "Dropbox keeps your data even when they say it's permanently deleted! Here's proof!!"


Why would privacy centered users use Dropbox?


Privacy isn't an on/off switch. I might be OK with the danger of my files leaking out as long as I know they'll be deleted when (or not long after) I tell them to.

Like any other security measure, it's a question of tradeoffs.


People being sensitive about their data does not necessarily imply they are sensible with their data!

That is their problem rather than anyone else's of course, but that isn't what they'll shout if it bites them in the arse later - they'll concentrate on shouting about the involvement of Service X rather than pontificating on whether involving Service X for that particular data was a good idea on their part in the first place.

If they are encrypting their data client-side before the sync service touches it, then they may have thought they were safe until a key leaked and they recoded everything with new keys only to find that arched copies somewhere out there were still openable with the previous (now leaked) key(s).


People complain about Facebook not actually deleting content that they wanted deleted. Privacy conscious users don't always have to be privacy conscious when they start using the service.


DropBox is not a cloud backup service.

Edit to add: Dropbox's sole purpose is to sync files between different physical machines. Which, now that I think about it, seems sort of anachronistic.

In cloud storage, the authoritative version of the file lives in the cloud and each device just accesses it. But in Dropbox, the authoritative version is on every machine. By default, every copy is authoritative, and actions are synced.

This sort of "many originals" architecture seems to confuse people. The article author here is clearly thinking of Dropbox like cloud storage--he checked the web interface, saw his files, and thought it was all good.

But in Dropbox the web interface is not authoritative, the local copy is.


Is Dropbox a backup service? Nope. Dropbox is folder synch, a network version of a pendrive. Dropbox makes no guarantees about the durability of the data on their servers, which are merely a browsable cache.

Use proper backup services, instead, like Memopal.


Hardly.

"Dropbox is a home for all your photos, docs, videos, and files". -- https://www.dropbox.com/tour/1


Storage isn't backup. If you have one copy of your files ("home") then you have zero copies of your data. You're only one hiccup away from losing it all, as this person found out.


Not to mention EVERY time I put an SD card in my machine dropbox does the pop-up and offers to store my photos in their cloud.

Its actually more than a little annoying.


You can very easily turn that feature and pop=up off.


"I think this bug should be considered completely separately from how unwise it is to use a cloud service as the sole storage of important files."

It is unwise to use anything as the sole storage of important files. Personally, I think cloud backup is safer then one backing up files to a DVD or tape backup but I just won't keep personal files on any cloud service.

Never just have one point of failure!


Totally agree with not having one point of failure. But backups should be read only as a requirement, and accessible as a feature.


Key assumption to your argument: Dropbox is a backup service.

I don't think that they are. More like the modern incarnation of a file server. Like a fileserver, they have resiliency against service failure or individual storage system failure (via S3). Cool stuff, but it ain't backup.


> I think this bug should be considered completely separately from how unwise it is to use a cloud service as the sole storage of important files.

YES: from DropBox's point of view - this is a serious bug even ignoring what other actions the users have/not taken to protect their data.

And NO: this sort of thing is one of the key reasons not to use such a service as a sole backup - linking the two when trying to educate users on the importance of thinking about data safety, so we can't separate the two as one is important when hammering the other point home...

> Even for files that are deleted intentionally and unambiguously by the user, I'm astonished that Dropbox actually deletes the files at the end of the 30-day restore window. I would expect them to keep the files for some multiple of the publicly-stated restore deadline

That would cause significant consternation in some areas. If I explicitly delete a file I might explicitly want it gone within the stated window and no later and would therefore be unhappy if I found the data still available (even if I had properly encrypted it before it touched the cloud service so noone else could read it).

> For files that aren't intentionally deleted by the user but are "de-synced",

I definitely agree there - it sounds like that process needs some transactional workflow wrapped around it so that:

1. Client and server don't do anything until they've both agreed what to do (then they can both go away and do it even if the connection drops, safe in the knowledge that the action is correct).

2. The client and server record what has been agreed as part of that transaction protected process, so in the case of an unclean stop during the process it can be resumed/retried.

3. Some process may need to be in place for one side rolling back if the other detects a failure applying the agreed actions.

Of course this is a fair bit of work for something that won't happen often, particularly when you consider that there might be more than one active connection to a set of files at any given time so it might not be a simple two-way merge operation, so DB may have other priorities - but the bad PR from it happening and not being cleanly dealt with is something that they need to consider if making that sort of decision.

> "The user should have kept multiple redundant copies" is not an excuse for a poorly managed online backup service

Agreed. But "the single backup service failed" is similarly not something that is going to encourage me to have sympathy for the user!

Sync services are excellent secondary backups, and they satisfy the "off site" requirement that a great many home and small-office users neglect (I know people who worked for a small company which died because the backups were in the same building as the active data so one fire took out the lot), but no one should trust them as their only backup. I even recommend against multiple sync services as your only backups as that in itself can cause problems (when bugs or other peculiarities from one interact badly with similar behaviours in another).

Using just a sync service means you have no off-line (or even semi-offline) backup, which is as bad as not having any off-site copies.

Of course getting this involves educating users as to the risks they are taking, which is a point notoriously difficult to get across, so while I don't expect a service like DB to be perfectly bug free (and therefore wouldn't trust it as an only backup) I think such services need to help the education process by being a little less indigenous about the safety issue and making sure their users know that an accidental delete will be propagated everywhere and be unrecoverable after a time. This will never happen though: "please use some other backup option too in case ours goes wrong" is not a sentiment marketing departments or investors will want to see publicly stated!


> If I explicitly delete a file I might explicitly want it gone within the stated window and no later and would therefore be unhappy if I found the data still available

Very few (if any) users who explicitly delete a file want that file recoverable within 30 days but not after that. They either want the file recoverable for a much longer period of time or they want the deleted file to be irrecoverable immediately. For the latter option, Dropbox offers the ability to "permanently delete" files. It requires at least four clicks to accomplish, displays a warning, and is driven entirely via the website rather than through the Dropbox client.


Of course this is a fair bit of work for something that won't happen often

If you think about it, that sums up almost any robust back-up system you go to the effort of setting up, properly testing, and maintaining over time. For that matter, it applies to almost any insurance policy in any context.

I expect most of us would still agree that potential catastrophic damage if you're the unlucky one can justify spending the time/money/resources to set up proper back-ups and insurance, though, and surely this is the case if you're running a service where robust file handling is your major value proposition.


> If you think about it, that sums up almost any robust back-up system you go to the effort of setting up

Exactly, which is why people should think harder than they do about backups. You should make sure that you are not exposed to risk you don't want to be exposed to because someone else's development priorities differ from your point of view.

True data protection is unfortunately a complicated business and most people see the flashy presentations on services' sites and trust all is well without giving it a second thought.

While I agree it should be the service's responsibility to cover these eventualities, it is not good practise to assume that they do. You might misunderstand what they do and not see risk were it is present, they might have honest coding mistakes that cause problems down the line, there could be ineptitude anywhere down the line, and so forth. If you are paying for their service then you might legitimately expect to be covered by some for of insurance for loss in the event of it being caused by a problem in the code/infrastructure, but never assume that until being explicitly told it is present and you have read the small print, and don't ever assume this at all for a free/cheap service.


" from how unwise it is to use a cloud service as the sole storage of important files."

I disagree (a little bit)

There are some issues here:

- Using a free service

- Using a service that "thinks" when you want to add or remove stuff. Yes, it's easier to use, but it's not explicit, and it usually fails spectacularly.

- Thinking that this "magical save thing" is equivalent to a backup.


He was a paid user. Therefore, not a free service at this point.


You have knowledge that most people don't. If that's important to safely use Dropbox, it's their responsibility to inform the users.


Oh I agree with you

Maybe the whole issue is: a local delete turns into a remote delete, and that's pretty much non-reversible (unless you realize what you did, later)

Maybe they should add something like "You just deleted a lot of files from your DropBox, are you sure, please review and confirm" (if this was done through sync)


Sync is not backup.

Sync is not backup.

Sync is not backup.

My strategy: a big external drive used for Time Machine, and a subscription to Backblaze. Both of these are all about retaining multiple versions, recovering from accidental deletion, and continuously backing up in the background. Dropbox is about syncing stuff between computers.


Dropbox keeps deleted files and previous versions of files for 30 days. With Packrat, a free feature for all paid accounts, Dropbox keeps deleted files for the lifetime of the account.

How is Dropbox not a backup provider? It may not be a very good backup provider, depending on your point of view, but it clearly markets itself as a backup provider ("your stuff is always safe in Dropbox and can be restored in a snap"), is used by its users as a backup provider, and has the features of a backup provider.


Backups should be read only.

If I can delete parts of a backup, it's just as susceptible to the same human mistakes that make them necessary in the first place.


The "Packrat" feature the parent comment mentioned keeps every version of a file you upload: https://www.dropbox.com/help/113


You can still permanently delete things. It's just harder.


By that definition, most corporate backup systems aren't backups! It's pretty common to overwrite old tape backups with newer backups, and yes, that does mean you need to be organized enough to make sure you don't accidentally overwrite the only backup of something.


Maybe a better way to put it is that backups are isolated from current "production" data operations.

Backups are not invulnerable, so yes, it's still possible to overwrite the backups, but that involves engaging the backup system specifically. Just doing file operations or settings changes on the production system won't do it.

Dropbox is not isolated from production data operations. In fact the whole point is to synchronize production data operations as fast as possible. That makes it dangerous.

(This does not excuse the bug is this story; sync setting changes should be obviously be atomic transactions that are confirmed on the server.)


Yeah; that's a better way of putting it. rm -rf'ing my production box's hard drive shouldn't nuke the backups.


Not at all. Any backup system defines an RTO (Recovery Time Objective) and RPO (Recovery Point Objective).

Usually, backups are scoped to protect against system failure, so fast time to recover and limited historical durations are usually design goals. Being able to restore a server to it's state 20 weeks ago isn't very useful.

Dropbox is like shadow copies in Windows. You can go back in time, for the purposes of journaling changes to an existing object. It's a productivity enhancer for users, not really a backup feature.

Keeping things forever is usually referred to as backup. For example, my employer keeps certain records for 100 years due to statutory requirements. When you archive, you want to tightly scope what you archive and limit it as much as possible. Things like short time to restore are usually not primary design goals.


I'm paying for packrat on top of pro. Are you getting it for free?


No, I just misread the help page. Thanks.


To me it sounds like Packrat is a backup feature, but this person chose not to take advantage of it.


I didn't realize they were now marketing themselves as a backup provider. Tsk.


But the OP claimed that they are a paid user but don't have PackRat


I too am a Backblaze customer but beware: backups of deleted files are deleted after 30 days.


That's kind of surprising given how their hardware model is setup, with pods of HDDs that are filled up and then essentially archived away. I guess the amount of space freed up is enough that it makes sense to open those pods back up for writing.


Huh, I didn't know that. My first line of 'oh shit I deleted a thing' is Time Machine, since it's right there next to my external monitor and won't require waiting for a download.


One of the reasons I switched to Crashplan. Their Java client is pretty terrible performance-wise, but it's still the best cloud backup I've tried.


Your definition of backup...


Which is why I don't solely use Backblaze (fast recovery, medium level of trust). I also backup the same data to Glacier-class S3 (slow recovery, high level of trust) and Time Machine (v. quick recovery, same-site risk).


I noticed there hasn't been any mention of attempting to recover your deleted files from your local disk. If you haven't tried it yet it is worth a shot. There are a few software solutions out there that will scan your disk for files that have been deleted. On Windows I have used Piriform's Recuva[1] and on OS X I have used PhotoRec[2], both have worked rather well. It is worth noting that the longer you wait to try the more likely the sectors are to be overwritten with new data. (And you should back up any files currently on the disk). I know it is a long shot at this point but some chance is better than none. Good luck.

[1] http://www.piriform.com/recuva [2] http://www.cgsecurity.org/wiki/PhotoRec


Or, I wouldn't be surprised if tomorrow he posts that he found them all in his trash.... Because nobody ever thinks to clear it. Sometimes the simplest solution is right under our nose.


He said that he de-synced them to free up space. I'm assuming that he would have noticed if the operation didn't free up space.


The bug in question:

> From all this information it seems that Dropbox client first deletes files locally before it informs the server about the new selective sync settings. Consequently, if the client crashes or is killed before the server is contacted, the files remain deleted without any trace. After the client restarts again, it only sees there are some files missing and syncs this new state with the server.

It concerns me that the top several comments are some variation of "you shouldn't trust the cloud with your data." How did software services become so totally insulated from the expectations we have of every other service that holds on to our valuable property?


> It concerns me that the top several comments are some variation of "you shouldn't trust the cloud with your data."

No, it's "you shouldn't trust one single thing with your data". Hardware fails. OSes have bugs (OS X's "space in your drive name and we delete your user folder.") Services have unexpected gotchas. This is why you never store anything in only one place.

A sync service by definition is not a separate place (unless he had found a bug in the Packrat feature)


My point is that, for some reason, we have been conditioned to believe that bugs and data loss are just something that happen as opposed to something that service providers, who we pay money to, are charged with avoiding. We hold software to a much lower standard of quality than we hold other products and services.


* we have been conditioned to believe that bugs and data loss are just something that happen

They're something that happen to unregulated services that cost $8/month. If I paid $8/month for electricity, water or Internet access, yeah I'd expect yearly blackouts. To get an SLA you have to pay more.


Would this bug not have happened with Dropbox for Business?

And didn't Microsoft make an absolute fortune releasing buggy as shit software throughout the entire 1990's?


No, it's "you shouldn't trust one single thing with your data"

Which is why he has local copies on several systems and the copy in the cloud with Dropbox.

The nasty thing here is that Dropbox effectively deleted the "local" backups.


Dropbox deleting the local copies are effectively what the author wanted to happen. The author wanted Dropbox to:

1) Remove the local copies.

2) Keep the remote copies.

3) Stop syncing those folders.

The author explicitly states that the goal was to use the Dropbox copy as the only copy (i.e. the backup).


Oops, you're right. But the way I read the report, even having other systems with local copies would've gotten data loss, because it would have removed those local copies as well (which was not the idea when enabling selective sync only on the laptop).


This is actually the point that makes no sense to me - after removing a folder from selective sync, the default (in my mind) should be to leave the folder alone. At the very least the user should be prompted or an option setting should be available to delete or leave the folder. 'Security' and 'disk space' are only two of the many reasons to not sync a folder.


I don't trust anything called "sync".

I trust file operations "move", "copy" and "remove".

But "sync" could mean anything: which side is syncing to which side? If something is not present on one side, will it delete it on the other?

I wish the term didn't exist, and apps/clients/... would copy or even just load or show things, and if you want to "sync", then rsync with clear source and destination. Same with mail for example: I don't want to "sync" email messages, I just want to see them, the most recent ones first.


Here "sync" means what Dropbox says it means. The answers to your questions are well defined in their software, even if to some other software, they might be different.

I wish the term didn't exist, and apps/clients/... would copy or even just load or show things, and if you want to "sync", then rsync with clear source and destination.

The source and destination is always clear. It's to the server and from the server. Dropbox uses a star topology, not P2P.

You'd get the same result if you used rsync with source and destination, if you passed --delete, which is equivalent to what Dropbox does.


"The source and destination is always clear. It's to the server and from the server."

That's exactly what I mean by not clear: to the server and from the server are the exact opposites. So if something is not present on one side, it can do two things to make both sides equal: it can delete it from the one side, or it can copy it to the other side. If all it has is a "sync" button, not a "sync to server" or "sync from server" button, it's not clear what it will do.

Something that deletes files shouldn't be called sync, but should clearly be marked as a delete operation. And deleting something from one location should not have the surprise effect of deleting it on other locations too, unless it's called "remove everywhere".


if something is not present on one side, it can do two things to make both sides equal: it can delete it from the one side, or it can copy it to the other side. If all it has is a "sync" button, not a "sync to server" or "sync from server" button, it's not clear what it will do.

It's only not clear because you're using "sync" in an abstract sense. In Dropbox, it is clear: if the side had the file and it disappeared, it's considered a deletion. If it never had it, it's assumed it should receive a copy.

There's no ambiguity in the context of Dropbox, even if there is in the term "sync" generally.

Something that deletes files shouldn't be called sync, but should clearly be marked as a delete operation.

Why? A deletion is just another action comparable to adding a file, moving it, etc. Sure it may lose data, but if you edit a file, replace the contents and copy it to the other side with overwrite, it can lose data too. That's why Dropbox provides a 30-day backup feature.


File sync with multiple devices is fraught with danger. Remember the old 'briefcase' on the Windows desktop? There's just no way to distinguish 'I don't want this file on this device' from "I don't want this file on any device'".


In Dropbox, it's very simple: the former is simply not supported at all. Your file should be in all devices or none (mobile excepted).


No, they have Selective Sync.


That's at the folder level, not file level.


True, but it does put lie to:

>> Your file should be in all devices or none


I think "recency" is a better mental model than star topology for how Dropbox works.

The authoritative source of the data is whichever endpoint has been modified most recently. If you delete a file on one machine, it's then deleted on the server, and then on other machines. The file is gone everywhere.

If you then go to one of those machines and restore the file from a local backup, the file is re-uploaded to the server, and then each machine will re-download it. The file is back everywhere.

If you log into the web interface and delete it there, each of the machines will delete it as well. The file is gone everywhere.

At any point the controlling view of the data is the one that was most recently updated.


In today's digital age, I believe lots and lots of people will lose a lot of valuable information like photos because they don't have a good understanding of what backing up means.

I've warned my friend who is computer illiterate that he is risking losing his childrens' complete photo collection for their entire lives because he is only keeping a single copy of his data on his hard drive. This is probably the situation with millions of people in the US alone, and when that hard drive failure event happens, which will happen, then they will lose everything.


This is becoming even more important with SSDs. I used to do a lot of data recovery, and photos were of course a very common type of data that customers wanted to recover.

If the drive was spinning, I could usually recover most of the data. Even if the drive was completely dead, there was still some chance at recovery of I had the correct logic board, and for a hefty price I could send it out to a company with more resources to get the platters spinning again.

SSDs can be much harder recover from, since you need information stored in the controller to map out the data, and many consumer level hard drives brick themselves on failure to prevent silent data corruption.


Did that work?

Would it have been more effective to just provide him with a list of stuff - a choice of two drives; a choice of two softwares (with really easy URLs to buy and download) and instructions for using and testing.

That could be a gift for birthday or Xmas?


I bought him a new hard drive, copied the photos over and gave him instructions on how to do it. He hasn't bothered to update his backup once, and it's been about 5 years and his photo collection has only grown.


I'm thinking of buying a CrashPlan Business account (my corp uses the Enterprise version) for my family for this reason. I set them up with TimeMachine, but even plugging in a USB drive has proven to be too much to remember.


I use Dropbox for sharing pictures, but would never even dream to use a cloud based service for backup purposes.

Granted, Jan's case is a bit more complex and I'm really sorry for his loss.

Stories like that should really be a lesson to everybody never to completely trust a cloud based service as your main backup.

On a side note: I agree that archiving of digital files is a hard problem. The smartest librarians of the world are thinking about how to achieve this for, literally, decades and I'm not sure they even have a good solution to the problem.

My personal strategy is redundancy: I buy new hard disks every couple of years and copy all important files, twice. One hard disk is kept off site.

It's not perfect, but it's the best I can come up with. Reading horror stories, like Jan's, indicates that it's the better solution. Despite the messiness.


I use Dropbox for sharing pictures, but would never even dream to use a cloud based service for backup purposes.

Why not? They complement well the local hard drives. It shouldn't be the only copy, of course, but then again, nothing should.


Why not? I find it pretty great in terms of backups (though that's not my purpose for doing it.)

I have copies of my files on my work machine, my laptop, my wife's laptop, my gaming machine, and ofcourse dropbox's servers itself.

I could lose any one of those without losing the bunch thanks to dropbox synching things.

Sure it doesn't help in the case of deletion but its great for the more common case of "my machine suddenly won't boot".


I could lose any one of those without losing the bunch thanks to dropbox synching things.

If it's due to a hardware failure, sure.

If it's due to a bug as in this case, or an accidental deletion that went unnoticed for some reason, or corruption of a key file by the application that created it, or malware encrypting your entire filesystem until you pay a ransom, not so much.

Please don't consider a bunch of sync'd copies to be a complete backup solution, whether the mechanism is a RAID setup on your local machine, or an automatic sync offsite, or a Dropbox-style cloud service, or anything else. These tools are guarding against one specific and relatively common failure mode, which is useful, but they do not guard against a lot of other things that can and sometimes do go wrong.


>I have copies of my files on my work machine, my laptop, my wife's laptop, my gaming machine, and ofcourse dropbox's servers itself. I could lose any one of those without losing the bunch thanks to dropbox synching things.

Read the story again. Jan had the same peace of mind you do, until a crash in the Dropbox client deleted ALL of his copies of the affected files. He only unsynced on one device, yet the Dropbox-stored copy, as well as copies he had on other devices, were gone forever, due to a client crash. Don't assume your files are safe just because they are synced to multiple devices; obviously that sync relationship can be disastrous when an unexpected bug rears its ugly head.


I've bought myself a Blue Ray recorder half a year ago and it's as easy to use as Dropbox with OS X...


Are you using dvdisaster to secure against data-loss?


I would also add that when using optical discs for backup, M-Discs are probably preferable over conventional dye-based discs.


Wow, do they work with conventional recorders?


Only if they say "M-Disc" on it. Many consumer recorders, especially if made by LG, support writing to these discs.


No, but they work with conventional readers. You need a special laser to do the physical etching that they're doing.


Thanks for the hint, no, but I'll do that in the future. In the past I actually had the problem of errorneous blocks one or two times. Eventhough these were Noname Multisession CDs that lay too much in the sun. (The media actually turned yellowish). Never had this problem again, but I only use brand media which I put it into proper covers that don't get too much light.

But I'll use it in the future... (Not sure if that's of interest anymore, but I used TrueCrypt for business backups.)


When your main business is storing files (and you also just raised yet another shitload of cash), not offering better safeguards against file loss is inadmissible. I left Dropbox for Google Drive when they announced the Condolezza Rice move, and I haven't looked back since.


If you are using Dropbox as a sole backup of your files, think again.

Simplify to If you only have a sole backup, think again. Points 4 and 5 should be you always have more than one backup of important material and at least one backup should be on physical media that you own.


Also, in the case of removing it from your computer and only storing it on the "backup". It is not really a backup if it only exists in one location.


This this this. The parable I've heard is that even two locations isn't a backup.


>> If you are using Dropbox as a sole backup of your files, think again.

> Simplify to If you only have a sole backup, think again.

Yep. 3-2-1: "At least three copies, in two different formats, with one of those copies off-site."


I like keeping backups on a hard disk in the office. For a guy who grew up in the 90's there's something comforting about being able to physically look at one of your backup methods!


That's a great backup method, which is cheap and easy to recover. It's not very robust though. In the event of a natural disaster, your office is just as likely to be destroyed as your home. Ideally you would have an offsite backup (or two) in a completely different region.


The 'different region' requirement might just depend on different things:

- How far away are your office and home?

- What is your locality?

As an example, I don't live in a place prone to earthquakes or hurricanes, and my office is a 20 ~ 30 min drive from my home. What is the risk that both my office and home are destroyed?


It's all just a matter of risk management:

- Simply copying your data to a second hard drive in the same machine protects against hard drive failure.

- Second disconnected hard drive protects you from ransom-viruses and accidental erasures.

- Second disconnected hard drive on the other side of the house gives you some protection from a fire (caught early enough).

- Second disconnected hard drive in another building some distance away protects against small disasters (eg. randomly directed tornado).

- Second disconnected hard drive in another region protects against large disasters (eg. single nuclear strike).

- Second disconnected hard drive in another country protects against national disasters (eg. invasion, multiple nuclear strikes).

- Second disconnected hard drive on another continent protects against major disasters (eg. moderate asteroid impact event)

...and so on. But all of these can be protected against by simply having 3 backups.

Except bit rot. That shit's inevitable.


> - Second disconnected hard drive in another region protects against large disasters (eg. single nuclear strike).

> - Second disconnected hard drive in another country protects against national disasters (eg. invasion, multiple nuclear strikes).

> - Second disconnected hard drive on another continent protects against major disasters (eg. moderate asteroid impact event)

I'm not really sure if worrying about my hard drive will be something high on the priority list in many of these events.


If Toronto is hit by a devastating flood or winter storm, I think it's entirely likely that both your office and home could be impacted. That isn't nearly enough geographic separation to be disaster-ready. You really want to be on different coasts or continents.



The linked article is about someone who managed to lock himself out of his Google account. No word about Dropbox.


This is clearly a problem for the reputation of dropbox.

On their website they market the service as:

"Even if your phone goes for a swim, your stuff is always safe in Dropbox and can be restored in a snap."

With this message in mind, and a valuation of roughly >>10$ BILLION<<.

I think, you can have very very high expectations regarding the data retention of their service.

Also, the whole point of using a service like dropbox is to remove friction/time investment regarding the standard backup/sync tasks.

If you suggests to handle multiple harddisk backups locally+offsite, that is fine except that not everyone wants this kind of time investment and cost associated with a do-it-yourself backup service and rather depend on a service provider that they can trust.


As I said to someone the other day after dealing with yet another failure by Dropbox to do the safe thing or even the reasonably-recoverable thing with our data: "One reason I'm not worth a billion dollars is I would never have let Dropbox ship like this!"

Doing sync on top of an existing filesystem, making it work with existing apps, maintaining solid transaction semantics, and having a simple, understandable user model -- you just can't do all four. The Dropbox experiment has now shown that "correctness" doesn't matter, commercially speaking.


We live in a convenient age, but also a fickle one. I laugh when my wife prints 100 hundreds of digital photos although her disaster recovery process is more stable than mine.


My wife does the same; she has a Snapfish account. She will have them print and mail her the most important pictures from a set, while trusting the ones she doesn't care as much about to our home server and Facebook (the former for long-term storage, the latter for convenient access). Since she has to order a certain number of prints every year to keep the account open, and it comes with unlimited free storage, it works out well for her.

The only exception was our wedding photos. We have a set of prints directly from the studio hired to take the pictures, as well as a digital copy on DVD from them. We also have those digital files backed up on our home server, a computer at my parents' house, Dropbox, Snapfish, and both of our Facebook accounts. The photography company also has long-term storage that we would have access to if all else failed. That's about as redundant as we can get without spending a ton of money, but it's more than most people seem to have.


I laughed at this when I first read it, but it's seriously true. For people that aren't computer literate, physical copies can sometimes be the best form of a backup for digital data.


100 hundreds is a lot printing ;)


Dropbox has one of the worst websites on the Internet. Seriously!

E.g. people have mentioned that "packrat" would have helped in this situation. But try finding that service from the pricing page. Hint: NOT THERE. Someone here linked to some explanation in their "help" section. Seriously? Such an important feature is only mentioned under "help"?

Even if you go to sign up for the paid version, Dropbox Pro, packrat isn't mentioned [1]. Their horrid website alone is enough for me to not want to do business with them.

Someone else here has succinctly summarized the situation [2]:

   One reason I'm not worth a billion dollars
   is I would never have let Dropbox ship like this!
[1] https://www.dropbox.com/upgrade [2] https://news.ycombinator.com/item?id=8105548


As many others mentioned already "sync is not backup". If you really want true backup then on a Mac I will suggest to have a setup like: 1- local backup for quick recovery with Time Machine 2- remote backup with a true backup solution like CrashPlan (or similar)

I personally have such setup with a local one with my Time Machine, and having CrashPlan running all the time, it did save me a couple times when after many weeks I deleted a wrong file or folder... Got everything back without problem.

If you don't have a Time Capsule, then you can always consider an alternative with just a smaller subset of what you want to archive by using a JetDrive by Transcend. I recently got a 128GB JetDrive and it is my local destination for my MBP for TimeMachine... Yes 128GB is not enough but knowing that CrashPlan is constantly running make me comfortable enough with that setup.


I use my Time Capsule's USB port to attach an external drive that is shared so that Crashplan can backup to a folder on it. That's in addition to using Crashplan Central for cloud backups and Time Machine.


> "you should backup"

What about in 10, 20 years? Photo libraries will keep inflating. Local storage will not. As of now I backup from a SSD Mac. What happens when I don't have a computer anymore?

Interestingly, people don't value "bits" or information. We value moments and emotions and work and art. There's no successful current consumer business model for people to store and backup photos (Backblaze is mainly prosumer).

And so aren't social networks the real backups by now? The redundancy of publishing on multiple services means some photos will stick and the rest will fade, somehow like former printed pictures I guess. Publish it or lose it?


> Local storage will not.

External backup drives continue to increase in size at a falling cost per GB.

It is cheaper and lower latency to store/retrieve a large amount of data locally than remotely, unless you believe that the current glut of "near-free cloud storage" will continue indefinitely. Market cycles suggest otherwise.

> Publish it or lose it

Why do social networks store your photos "for free"?


> And so aren't social networks the real backups by now?

I hope not. I cringe when I see people who treat them like they are. What do you do when your account password gets cracked(it happens to people on Facebook a lot), or the site itself goes away(MySpace)? Also, you're not able to store the images in their original form; I'm pretty sure sites like these will have limitations on what you can upload. I'm not a pro-photographer by any means, but there's no way I could upload a RAW file to Facebook.


What happens when I don't have a computer anymore?

Wifi-enabled drives are already common; no computer needed. They are even targeted at mobile device users ("for iPad/iPhone").


I had an issue similar to this (but fortunately not nearly as bad) about two and a half or three years ago and that was what convinced me to pay of the Packrat service add-on.

I agree with the OP that that should be a built-in option for all paying customers -- or at least make it more visible as an add-on. I've had instances in the past where it was months later that I realized something was either deleted or didn't sync and I had to look through Events and use Packrat to restore. It can be scary - even if you do make backups of your backups.


Once you deleted your own local copy, the folder of photos on Dropbox was no longer a "backup." You just moved your one copy elsewhere. You essentially had NO backup. And everyone should know, you should always have at least two backups. (Again, you had zero.) I wouldn't even consider Dropbox (without Packrat) a valid backup destination either, because any changes get synced.

Sorry this happened to you, but better backup practices could have avoided it easily.


My buddy Tom says "If its not on your current computer, its lost". Imagine when Dropbox closes one day (as every company ultimately does), how much will be lost.


Since all the files are sync'ed to the computer, nothing would be lost unless selective sync was enabled.


A comparable bug would be that when the server goes away (Dropbox closes), the client starts deleting all local files.


It really sucks that we all have to keep learning this lesson over and over. Everyone I've ever spoken to that has a good backup strategy in place has it because they have lost irreplaceable files.

Have real backups. Syncing is not a backup strategy, raid is not a backup strategy, etc. 3-2-1 At least three copies in 2 different (storage) formats and at least one copy offsite. It sounds like overkill but you have to decide how much your data is really worth.


"raid is not a backup strategy"

Though I have seen it used (to my horror) as a software release strategy.... :-)


i think people are still struggling because creating real backups still seems like kind of a pain in the ass compared to cloud syncing.

what's currently the most convenient and simple way to store multiple local backups?


It IS a pain and I don't know that there is a way to fix that. As I said though you have to decide what your data is worth to you.

I personally use http://duplicity.nongnu.org/ and a semi-complicated system of cron and bash script to backup my files to my local server, some flash drives and s3.


Getting a Time Capsule is pretty convenient if you're a Mac user. That you can add USB drives to the Time Capsule also allows you to store extra discrete snapshots using other backup software like Crashplan.


I worry about stuff like that happening to me with my photos. The best idea I've come up with is setting up a Synology that I export my photos to. It has a basic web app for showing off photos. I then use it's AWS S3 service to back those photos up every night. The Synology also has something like RAID1 on it for some redundancy at the local level. The hard part is remembering to export the files from the computer to the Synology.


I think the lesson here is not that the cloud is bad, it's that a sync service is not a backup. I use Dropbox to share, I use Amazon S3 to backup.


I use both backblaze, as well as Google Drive. I don't have anything in particular I'd be completely upset if I lost it forever, which is why I dont do local backups, but I've definitely been looking at using Crashplan as a secondary backup. I just cant be bothered to hook my laptop up to an external when I hardly sit next to it anyway.

What does HN think? Safe if I went with 2 cloud backup providers?


The solution to this kind of problems is: Make Read-only "disaster recovery" backups - zip it and burn it to a DVD set or Bluray, and put it off-site. Make incrementals or differentials of those discs every few months and a full data set each 3 months.

Takes a few hours, I like to burn all my data to Bluray (50GB per disk) while watching a movie.

Store the bluray data sets at a friend's house and never worry again.


Dropbox is NOT a backup service. It's a convenient file syncing and sharing service. And even then it's a good ides to actually check docs or help before proceeding to do something which might result In a loss of data.


Just curious why more people don't learn about and invest in home NAS's? Something like Synology would be pretty good at storing 8000 important photos. It is still a single point of failure, but its one I control.


It's a hassle to learn, admin, and it's more likely to eat your data. Many people do not have real internet connections at home so they could assign it an ip and access it remotely. Plus it's hard to buy a good one unless you go for generous overkill. And after 2 years you have yet another vendor abandoned Linux device with remote vulnerabilities on your home net.

(I still prefer it to cloud storage though)


Thanks for the point of view. I personally love the Synology at home and have a spare disk for cases where I may have a failure. I would say, barring something catastrophic, that I am set for another 3-5 years, which is fine. I would most likely buy a new model Synology 3 years from now and continue expansion. They also offer another disk shelf you can addon.


> If you are using Dropbox as a sole backup of your files, think again.

Good advice.

> Without making a mistake, you might lose your files.

If Dropbox, or any other single service or method, constitutes your entire backup strategy, then the mistake is already made.


I agree that single service or method should not be entire backup strategy. I backup pictures and important files to two seperate external discs, besides desktop's storage.

Sorry to hear, but sometimes it happens. Look at the positive side. you will have more precious moments to look forward to and those lost pictures, you will have to go old fashioned way of recollecting. Do you remember the day...kind.


I would never consider using Dropbox (or any other cloud storage) as my sole backup system for critical files unless I was willing to pay for the "Packrat", i.e. infinite history, feature.


The moment you only have one copy of your files anywhere you don't have a backup.

I've heard it said that if a file doesn't exist in at least three different places it might as well not exist.


Exactly. (I can't believe I had to scroll down this far to find this.)


Try going into the folders and manually restoring. You can select more than one file at a time.... I was able to restore for more than 30 days... Worth a shot.


Many thanks for your advice, this was the first thing I tried, unfortunately without any success...


Thanks for the tip on the Packrat feature. I didn't know about that. Added.


If it's not saved in three places, it's not saved.


Well, sorry to say this, but if all of your photos were stored only online...


It depends on how you look at it. They were stored online (Dropbox servers) as well as locally (Macbook). The problem is that a bug in Dropbox's client turned a de-sync operation into a delete operation. So one way to see it is that even having local storage was not safe because the sync was flawed. Another possible way to look at it is if your only local copy is synced with the cloud, then effectively your only storage is online.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: