Hacker News new | past | comments | ask | show | jobs | submit login
encryption is not gravy (benlog.com)
61 points by vgnet on April 30, 2012 | hide | past | favorite | 43 comments



There are many other issues with adding encryption to an existing service. Take Dropbox as an example. If Dropbox allowed each user to manage their own private/public key pair, then Dropbox wouldn't be able to see any of your data, and this couldn't use de-duplication. This means they need to actually allocate 2GB for every free user; no longer could they count on multiple users uploading the same files. Dropbox would have no way of knowing. Additionally, every minor change to a file would result in the entire file needing to be re-uploaded. The "previous versions" of each file would need to be re-worked, etc.

There are lots of technical issues with just slapping some crypto on to an existing service. User management of passwords/keys seems trivial compared to these problems.


As far as I know, you could do delta uploads with encrypted files. It should not be any different if the file itself is encrypted. For the same reason you can e.g. use rsync delta syncs with encrypted files...


There should not be any similarities between an encrypted file and the same file encrypted again after some changes are made to the file. Even if you just change one character in a text file.

I'm sure the rync delta transfer algorithm still works fine with encrypted files, but the changes for encrypted files are going to be calculated as 100% so you're not going to save anything.


I took this to mean that since the user side has full access to the plain text, the user client could calculate a delta, then encrypt it and upload just the encrypted delta.

You'd still need to pull it all down and apply the deltas each time you put it somewhere new, but it would work. Possibly with a 'full version' on some longer schedule. As a bonus, you'd get automatic history, which Dropbox stores anyway.

It's still got issues, no doubt, but it could be done, no?


What you describe is nearly exactly how an encrypted backup service called tarsnap works. Basically, it provides de-duplicated, compressed backups to the Amazon S3 cloud using local key files. The difference here is that it deals with data in blocks, not explicit deltas. This means you have no concept of 'history', and separate archives on the service have no strict dependencies on each other (they merely reference certain blocks that have been uploaded). It uses a cli-only interface inspired from tar, and is very handy for cron backups of important data.

There is also a project called ddar, which is designed to provide merely the de-duplication of tarsnap so you can setup your own repository (tarsnap only works with its service).


You might be thinking of a hash. Hashes have a cascading effect where a single bit change in the source file has a "waterfall" effect during the hashing where the output is drastically different from the hash of the unmodified input.

I don't think this is necessarily the case with all encryption mechanisms.


You could do the delta on the client side before you do the encryption. That's what Arq does.


This is nonsense. Block ciphers do exist.


Yes they do, though I don't think that was what he was referring to.


Bitcasa says they have a patented algorithm with both client side encryption and de-duplication. Here's related Quora: http://www.quora.com/Bitcasa/How-can-Bitcasa-possibly-achiev...


That kind of model, of necessity, leaks information about the files you are uploading, which is a major negative. It is now possible to tell if you've uploaded the same document as somebody else.

I'd love to know how they are doing it, though. I'm not sure I believe there is a way to do it that doesn't allow Bitcasa to read the file (at least not one using well-researched encryption technologies). It seems like they are likely using your key as a key encryption key for the actual key to the file, but for that to work, they have to be able to tell you the real encryption key, which means they have access to the contents.

I'd love to be proven wrong, though, because we could use it. :)


Probably something along the lines of E(data, H(data)), I'm guessing.

I'd be very surprised if this were patentable. The same basic technique was mentioned offhand in literature as early as 2003 (http://static.usenix.org/events/hotos03/tech/full_papers/mis...), and I suspect the idea's even older.


We don't know how much space Dropbox saves by having the dedupe feature. For all we know it could be something as small as 5%, or even less.


I would pay DropBox premium for this feature.


You and about 5,000 other people. Not a big market, unfortunately.


It may not be a big market that the masses are clamoring for, but it will soon be similar to services encrypting passwords in the database. The outside world shouldn't be expected to know about the existence of rainbow tables and the futility of md5-hashes. The educated, however, know what to demand, will expect their services to offer it, and will leave lacking-services for support-services - and in their exodus the masses inevitably follow (e.g. Hotmail -> GMail)


I think you are right. The only thing holding back cryptography from being widely adopted is the lack of a service that makes strong cryptography super easy to setup and use. I know this isn't a trivial task, but it's not impossible to implement strong crypto in a user-friendly way.


> The educated, however, know what to demand, will expect their services to offer it, and will leave lacking-services for support-services - and in their exodus the masses inevitably follow (e.g. Hotmail -> GMail)

That's not how it works.

People didn't leave Hotmail because the "educated" left, they left it because Google gave several GBs for free, the interface was simpler, the search better and MS was out of fashion.

If you're not a large market you don't get a service, or you only get niche vendors to cater to you. You can bypass this by setting trends for the "uneducated" (whatever that means), else we will all be using Lisp Machines or Smalltalk environments.


Educated/uneducated may not be the right terms, but it is possible to have hindsight just by understanding the landscape.

Privacy is becoming a large problem in the internet and encryption will likely be part of the solution. Without encryption, the ownership of data is on the service provider instead of the person.

Privacy is a feature just like free storage. One day, privacy can be available to the masses just like storage is today. (also think back how many people actually wanted or needed multiple gb of free storage for their emails until one was provided by a service like gmail)


This is wrong. SOX compliant fortune 500 companies can't legally use dropbox due to confidentiality requirements. Adding encryption would fix the confidentiality issues.

The BIGGEST deepest pockets would pay handsomely for this.


FYI, SpiderOak has always supported zero knowledge encryption, and recently launched an enterprise product. https://www.computerworld.com/s/article/9226176/SpiderOak_la...


JungleDisk has too


How would you verify that they never got your encryption key? You installed their software on your computer. How can you be sure it never sent your key to their server?


The same way you can be sure that any of the hundreds or thousands of other programs installed on your computer are not keylogging and sending your passphrases and plans for world domination to the NSA, Bilderberg Group, and the Russian mob.

That is to say, you can't be sure. However, Dropbox is a company in good legal standing, and they have a lot to lose if they offered client side encryption and then leaked the passphrase.

Spideroak (definitely) and Backblaze (I think?) already have client software which offers client side encryption. Whether you trust them is up to you.


I'm launching a product shortly that offers exactly this feature to Dropbox (or Google Drive, SkyDrive, ...). It's a native OSX and Windows app that keeps the keys on the users' computer and transparently encrypts and decrypts files before they're sent to your cloud storage provider.

The product is in beta right now but I'd love some more people to try it out. If you sign up at http://safeboxapp.com, you'll get a download link to try it out.


Just signed up for it.

How does it compare to BoxCryptor where you can also access your files using EncFS?


Unlike BoxCryptor, it's not based on an encrypted volume. Instead, files and directories in a designated folder are individually encrypted (much like Dropbox).


Wow, I had the exact same idea around a month or two ago, but didn't have time to execute.

Please tell me you are sending out invites today.


I would too. I would pay maybe $100+ a year for this, even for only 4GB or whatever I currently have on Dropbox. But maybe this still wouldn't be profitable to develop if only a small group of paranoid users would buy it.


10$ a month is not a lot of money. But, why not just use Dropbox on locally encrypted files?


Because then the web interface is worthless and you have to decrypt the files on every machine you sync with Dropbox


Check SpiderOak - sounds like exactly what you want. https://spideroak.com/


For me, the fact that my data is inaccessible on the server is the single most interesting fact about that sync product.

All alternatives would be reasons to politely decline taking part. So IF there will be compromises in the future for the scenario where users cannot back up their own key, I do hope that the current behavior will always be a viable option either. I'd rather risk losing my data through my own stupidity (been there, often enough) than pushing my browsing history (potentially sensitive) or even passwords (..) to a random service on the net.


While I agree with most of this article, Adida says that a "full-strength, randomly generated, user-managed key" implies that "Enabling a new device requires coordination with an existing device". This is typically the case with current systems, but it is not necessarily the case. It is eminently practical for a human being to memorize an xkcd password with enough entropy to resist brute-force attacks into the foreseeable future.

http://lists.canonical.org/pipermail/kragen-hacks/2012-April... demonstrates encoding an 80-bit random number (which is plenty secure with a reasonable key derivation function) as each of "point pleased intense de maybe fairly arms", "bejuso jejigi nububi bidoda gahano", "ADD DOTE BID HILT LAUD MAIN CALF CITY", and "仴薦肨縨猯鹽", any of which is eminently practical to memorize. I use this program to generate my login passwords these days.

(80 bits is not enough for a key for something like AES, because you can try a lot of different keys per second. It's plenty if you have a decent key derivation function to add a 25–35-bit work factor.)

This is different from a user-chosen password because users are often highly predictable in their password choice.


right, that's what I labeled the password-derived encryption use case. Great to do the XKCD mechanism, and if you follow the link in that article you'll see that we're working on maximizing the key stretching we can do based on passwords:

https://wiki.mozilla.org/Identity/CryptoIdeas/01-PBKDF-scryp...

But unless you have a crazy long passphrase, you're not going to get 128 bits, let alone 256.


Yeah, I did read Warner's proposal. It's full of awesome, as his ideas usually are. Were you around that time that he showed Memento at his house, but with the scenes in chronological order?

As I explained in a late edit to my comment, this is distinct from your "password-derived key" case because it eliminates the major drawback of that case: "This is not as secure as the previous setting, since most user passwords are not nearly as strong as full-strength crypto keys." If you generate high-entropy passphrases, that problem goes away.

128 bits is overkill for defense against brute force. You can do maybe 2³⁰ crypto operations per second in a custom crypto-cracking processor, pack maybe 2²⁰ of them onto a custom chip, use maybe 2²⁰ custom chips in your Crypto Cracking Center in your evil genius volcano base, and let it run for maybe a year, 2²⁵ seconds, before you get bored. That's still only enough to search 2⁹⁵ keys, so you should be pretty safe with keys that need 2¹⁰⁰ operations to crack, at least for a few years. Or, if you don't have a supervillain or major intelligence agency devoting their full computational resources to reading your browser bookmarks, 2⁸⁰.

I do think it's actually feasible for someone to memorize an 11-word phrase encoding a 128-bit key, but it would take several minutes and careful practice over the next few weeks to be sure they didn't forget it, and using a decent PBKDF with a 7- or 8-word passphrase is probably a better option.


If you install Ubuntu with home folder encryption, it gives you the "unravelled" encryption key for you to write down somewhere.

Seems like a good idea: If you forget your passphrase, you can recover your data with this.


The product he's talking about in the end does the same. If you join Sync (at least that's what I remember what happened..) you'll create an account and see a long generated key. You're asked to store it somewhere safely/to print & file it.


If you're using separate passphrases for everything, surely that would be no more secure than writing the passphrase down?


> The most expensive cars have unlocking fallbacks.

This is only the case because the car company is sitting on a database of everyone's keys. It amounts to server-side security. If a security professional were designing a "secure" car, they would demand one which is truly bricked if you lose the keys.

I expect that part of the issue with cryptography is explaining to users why their data needs to be more secure than their car.


I wonder if you could set up a "key server". It would be like an online safety deposit box for your key. That way, no matter what computer you're on, you can access the keyserver and authenticate yourself, and it would recover your key.


Sure, but then all of your data is only as safe as that key server. And if you lose your key/password to it, you are just as hosed.

It really comes down to two options: take care of your key by yourself, have best security and highest risk of loss or share your key with somebody else, reduce your security, and have a fallback against key loss.

My company is working on easy-to-use security. One of the things we are looking at is a middle-ground using key-splitting algorithms to give you very nearly as much security as the first with the fallback of the second.


"Every feature is someone else's bug."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: