Hacker News new | past | comments | ask | show | jobs | submit login

For context: Amazon Cloud Drive is a Google Drive/Dropbox analog, except that it provides "unlimited" storage space and bandwidth for $5/month. The built-in user interface requires you to manually upload/download files through a web browser, but the service also supports an API for programmatic access. So tools like rclone and acd_cli were developed to let you do bulk transfers and/or mount your storage as a network filesystem, optionally with encryption (which defeats any attempts Amazon might try to use for deduplication).

Now Amazon is suffering the obvious consequence of offering unlimited storage: people are using it to store tens of terabytes of media and/or backups at very low cost. In an attempt to kill off heavy users, they shut down registration of new API keys several months ago, and now they're systematically revoking the API keys used by popular open-source tools.




Yeah, this is a hard one. There are users on /r/DataHoarder/ that claim to have uploaded literally 100s of (encrypted) TBs to Cloud Drive, which is plainly unsustainable and abusive.

On the other hand, they've also killed the product for a lot of more legitimate users as well. The Amazon web interface and apps for Cloud Drive are obnoxiously terrible, and Rclone really is just a better way to use it. I've been using it to sync 10s of GBs of photos between all my different computers, but with Rclone unavailable, I'll have to fall back to Google Drive, S3, or some other option (the unlimitedness of Cloud Drive was good peace of mind).

I'll be keeping an eye on it over the next few weeks to see whether shipping a binary with OAuth secrets was actually the reason for the ban, or just a pretext for getting the Rclone users off the service (personally, I suspect the latter).


I just use it to upload daily encrypted backups of my mail-server (< 500MB per month)... so I wouldn't mind if they set some reasonable limit to encrypted uploads (say 1-10TB).

I feel like anyone who's actually uploading personal content and who isn't uploading media files that are amenable to deduplication would be comfortable with some threshold as well.


> so I wouldn't mind if they set some reasonable limit to encrypted uploads (say 1-10TB).

How is Amazon supposed to distinguish between encrypted and non-encrypted data that you upload?


A heuristic. No common magic header and poor compressibility? Likely encrypted.


Since encrypted data is indistinguishable from random noise, I think poor compressibility is actually zero compressibility, isn't it?

There are tools to search for TrueCrypt / other encrypted partitions on disks, so it's a solved problem to detect encrypted data.

It would be unfortunate if services ban the ability to upload encrypted secrets, though. On the other hand, that'd be good for Tarsnap. I wonder how much it'd cost to store 10TB on it?


The first billion of digits of Pi might look pretty random, but there might be a short program which genrates them - which can be considered a compressed form. In general, it is impossible to decide how good a given string might be compressed. https://en.wikipedia.org/wiki/Kolmogorov_complexity


Might be a short program that generates them? I'm going to go ahead and file that in the understate of the century folder.


Since encrypted data is indistinguishable from random noise, I think poor compressibility is actually zero compressibility, isn't it?

I don't think so. Random noise can take any form - even of a string composed entirely of zeros, which would be trivially compressed. It's just very unlikely that it'll actually be compressible.


Random noise can't really take any form when you apply the implicit restriction of your search fitting within finite time and space.


I don't get what you're saying. Searching? For what?


The definition of a random number is in the process of generation. You have to actually generate random numbers if you want to have a random number that meets some criteria. And even with impossibly vast resources, you will never find a random megabyte that compresses well.


Well, sure, that's what I wrote: "it's very unlikely that it'll actually be compressible." But it's incorrect to claim that random numbers are by definition non-compressible.


It was in the context of encryption. Even a 64 digit random number is very unlikely to be compressible, and 64 bytes is about as small of an encrypted partition as you'll ever have.


It'd take longer than the universe has left to find a string of N zeroes by generating random numbers, for sufficiently large N. And N is surprisingly small.


Sure, but not zero.


Yes zero. As zero as zero can possibly be, measuring with the most precise instruments possible. There's an infinitely better chance of both of us being struck by lightning and imagining you found such a number randomly.

It wouldn't be zero in certain math worlds. It is zero in the real universe.


So why not just take apart the official apps for Google Cloud Drive and use whatever API and authority into has to build your alternative client?


The nice thing about Rclone is that it already supports Google Cloud Drive (and a host of other providers) so the technology isn't the problem.

I'm lamenting Amazon Cloud Drive in particular because it was the best deal. $60 a year for unlimited storage and with no caveats (or so it seemed before last week).


I am sorry, "Google" was clearly a typo: I meant "Amazon". You do not need Amazon's permission to write a client that targets their API.


If a service claims that they provide UNLIMITED storage why using it is abusing?


HOPSFIELD No, these are entries for McDonald's Sweepstakes. No purchase necessary. Enter as often as you want. So, I am.

CHRIS Really?

HOPSFIELD This box makes it one million, six hundred thousand. I should win thirty two point six percent of the prizes, including the car.

CHRIS Kind of takes the fun out of it, doesn't it?

HOPSFIELD I suppose so. But they set up the rules, and lately, I have come to realize that I have certain materialistic needs.


For those who don't recognize it, that's from the movie "Real Genius". For those who don't quite remember it that way, that looks like it is the version from an early version of the script. By the time the movie was actually made it was changed to Frito-Lay from McDonald's, Hopsfield was changed to Hollyfeld, and the above dialog was altered slightly.

That scene is based on a real life incident which in fact involved a McDonald's sweepstakes and Caltech students entering over a million times: [1]

Since everyone knew that the school in the movie, Pacific Tech, was meant to be a thinly disguised Caltech (it only became Pacific Tech when Caltech objected), and McDonald's had not been happy with the Caltech sweepstakes prank and probably would not want it brought up, my guess is that one or both of McDonalds and Caltech asked for the change.

Changing it to Frito-Lay is an interesting choice, because six years before the McDonald's sweepstakes, a group of Caltech students tried mass entry on a Frito-Lay sweepstakes, but apparently were not as successful.

[1] http://hoaxes.org/archive/permalink/the_caltech_sweepstakes_...


Crap. I knew something seemed off about it. Thank you!

====

Lazlo: No. These are entries into the Frito-Lay Sweepstakes. "No purchase necessary, enter as often as you want" - so I am.

Chris: That's great! How many times?

Lazlo: Well, this batch makes it one million six hundred and fifty thousand. I should win thirty-two point six percent of the prizes, including the car.

Chris: That kind of takes the fun out of it, doesn't it?

Lazlo: They set up the rules, and lately I've come to realize that I have certain materialistic needs.


abuse: use (something) to bad effect or for a bad purpose; misuse.

Abuse doesn't mean breaking the rules. It's like going to an all-you-can-eat buffet and staying for a week. Or taking a job with unlimited vacation time and coming to work once a month.


They should declare limits, then.


I bet in the ToS there are likely limits spelled out, or barring that, a clause that grants Amazon to right to shut you down if they unilaterally decide you're "abusing" the system.


I have looked at the TOS in Spain and you won the bet :)

They can shut down your account for the reason you stated


Yeah, never make a bet that a company doesn't have a vaguely worded way to kick off whoever they want at any time.

But it would be nicer for everyone if they could just honestly state a number of TB.


There are: you can only use approved third-party apps to connect to the service. This one is no longer approved.


Then we agree to disagree.


What do we disagree about?


I disagree with you in that it is a misuse.

What Amazon should do, as Microsoft should have been done in the case of the unlimited Onedrive offer is putting some clause of reasonable use.

When some company offers an unlimited resource, some of the customers will use it to upload/download/etc a lot.


Technically it's not. Hoarding for hoarding's sake is a waste of a obviously limited resource, though.


Being waste does not imply being abuse.


Wait, the resource is UNLIMITED, isn't it? At least it is sold as unlimited.


$5/month for unlimited access via API was really a steal. This would buy you only 217 GB via standard S3 storage, and then you'd have to pay an exorbitant $0.090 per GB for egress data transfer; you'd burn through your $5 after downloading 55 GB.


I'm a bit skeptical about how much deduplication saves them. I have a couple of 100 gigs on ACD, unencrypted, most being family photos/videos - I don't think they're going to find any portion of those that's remotely common to other users - at least, I hope not!


If it's file-level deduplication, sure. But if it's chunk-level, I think you may find that you actually have many chunks identical to many other users' chunks.


Unlikely given any non-negligible chunk size and accounting for the randomization effect of compression (image and video).

I think deduplication buys them most when people upload common assets - videos downloaded from the net, ebooks, ISO images, etc.


Doesn't compression generally reduce entropy?


> In an attempt to kill off heavy users, they shut down registration of new API keys several months ago, and now they're systematically revoking the API keys used by popular open-source tools.

Actually this seems to have been more triggered by a serious issue with acd_cli's authentication server that resulted in its users seeing other people's files:

- https://web.archive.org/web/20170514020241/https://github.co... - https://www.reddit.com/r/DataHoarder/comments/6bi5p5/amazons...

Amazon has started paying a lot more attention to open source tools as a result.


> now they're systematically revoking the API keys used by popular open-source tools.

Would this also impact Arq users? I was hoping Arq would support Backblaze...


I think DropBox will shut down their API in July of this year. I got email warning me about six months ago - since then I moved our internal system to FTP...


They just updated it to a new/improved version of the API




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: