Does this pave the way for a “lite” version of the Dropbox client that _only_ syncs files and has none of the “added value” bloat that has crept in of late?
When are you going to offer a cheaper plan with less storage for people that only need <50GB?
I lucked out and have 2 free plans that have bonus storage from various promotions. I get about 25 GB per account. I haven't maxed either one.
I absolutely love the product. My wife scans a file, I can grab it right away. I'm at work and need some document (e.g., my driver's license photo), I hop on the website and download it.
I pay $5 for backblaze to backup 5TB. I don't want to spend $10 a month for storage I'll never use (I couldn't even keep that much synced on most of my devices) but I'd gladly pay $3-5 a month for 50-100GB.
With Dropbox Family, each member of the plan has their own Dropbox account. A single person, the Family manager, will manage the billing and memberships for the entire Family plan.
Out of curiosity, how much does bandwidth usage contribute to your overall operational efficiency (as compared to for example the cost of running the actual servers)? Would totally understand if you can't answer this :)
Alexey from Traffic Team is here. Traffic is definitely a non-negligible part of the budget. We try to reduce it as much as possible for both lower operational expenses and better user experience. Main drivers for that improvement (besides owning our own Edge infrastructure) on the client side are:
1) Brotli (Broccoli) compression.
2) Differential updates through librsync.
3) "LANSync" a P2P sync within a broadcast domain (secured through server issued short lived TLS certs.)
That said, Desktop Client is only 1/3 of the overall Dropbox traffic -- the rest 2/3 are split between Web and API.
Does this ratio include the Dropbox official mobile apps?
Have LANsync peers been considered as a sources of blocks for mobile clients?
Like most, I’m observing (and participating in) multidimensional access to data. For not, accessing files on my local desktop is still much faster than direct downloads from the Dropbox cloud. It’s a bummer to source files that are on my LAN from the cloud. This may become more problematic as bandwidth billing models move toward pay-per-bit.
> Desktop Client is only 1/3 of the overall Dropbox traffic -- the rest 2/3 are split between Web and API
Interesting! I assume the desktop client is still dropbox's main product so that's surprising to hear. Is it because the desktop has everything cached and rarely has to download whereas web and mobile has to download a fresh copy each time they are viewed?
This is why I continue to use Dropbox for daily work and constantly changing files. The syncing is unmatched. It’s surprising how bad the others like OneDrive and google drive are in comparison.
OneDrive completed its rollout of differential sync in April 2020[1], after beginning in Sep 2019. This should improve OneDrive’s sync speed substantially.
They already had this for Office files, it's just finally extended to all file types after several years. It's still nowhere near as fast as Dropbox, especially for complex directories, and the fact that it took until 2020 to finish this feature shows how far behind they are.
I recently switched from Dropbox because of the added device limitations for the free tier and because I don't really want to pay 10 euro a month for 2 TB of space when I only need 10 GB. Got myself a Nextcloud instance for third of the cost and I have to say that the syncing absolutely sucks. It's so bad that I'm going to migrate away from it as well.
Not going back to Dropbox yet though. I'd rather try out Google Drive since I consider it to be much better consumer plans.
I stopped paying for Dropbox precisely because there was no sensibly-sized plan (150GB working set, was paying for 2TB and an extremely bloated desktop client). Decided to move everything to a combination of OneDrive (which I had been resisting for years) and SyncThing (which is OK but crufty):
I had the same issue, especially w/ syncing a large amount of small files, and switched from nextcloud to seafile which works way better on the same hardware.
I'm more of a security-focused engineer so I'm most interested in the "specially crafted low-privilege jail". What protocol gets data in and out, not shared memory I'm sure? Do the jail processes also have to implement an RPC server (protobuf/gRPC/HTTP?) or is there another mechanism for giving them work and receiving results?
And yes, much of the overhead stems from the RPC server that needs to be implemented. For lepton we used a raw TCP server (a simple fork/exec server) to answer compression requests. For Lepton we would establish a connection and send a raw file on the socket and await the compressed file on the same socket. A strict SECCOMP filter was used for lepton. It was nice to avoid this for broccoli since it was implemented in the safe subset of rust.
In my opinion broccoli does not go so well with bread (brötli = bread roll in swiss german), so some more matching name suggestions are: gipfeli (Croissant), weggli, pfünderli (500g bread), bürli, zöpfli
Savory with a touch of sweetness, Broccoli Bread cooks up like cornbread but offers fiber and calcium. The original name was Brot-cat-li (since files could be concatenated and compressed in parallel), but when we said it fast it sounded like "Broccoli" and the name stuck.
We heavily investigated zstd and met with the brilliant inventor, Yann, who provided amazing insights into the design and rationale behind zstd and why it is so fast and such an amazing technology. I also recompiled zstd into rust using https://github.com/immunant/c2rust and tried using various webasm mechanisms to run it (I didn't get webasm quite fast enough, and teaching c2rust to make it safe would be quite a slog).
But the main reason we settled on Brotli was the second order context modeling, which makes a substantial difference in the final size of files stored on Dropbox (several percent on average as I recall, with some files getting much, much smaller).
And for the storage of files, especially cold files, every percent improvement imparts a cost savings.
Also, widespread in-browser support of Brotli makes it possible for us to serve the dropbox files directly to browsers in the future (especially since they are concatenatable). Zstd browser support isn't at the same level today.
> the main reason we settled on Brotli was the second order context modeling
This advanced feature is only relevant on reaching compression levels 10 or 11, which are extremely slow. Below that, it's barely used by the encoder, due to memory and cpu taxes.
Given your application has reached speed concerns, and ends up using brotli at compression level 1 in production, you would be surprised to notice that in this speed range, zstd compresses both faster and stronger, by a quite substantial margin.
For long term storage of blocks, we compress at much higher compression levels like you mention. These densely compressed blocks are, in turn, served directly to customers when they download their own files.
For uploads you're right: we'd be theoretically better off with high performing zstd, but there are maintenance costs with maintaining 2 separate compression pipelines that are similar, but different, for upload and downloads.
Plus there is no safe rust zstd compressor and the safe rust zstd decompressor linked in this thread is only recently available and is also several times slower than the safe rust brotli decompressor.
> Pre-coding: Since most of the data residing in our persistent store, Magic Pocket, has already been Brotli compressed using Broccoli, we can avoid recompression on the download path of the block download protocol. These pre-coded Brotli files have a latency advantage, since they can be delivered directly to clients, and a size advantage, since Magic Pocket contains Brotli codings optimized with a higher compression quality level.
It looks like they did, but having an implementation in a memory-safe language was one of their requirements. Learning that was for me the most fascinating part of the article.
I'm sure they could implement it technically speaking, but if a compression protocol is not widespread enough to have others doing such a thing, they can probably consider that a sign of how supported it is.
> Maintaining a static list of the most common incompressible types within Dropbox and doing constant time checks against it in order to decide if we want to compress blocks
There is also a format-agnostic and adaptable heuristic to stop compression if the initial part (say, first 1MB) of the file seems incompressible. I'm not sure whether this is widespread, but I've seen at least one software doing that and it worked well. This can be combined with other kinds of heuristics like entropy estimation.
This is a really interesting write up of their use of Brotli! Makes me wonder if there might be a novel way I could leverage it beyond HTTP Responses.
I never realized the advantages of brotli over zlib could be so extensive, in particular, it appears they're getting a huge speed boost (I think also in part that its written in Rust)
>we were able to compress a file at 3x the rate of vanilla Google Brotli using multiple cores to compress the file and then concatenating each chunk.
Side note: I admit, at first I thought they were talking the Broccoli build system[0]
The tradeoff between client CPU time and upload speed is interesting. If they need to be able to output compressed text at 100mbps, that gives a budget of ~100ns/byte, or pretty much what they would have been spending with zlib in the first place. But on my fiber connection I only have a budget of 10ns/byte. Does that mean you'd use the equivalent of `brotli -q 1` for me? If so, doesn't the march of progress continually erode the advantages of compression in this use case?
They aren't on the same level of abstraction. Rsync currently uses zlib for block compression on the wire. Brotli/broccoli would be an alternative option.
Various compression enhancements, including the addition of zstd and lz4 compression algorithms and a negotiation heuristic that picks the best compression option supported by both sides.
Is there a pun between Broccoli and Brotli I'm not aware of? There's another Brotli compression tool called Broccoli (written in Go), just a coincidence?
Great question! We developed and deployed Lepton to losslessly encode JPEG image files. Lepton continues to deliver substantial storage and cost savings every year. You can read more about it here https://dropbox.tech/infrastructure/lepton-image-compression...