FWIW, I've been using syncthing [0] for some years now [1] and am very pleased. Even though my data is unavailable on the cloud from any untrusted computer (like e.g. my corporate laptop), it's synced on my "fleet".
I'm not sure that PrivateStorage actually adds anything to the equation?
EDIT> The Tahoe LAFS [2] model is more that you spread your data over multiple providers. NAS at home, several VPS providers, or what have you. It feels like RAID in the network, and it allows very precise setting of redundancy policies.
So syncthing actually only runs on trusted machines, whereas PrivateStorage will be able to run on both trusted (tightly managed) and untrusted machines (like a VPS in the USA).
How much data do you sync? I'm syncing 60 GB with NextCloud and it annoys me frequently: every time I log in it spends 5 minutes scanning my data, pegging at least one core of CPU and using up a lot of my I/O capacity. And of course at a pretty annoying time, since I almost always want to be actually using my machine during the first 5 minutes after logging in. And I'd really like to be syncing more data. Anyway, wondering if syncthing does better in this respect.
Im using Syncthing as well, and in my Syncthing dir on my laptop there is currently 52 GB of data.
That includes a synchting-share for my automatic backups and my "dropbox" replacement (a simple directory for syncing between phone, and computers).
It works great. I haven't had any issues with it. The current release of syncthing is very stable. Earlier versios were a bit error prone. But they seem to have fleshed out most, if not all bugs i encountered in earlier versions.
I have about 10 GB in multiple folders synced in various combinations between:
- ec2 instance (light burstable, always on)
- desktop (debian)
- laptop (manjaro)
- desktop (windows)
- phone (one-way sync to get photos off the phone)
- tablet (kindle fire)
It has been great. Solid and trustworthy. It picks up changes made before the service was running, handles deletes and renames just fine, and updates are simple. The web-based UI is good.
I like that calmh and the team are not adding lots of features. There are lots of things they could add to make syncthing "better", but they want to make sure syncthing does one thing well.
One nit - the UI shows the "latest change" for each folder. The common understanding of this phrase would be "the file in that folder that most recently changed" but what syncthing actually shows here is "the most recent change that syncthing made to this folder". That means that if I change a file on the current device and syncthing picks it up and replicates it out to the other devices, that change will not be shown as the "latest change". If some other device changes the file and syncthing replicates that change back to the current device, then it will be shown as the "latest change". This is confusing. "latest change" should just show the file that most recently changed for any reason.
I'm syncing around 90 GB between my server, laptop, and LineageOS Pixel phone. I use it to sync my documents, music, passwords, and archived pictures. I also use it to sync photos taken by my phone camera as they are taken.
Setup:
* Camera: 1.8 GB, 243 files
* Documents: 10.8 GB, 4604 files
* Music: 61.5 GB, 25077 files
* Passwords: 660 KB, 726 files
* Pictures: 16.5 GB, 6450 files
The passwords are managed by 'pass' [0], which is viewable on my phone using Password Store [1]. Cold-launching Syncthing takes ~10 seconds on my phone, but it does it automatically on boot and thereafter runs in the background. Battery impact seems to be negligible.
So far I've been disappointed with sync issues with Spideroak, OneDrive, and Nextcloud.
Now I use Tresorit (which I only became aware of because of... and online ad!?) It doesn't seem to have sync issues for me. Dropbox didn't have issues either, but it wasn't as secure.
Given how rock-solid Syncthing has been, I wonder how hard it would be to bolt encryption onto it so anything that some specific nodes receive is always encrypted.
Unfortunately that's harder than just always leaving a Raspberry Pi on at home, especially given that I want to be able to sync files to my phone, where EncFS probably doesn't work at all (or easily).
I'm unfamiliar with syncthing, but could you run two daemons, one that does encrypted sync to e.g. dropbox, and one that does plain sync to your phone and such? Or would the two instances stomp on each other or get into an infinite loop? e.g.:
Some syncthing nodes could host only the encrypted data, without the keys to decrypt them. This adds the benefit of having some nodes host the data, without being able to access it. Think: VPS, etc. that have very good availability track record, but some doubts about whether your hosting company can spy/might be coerced into spying.
Exactly. If I could be sure that the VPS couldn't read or mess with my files without me knowing, I'd definitely add a SyncThing node on my VPS and have increased availability along with security without any hassle.
Tahoe-LAFS by itself is probably not (you do have to configure and keep some Python-based daemon software running), but PrivateStorage is a managed service.
I trust S3, B2 Google's blobstore more than some rando's machine who runs filecoin. Tahoe-LAFS gives you the assurance that the actual backend storage only sees encrypted data. The big clouds have this advantage that they are probably more reliable, faster, have lower latency, better uptime, and lower price.
The Tahoe LAFS model is more that you spread your data over multiple providers. NAS at home, several VPS providers, or what have you. It feels like RAID in the network, and it allows very precise setting of redundancy policies.
In the PrivateStorage case the machines aren't "trusted" but they are all run by the service you're paying for -- so the incentive to keep them running properly is indeed there.
For other kinds of Tahoe deployments, no there's nothing built-in to incentivize storage-server operators. That part is up to whomever is organizing and running the Grid (what Tahoe calls a group of storage-servers). For example, friends could agree to host storage-servers for each other and create redundancy + trust that way.
The difference between Tahoe and things like Storj / FileCoin is that those services intend to be "a single, global service" whereas Tahoe is software that can be deployed in several different ways -- one of which is a professionally managed Grid such as PrivateStorage.
If you are interested in these topics I'd encourage you to join #tahoe-lafs on Freenode or one of the Tahoe development meetings. These are definitely things I've seen discussed but I think Tahoe-LAFS is far more likely to introduce a concept of "federated Grids" rather than "a single global Tahoe service".
You can configure Tahoe-LAFS to store data wherever you want but I guess PrivateStorage will have its own settings and you won't be able to select a nas at home. Just a guess though.
An expert could figure out how to get the PrivateStorage Tahoe client to use other storage servers, but yes in general it is "a managed service" and I don't think using your own storage-servers will be "a supported configuration".
Making user owned storage user friendly for people with non sys admin skills is a challenge, but something we are working towards.
because our gaia hubs are associated with user's ids, we are also working on automating SSL as much as possible for individual users as well. This is another technical challenge that makes it difficult for the average person to set up their own trusted environment where they control their own data.
Sure, yes you could do that -- I mean, PrivateStorage is just shipping you a "real actual Tahoe client". The main feature you're getting is the managed storage-servers.
So if you happened to "not completely trust" the availability of those you could also configure one of your own and configure your client(s) to use that and the PrivateStorage servers. That is, hedging against PrivateStorage going away so suddenly you can't retrieve your data.
But, I agree: if you're doing that you're likely able to run your own Tahoe grid on VPSes or similar.
Looks like it's not actually ready yet? PIA has a great track record, though, so this seems promising.
I also like this when you give them your email:
>> This is not a mailing list, and your email will be permanently removed after we send a one‑time notification when PrivateStorage is available to the general public.
There are a number of alternative paths in this space if you're truly focused and willing to invest a bit, but if you care about privacy enough to seek a service like this out and just want to minimize mental overhead, this seems like a good choice.
Tahoe-LAFS makes some impressive claims like maintaining confidentiality while running on untrusted machines. I think a lot of folks now would assert that really any machine running x86 due to Intel ME and the AMD equivalent should in fact be untrusted.
I'm not in a position to criticize though, this is just from a cursory glance at the summary page, and frankly I used PIA as my own VPN provider for a number of years and had only positive experiences.
(author of Tahoe here, although I'm not much involved these days)
> Tahoe-LAFS makes some impressive claims like maintaining confidentiality while running on untrusted machines. I think a lot of folks now would assert that really any machine running x86 due to Intel ME and the AMD equivalent should in fact be untrusted.
To be precise, our claim is that you can use untrusted servers, since the client encrypts the data before it leaves your machine. You are, of course, entirely reliant on your own client being trustworthy. Nothing can save you if your client is compromised, whether via ME, a BIOS infection, an OS rootkit, or a boring old userspace compromise.
The Tahoe-LAFS client runs pretty well on ARM and Raspberry PIs, in case that feels better.
"There are a number of alternative paths in this space if you're truly focused and willing to invest a bit, but if you care about privacy enough to seek a service like this out and just want to minimize mental overhead, this seems like a good choice."
It feels to me like 'borg'[1] is becoming the de facto standard for this use-case. There were a number of similar tools (like duplicity) for years but borg seems to have buttoned up all of the issues.
So on one hand, you just need some good open source software for that, there's enough cloud and there's no reason you wouldn't choose the cheapest one if you have everything client side encrypted and can add more redundancy. On the other hand..
Should anything truly private be stored in the cloud? I have never seen a solution that doesn't boil down to trusting someone. The claim is that the code is open source. But I don't know how I would verify that that's the actual code they are running on their servers. I also don't understand the payoff. For information that's not truly private (like your music collection) but that could possibly be data mined, then a very basic level of privacy you get from something like Dropbox should be enough, right? What does this service offer that other cloud storage providers don't offer? For information that's truly private, why would I risk it becoming eventually available to hackers by putting it somewhere in the cloud? What am I missing?
The data is encrypted on your client before it leaves your computer. You're relying upon the servers to hold onto your ciphertext (i.e. availability), but not to keep it secret (confidentiality). And the client can detect changes to the ciphertext, so you aren't relying upon the servers for integrity either.
You have to trust the client code, for sure, but that's something that you're at least nominally in a position to inspect and verify. https://github.com/tahoe-lafs/tahoe-lafs
I'm a programmer. And I still don't think I'm in a position to verify if something is cryptographically secure. It's quite possible that a client has been built with an extremely subtle backdoor already in mind. One that crypto experts won't find for years.
Yes, but it's like when you're at a cafe and need to go to the bathroom so you ask the random guy next to you to watch your laptop. Sure he could steal it, but you reduced the attack vector to just him.
It's a reasonable analogy. To use your analogy I'm suggesting you don't trust anyone with your laptop and bring it to the bathroom with you. If something is truly private and / or valuable information don't put it in the cloud. I'm not alone in that thinking. When it comes to storing people's digital currency you hear about things like cold storage. For very good reason.
There are IPFS driver requests and now requests for drivers to support privatestorage by Least Authority as well, if you also want to replicate your data temporarily across some nodal network.
While gaia fundamentally does not require using the comprehensive Blockstack API, we are working on tutorials to abstract the use of only gaia without Blockstack. They are designed to be functional independent of each other, in the same way people can use Blockstack authentication without gaia, the reverse can be true: https://docs.blockstack.org/storage/overview.html
Currently, I want it to be even easier than just bootstrapping a docker-compose in gaia for users to host on their own machine, or rasberry pi or what have you. We are working on that as well as cloud hosted solutions.
I would like for people to be able to launch a vm with a preinstalled image locally on their own machine, not just google cloud, amazon, Digital Ocean etc. The groundwork for a secure and minimal VM is mostly in place. We need to set up more instructions for this but feel free to launch the docker-compose and give it a whirl in your environment of choice if you don't want any of the cloud AMI's we currently offer.
I'm curious what the pricing will be when this is opened up to the public. Some years ago when I compared encrypted online storage, I found Least Authority to be quite expensive. It still seems to be ($25 a month). [1]
Tahoe-LAFS does "erasure coding" on the chunks of data. This increases the size of the data (adding redundancy) so that you can recover a file without recovering every single chunk. These parameters are decided client-side. In the smallest possible case (i.e. every chunk required) there is some slight overhead from the zfec and Tahoe headers.
If you are using redundancy of any kind, it will inflate the size of the ciphertext versus the plaintext thus affecting sync speed.
Tahoe-LAFS does split everything up into fixed-size chunks, though, so the total size of the file doesn't really matter -- it will still be uploaded in 128kb (default) chunks to the storage servers.
So, it's not the encryption that has an impact but the erasure-coding (which gives the "RAID-like" features) and you can configure it to have zero redundancy and thus only some slight increase in the total amount of data to send.
Hadn't even thought about a difference in size; I was thinking the CPU overhead. If I save a 1GB file, how much processor time will it take to re-encrypt the whole thing so it can be sent off? Or does the chunking apply here too; i.e. only the chunk of the file that's changed has to be re-encrypted?
I don't know the exact answer to that, but "not much" in comparison to the time to send the bytes over the network. The actual contents are encrypted using AES which often has built-in instructions on modern processors and is thus very fast. The vast majority of the time is uploading time here.
Tahoe does use "convergent encryption" (basically, the key is based on the contents) so that the same file encrypted by the same client results in the same ciphertext (and thus, doesn't need to be re-uploaded).
I believe that only happens at the "capability" (i.e. file) level, though, not each chunk. So, if you had a directory of 10 files each 100MB and changed one, you'd only have to upload the new directory-descriptor and the one changed file -- but if you change a few bytes of a 1GB file, you'd have to upload all the ciphertext for that file again.
https://restoreprivacy.com/5-eyes-9-eyes-14-eyes/ Mostly, but also by some of the news I see over here about parliaments trying with more or less success trying to pass laws to take down websites or force them to comply with questionable reasons.
I understand that there could be reasonable arguments behind, but I feel very uneasy about it.
tarsnap's target audience is sysadmins and other UNIXy gurus. My grandma and my dad, who would benifit most from a secure sync mechanism would probably be unable to use it.
Also, I'm not sure how you can use tarsnap at good cost for p2p sync.
Tarnsap is for backup, I don't think it can really be used for sync in the general sense. It's also hard to predict how much it's going to cost you if you don't exactly know how much data you're going to upload (and is IMO prohibitively expensive for even moderately-sized backups).
I love the tech behind tarnsap, I love that the client is open source, I love the whole philosophy of it but I really struggle with the pricing.
I'm not sure that PrivateStorage actually adds anything to the equation?
EDIT> The Tahoe LAFS [2] model is more that you spread your data over multiple providers. NAS at home, several VPS providers, or what have you. It feels like RAID in the network, and it allows very precise setting of redundancy policies.
So syncthing actually only runs on trusted machines, whereas PrivateStorage will be able to run on both trusted (tightly managed) and untrusted machines (like a VPS in the USA).
[0] https://syncthing.net
[1] https://try.popho.be/byeunison.html
[2] https://tahoe-lafs.org/trac/tahoe-lafs