And it looks like two of the last really big tipping stones are being addressed:
1) Advanced NAT traversal via UDP, which is is less efficient than TCP, but orders of magnitude better than relying on relays. More about this here: https://kastelo.net/2017/03/08/syncthing-kcp.html
2) Internal filesystem notification watch facility for near realtime sync (currently possible via an external service -> syncthing-inotify). For more see pull request https://github.com/syncthing/syncthing/pull/3986
I have one suggestion for the authors. Make it clear on the front page exactly what Syncthing does. There's lots of information about Syncthing's qualities but if you don't already know what the software does, it's not obvious.
But the same suggestion could be made about almost any product with a web-site these days. Indeed Syncthing is doing much better than most; they have a box at the top of the page with a one-sentence explanation/plug.
Note that usage reporting is disabled by default, and has to be enabled manually. The Android app alone has around 30.000 active installs according to Google Play statistics. Usage reporting only shows about 2.000 of those. Based on this data, I would estimate that there are between 100.000 and 300.000 active installations of Syncthing.
Source: I'm the maintainer of the Syncthing Android app.
Is there any way to get additional information about why a phone and a desktop client on the same LAN can't see each other? (or take 10+ minutes to sync?)
It amazes me that Syncthing handles rapidly-changing files better for me than Dropbox. Where Dropbox has left different versions of files on different computers even after they have changed multiple times, Syncthing happily synchronizes them without issues. And the fact that Syncthing is open source and allows me to keep files on my own machines is a big plus. However, I would love to see a paid option for easy backup of selected files to a machine in the cloud. Some people want their files backed up and accessible through a web service and I wish I could recommend Syncthing to them as well.
Just a heads up: that almost certainly means SyncThing is failing to recognize some classes of conflicts and concurrency races, and is destroying some changes. Doing this right is a hard problem. I know they can be a pain to deal with, but we don't spin out conflicts for no reason!
Do you have any specific bug or design reference to the Syncthing codebase that illustrates this, or do you simply believe that it's impossible to do it faster than Dropbox does it?
Fwiw, i frequently get notifications from Syncthing that it's found a conflict it can't resolve and asks me to resolve manually.
Sorry, not super familiar with Syncthing. It's definitely not impossible to do better than Dropbox, but to some degree, as in all things, it's a function of engineering resources, telemetry, and usage.
We've likely put 25-100x more resources into solving this problem over the last 10 years, and we just have a lot more data b/c we have 100s of millions of users on lots of platforms using Dropbox with every application you can imagine. So we're able to tease out the "long tail" of weird file system and application behavior in a way that's very difficult for smaller projects. Truly durable conflict management in the face of arbitrary mutations by user applications on the filesystem ends up being a really, really hard problem to cover exhaustively. The Dropbox client handles literally hundreds of special cases.
So, yeah, I believe it is (in general) safe to assume that Dropbox is probably doing A-more-correct-thing for a complicated (and admittedly confusing) reason when it comes to sync behavior. But we're not perfect--we do still find surprises from time to time, so feel free to contact support if you see something that looks wrong!
That was my initial thought - however after rereading his post he seems to refer to changing files quickly on a single system and seeing the changes replicated to multiple additional systems. This reading suggests Dropbox fails to keep up and eventually decides some of its own yet-unfinished syncs in additional systems are conflicts.
Are you saying that you solve conflicts specifically for different types of applications (e.g., Word, Photoshop, Excel)? That's impressive. However, now I'm wondering how I get my own apps supported.
I think that Dropbox typically handles conflicts quite well, and the issues I had are more likely bugs outside the conflict resolution implementation. I was a bit brief in my comment above, so let me elaborate in case you or someone else is interested:
The issues I had didn't result in conflicted files. Rather, after making a big change (i.e. switching git branches) some files were never updated or synced. Dropbox stopped picking up changes in the folder and eventually removed new changes once restarted.
The order of events were something along the lines of:
1) Did work on computer A that caused massive file changes (i.e. moving between git branches).
2) Moved to computer B to continue work.
3) Noticed files were old or missing on B.
4) Syncing files in some other folders worked, but nothing happened in the folder with missing files.
5) Restarted Dropbox on both machines in hope that this would trigger a fresh sync.
6) Observed files being reverted to old versions or deleted on machine A.
The end result was that Dropbox threw away the changes I had made on A and left me with the original state of B. I was able to recover the changes from a backup, so it was no big deal in the end (although it left me a bit scared I could have lost those files without noticing).
I was in contact with Dropbox support about the issue and explained in detail what I had done and what happened. I was offered help to recover the files, but since I had already done so, I just told them I didn't need any more support on the issue. I thought it might be because /proc/sys/fs/inotify/max_user_watches had a low value on one machine, so I wrote back that they might want to add back the old warning about this. However, the same problem with deleted files happened again after I had verified that this value was high enough on all machines.
I have also seen how a script run by a colleague managed to confuse Dropbox. The script was running a test which repeatedly created and deleted the same file before checking its correctness. Running the script in the Dropbox folder left him with some old version of this file and a failed test. Running the scirpt in a folder outside Dropbox left him with the correct final version of the file. He was only working on one machine.
And yes, I know it's "bad" to run scripts like this or switch git branches on top of sync software, but it happens, and it is interesting to see how different software handles these cases.
It should be noted that Dropbox usually handles these massive file changes well, so moving to Syncthing has for me been more about it being open source and the possibility to keep files on my own machines. I was just glad to see that Syncthing also handles heavy use cases gracefully.
Last I recall it didn't, which kind of soured me on it - in an ideal world every sync operation would be prefaced with "send a backup to one of my backup endpoints first".
Backing up your filesystem can and really should be part of an entirely different process. Let syncthing focus on syncing. Get CrashPlan, rsync, or some other process going on your filesystem for backups. Unless you are hoping for backing up every single version of your documents.
One of the common complaints about Syncthing is the lack of native UI (you use your web browser), and the difficulty in setting it up for non-technical users (copying an executable into a directory, running it directly, setting it to automatically start on login aren't something you'd ask your parents / grandparents to do).
Syncthing is sorely missing iOS support. I know they have their reasons not to implement it[0], but I feel that feature might be worth the hassle.
Photo sync on iOS without iCloud is a huge pain atm.
My backup strategy involves a NAS folder that stays up to date (i.e. also deletes photos that I delete on my phone), and which I occasionally sort into final location. After that I can free the space on my phone.
All other options (including Resilio Sync) are a add-only "sync"[1], which makes it annoying to sift through all the photos including those I already weeded out a week ago.
Syncthing is entirely made by volunteers, and so far no one has had the time or interest to work on an iOS app. If you (or someone else) wants to work on an iOS app, that would definitely appreciated.
To avoid lock-in with Apple, I tried Amazon Drive for photos. It has worked perfectly for backing up media to their cloud storage. So now I keep my photos in two places.
I actually use OneDrive and it automatically syncs my camera roll from iOS. I use both iCloud and OneDrive and they've both worked basically the same since the day I started using it.
As a heavy user of Syncthing for the Tron project ( https://reddit.com/r/TronScript ) I'm excited to see how well Syncthing is doing, and really appreciate the hard work from the devs. We spun up a Syncthing node to distribute the project after Resilio Sync (formerly "BT Sync") put a hard cap on the number of nodes that were allowed to connect.
I created a Docker syncthing relay image[0][1] to enable quick deployment to put on some of my servers with extra bandwidth and memory (alot (>500) of connections quickly exhausts a cheap 512MB instance doing other things and the OOM killer runs).
What I find most shocking is that somehow my instance (in SF via DigitalOcean, surprise) is the only relay[2] on the West coast the majority of the time. How is that? It also is one of the busiest relays by connection count showing that plenty of people in the Bay Area are selecting it likely due to latency.
What's the drive or motivation behind people running syncthing discovery & relay servers?
Do they generally then run their own clients to only use their own servers?
With Tor I'm convinced over the "speech should be free, I'm making the world a better place" type argument.
With syncthing I'm incredibly impressed that this optional (but incredibly useful) infrastructure is being run by people for free since it doesn't quite make the world a better place in the same way? It just saves a bunch of other people setting up their own relay/discovery server on a cheap linode/aws instance?
I think people just want to help others, and do something useful with their spare resources. I run a relay on a server on my home internet connection, so it doesn't cost me anything.
> I think people just want to help others, and do something useful with their spare resources. I run a relay on a server on my home internet connection, so it doesn't cost me anything.
Same. The devs the contribute their time to build syncthing, the least I can do to help their (our?) cause is to spend 10-60 minutes to setup a relay and spend 5 minutes to babysit it every other month.
You don't really need anything except a fast enough internet connection. The relay page [1] has some stats about that. And there are various configuration options, see the documentation [2].
There are tools like gocryptfs or cryfs that solve that problem to a degree. cryfs even makes the directory structure and file names opaque with chunking the encrypted data in small blocks but this turns quickly into a benchmark of your sync software (i.e. don't use it with Nextcloud, the client can't cope well and fast with lot's of small files at least for the initial sync). Not sure if syncthing is better.
Didn't came around to testing gocryptfs, but if you don't need the chunking it looks like the better alternative.
This is the blocker for me too. Support for a zero knowledge server where even if someone gets access to the filesystem on the server, they can only see it encrypted.
Syncthing is awesome. It is more difficult to setup than something like Dropbox (though, scanning QR codes is pretty easy IMO), but that has gotten exponentially better over the last year or so via the community and it seems to only be getting better.
syncthing honestly is one of the best pieces of software i use. It manages a fairly often changing music library, a lossy copy of it, and my phones photos really handily all on a raspberry pi 1. One of my favourite programs i use daily.
Since no one is complaining: I find Syncthing too complicated to set up. You must create a folder, then get a number for a computer (which you end up writing in a pastepad or something like that because the two computers don't communicate), then go to the other computer and place the number in, then do the same for the second computer to the first.
Repeat it for 5 or 6 machines.
If any machine is switched what do you do? Start over again establishing the link with all other machines.
This is the price you have to pay for a (secure) decentralised system, but syncthing tries to help where it can. E.g. you can set a syncthing instance as an 'introducer', so you only have to establish a connection to this device to get offered connections to all others devices. You still need to authorise these introductions on the devices, but you do not need to type in the crypto verification strings.
This is the price you have to pay for a (secure) decentralised system, but syncthing tries to help where it can.
In general, Resilio Sync's 'key per folder' model seems to work better. I can give a key to other people and they can join the swarm without extra work on either sides. Also, it has a logical extension to encrypted-only peers: they get a derived key that can be used to sync in the swarm, but cannot decrypt the data.
If there was a stable open source program using Resilio Sync's sharing model, I would switch in a heartbeat. (There is Librevault, but it still seems to be mostly in development.)
Yeah, the only problem is you can never "un-give" a key, which means you either have to rebuild your cluster under a different key (having to re-give key to everyone else apart from that one guy you want to kick out), or deal with the person having access to the data forever.
If the data is encrypted using a session key and then the session key encrypted with a master key and stored with the data (concat them together) then it can. You can't revoke access to the data they've already got, of course, but as changes/additions are made they lose the ability to decrypt the new data because you've used a new session and master key.
Indeed, with the standard folders you cannot do this. With the identity-based sharing, you can remove someone from a folder. The identity-based sharing model has some downsides (no support for encrypted read-only peers), but does support user management.
That's how I do it. My use case is a large number of nodes (>200) need read-only access to data I provide from my node. So my node is set as "introducer" and gives the host list to everyone joining the swarm.
Once you established a connection to the 'introducer', it will introduce your device to all other devices it knows and vice versa. You still need to authorise these introductions on the devices, but no typing/copying of crypto strings.
The mobile app can read a qrcode from the screen of the laptop/desktop. My setup is laptop, phone, tablet.
Phone and tablet are master for their own pictures, screenshots, downloads, contacts. The laptop is master for everything else, keepassx included. I don't share anything over Google and Dropbox.
I came to wander if there was market demand for some kind of "Syncthing Admin" that would centralize this process for corporative users, and provide other kinds of features (commands from "Syncthing Admin" could be transmitted through Syncthing itself!).
You can make a machine an introducer so any things that are added on that machine are added to all the others on your network, also it has qr codes so adding to an android phone is very easy. its gonna be the same with any other p2p sync client. I think its even the same for resillo.
I have been a happy user of syncthing, it is brilliant Piece of software, I no longer have to run around looking for data cable, I just open my laptop, and it syncs the complete file system!
Is there something like Syncthing that works like a RAID system? I'd like to use multiple computers in different locations and use them to backup and expand my storage size.
Basically redundant but distributed storage with N node fail tolerance. Preferably interfaceable via a "/mnt/cloud" directory so I can use it from my laptop and easily write backup scripts and store data into it.
If this existed I'd have something big to work on setting up next week.
This is an interesting idea, but I don't think it could work the way you are describing it (like RAID). Remember the limitations of RAID, and then add intermittent failures and huge latency to the communication between the "drives". It would not be able to remain coherent in real time.
Now, I think you could definitely set up a system that could handle the distributed data like this and allow you to access it, but I am skeptical that you could construct such a system that would be able to read/write/update/delete files on the fly and retain N-node failure tolerance at all times. You would probably have to allow the system to replicate data slowly.
There is also the problem of network splits. When they rejoin, how do they resolve conflicts and such?
into the filesystem and another server would be reading through everything and processing it (not in real time, but as a batch system).
It's ok for things to be left over for collection at the next cycle.
Also netsplits aren't much of an issue if you're largely just doing writes and reads on seperate files. It only becomes a problem for short-term and currently processed files (that are being read and written to in real time).
This is more of a long-term cold storage system and not nodes talking to each other.
TL;DR
"It's not a bug it's a feature"
Patient: "Doctor, how do I stop the pain I get from doing this"
Doctor: "You stop doing that"
You set up one introducer, where storage nodes and clients meet and exchange each other's address. After that your clients can use the storage as one big single, end-to-end encrypted, resilient space where N nodes out of M can fail without any effect on your data.
Is it possible to mount Tahoe-LAFS as a file system? If it is, then it's exactly what I'm looking for.
Edit: Also, can you say "this group of nodes count's as a single location" for failure protection? So if I have two locations where I store servers and then 10 other one off data collection locations can I say "I want you to treat these datacenters as one node since they are very likely to fail together if they fail".
I read the FAQ and this is apperently an asked question! I'm not surprised because I think many people are thinking of doing the type of thing I want to do.
Here it is directly from the Q&A...
" Q12: If I had 3 locations each with 5 storage nodes, could I configure the grid to ensure a file is written to each location so that I could handle all servers at a particular location going down? "
" A: Not directly. We have a wiki page and some tickets (linked from the wiki page) about this but it's deeper than it looks and we haven't come to a conclusion on how to build it.
The current system will try to distribute the shares as widely as possible, using a different pseudo-random permutation for each file, but it is completely unaware of server properties like "location". If you have more free servers than shares, it will only put one share on any given server, but you might wind up with more shares in one location than the others.
For example, if you have 15 servers in three locations A:1/2/3/4/5, B:6/7/8/9/10, C:11/12/13/14/15, and use the default 3-of-10 encoding, your worst case is winding up with shares on 1/2/3/4/5/6/7/8/9/10, and not use location C at all. The most likely case is that you'll wind up with 3 or 4 shares in each location, but there's nothing in the system to enforce that: it's just shuffling all the servers into a ring, starting at 0, and assigning shares to servers around and around the ring until all the shares have a home.
The possible distributions of shares into locations (A, B, C) are:
It's not exactly a filesystem, but git-annex allows you to configure multiple nodes, and to defined the minimum number of copies each file should have (globally or per file type). Using the daemon (assistant), the nodes will automatically copy files from other nodes until the number is reached.
That seems to be focused on throughput. I'm focused on how much data I can store into the system and if it's redundant so long as N nodes don't die (and if they do I'd like it to attempt to reorganize the network so they system keeps working with N node failure).
I don' t think so. My experience with it is non-existent, but I read up on it before and now again, and self-healing and replication was pretty high on the feature list. Anyway, what you describes is a distributed/clustered (fault-tolerant) file system. Tahoe-LAFS as mentioned by jabl is one of those, https://en.wikipedia.org/wiki/List_of_file_systems#Distribut... has a list.
Can anyone share their experiences with Syncthing compared to Resilio Sync (aka Bit Torrent Sync)?
I am using Resilio Sync since its beta version, but they have added a lot of restrictions since then and made some functions only available for paying users.
I am concerned about ease of use (setup and everyday usage) on Windows and Android.
We used Resilio Sync (aka Bittorrent Sync aka BT Sync) for a while for the Tron project ( https://reddit.com/r/TronScript ). It worked great for our use-case: distributing a large number of files to a large number of nodes who required read-only access. Anytime I changed a file on the master node it would blast out to everyone else while simultaneously preventing any other node in the swarm from propagating changes.
Unfortunately with all the restrictions they introduced, it stopped working reliably. Internally they introduced a hard-coded peer cap of something like 32, while our swarm was already over 500.
We switched to Syncthing and it's been working well. There is still one major issue and that's that anyone can mark their copy of the folder as "read-only" (formerly called "Master"), and then they will attempt to propagate changes out to the rest of the swarm. So, there's that. But strictly on a distributed file syncing level, it works great.
Do you have a reference to this 32 node cap for Resilio? I just tried doing a quick search and did not see any specific results showing this limit. Thanks.
I basically post a warning that says "be aware that someone might attempt this." There's unfortunately nothing I can do about it with the way Syncthing is currently architected. We haven't had any malicious users yet but I'm sure we will eventually.
Fallback is just to provide a static .exe on the mirrors and leave the Resilio Sync node running alongside it.
There is no concept of a single "Master". The folder types are "Send & Receive" and "Send only" and the latter will not accept changes from any other client. I still agree it's not the best solution for a network you don't control and that might have a malicious actor in it.
I used syncthing when it first came out on windows and it was very easy to set up so i can only assume its improved since then, and the android app is quite good. it has a large number of settings and you can make it only run when you're charging so you dong use all your batter.
In the meantime, I have checked Syncthing forums and I have found this topic: https://forum.syncthing.net/t/why-im-moving-back-to-btsync-r... It is exactly complaining about the ease of use on Windows and Android, saying there is no native GUI, etc. From this description it seems, that Syncthing is not there, yet.
There isnt a native gui, in that its in the browser but that isnt really a problem i would have said. it allows the developers to focus on a cross-platform good backend and have a crossplatform frontend without the issues that come with having that.
There are a number of projects which make life easier for Windows users (tray icon, autostart, make it look and feel and bit more like a Windows application, bundle filesystem notifications, etc). Two of the big ones are SyncTrayzor (https://github.com/canton7/SyncTrayzor) and Syncthing-GTK (https://github.com/syncthing/syncthing-gtk).
I tried to switch to SyncThing BUT firewalls at various locations just tripped me up. Resilio Sync breaks through the firewalls with no problem. That's my only issue.
Has anyone been able to sync large folders with syncthing? Last time I tried when syncing 500GB you'd end up using >500MB of memory which made it unusable for home NAS setups. Hopefully it has since improved?
See the bug report at the time. It was closed without being fixed and it was 100% reproducible. Maybe it's fixed now but no one was too worried about it then and I saw similar reports on the forums for a while before I lost interest. I should either try syncthing again or finally test infinit completely. This is the last piece of the puzzle I'm missing to have a fully workable personal data management solution that allows me to bring up a completely new laptop with my full config and data with just a clean Ubuntu install and a few commands.
This goes against one of the points of Syncthing (but I'm more interested in its other features): are there any good 3rd-party hosts so I don't have the hassle of maintaining on myself?
The Android version targets Linux (due to some issues with Golang), which means it doesn't show up in the Android section. Actual usage reporting data for Android is ~2800 devices or ~11.7%.
I've run both the mobile and desktop version on my phone and it just drains the battery way too fast. I ended up just using an rsync cron job to sync my phone over ssh while everything else uses syncthing (and yes, I know this means I can't delete anything but I very rarely do anyway.)
You can set it to sync only on wifi and/or only when charging, which has always done the job for me. Depends on your use case, but things like password DBs need to be update-able from both directions, which rsync doesn't really do. On the other hand, I can rely on the fact that I don't need my home to be sync'd up unless I'm actually at home, on wifi.
I loved the concept and hated the battery consumption on Android devices. Well it works totally different than a normal cloud storage so there is nobody really to blame but I cannot afford that much battery on my small 5" phone. Also did not liked that dates are not correctly on Android (but again this is not Syncthing fault). Why we cannot have date correction on external storage (without root) like on desktop file systems on Android? :(
I would love to use Synchthing but Bit Torrent Sync, now known as Resilio Sync, works behind firewalls so much better. Hope to one day come back to syncthing.
Last year. I am in different Public School networks and SyncThing still got caught. I worked for hours to figure it out and just couldn't :( Did they beef it up since last year?
Well I am in about 10 different school districts and they all have different policies. I just won't do anything that won't work as a default so I can claim ignorance.
This looks pretty cool, and I'll definitely look at it as a Dropbox replacement.
On the negative side, your stats page doesn't have a link back to your main page, so you won't even know how many folk had a look based on this HN post. It's a good idea to ensure that all your pages have a link back to your main page.
Yeah that page which was linked to isn't really part of the website: it's a separate site entirely, which is a frontend to the data-collection service.
People aren't really supposed to link to it directly as a means of introducing new people to Syncthing...
Indeed, but you never know what page someone might submit to a site like HN, so it's usually a good idea to have a link back to your main site on all pages.
For example, they could like to your Forum page (which doesn't have a link to your main page). Or to your Github page (which has a link to the Forum, but not to the main page). Or your Docs page (which...well, you get the idea).
Sites that have witty and inventive 404 pages sometimes get those publicised. You really never know what someone will choose to submit.
Nicely, nicely crafted software, and getting better by each bump.
I have been running it like forever between various Android devices, day to day PC, backup server, and whichever relevant devices have been passing through the household. Just works, can't recommend it enough. The GTK gui is a handy extra.
Hopefully some day an implementation in something other than Go will emerge. The current Go implementation is mind-numbingly slow and uses huge amounts of memory.
IIRC it was due to the new version having a protocol incompatibility with the old version, which means that users needed to upgrade SyncThing on all of their devices simultaneously to keep using it. I can certainly see how that could be a hassle especially when SyncThing is installed via a package manager or in the case of the Android app, where different sources provide different versions.
1) Advanced NAT traversal via UDP, which is is less efficient than TCP, but orders of magnitude better than relying on relays. More about this here: https://kastelo.net/2017/03/08/syncthing-kcp.html
2) Internal filesystem notification watch facility for near realtime sync (currently possible via an external service -> syncthing-inotify). For more see pull request https://github.com/syncthing/syncthing/pull/3986
The future looks bright!
Addendum: Have a look at the community forum https://forum.syncthing.net