I have built a tool that allows Linux and macOS users to mount their Google Drive account locally as a virtual file system. The file system supports most typical operations (creating/deleting/moving/renaming files and directories, reading/writing to them).
It can also export special Drive files as OpenOffice documents and has an in-memory cache which improves the speed of navigation and file access. Changes performed on other clients (e.g. on the web or mobile interface) are usually detected shortly and applied locally as well.
I also wrote a paper[0] on it, as I am using the project for my bachelor thesis.
It is still rough around the edges and lacks some functionality, but for the moment it is good enough for my personal use. I am looking forward to hearing your comments and feedback on how to improve it.
Thank you! I have covered this question in a different comment:
> In short, GCSF tends to be faster in several cases (listing files recursively, reading large files from Drive). The caching strategy it uses also leads to very fast reads (x4-7 improvement compared to google-drive-ocamlfuse) for files that have been cached, at the cost of using more RAM.
> On the other hand, I arrived at the conclusion that google-drive-ocamlfuse provides a better overall experience, as it already has an active community behind it. My goal with GCSF is to more or less close the gap between the two projects and reach the same level of functionality.
It was required for us, one semester of “independent study” where you must select a research adviser (professor) and produce a final report. I did mine on “TorCoin” and it was an awesome experience, the only time I legitimately enjoyed school.
I'm curious, how does it compare to google's own similar product, Backup and Sync[0]?
It seems like the primary difference (other than being FUSE-based) is that, for data where google has a collaborative interface (docs, sheet, etc), Backup and Sync will place into the folder a "file" which is simply a link to open up the docs/sheets/slides app. Your app, on the other hand, seems to translate this data into open office format when changes are detected (i.e. .ofd for docs).
At first glance this strategy seems less good for real-time collaboration and less performant, but there may be advantages to it as well.
Do you find that strategy to be practical in most cases? Are there other features that distinguish this from Backup and Sync?
EDIT: Aha, looking into it more, it seems like mounting drive folder with this project doesn't trigger the downloading of any data, and rather lazily loads the data when the file system requests it, which would be a very significant difference. Is that right?
Thank you for your response! You are correct in saying that GCSF doesn't download any data upfront. It constructs the file tree at mount time using only file metadata and downloads the actual file content only when it encounters a `read` call. This is an advantage if you're running low on local space (the file system essentially adds 15 GB of "free" additional storage).
Real time collaboration is indeed a shortcoming. I would still use the online interface of docs/sheets/slides for this purpose.
I haven't personally used Backup and Sync, as there is no Linux version of it. From what I gather, it seems that it uploads local files to a new category on Drive instead of the 'My Drive' directory. This can be useful for automatic backup. You simply set it up once and forget about it.
However, GCSF might be a better choice for the additional control it provides. Whereas with Backup and Sync you have to inspect a file manually in order to check whether it was synced or not, GCSF ensures that a pending write operation will only return once the file transfer is effectively complete. For instance, when copying a file to Drive, the execution of the command will take as long as the upload process itself. Once finished, an exit status of 0 will indicate precisely that the upload was successful and the file is certainly on Drive.
I imagine this sort of strategy is a better fit for use cases which require high confidence and predictable behavior.
> For instance, when copying a file to Drive, the execution of the command will take as long as the upload process itself. Once finished, an exit status of 0 will indicate precisely that the upload was successful and the file is certainly on Drive.
This is neat! Can I use this for multiple accounts? I've been looking for a MacOS and Linux solution for this exact problem but I'd like to use it for 3-4 Google accounts.
Another question related to implementation. How easy is it to use a language like Rust for some web connection stuff like what is being used here? I've never used the language, but I've always been interested in it.
> This is neat! Can I use this for multiple accounts? I've been looking for a MacOS and Linux solution for this exact problem but I'd like to use it for 3-4 Google accounts.
The current release allows you to mount a single account for each local user. For the moment, you could work around this limitation by creating another user on your machine and running a separate instance of GCSF as the new user. I created an issue [0] and will work on adding support for multiple accounts in a future release.
> How easy is it to use a language like Rust for some web connection stuff like what is being used here? I've never used the language, but I've always been interested in it.
You can make HTTP requests relatively easily with the help of hyper [1]. In the case of this project, I was lucky to have some useful libraries readily available: yup-oauth2 [2], google-drive3 [3]. I would place Rust somewhere below Python in terms of existing tools and support (for instance, Google doesn't provide any official client libraries for Rust, but it does for Python), but making relatively simple applications is completely achievable (and fun as well) in Rust.
In addition to FUSE which is a wrapper that must be then given calls just like this, there are I suspect tens if not hundreds of non fuse implementations - the drive api is practically written for people to do this.
I say this not out of malice, I've done it myself before when integrating an application written in python which wad expecting a filesystem.
Props for writing it in rust though - I've been looking for an application in rust to cut my teeth on, perhaps I'll use yours as an intro. :)
Guessing you're basically building a FUSE filesystem with a Google Drive backend? I know KeyBase.IO did something similar accross platforms. Not sure if thats how you did it but might interest you to check out FUSE too.
I’m not sure if it’s what you’re looking for, but Backup and Sync is a similar Drive sync utility made by google for personal accounts. The only feature that it doesn’t have is file-level sync settings (afaik B&S only has folder level sync options)
Backup and Sync lacks the coolest feature of File Stream. File Stream allows you to download files when you access them instead of keeping them all on your PC. For someone like me with lots of photos, a smallish SSD, and a fast internet connection it's very convenient.
Interesting, I use it at work and have nothing but problems with it. What sort of file count do you have on Drive? I suspect ours is beyond the testing scope of DriveFS and as such perfo nance is hideous and often freezes up computers for minutes at a time.
I ship a similar product that has included Google Drive integration (and many other back ends) since 2013 - https://www.expandrive.com - happy to answer any questions too!
I remember testing Expandrive out a few years ago. Great to hear you're still going strong and expanding functionality to cloud storage services too.
Haven't revisited the program since we originally tested it and found a major problem for our application, but can Expandrive now handle symlinks over sshfs correctly or does it still silently mangle them into regular files?
Very cool, I think I will try using this as my standard google drive solution on linux.
One thing I haven't found in your paper is how the software handles conflicts. Suppose I have two or more machines hooked up to the same account, all simultaneously modifying the same file in different ways. What's going to happen? I guess this would be mostly up to the server side and out of your control, but maybe you can point me to a specification on how such issues are handled?
Some unwanted behavior might occur in scenarios like the one you describe. Most probably, the change performed by one client will silently overwrite the other. If there is however a small gap between the operations, the earlier one will have a better chance of being picked up by Drive and detected by the other file system instance. In this lucky case there might be no data loss.
I would set the `sync_interval` configuration parameter to a low value to improve the chances of detecting changes as soon as they appear, but I would also try to make sure that only one client works on a certain file/directory at one time.
This case looks like a good area for future improvement. Thank you for addressing the issue!
Thank you! I think google-drive-ocamlfuse is an excellent product. It is clearly more mature and has more features than GCSF.
I made a comparison between the two projects in sections 4.2 and 4.3 of my thesis [0]. In short, GCSF tends to be faster in several cases (listing files recursively, reading large files from Drive). The caching strategy it uses also leads to very fast reads (x4-7 improvement compared to google-drive-ocamlfuse) for files that have been cached, at the cost of using more RAM.
On the other hand, I arrived at the conclusion that google-drive-ocamlfuse provides a better overall experience, as it already has an active community behind it. My goal with GCSF is to more or less close the gap between the two projects and reach the same level of functionality.
Ditto. I'm using clone for this same function. I'm using it for casual file access (not lots of intensive access every day all day) but I'd like to know of there is any benefit to this over rclone, since as mentioned rclone has a pretty big user base..
Fuse has been around for a while, in fact MacFuse was an implementation that was open sourced by Google, although no longer workable given the advances of the MacOS. There is now OSXFuse which is used by a few commercial applications including Transmit and the Storage Made Easy Cloud Service which uses it to support multiple backends including Google Drive and Google Storage.
On Windows there are Fuse implementations but they are not as rock solid as OSXFuse on Mac. The best commercial implementation Windows FUSE I'm aware of is CallbackFS.
I'm getting an error when compiling (`cargo build`), both on stable and nightly Rust:
error: failed to run custom build command for `fuse v0.3.1`
process didn't exit successfully: `/home/bromskloss/code/gcsf/target/debug/build/fuse-46607682b28e6d4c/build-script-build` (exit code: 101)
--- stderr
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Failure { command: "\"pkg-config\" \"--libs\" \"--cflags\" \"fuse >= 2.6.0\"", output: Output { status: ExitStatus(ExitStatus(256)), stdout: "", stderr: "Package fuse was not found in the pkg-config search path.\nPerhaps you should add the directory containing `fuse.pc\'\nto the PKG_CONFIG_PATH environment variable\nNo package \'fuse\' found\n" } }', libcore/result.rs:945:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Hey, thanks for the awesome work! I always wanted something like "one drive on demand"[0] but for linux. Maybe in the future you could add an option to keep some files offline. By doing that, Linux would have a real alternative to one drive
[0]https://support.office.com/en-us/article/learn-about-onedriv...
Not sure what you are referring to as implemented natively in GNOME.
One crucial difference is the fact that Backup and Sync picks up local files (which exist physically on the user's machine) and uploads them to a special Drive directory in the background. GCSF does not store anything locally unless you tell it to. It simply creates a virtual directory and reports its content and file tree structure so that it matches whatever exists on Drive.
I think GP is referring to the Gnome Virtual File System (GVfs) [1], which has a number of pluggable backends, one of which being Google Drive. I've used it lightly and it seemed to "just work."
The question is not stupid at all, but the answer is :). GCSF stands for "Google Conduce Sistem de Fișiere" -- a (bad) word-by-word Romanian translation of "Google Drive File System".
Although this is fairly nice, I would recommend Syncthing over this. It has the benefit of not relying on any third party to store your data, it's all exclusively on your devices, along with some very solid security.
From my understanding, the primary purpose of this is for backup and syncing your Google Drive files between multiple devices, which is very similar in nature to Syncthing. Is there something else that I am missing?
Seconded, there are widely deployed alternatives, specifically rclone (mount) and plexdrive. The former is, for backup purposes, better used via it's CLI, mostly due to how it relays IO errors up and the fact that many applications are too dumb to retry filesystem errors, or even not crash completely.
I have built a tool that allows Linux and macOS users to mount their Google Drive account locally as a virtual file system. The file system supports most typical operations (creating/deleting/moving/renaming files and directories, reading/writing to them).
It can also export special Drive files as OpenOffice documents and has an in-memory cache which improves the speed of navigation and file access. Changes performed on other clients (e.g. on the web or mobile interface) are usually detected shortly and applied locally as well.
I also wrote a paper[0] on it, as I am using the project for my bachelor thesis.
It is still rough around the edges and lacks some functionality, but for the moment it is good enough for my personal use. I am looking forward to hearing your comments and feedback on how to improve it.
[0] https://sergiu.ml/~sergiu/thesis.pdf