I've written something similar in python — a script that renames documents based on their contents using Ollama. It's very easy to write things like that, I recommend anyone interested in local LLMs try a fun project such as that.
My first question was why would this be a file system rather than an app or a script, but I see it's actually an Electron app and python scripts, which I think is the right approach.
I think that something that would have a UI for automatically tagging, renaming, and moving files on request but not constantly would be very handy. Also, if you could somehow steer files into binning directories ("if X, put it in bin ~/X, if Y, put the file in bin ~/Y", "if it's an invoice or deals with payment, put it in ~/Documents/Finance"), that would be cool. Finally, Windows support would be amazing.
I feel naming this is rather personal as we all think differently, and organize files mirroring our way of thinking. I find it odd to have an LLM do that. How often do you use it and how does it fit withing your workflow?
I sometimes scan over 300 documents at a time. They are OCR-ed, but this does not make them easy to search when they are unindexed (on cloud drives, or on disk/tape for archival purposes). So it is important to name them and tag them with relevant keywords.
Usually I just want the LLM to extract the document title and unique information (document numbers, companies mentioned on the document, people mentioned in the document, dates, and similar).
The entire workflow is this:
1. Prepare a batch of documents with Patch-T codes and load them into an automatic feed scanner.
2. Scan documents with NAPS2 and OCR, producing PDF files as they are scanned.
3. My python script monitors the output directory, picks up batches of files at a time, and processes them sequentially. The first page of each document is loaded as text, the text is fed into llama2 (the last one I used) through Ollama with a few-shot prompt, roughly in-context trained for my sort of documents, it outputs a file name.
4. The file is renamed and moved to a processed directory.
5. Optionally, I will also run a second step to generate tags.
I do not organize my every day files this way, it is specifically for digitizing and making redundant archival copies of old documents, which I collect.
This is a bespoke pipeline that fits very well for my workflow. The renaming is very accurate after I have perfected the context that I pass to Ollama, and the final files are easily searchable on slow-read media. I have not yet added any kind of computer vision because I scan so few photographs that I can label them myself. I agree with you that naming is personal, but I think that we can train the LLMs to name things the way we expect them to be named. And at some point, the human cost of labelling large amounts of data may make certain workloads simply impossible.
Interestingly, my pipeline is very slow. It runs on an old GPU and takes about 30 seconds to produce just the few tokens to name a file, and about the same for the tags. I will probably move it to an even slower pipeline on a Mac mini M1 when I have the time, just to save on electricity. Because ultimately, it finishes in a few hours and it doesn't take up any of my time. What would be a full-time labelling job for a human is now a full-time job for a machine, which makes the archival hobby feasible and cheap.
I’m a lawyer and do the same thing. YYMMDD_ENTITY_TOPIC is a template I use a lot when dealing with lots of documents in large cases. Regex and Keyboard Maestro were my main tools before LLMS became a thing.
Will this actually move stuff around? I'd prefer that it mounted in another directory, giving me an organized view of my files but not actually moving them.
No— you have to explicitly checkmark it for the change to happen. We saw the famous OpenInterpreter tweet where the agent deleted every file on someone’s computer. We wanted to avoid that
Hmm or just make it produce symbolic links by default, and then if you want, allow you to "commit" changes, which would actually move these files. Any downsides to this?
Could also have integrity checks that total number of files and their attributes didn't change after the commit
This is pretty neat. Can confirm my Downloads folder could use some help, there's usually at least one or two nested "Old Downloads" or "Sort me" folders.
I think one thing to improve the readme or landing page for this project would be a before & after for a sample ~/Downloads directory, maybe in `tree` format.
A lot of people are worried about Llama screwing up, and that's a valid concern. But this is also an Electron app + a few nontrivial Python scripts for watching changes to a filesystem, yet there are zero actual tests. Just some highly unrepresentative "sample data."
I am a grumpy AI hater. But Llama is not the security/data risk here. I don't think anyone should use this unless they are interested in contributing.
Oh come now no need to be grumpy. We need to just accept that this is somewhere between managing your files using an algorithm that integrates a roulette wheel and a system that instead has Russian roulette built in. In either case its going to get messy.
Apple will probably add this feature and more to the stacks feature on macOS (a multimodal model would be very useful there). Even better: I expect Apple to use ML and local models to scan file contents and have them show up in search (e.g., on spotlight or Raycast, search for the picture of my latest receipt that I saved __somewhere__ I don't remember).
A few months ago, while using search in finder, I noticed that it would return images with the search term in the image. They seem to be doing something ML already
I build a tool with similar goal a year ago: https://github.com/jjuliano/aifiles -- A CLI that manages files using AI. It helps me to organize my files and backups.
In my case, I built a local OpenAI-emulated proxy API to run against a local LLM, and used a modified OpenAI library to connect with it. This was the solution a year ago. Now, it's easier to deploy a local LLM.
No demo/example? I'm not going to run this on the basis that the developer thinks he built something that give me some pretty much unspecified benefit. I don't really understand the "incognito" mode either. What's the benefit of having it off? Why isn't it on by default?
The high level strategy would be to get the browser tab contents and then ask a local LLM to organize the tabs based on their full content. If you run selenium or some developer versions of certain browsers you might be able to source the full contents directly, including any state that may not be obvious from only the URL. If the url of the tabs is enough (for most cases it should be), then there are many options and relatively easy implementations possible. Emacs has tools to communicate with browsers (though they depend on the local OS and some are limited to only certain browsers or certain OSs), so if you are happy with controlling the tabs from Emacs, you could simply reorganize/regroup the tabs within an Emacs buffer with the help of an LLM that gets the url and may open up connections to see what the trivially accessible contents aee. I would use this and might test this idea when I am not AFK. If Emacs is not an option, perhaps find an OS or extension-dependent way to reorganize the tabs.
Yes they can. But the ergonomics of that aren't great. Also ollama takes up a sigificant amount of memory. I'm trying this with a cheap model like phi-3 hosted on my server.
I really like Arc, but I hate when I go through the file download context to choose a path and file name, then Arc renames the file. For the direct downloads, I’m all for it, but I wish it recognized that I chose a file name myself, so don’t rename.
chrome recently rolled out something that groups browser tabs together. I thought they said they used AI. But basically a bunch of youtube tabs get consolidated into a youtube button tab that toggles the group to expand.
Makes sense. Wouldn't grouping based on the context of what i'm browsing be more desirable? If i'm searching a bug fix i'll probably be doing it across multiple domains, perhaps a tab group based on that.
Alex here—- one of the team members behind the project.
We built this for the Llama3 Cerebral Valley Hackathon. The idea was this: My ~/Downloads folder is extremely messy, and I wanted an agent to fix it for me. So we built one.
LlamaFS reads file contents and metadata to Ollama with LlamaIndex, Moondream, and Whisper to understand what it’s about. It renames the files according to a specified pattern and organizes similar files into directories. Also, LlamaFS doesn’t overwrite anything until you explicitly ask it to (no risk of deleting anything precious)
One cool thing we benchmarked with AgentOps was the speed. It goes pretty fast, ~500ms per file.
In watch mode, LlamaFS starts a daemon that watches your directory. It intercepts all filesystem operations, updates i and uses your most recent edits in context to proactively learn and how, so you don't learns predict how you rename file. e.g. if you create a folder for 2023 tax documents, and start moving 1-3 file in it, LlamaFS will automatically creates, and move the right!”
Come again? Was AI used to generate the documentation?
I think I would like a tool that intelligently suggested renames but doesn't automatically do them.
Arc Browser has an AI rename feature (for downloaded files). I tried it out but I had to turn it off. I love Arc Browser BTW and their AI hover summary is useful. I found poorly naming of files to be disruptive and it's a lot better if I am more involved in renaming the file- that will help me remember it.
That's the approach I took on a similar toy project[IT] I've been working on the past week (images instead of text.) It first creates a `metadata.csv` file with suggested clean filenames and a Boolean flag indicating if it thinks it needs to be changed at all. You can manually view and edit the `metadata.csv` file and only once you're happy with it do you pull the trigger by running `autorename()`. I definitely feel like you need a human in the loop for this kind of thing.
Working on something related, github.com/idncsk/canvas, canvas-server, canvas-ui-browser, always use the dev or for browser the tarbor branch, waiting on my burger in Budapest sry for the lack of details
Interesting. Literally just this week i planned some time in to experiment with LLMs as a full FS. This seems not truly fair to be named "FS". But cool approach. Though this has been done in bash scripts online before. So i only dislike naming it FS. It's not an FS. Just like your productivity app is not an OS.
Will check it out regardless. Congratulations anyway!
This is something where I can see infinite context windows really working. My local llm knowing where tens of thousands of files located on my computer are, what they do, which ones can be moved, which ones can't etc, how to organize them. Just cleaning up my absolutely messy space.
This the type of thing I plan to implement. If there's one way to save things. Of course, it's definitely going to hide some files eventually, so hopefully you got a log file going documenting the modifications.
Sending all your personal files to an external provider seems like a recipe for a privacy disaster. Luckily, they provide an option to use your local Llama instance.
This would be cool if it weren’t so heavy-handed—a similar application to add tags to files or some metadata that could be indexed by Alfred or Rofi would be cool.
I think that would be a mental degradation. Organising the data is at least as important as knowing what the data is. If you don't have a mental map of what you have, you will not know what you can search. I don't really mean file contents or dates here, that sort of auto-organisation is long possible already.
I'm more excited for the day we can have robot do that for the analog stuff instead. I'd rather sit and enjoy organizing files on my computer than loading the washing machine, the dishwasher, folding clothes, packing for vacations .. well you get the point.
Would be cool if I can build CI pipelines for daily stuff, just describe everything in YAML* and not have to do repetitive tasks all the time.
* or, hopefully something better that isn't a pain to write
So, this terrifies me, in that it is going to rename files and move them. I have an inordinate amount of fear about that happening in the hands of an LLM. But, I love the idea of a virtual filesystem / directory that lets me see things based on LLM-led naming. Just, leave my main files alone, oh please god, don't touch them.
With virtual systems, I think it could be really interesting, you could have a few different types, from conceptual, to project, to research area. That would be amazingly cool.
When I write programs like this, I have my program write out shell commands for moving/deleting/renaming files to stdout. Then I review the changes and if I like them, I pip to `sh` or `bash`. Assuming that the file renames are idempotent, this works out really well.
My first question was why would this be a file system rather than an app or a script, but I see it's actually an Electron app and python scripts, which I think is the right approach.
I think that something that would have a UI for automatically tagging, renaming, and moving files on request but not constantly would be very handy. Also, if you could somehow steer files into binning directories ("if X, put it in bin ~/X, if Y, put the file in bin ~/Y", "if it's an invoice or deals with payment, put it in ~/Documents/Finance"), that would be cool. Finally, Windows support would be amazing.