Hacker News new | past | comments | ask | show | jobs | submit login
llama-fs: A self-organizing file system with llama 3 (github.com/iyaja)
221 points by archb 8 months ago | hide | past | favorite | 62 comments



I've written something similar in python — a script that renames documents based on their contents using Ollama. It's very easy to write things like that, I recommend anyone interested in local LLMs try a fun project such as that.

My first question was why would this be a file system rather than an app or a script, but I see it's actually an Electron app and python scripts, which I think is the right approach.

I think that something that would have a UI for automatically tagging, renaming, and moving files on request but not constantly would be very handy. Also, if you could somehow steer files into binning directories ("if X, put it in bin ~/X, if Y, put the file in bin ~/Y", "if it's an invoice or deals with payment, put it in ~/Documents/Finance"), that would be cool. Finally, Windows support would be amazing.


I feel naming this is rather personal as we all think differently, and organize files mirroring our way of thinking. I find it odd to have an LLM do that. How often do you use it and how does it fit withing your workflow?


I sometimes scan over 300 documents at a time. They are OCR-ed, but this does not make them easy to search when they are unindexed (on cloud drives, or on disk/tape for archival purposes). So it is important to name them and tag them with relevant keywords.

Usually I just want the LLM to extract the document title and unique information (document numbers, companies mentioned on the document, people mentioned in the document, dates, and similar).

The entire workflow is this:

1. Prepare a batch of documents with Patch-T codes and load them into an automatic feed scanner.

2. Scan documents with NAPS2 and OCR, producing PDF files as they are scanned.

3. My python script monitors the output directory, picks up batches of files at a time, and processes them sequentially. The first page of each document is loaded as text, the text is fed into llama2 (the last one I used) through Ollama with a few-shot prompt, roughly in-context trained for my sort of documents, it outputs a file name.

4. The file is renamed and moved to a processed directory.

5. Optionally, I will also run a second step to generate tags.

I do not organize my every day files this way, it is specifically for digitizing and making redundant archival copies of old documents, which I collect.

This is a bespoke pipeline that fits very well for my workflow. The renaming is very accurate after I have perfected the context that I pass to Ollama, and the final files are easily searchable on slow-read media. I have not yet added any kind of computer vision because I scan so few photographs that I can label them myself. I agree with you that naming is personal, but I think that we can train the LLMs to name things the way we expect them to be named. And at some point, the human cost of labelling large amounts of data may make certain workloads simply impossible.

Interestingly, my pipeline is very slow. It runs on an old GPU and takes about 30 seconds to produce just the few tokens to name a file, and about the same for the tags. I will probably move it to an even slower pipeline on a Mac mini M1 when I have the time, just to save on electricity. Because ultimately, it finishes in a few hours and it doesn't take up any of my time. What would be a full-time labelling job for a human is now a full-time job for a machine, which makes the archival hobby feasible and cheap.


I’m a lawyer and do the same thing. YYMMDD_ENTITY_TOPIC is a template I use a lot when dealing with lots of documents in large cases. Regex and Keyboard Maestro were my main tools before LLMS became a thing.


> automatically tagging, renaming, and moving files on request but not constantly

Key point here -- I might be old, but I like my files to stay where/how I left them.


Well, I see very good reasons for writing it as an actual file system... it allows the rest of the ecosystem to interact with it on an equal footing.


Will this actually move stuff around? I'd prefer that it mounted in another directory, giving me an organized view of my files but not actually moving them.


Exactly this. If it starts moving files around, who knows where will they end up.


No— you have to explicitly checkmark it for the change to happen. We saw the famous OpenInterpreter tweet where the agent deleted every file on someone’s computer. We wanted to avoid that


A `dry run` mode producing a symbolic link based reorganized copy of the folder would be nice.


Hmm or just make it produce symbolic links by default, and then if you want, allow you to "commit" changes, which would actually move these files. Any downsides to this?

Could also have integrity checks that total number of files and their attributes didn't change after the commit

Cool project OP


Hopefully not. LLMs hallucinate. It will move some critical files somewhere random n levels deep in some node folder you will never be able to find


It's perfect for organizing people's /tmp folder


/dev/null seems like a nice destination


Yeah, the thought of some LLM based hallucinating piece of software moving my files around gives me anxiety.


This is pretty neat. Can confirm my Downloads folder could use some help, there's usually at least one or two nested "Old Downloads" or "Sort me" folders.

I think one thing to improve the readme or landing page for this project would be a before & after for a sample ~/Downloads directory, maybe in `tree` format.


A lot of people are worried about Llama screwing up, and that's a valid concern. But this is also an Electron app + a few nontrivial Python scripts for watching changes to a filesystem, yet there are zero actual tests. Just some highly unrepresentative "sample data."

I am a grumpy AI hater. But Llama is not the security/data risk here. I don't think anyone should use this unless they are interested in contributing.


Oh come now no need to be grumpy. We need to just accept that this is somewhere between managing your files using an algorithm that integrates a roulette wheel and a system that instead has Russian roulette built in. In either case its going to get messy.


Apple will probably add this feature and more to the stacks feature on macOS (a multimodal model would be very useful there). Even better: I expect Apple to use ML and local models to scan file contents and have them show up in search (e.g., on spotlight or Raycast, search for the picture of my latest receipt that I saved __somewhere__ I don't remember).


They Have stacks already which kind of do a similar thing https://support.apple.com/en-ca/guide/mac-help/mh35846/14.0/...


A few months ago, while using search in finder, I noticed that it would return images with the search term in the image. They seem to be doing something ML already


It’s been like that for years. It does OCR of text, object names (like cat) and groups by person / pet, etc.

Most stuff like that doesn’t need an LLM and would probably decrease utility.


I build a tool with similar goal a year ago: https://github.com/jjuliano/aifiles -- A CLI that manages files using AI. It helps me to organize my files and backups.

In my case, I built a local OpenAI-emulated proxy API to run against a local LLM, and used a modified OpenAI library to connect with it. This was the solution a year ago. Now, it's easier to deploy a local LLM.


No demo/example? I'm not going to run this on the basis that the developer thinks he built something that give me some pretty much unspecified benefit. I don't really understand the "incognito" mode either. What's the benefit of having it off? Why isn't it on by default?


I would like to see something similar for my browser tabs which are always a mess. Unsure what UX considerations are needed. Thoughts?


The high level strategy would be to get the browser tab contents and then ask a local LLM to organize the tabs based on their full content. If you run selenium or some developer versions of certain browsers you might be able to source the full contents directly, including any state that may not be obvious from only the URL. If the url of the tabs is enough (for most cases it should be), then there are many options and relatively easy implementations possible. Emacs has tools to communicate with browsers (though they depend on the local OS and some are limited to only certain browsers or certain OSs), so if you are happy with controlling the tabs from Emacs, you could simply reorganize/regroup the tabs within an Emacs buffer with the help of an LLM that gets the url and may open up connections to see what the trivially accessible contents aee. I would use this and might test this idea when I am not AFK. If Emacs is not an option, perhaps find an OS or extension-dependent way to reorganize the tabs.


an extension seems the most natural way to do this, but that would entail hosting a model which isn't cheap, will give it a go


Can extensions connect to localhost? IE to a local ollama for example.


Yes they can. But the ergonomics of that aren't great. Also ollama takes up a sigificant amount of memory. I'm trying this with a cheap model like phi-3 hosted on my server.


Also with the possibility to look at the current tab and open a new window populated with tabs from related pages you've visited earlier.


Arc Browser does that already, also for downloaded files. Looks neat.


I really like Arc, but I hate when I go through the file download context to choose a path and file name, then Arc renames the file. For the direct downloads, I’m all for it, but I wish it recognized that I chose a file name myself, so don’t rename.

Haven’t played with its tab organization, though.


+1 for Arc


chrome recently rolled out something that groups browser tabs together. I thought they said they used AI. But basically a bunch of youtube tabs get consolidated into a youtube button tab that toggles the group to expand.


Makes sense. Wouldn't grouping based on the context of what i'm browsing be more desirable? If i'm searching a bug fix i'll probably be doing it across multiple domains, perhaps a tab group based on that.


Alex here—- one of the team members behind the project.

We built this for the Llama3 Cerebral Valley Hackathon. The idea was this: My ~/Downloads folder is extremely messy, and I wanted an agent to fix it for me. So we built one.

LlamaFS reads file contents and metadata to Ollama with LlamaIndex, Moondream, and Whisper to understand what it’s about. It renames the files according to a specified pattern and organizes similar files into directories. Also, LlamaFS doesn’t overwrite anything until you explicitly ask it to (no risk of deleting anything precious)

One cool thing we benchmarked with AgentOps was the speed. It goes pretty fast, ~500ms per file.


“In batch mode, you

In watch mode, LlamaFS starts a daemon that watches your directory. It intercepts all filesystem operations, updates i and uses your most recent edits in context to proactively learn and how, so you don't learns predict how you rename file. e.g. if you create a folder for 2023 tax documents, and start moving 1-3 file in it, LlamaFS will automatically creates, and move the right!”

Come again? Was AI used to generate the documentation?


I think I would like a tool that intelligently suggested renames but doesn't automatically do them.

Arc Browser has an AI rename feature (for downloaded files). I tried it out but I had to turn it off. I love Arc Browser BTW and their AI hover summary is useful. I found poorly naming of files to be disruptive and it's a lot better if I am more involved in renaming the file- that will help me remember it.


That's the approach I took on a similar toy project[IT] I've been working on the past week (images instead of text.) It first creates a `metadata.csv` file with suggested clean filenames and a Boolean flag indicating if it thinks it needs to be changed at all. You can manually view and edit the `metadata.csv` file and only once you're happy with it do you pull the trigger by running `autorename()`. I definitely feel like you need a human in the loop for this kind of thing.

[IT]: https://github.com/olooney/image_tagger


Can i run this in a sandbox or dry ... Not that i'm not trusting my AI Overlord ;)


Working on something related, github.com/idncsk/canvas, canvas-server, canvas-ui-browser, always use the dev or for browser the tarbor branch, waiting on my burger in Budapest sry for the lack of details


That's an interesting project and concept.


Interesting. Literally just this week i planned some time in to experiment with LLMs as a full FS. This seems not truly fair to be named "FS". But cool approach. Though this has been done in bash scripts online before. So i only dislike naming it FS. It's not an FS. Just like your productivity app is not an OS. Will check it out regardless. Congratulations anyway!


I created something very similar, and much simpler, to this in Go not too long ago (without Llama) called Switchboard.

it's not "self organising" in this sense, but it's an easy way (imo) to organise files across your desktop.

https://github.com/Cian911/switchboard/


Neat, I built a tool to quickly do this manually via key bindings[0].

As a data archivist, I would definitely recommend a setting to turn off rename of files as that can often be a database id, timestamp, etc.

[0] https://github.com/VisualFileSorter/VisualFileSorter


This is something where I can see infinite context windows really working. My local llm knowing where tens of thousands of files located on my computer are, what they do, which ones can be moved, which ones can't etc, how to organize them. Just cleaning up my absolutely messy space.


This the type of thing I plan to implement. If there's one way to save things. Of course, it's definitely going to hide some files eventually, so hopefully you got a log file going documenting the modifications.


Sending all your personal files to an external provider seems like a recipe for a privacy disaster. Luckily, they provide an option to use your local Llama instance.


This would be cool if it weren’t so heavy-handed—a similar application to add tags to files or some metadata that could be indexed by Alfred or Rofi would be cool.


This would be super cool with Obsidian (just Markdown files). Dump notes and let it organize itself. Who is up for a hacking session?


Finally hope for our digital mountains. In a few years you'll point the AI at it and it will organize it for you.


I think that would be a mental degradation. Organising the data is at least as important as knowing what the data is. If you don't have a mental map of what you have, you will not know what you can search. I don't really mean file contents or dates here, that sort of auto-organisation is long possible already.


I'm more excited for the day we can have robot do that for the analog stuff instead. I'd rather sit and enjoy organizing files on my computer than loading the washing machine, the dishwasher, folding clothes, packing for vacations .. well you get the point.

Would be cool if I can build CI pipelines for daily stuff, just describe everything in YAML* and not have to do repetitive tasks all the time.

* or, hopefully something better that isn't a pain to write


In a few years you'll point the AI at your life and it will live it for you.



Great! That will free me up to view advertisements 24/7! Win-win!



I wonder how well this will pair with a search first UX since finding stuff may be hard otherwise.


So, this terrifies me, in that it is going to rename files and move them. I have an inordinate amount of fear about that happening in the hands of an LLM. But, I love the idea of a virtual filesystem / directory that lets me see things based on LLM-led naming. Just, leave my main files alone, oh please god, don't touch them.

With virtual systems, I think it could be really interesting, you could have a few different types, from conceptual, to project, to research area. That would be amazingly cool.


When I write programs like this, I have my program write out shell commands for moving/deleting/renaming files to stdout. Then I review the changes and if I like them, I pip to `sh` or `bash`. Assuming that the file renames are idempotent, this works out really well.


> I love the idea of a virtual filesystem / directory that lets me see things based on LLM-led naming. Just, leave my main files alone

I like this idea.


glue + tomatoes = recipe directory




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: