You could do this with Plandex (or Aider... or ChatGPT) by having it output a shell script then `chmod +x` it and run it. I experimented early on with doing script execution like this in Plandex, but decided to just focus on writing and updating files, as it seemed questionable whether execution could be made reliable enough to be worthwhile without significant model advances. That said, I'd like to revisit it eventually, and some more constrained tasks like copying and moving files around are likely doable without full-on shell script execution, though some scary failure cases are possible here if the model gets the paths wrong in a really bad way.
I feel like what I am saying should be natively supported.
If youre worried about changes getting it wrong, just show a prompt with all the batched changes.
me > build my jar, move it to the last folder I copied it to, and run it.
LLM >
built jar xyz.jar
moving jar to x/y/z
me > yes.
me > redo last command.
Provide rollback/log for these features if need be.
I really dont think you even need an LLM for this. I feel like I can do it with a simple classifier. It just needs to be hooked into to OS, so that it can scan what you were doing, and replicate it.
For example if I keep opening up folder x and dropping a file called build.jar to folder y, a program should be able to easily understand "copy the new jar over"
I imagine at point this is going to be done at the OS level
It's a great concept and I agree it will definitely exist at some point, but working a lot with GPT-4 has made me viscerally aware of how many different ways something like "build my jar, move it to the last folder I copied it to, and run it" can be spectacularly misinterpreted, and how much context is needed for that command to have any hope of being understood. The other big issue is that there is no rollback for a `rm` or `mv` command that screws up your system.
I had similar ideas when I started on Plandex. I wanted it to be able to install dependencies when needed, move files around, etc., but I quickly realized that there's just so much the model needs to know about the system and its state to even have a chance of getting it right. That's not to say it's impossible. It's just a really hard problem and I'd guess the first projects/products to nail it will either come from the OS vendors themselves, or else from people focusing very specifically on that challenge.
Youre right there is a lot of ambiguity there. I think being able to scan user actions helps a ton with this though, because you know exactly the steps the user took. Most of the times I want this is when I literally have to repeat the same set of actions 5+ times and writing a script to do it isnt worth it. I want to be able to just save/train the model and have it do what I want. Today I literally built a jar 50 times, with each time having to open up two folders and copying files between the two same directories. Massively annoying.
There is still some ambiguity there because cases might slightly differ, youre right.
For rm/mv. mv is easily reversible no? You just need to store some context. Same with rm, just copy it to a temp directory. But again with a confirmation prompt its a non issue either way.
Also maybe we need a slightly different kind of LLM, which instead of just assuming its top predictions are correct, gives you actions at critical steps on how to proceed.
build a jar.
> I can build a jar with x,y,z, which do you want?
OpenInterpreter is another project you could check out that is more focused on code/script execution: https://github.com/OpenInterpreter/open-interpreter