Show HN: Nous – Open-Source Agent Framework with Autonomous, SWE Agents, WebUI

SparkyMcUnicorn · 2024-08-09T20:33:58 1723235638

This looks fantastic! I've been using aider and had my own scripts to automate some things with it, but this looks next level and beyond.

I wanted to try this out (specifically the web UI), so I configured the env file, adjusted the docker compose file, ran `docker compose up` and it "just works".

It would be great if there was a basic agent example or two pre-configured, so you can set this up and instantly get a better sense of how everything works from a more hands-on perspective.

campers · 2024-08-10T01:15:40 1723252540

Updating the Dockerfile and docker-compose.yml was the last change I made so glad to hear that worked for you! What change did you make to the docker compose file?

The CLI scripts under src/cli would be the best examples currently to have a look at for running an autonomous agent, and the fixed workflows (e.g code.ts)

SparkyMcUnicorn · 2024-08-10T07:06:56 1723273616

environment cannot be an empty object, which it is by default currently. And I commented out the google cloud line (thanks for that code comment).

tikkun · 2024-08-09T16:19:18 1723220358

If this isn't by Nous Research, may want to consider renaming (https://x.com/NousResearch, https://nousresearch.com/)

arilotter · 2024-08-09T17:37:53 1723225073

I can confirm this is not a projected related to Nous Research in any way, just an unfortunate naming collision

taskforcegemini · 2024-08-09T17:21:18 1723224078

Nous is the french word for "us". haven't heard of NousResearch

taskforcegemini · 2024-08-09T17:24:21 1723224261

but they explain it is from the greek nous which fits better for ai

flir · 2024-08-10T12:11:21 1723291881

I was surprised when I learned about the Greek derivation. In the UK it's slang for common sense ("use your nous, mate"). I have to wonder how a bit of Ancient Greek ended up as UK slang.

have_faith · 2024-08-10T12:36:46 1723293406

Where in the UK is this a thing?

flir · 2024-08-10T23:41:38 1723333298

Probably less common today but honestly I'd be surprised if someone didn't know it at all. Try asking someone over fifty in England what "use your nous" means.

A quick dip into Google Books throws back this from the works of George Garrett, a Liverpool-born working class writer who wrote about being a merchant seaman during World War I: "An older fireman stopped them. 'Use your nous,' he sang out. 'You can't pile into another room and waken all hands for the sake of an individual."

Or this from The Spectator in the 80s: "Use your nous, you silly cow"

https://www.quora.com/What-does-the-British-slang-word-nous-...

Citizen_Lame · 2024-08-10T14:15:48 1723299348

Eton probably.

campers · 2024-08-10T19:18:41 1723317521

Actually when I was bouncing ideas off Claude it was suggested with the alternative spelling of noos. Then I can keep the concept and only have have one letter to change

MeteorMarc · 2024-08-10T08:57:32 1723280252

And there is https://nous.technology/, known for their smart plugs.

crystal_revenge · 2024-08-09T16:24:30 1723220670

And if it is is by Nous Research we there definitely needs to be clearer branding as this is very confusing.

If OP is not Nous Research (which I suspect to be the case) then a name change is a must as they're already a fairly well established company in the LLM space (surprised OP isn't aware of the name collision already). It's a bit similar to creating a new library with the "Smiling Face with Open Hands emoji"[0] as your logo

0. https://emojipedia.org/hugging-face

campers · 2024-08-09T17:18:38 1723223918

When I first picked the name, after a chat with Claude, I hadn't come across Nous Research back then, and they didn't show up Googling for just nous.

I see a bit of reuse of words in other various llm related projects.

Langchain/langfuse/langflow

Llama/ollama/llamaindex

so I hadn't been too worried about it when became aware of them.

That's what Show HN is for, getting feedback, and a name changed now would be easy before I post it around more.

downrightmike · 2024-08-09T16:44:33 1723221873

Never heard of you

crystal_revenge · 2024-08-09T17:40:36 1723225236

I'm not affiliated with Nous Research in anyway, but do work in the LLM space and at least in this community it's a fairly well known org. Since this project also is in that space I was just adding support for parent's observation.

zwaps · 2024-08-11T14:44:57 1723387497

Yeah but in the LLM/AI space, everyone who isn’t total newbie knows Nous Research

And this framework kinda does fall within that space

stavros · 2024-08-10T15:58:40 1723305520

I'm not entirely sure what this does? The initial paragraph goes into history and what other platforms do, but it doesn't say what problem this will solve for me. Then it continues with some features and screenshots, but I still don't know how to use this or why.

namanyayg · 2024-08-09T16:49:33 1723222173

This looks too good. I have a B2B AI product, the features that exist in Nous easily outclass anything I could make in a reasonable timeline.

Maybe I should rewrite my app using Nous...

campers · 2024-08-09T17:47:12 1723225632

Thanks! I've spent a lot more time on the computer than I would like over the last few months building it.

If you think you might want to feel free to get in touch

viraptor · 2024-08-09T22:02:05 1723240925

I'm having a hard time figuring out how much logic lives in Nous and how much in Aider for code changes - could you say some more about it?

Playing with the code agents do far I've found Aider to do many silly mistakes and revert its own changes in the next commit of the same task. On the other hand Plandex is more consistent but can get in a loop of splitting the take into way too small pieces and burning money. I'm interested to see other approaches coming up.

campers · 2024-08-10T00:27:59 1723249679

I have a few steps so far in the code editing at https://github.com/TrafficGuard/nous/blob/main/src/swe/codeE... There is a first pass I initially created when I was re-running a partially completed task and it would sometimes duplicate what already had been done. This helps Aider focus on what to do.

  <files>${fileContents}</files>
  <requirements>${requirements}</requirements>
  You are a senior software engineer. Your task is to review the provided user requirements against the code provided and produce an implementation design specification to give to a developer to implement the changes in the files.
  Do not provide any details of verification commands etc as the CI/CD build will run integration tests. Only detail the changes required in the files for the pull request.
  Check if any of the requirements have already been correctly implemented in the code as to not duplicate work.
  Look at the existing style of the code when producing the requirements.

Then there is a compile/lint/test loops which feeds back in the error messages, and in the case of compile errors the diff since the last compiling commit. Aider added some similar functionality recently.

Then finally there's a review step which asks:

  Do the changes in the diff satisfy the requirements, and explain why? Are there any redundant changes in the diff? Was any code removed in the changes which should not have been? Review the style of the code changes in the diff carefully against the original code.  Do the changes follow all the style conventions of the original code?

This helps catch issues that Aider inadvertently introduced, or missed.

I have some ideas around implementing workflows that mimic what we do. For example if you have a tricky bug, add a .only to the relevant describe/it tests (or create tests if they dont exist) add lots of logging and assertions to pinpoint the fix required, then undo the .only and extra logging. Thats whats going to enable higher overall success rates, which you can see the progress in the SWE-bench lite leaderboard as simple RAG implementations had up to ~4% success rate with Opus, while the agentic solutions are reaching 43% pass rate on the full suite.

easygenes · 2024-08-09T16:23:37 1723220617

Cool project!

Just FYI your chosen name collides with Nous Research, which has been a prominent player in open weights AI the past year.

campers · 2024-08-09T17:29:21 1723224561

Thanks! I posted a reply to another comment about the name clash. I thought I could add another weird to differentiate, but Nous Agents doesn't really roll off the tongue. New name ideas welcome!

KolmogorovComp · 2024-08-09T20:00:48 1723233648

Pick a proper noun, you will soar above the plethora of startups re-using common nouns for no good reasons (eg: "plane" having nothing to do with aeronautics).

dr_dshiv · 2024-08-09T19:19:06 1723231146

Noosphere might be cool

aiagentsdir · 2024-08-15T01:10:16 1723684216

That's promising! Congratulations with launch. Considering adding to the specialized directory for AI agents and Frameworks to build them. Let me know if you need help.

https://aiagentsdirectory.com/

simonw · 2024-08-09T17:55:15 1723226115

Which definition of "agent" are you using for this project?

campers · 2024-08-09T18:08:19 1723226899

Good question, at first I only called the fully autonomous agents as agents, as to me that's what having agency is. I didn't like when other projects had "multi-agent" when it's just a bunch of llm calls.

Initially the coding and software dev agents were called workflows, but to make it more agenty I was ok with it being called an agent if the result of an llm call affected the control flow

simonw · 2024-08-09T18:44:50 1723229090

So an agent here is the combination of a system prompt and a configured set of tools, kind of like an OpenAI "GPT"?

MacsHeadroom · 2024-08-10T02:19:06 1723256346

No, a chat bot using tools (e.g. GPTs) is an "assistant."

An LLM agent is not a chat bot, unlike an assistant. It is a primarily or fully autonomous LLM driven application which "chats" primarily with itself and/or other agents.

In other words, assistants primarily interact with humans while agents primarily interact with themselves and other agents.

campers · 2024-08-21T09:49:48 1724233788

I'm going to make the availability of requestFeedback function a boolean flag, so when running benchmark suites etc it can be disabled. Whether its an assistant or agent by that definition is just really a parameter value.

siamese_puff · 2024-08-09T15:13:42 1723216422

That trace UI is nice

campers · 2024-08-09T15:31:00 1723217460

I can't take credit for that particular screen, it's the Trace UI in Google Cloud. I did look at LangSmith for tracing, but for now I wanted to stick with standard OpenTelemetry tracing, so you could export the spans to Honeycomb etc

hdlothia · 2024-08-10T03:21:45 1723260105

How much does it cost to run?

campers · 2024-08-10T04:24:50 1723263890

To have it deployed costs nothing to run with the Cloud Run and Firestore free tier.

As for LLM costs that really depends what you're trying to do when it. Fortunately that cost is always coming down. When I was first building it with Claude Opus the costs did add up, but 100 days later we have 3.5 Sonnet at a fraction of the cost.

The Aider benchmarks are good to see how different LLMs perform for coding/patch generation. Sonnet 3.5 is best if it's in the budget. DeepSeek coder v2 gives the best bang for buck https://aider.chat/2024/07/25/new-models.html