Hacker News new | past | comments | ask | show | jobs | submit login
ReactAgent: LLM Agent for React Coding (reactagent.io)
130 points by omarfarooq on Oct 25, 2023 | hide | past | favorite | 44 comments



The critical element of these systems is that it takes you less time to get to production worthy code using the tool than it would without it. You can't just stop a demo at the point where you don't have working code. Anyone with a ChatGPT plus subscription can ask it to generate react components from a mockup, but it doesn't do a great job. You have to show your tool doing a good job, not just flipping through component files.


For me, a critical part of "doing a good job" is how the implementation fits into a larger system. ChatGPT is not very good at that.

I tried asking it to build a basic state management component for a TypeScript/React app. It offered a class-based component. I asked to use a closure instead. It offered a closure but skipped some of my other requirements that it has previously added (like types). I asked to add in-memory caching. It added caching but removed something else. I asked to create a context provider based on this component, it created a context provider but skipped some parts from the state management implementation.

Basically, it sort of works if you can hand-hold it and pay attention to every detail. But that barely saves me any time. And the code it generates is definitely not production ready and requires refactoring in order to integrate it into an existing code base.


A good trick here is to use the API rather than the web interface and give it access to functions that let it edit files directly. Then you don't have to do the iterations where you ask it for something new and it forgets something old, it just updates the code in place, leaving all the old stuff and changing/adding the new stuff.


Interesting. Is there an IDE (ideally, VSCode) extension you could recommend for that?

What do you mean by giving access? Is it possible to limit its access only to certain functions/files?


Not the GP, but I've been working on an open platform [0] for integrating OpenAI APIs with other tools (VSCode, JupyterLab, bash, chrome, etc) that you might find interesting, the VSCode integration supports editing specific files / sections etc.

Also worth taking a look at Github Copilot Chat[1], it's a bit limited but in certain cases it works well for editing specific parts of files.

[0] https://github.com/lightrail-ai/lightrail

[1] https://marketplace.visualstudio.com/items?itemName=GitHub.c...


This is exactly the use that Promptr is intended for https://github.com/ferrislucas/promptr

* full disclosure: I’m the author of Promptr


Looks promising, thank you! I will try it out this week.


Yes. Cursor at cursor.sh. I'm happily paying for it, and it works great giving answers based on your codebase. It generates great inline code, but doesn't have file and multi-file generation (yet?).


Not built in, you have to use the API and then build functions for it to invoke which would fetch the files. It’s called function calling (https://openai.com/blog/function-calling-and-other-api-updat...) but it’s not as easy as you might expect.


cursor.sh. Does exactly that.


The way to tackle that is with RAG and local embeddings so that you can set examples for the code conventions you prefer. ChatGPT isn't going to do it without a lot of manual copy/pasting, and most tools I've seen are not that great. I just use a bunch of hacked together scripts and a local embed db with my code as an assist and it's worked pretty well.


In my experience, follow up questions are when things break down. Edit your original prompt, tweak as needed and re-submit.

If you want to follow up, you can also edit that follow-up however many times and see all the answers it gives.

Sometimes no-edit regenerate can be useful too.


> ChatGPT is not very good at that.

Mmm.. that's an awfully big generalization.

I'm going to go out here on a limb and say... maybe you're doing it wrong.

ChatGPT is very very good at what you're describing (integration between well defined interfaces and systems in code), especially GPT4.

If you get one bad result, does that mean it sucks? Or... does it mean you don't understand the tool you're using?

The power of AI is in automation.

What you need to do is take your requirements (however you get them) and generate 100s of solutions to the problem, then automatically pick the best solutions by like, checking if the code compiles, etc.

...

This is a probabilistic model.

You can't look at a single output and say 'this sucks'; you can only (confidently) say that if you look at a set of results and characterize the solution space the LLM is exploring as being incorrect.

...and in that case, the highest probability is that your prompt wasn't good enough to find the solution space you were looking for.

Like, I know this blows people's minds for some reason, but remember:

prompt + params + LLM != answer

prompt + params + LLM = (seed) => answer

You're evaluating the wrong thing if you only look at the answer. What you should be evaluating the is answer generator function, which can generate various results.

A good answering function generates many good solutions; but even a bad answering function can occasionally generate good solutions.

If you only sample once, you have no idea.

If you are generating code using chat-gpt and taking the first response it gives, you are not using anything remotely like the power offered to you by their model.

...

If you can't be bothered using the api (which is probably the most meaningful way of doing this), use that little 'Regenerate' button in the bottom right of the chat window and try a few times.

That's the power here; unlimited numbers of variations to a solution at your finger tips.

(and yes, the best way to explore this is via the api, and yes, you're absolutely correct that 'chatGPT' is rubbish at this, because it only offers a stupid chat interface that does its best to hide that functionality away from you; but the model, GPT4... hot damn! Do not fool yourself. It can do what you want. You just have to ask in a way that is meaningful)


I would say where it is beneficial to use thr ChatGPT UI is with the Data Analysis mode, where it has access to a code environment. You upload a few files, ask it for implementation and it will happily crunch through the processes, validate itself via the REPL and offer you the finished files as downloads. It's pretty neat.


Yeah, it really didn't seem like there was a lot of "there" there. Maybe I'm missing something in the codebase but they're just piping specs to the gpt-4 api?


> doing a good job

Doing a good job is the subjective part, no? Especially with opinionated things like programming in general/React


There are:

- Style guidelines

- Unit tests

- Functional requirements

- System requirements

If the tool creates commits to pass all of them, then it is objectively doing a good job.


What is “a good job” according to your definition? If it can save even 10% time, then the tool is worth the money. And by the way, they never promised anything

> ReactAgent is an experimental autonomous agent


I don’t know how it is even going to achieve that…

It’s going to give you a code base that is filled with subtle bugs, no tests and no documentation. You won’t have a good understanding of your own product and you are going to almost certainly end up wasting more time over even a modest time window compared to the feeling of productivity you had at the start.


It will only give you that codebase if that's what you ask it for. If you ask it for tests it will give you tests. If you ask it for documentation it will give you documentation.


I feel like you could build reliable scaffolding tools with zero AI. Code generation tools are underused a lot of the time.


Correct and from there you can still build the tests and documentation to go with it and actually understand how everything works.

This is literally hoping that some linear algebra process is going to magically put all of the right things together in all the right ways while maintaining all of the correct syntax and the underlying logic will make sense. Sometimes I think people forget that it’s just a glorified guessing game of what letter most likely comes next.


Aren't all of these thin branded clients around chatgpt extremely high risk ventures for both the company and the customers? OpenAI can change their prices, or their terms whenever they please and the crunch _will_ come in the future when they have to make back their money they are currently throwing away to grow.


Generative AI as a category isn't going anywhere and the model interfaces are mostly interchangeable so there isn't much lock-in. If OpenAI were to jump the shark, it's not that hard to switch an app to alternative models.

Plus, OpenAI is making plenty of revenue[1]. Yes, they're operating at a loss to grow faster, but it sounds like their unit economics are positive meaning can likely become profitable in the future without price hikes or user-hostile changes.

1- https://www.maginative.com/article/openais-revenue-skyrocket...


There's competition for alternatives. https://www.promptingguide.ai/models/collection

Life will find a way.


In my opinion, they’re not, because OpenAI makes generalized tools, that lack specificity (still need prompting, splitting, etc.).

I’d see it as AI as a Platform. The AI space will get hyper-competitive now with companies like Anthropic and open-source projects like Llama. LLMs will become a commodity


I tend to subscribe to this viewpoint. With the size of the opportunity and the level of competition, and barring some exponential compounding dynamic (possible in this space of all industries), I don’t see foundational model providers like OpenAI able to compete at so many levels of abstraction at once (as in both platform and vertical solutions).

What we’re more likely to see is some sort of consolidation and collapse of layers of what used to be a viable business. Companies who are not actively working on differentiation and adding real value will simply start withering away, at the same time giving space to nimbler teams that operate what used to take hundreds of people to manage.

tl;dr: No direct competition but consolidation and disruption of current operators.


They clearly didn't cost much to make. So in terms of business risk it's one of those "high probability, low consequence" type risks. There's upside to be made for as long as it remains viable. Just don't bet the farm on it.


couldn't you say the same thing about anything you build on top of? what if AWS raises its prices?

If your building something people want to use and willing to pay for you would raise prices.

could it crash and burn sure. but you can't remove all risk as a startup.


The namespace is overloaded. This 2022 paper decsribes a method for looping LLM generations in a specific way to generate more reasonable decisions and outcomes: "ReAct: Synergizing Reasoning and Acting in Language Models" https://react-lm.github.io/

This is the type of agent that AutoGPT uses.


I had an excellent pun in my blog post about LangChain's problems:

> Wars about software complexity and popularity despite its complexity are an eternal recurrence. In the 2010’s, it was with React; in 2023, it’s with ReAct.


Can you share your blogpost?


The Hacker News discussion is here: https://news.ycombinator.com/item?id=36725982


Millions of frameworks and hype tools stacked together now regurgitated by a chat bot that can't reason. All to bring you more generic sites that can't run on processors with billions of clock cycles per second. Web "developers" should be embarrassed.


Hey everyone, happy to be featured here and see your thoughts! I'm the original author, I was curious to find a way to generate production code with LLM, and I did find a path to this, it requires a lot of work and engineering, solving a complex problem by splitting it to multiple smaller ones that an LLM / LLM agent can take. IMO it is doable, but it requires a lot of work and potentially funding, so I left this project to do something fundable in this space :) I wish I could keep working on it full time, if anyone wants to help doing that, or ask anything, reach out!


Is the demo video intentionally set to double speed or is that a bug/typo?


IDK, but it's a loom video:

<iframe width="100%" height="315" src="https://www.loom.com/embed/591fd03b54d04a74a15995815de47c76" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>

Maybe they checked our gyroscopes and thought based on that we wanted to watch it at double speed.

Edit: weird, if I click that link, it plays at normal speed.


You could easily setup the same with https://formula8.ai - even more flexible. Just write a React Component Formula.


Very cool to see, I can't wait to use something like this especially once someone automates the loop of write -> run code -> check errors -> fix errors for an LLM


We’ve done experiments with generative UI code, but we keep coming back to the LLM making crappy looking UI, and if the engineer still needs to tweak the code afterward this is just a glorified copilot. Without more specialized models, I believe the next step is an intermediate design language…something an LLM can’t make look bad. Lots of room to innovate here though. Really glad to see people working on this problem.


Just a heads up, the video playback was 2x by default so I couldn’t understand anything until I looked at the settings. I don’t think it’s a previous setting from me because I’ve never used loom before.


Is there something like this for Swift & SwiftUI development? My day job is React, so I need a lot less help there than with my personal iOS apps.


Just ask ChatGPT4. It is pretty good. Now even better with vision support. Upload a wireframe download code.


AI powered no-code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: