Hacker News new | past | comments | ask | show | jobs | submit login
Artifacts are now generally available (anthropic.com)
82 points by doener 16 days ago | hide | past | favorite | 33 comments



I've known about artifacts for a couple of weeks now, but didn't have the time to really test it, although they had appeared in a few prompts I made but mostly related to documents and stuff.

For he sake of it, to test the waters would you say, I decided to prompt for a small React app I've had in my mind [1] (Link below app description) for a couple of days now. I know this is far from being a polished, market ready app. But I do think Junior Devs are all outta job (Or going to be, within 6 - 18 months). Prepare for the shift, this one particular paradigm is taking no prisoners.

---

For the record, this is the app I got out of it, fully working:

My GF owns a cafeteria and she's running a promo on all her menu, _except_ other promos. The promotion is a 20% discount on all products, with the price rounded to the nearest 100 pesos (Argentina).

The small app I requested is single page, with a large number (total price) displayed on top. Below is a 3 tabbed panel, with tabs being PROMO, REGULAR and SELECTED Each tab has a list of products, with buttons [ - ][ # ] [ + ] <name> $<price> When a [ + ] button is clicked, the number increases, and that product appears in the SELECTED tab if it wasn't there. [ - ] decreases the # in both listings, and if its zero it disappears from selected. The price on top is always updated to reflect the total price. The SELECT tab has all the selected products (You can change the # for each product there as well) with price, and a per type subtotal (as the discount is only applied to REGULAR products). The subtotal when discounted appears as ~$#.##~ $#.## where the first instance of a number is the price without discount and the 2nd number is the price after discount.

My knowledge of React is negligible, and even if I know it enough to do this it would had taken me at least a couple hours to code. I could probably deploy it as an app to my GF's phone in less than an hour compiling and all.

It has also the possibility of an API call, so I could shove all them products in a page and go changing accordingly.

[1] https://claude.site/artifacts/2fe3c97d-88f6-4e2f-9ed2-96452e...


This does not take a couple of hours. I can assure you of that.

The app description sounds like it was more work to come up with than the app code itself.

Also, this is an extremely common, bland and simplistic example. You’ll see it flail and falter on slightly less generic boilerplate-heavy “apps” and there is no known tech/theory that’ll get us there. We just hope more “scale” will magically solve all its issues, but IMO that’s not going to happen. The paradigm just doesn’t allow it. Time will tell who is right of course.

I don’t want to downplay the recent LLM achievements because they are amazing, but this is not taking any (programming) jobs.

I’m actually not too worried about the current wave of AI. Copilot is an annoyance and GPT4 is lying so much it’s getting old. LLMs have deep issues that I’m not seeing much improvement on. The next wave of AI however.. I’m not sure. It’s a matter of time.


That's pretty neat.

I'd like to see how it handles implementing a Sudoku solver.

I've tried, with multiple different AIs as a benchmark to ask them to implement a Sudoku solver in Rust. And it always turns into an uncompilable mess at worst, or a solver that fills the grid at best but which is not actually following the rules of Sudoku.

Maybe I'm just bad at "prompting". But with all of the material online about implementing your own Sudoku solver I don't understand why any of the models I've asked always produce incorrect solutions. There should be enough info in their training material to know how to do this. I've tried a bunch of different ways like just asking a short simple question without detail ("Implement a Sudoku solver in Rust"), or having it discuss general strategies for the implementation first, along with things like telling it that it's an expert software developer or a professional computer scientist etc. The textual descriptions that it gives sound like they make sense, but when it comes to implementation in code, I've so far never seen actual success with that.

Feeding compile errors from the compiler back into the different LLMs I've tried, often times result in the AI fixing a compile error in one place and then introducing new compile errors in other parts of the code.

If this Artifact thing could enable me to do the following then I am probably going to cancel my OpenAI ChatGPT subscription and start using Claude with Artifacts on the Claude Pro plan instead:

1) Implement a Sudoku solver in Rust. The program will be a command line tool that takes as input a grid of numbers for a Sudoku to solve, and prints the solution to it. It also needs to properly identify invalid input, and input that has no solution. (This alone would be impressive to me because I've never gotten any of the LLMs I've tried to properly do even this.)

2) Make use of serde and hyper to accept a Sudoku puzzle to solve as JSON input over HTTP. Respond with a JSON response that gives the solution to the puzzle.

3) Create a graphical, run-of-the-mill web frontend for the Sudoku solver.

The criteria for success is that the person (me) talking with the LLM never provides any of the actual thoughts about the algorithms and data structures used. If I wanted to do that I'd write it from scratch myself. I want to have the LLM pick the algorithms and data structures, and that the most I do is to paste back any errors and have it actually fix those.


I just asked ChatGPT 4o to generate a Sudoku solver in Rust, and the code compiles and works fine.

Edit: tried with Llama 3.1 70B as well, and funnily enough it outputs the exact same Sudoku puzzle as an example to test the code with.


All of these “make me stereotypical app that you’ve seen thousands of examples of online” that people are using to prove how great AI is at programming now is actually just a shittier version of copy paste where your clipboard randomly changes arbitrary parts of the text before pasting it.


Btw open-webui will soon have merged a PR adding artifact-like interaction:

https://github.com/open-webui/open-webui/pull/4548


What I'm missing on Claude is a Python sandbox, which at least should be able to use matplotlib, PIL, numpy and pandas and be able to display results the way ChatGPT does.

Claude appears to be really good at Javascript-based conversations.

Also, a search functionality which searches in existing conversation would be at the top of my wish list. AFAIK only Mistral offers this. Gemini is the worst at giving you access to older conversations, where you have to use the "Activity" functionality.


Just ask it to use pyodide as a script tag: https://claude.site/artifacts/4aa202ca-b620-4636-8364-3c7bf6...


Claude has support for "tools" in its APIs, so it should be possible in an alternative frontend to have Claude use a Python sandbox.


Webwright, like some other similar tools, will build and run code in a terminal or container using OpenAI or Anthropic tool calls. Just ask it to do a Mandelbrot set. We're currently working on adding ChromaDB to it for history and abstracting the LLM class to support Ollama.

https://github.com/MittaAI/webwright


Claude was able to read pasted code from one framework, generate preview, jazz up the design and preview via artifacta, then translate back to original framework!


Has anyone had success applying with these yet?

I found that for anything above hello world complexity the snippets produced by these tools take more time and have lower quality than if you just learned the tool yourself (which you can use AI for) and did it yourself.

I did manage to get interesting results when I turned it into a few shot prompt with a lot more RAG. So maybe more specialized versions of these tools can produce better results than generalist versions.


I managed to build a non-trivial desktop app (a game trainer which hooks up to a process and reads/manipulates memory in real-time) using Tauri framework (which is based in Rust) almost entirely using Claude 3.5 Sonnet. And I had zero Rust experience prior to that. There were some issues along the way but with the help of Claude I was able to solve them.


I'll say I've had success with it, to the extents of what I was trying to solve https://news.ycombinator.com/item?id=41410867


This is nice but this is what I would call hello world complexity. Like if you do a basic react tutorial or have the AI teach it to you (2 hours) then you would be able to make this in <30 minutes and most likely it would teach you tailwind as well.

The advantage is this lets you see what something would look like in a few minutes without needing to learn the thing (saves you 2.5 hours) the disadvantage is when you try to add new features or build a new site you haven’t learned anything. The 2.5 hours is an investment in the skill.

Another interesting thing to point out is how the UI pattern is off in your example. Do you see how “selected” category is in the same multi selection as the actual menu items whereas the normal UI pattern would be to have a Cart or something on the page. This is the issue that I have been running into with these Generative UI approaches is this stuff happens constantly and in very strange ways.


> the disadvantage is when you try to add new features or build a new site you haven’t learned anything

I don't have an example to prove you wrong here, but here's my 2 cents:

1. You could probably decompose the app in different parts, and then put everything together. Even thought my experience with React is nill, I do (Think I do) have the expertise and know-how to glue everything together (It is yet to be proven that Claude cannot do this). 2. I obviously don't think it can replace a Sr. Dev (yet) but for somebody with my expertise, this is a very good starting point. It saves you 2.5 hours... Every. Single. Time. 3. This is ground zero (sort of). In the words of the famous Tank from The Matrix... This is loco.


So I have used it in the way you’re describing for coding a backend service in Rust. I had never used rust before and I was just trying to see how easy it would be.

The good part is you can get the minimum viable product of every feature up every time. The bad part is when you run into bugs you can’t debug it yourself because you haven’t built up the mental model of how the system works since you’re just copying and pasting. The more unique your problem gets (once you’re out the hello world zone of problems) the worse the AI is at helping you. The first few times you run into a simple debugging and the AI figures it out for you it’s pretty cool but the frequent times the AI gets stuck in a loop and you spend hours feeding it debug output after debug output and it still being broken makes you hate this workflow.

So yes it does help you rapidly build the features but at the same time there’s an immense amount of additional work needed to maintain and debug the system. Also the code is generally very poorly structured and architected.

So yea impressive stuff at a surface level and great for speeding up your ability to get started on a task. But really bad to rely on it (at this point) for anything of greater complexity than the basic quick start tutorial of your library of choice (in my experience).


> Users on Free and Pro plans can publish and remix Artifacts with the broader community, allowing you to build and iterate on the materials published by other users around the world.

Sounds interesting. How does this work? Is it any good?


One thing I like about artifacts is that it's code underneath. So, if I wanted to, I could copy and paste the code into my editor and make it portable.


Is the artifact produced always a single file?

I’ve been enjoying Claude, but while the output is single file instead of a codebase, it’s pretty limited and toy-ish.


I've had Claude generate 5 files/artifacts in a single response. I found it pretty impressive.


Claude can generate different files in a single response, for example I asked the model to generate me the files (all of them) to run a website locally to do a specific task I needed, and Claude generated the app.py, html, JavaScript and whatever rest it needed and it worked first try.


No, it can produce plenty of files, and for anything on their whitelist of Javascript libraries, it can render in-line, so you can get fully working React applications rendering in the artifacts preview.


How do you all feel about GPT-4o vs Claude currently?


I use both Claude and GPT. The ONLY reason why I still use (and pay) for GPT is because it has a stand-alone app. Both Claude's projects and context size make GPT a child's toy (IMHO)


What is it about the stand-alone app that you find valuable?

> Both Claude's projects and context size make GPT a child's toy (IMHO)

I've found GPT still seems to give me better responses for some types of things, but its hard to quantify. I may just be hallucinating the quality difference.


For me it's that the ChatGPT mobile app can hook into the system microphone. I will turn it on, lock my device, then go for a walk/drive and talk to ChatGPT about whatever I'm curious about that day. Its knowledge is encyclopedic, not contemporary, but it's still pretty cool.


I use the mobile app for sure, just never personally had a use for using the desktop native app instead of the web app


That it's stand alone. I know you can have a window with just that page and stuff, but then you have a system wide shortcut to ask things (that combined with MacWhisper is magic) plus the conversational capabilities.


Why not ChatBox or Msty apps which are standalone and model-independent?


NEver heard of them, will take a look. Thanks!


If Claude had an integrated image editor it'd be easy to drop GPT for me. Love the speed of responses especially, and I'm excited about artifacts after some early playing around.


Ehhh, I'd rather hear that they stopped supporting SB 1047.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: