Hacker News new | past | comments | ask | show | jobs | submit login
Co-designing a sparse music codec with ChatGPT o3 (akuz.me)
52 points by avaku 15 days ago | hide | past | favorite | 22 comments



I've had similar breakthroughs in the pet-project department lately too.

Normally I'd never write small projects in languages I'm unfamiliar with, or learn how to do Android app development just to fix a minor issue with an open-source app, but pair coding with claude code has made these things much more achievable. Vibe coding quickly goes off the rails towards garbage outputs of course, but I've found actually reviewing and guiding the quality of the work piece-by-piece can work with the right effort.

There was a very cool small hardware project posted here some time ago, I forgot what it was, where the creator admitted that he had almost no prior experience doing hardware design/build, but pushed through it solo with LLM assistance far enough to get some pretty impressive outputs.


Yeah. I love the Zed AI assistant for the way it manages context (inline edits in the context of the chat) and how "raw" it is. However, I mostly use Goland for coding, so was getting annoyed at having to switch constantly.

So, in a couple of evenings (with already having a working PoC evening one) I managed to basically replicate a Zed AI-like integration as a Goland plugin (much less polished and feature-rich, but covering exactly what I need).

I've never written kotlin nor a jetbrains plugin before, and since what I wanted is quite complex, it would've easily taken me 1-4 weeks full-time work otherwise - which is to say I would've never done it. It required a ton of grepping around the intellij-community repo too, to see what is available, and what existing patterns are (all done by the AI of course).

In this case I vibe coded it (I deem it fine for side-projects, not for production work of course) with Claude Code and it cost me ~$100 all in all, while I was mostly chilling / gaming. Later stages definitely required me to read the code and understand it, since as you say, vibe coding falls apart at a certain point pretty drastically. But having a working skeleton (and code I can modify instead of authoring from scratch) I could easily polish it to sufficient stability.

All in all, a pretty magical experience, and I now use my new plugin all day. Not only amazing bang for buck, but just enabling me to do side projects that I otherwise just wouldn't have the time nor want to put in the effort to do.


Very similar experiences coding with AI. Getting so much more done and enjoying it more.

But people will still read this and say, nah AI is just hype...


I don't know if it's a skill issue but there seems to be a ceiling to the problem complexity it can handle. It's alright for web but I haven't gotten great results from anything more. It also falls over when you need it done your way, not the model's preferred way (aka whatever is most likely). When you need a UI library besides shadcn. When you're using Svelte over React. Etc.

It would be really constructive if people with other experiences could share their chats so we could see what they're doing differently.


Right now I treat AI as a tool to speed up boilerplate code, or to help me when writing in an unfamiliar language.

In areas of high complexity, I want to write the code myself — even if AI was capable of doing it. Often I don’t just want code, I want the understanding which comes from thoughtfully considering the problem and carefully solving it.

Perhaps one day, I could task an AI with writing an API, and it would be able to not just write the API, but also write a bunch of clients in other languages, and automatically integrate lessons learned writing clients into revisions of the API. Then it could task a bunch of other AI models with writing exploits, and patch the API appropriately. Then integrate any lessons learned when revising the API so that the code is more maintainable.

But what’s the fun in that?


I can't upvote ya enough. There's some kind of Clever Hans thing where people aren't honest with how much fudging they have to do, or how brittle the results are.


I made this blog post, because I did very little fudging and was shocked by how good the results are. However, this only happened with the latest model (o3). That's why I wanted to describe my approach. First, give AI the model / architecture description, and make sure it understands. Then proceed to the code.


Yep, I've heard many people have much stronger results with o3 and am glad it's working for them! I think it would help if you could dig up and publish at least a few of your chats through ideation/architecture, initial implementation, refinement, testing, etc. It might help those of us luddites who can't seem to make even o3 work the way we want it to.

I think until we solve the knowledge supplementation problem (i.e., libraries providing extensive examples and LLMs being capable of ingesting, referencing, and incorporating it quickly) we’ll be headed for a tech monoculture—one framework or library for every solution. There’s been some talk about this already with some developers preferring to use older libraries or sticking within the confines of what their LLM of choice was trained on.

Actually, I had a hardware project where I found myself gravitating toward the microcontrollers and sensors ChatGPT was already familiar with. In this case, I cared more about my project working than about actually learning/mastering my understanding of the hardware. There’s still time for that but I’ve been able to get something working quickly rather than letting my notebook of ideas fill with even more pages of guilt and unfinished dreams…


I don’t think this is it either. Svelte actually has specially-compiled documentation for LLM ingest which I’ve tried supplying. It’s moderately less prone to directly write react, but still writes react-isms pretty often. Eg instead of $effect I’ll get useEffect(), or it’ll start writing svelte 4 and old $ syntax versus runes.

With the Python backend tests I did, it just barfed out absolute garbage. Fastapi and sqlalchemy are so common there’s not a great excuse here. I’d tell it what routes I needed written for a given table/set of models pre-written and I even tested with a very basic users example just to see. No dice; they were always syntactically valid but the business logic was egregiously wrong.


There is definitely a ceiling problem. However, I found that with smarter models (such as o3) you could first discuss the mathematical model (or system architecture), so that it knows the context. After that you can ask it to write code, and then it can follow your overall idea very well. Even write a working ML algorithm with fitting. That's why I made the blog post, because I was able to achieve the next level of collaboration with o3. It almost shocks me how good is this. Since the blog post, I've made significant advances in other work-related areas (ML algos for finance).


I've been messing with Claude pretty heavily the last week or so and found if I guide it in the right direction it (mostly) does the right thing.

I've been having it reproduce the lua peg parser thingie (plus utf-8 support)[0] from the paper they wrote about it while also using the 'musttail' interpreter pattern [1] just because that sounded like a good idea. Once I got over the fact that Claude is really bad at debugging and Deepseek, while much better at it, isn't very reliable due to 'server busy' timeouts things have been going swimmingly. While I don't do very much debugging once in a while they do get stuck trying the same things over and over so I have to step in and every so often it'll go crazy and come up with some over complicated solution where I have to ask "that's nice and all but wouldn't it have been easier to just update the index variable instead of rewriting the whole thing?" Claude also seems to like duplicating code instead of generating a helper function but I haven't had that argument with it yet.

So far we have a VM which passes all the tests (which was a battle), a Destination-Driven Code Generation (with additions and subtractions) based compiler implementing all the peg specific optimizations from the paper (which I haven't even starting debugging yet) and the start of a Python C-API module to tie it all together. Admittedly, the python module is all my doing because it's easier to use pybindgen than fight with the robots.

So, yeah, I'm just having fun getting the robots to write code I'm too lazy to write myself on a subject which has interested me for at least a decade. Once I get this working I plan on seeing how well they do with Copy-and-Patch Compilation[2] but don't really have high hopes on that.

[0] https://www.inf.puc-rio.br/~roberto/lpeg/ [1] https://blog.reverberate.org/2021/04/21/musttail-efficient-i... [2] https://arxiv.org/abs/2011.13127


That's interesting. Are you primarily a web guy? I'm not and can't get it to do shit with web based stuff because I have no idea what I'm doing and have no idea when I've gone down the wrong rabbit hole. I can get a ton done with Python type desktop apps though, because I already know what I want and how it should generally be designed.


I am not, no, but have been working on some web stuff recently. It's not that good at web either, it's just less bad than other stuff. If you want a shadcn reactslop landing page that looks vaguely like all the other landing pages on the web, I guess it does that.

It's okay at backend python if I'm very careful about typing, pydantic, etc. But my hope was the "cutting-edge" models would be able to e.g. implement a repository and route given the models and schemata; no such luck. They just aren't great yet, at least not from my testing, hence the hope that someone can share the actual usage of where and how they are rather than just saying so repeatedly.


For sure programming with AI is a game-changer, and o3 in particular is really quite good at maths.

I've tried getting ChatGPT to write blog posts for me as well, but it seems to struggle with knowing which things are important and which things aren't. You need to apply a lot of editorial control.

For example:

> We swapped 3 × 3 windows for 5 × 5, removed global gains then re-introduced per-occurrence magnitudes, and replaced hard clamping with bilinear interpolation so gradients would flow

What does this stuff mean? Why mention these very specific details in the middle of a very high-level overview?

Obviously ChatGPT said this because there was a part in your conversation where you decided to make these changes, but without any of the context behind that it is meaningless to readers.


> Obviously ChatGPT said this because there was a part in your conversation where you decided to make these changes, but without any of the context behind that it is meaningless to readers.

Have you considered you might not be the target audience the author wrote this for?


Do AI commentators only expect to be read by AI fans?


I am an AI fan, I just thought the random specific irrelevant details sprinkled in the middle were out of place in such a high level overview.

One technique I use is to paste a draft into a new conversation and let ChatGPT work on it some more without the context from your earlier chat. You can do this several times.


You're right, your comment is valid. I will make better blog posts in the future :)


I thought that the full details might not be interesting for people, since the algorithm is just a mock up of the idea, but which already works! If I wrote the whole post about all the details, it would be very long. I can see how it's a bit out of context, because other parts of the algorithm are not described. But at least I uploaded it to github :)


Mystifying work


I am sorry for not making a long blog post with all the details, I just wanted to highlight how the cooperation with AI went to the next level (model design, then implementation). At least I posted the code :) It's just a mock up, and the model will evolve, so the architecture of it is not the important part.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: