Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I haven’t had any luck prompting LLMs to “have taste.” They seem to over fixate on instructions (e.g. golfing when asked for concise code) or require specifying so many details and qualifications that the results no longer generalize well to other problems.

Do you have any examples or resources that worked well for you?



Yeah prompting doesn't work for this problem because the entire point of an LLM is you give it the what and it outputs the how. The more how that you have to condition it with in the prompt, the less profitable the interaction will be. A few hints is OK, but doing all the work for the LLM tends to lead to negative productivity.

Writing prompts and writing code takes about the same amount of time, for the same amount of text, plus there's the extra time that the LLM takes to accomplish the task, and review time afterwards. So you might as well just write the code yourself if you have to specify every tiny implementation detail in the prompt.


Makes me think of this commitstrip comic: https://i.xkqr.org/itscalledcode.jpg (mirrored from the original due to TLS issues with the original domain.)

A guy with a mug comes up to a person standing with their laptop on a small table. The mug guy says, "Some day we won't even need coders any more. We'll be able to just write the specification and the program will write itself."

Guy with laptop looks up. "Oh, wow, you're right! We'll be able to write a comprehensive and precise spec and bam, we won't need programmers any more!"

Guy with mug takes a sip. "Exactly!"

Guy with laptop says, "And do you know the industry term for a project specification that is comprehensive and precise enough to generate a program?"

"Uh... no..."

"Code. It's called code."


You know, this makes me wonder... is anybody actually prompting LLMs with pseudocode rather than an English specification? Could doing so result in code that that's more true to the original pseudocode?


You can give the macro-structure using stubs then ask the LLM to fill in the blanks.

The problem is that it doesn't work too well for the meso-structure.

Models tend to be quite good at the micro-structure because they've seen a lot of it already, and the macro-structure can easily be promoted, but the levels in between are what distinguishes a good vs bad model (or human!).


I’m not sure if it went anywhere but I remember there was this attempt at one point called Sudolang:

https://medium.com/javascript-scene/sudolang-a-powerful-pseu...


Goodhart's Law of Specification: When a spec reaches a state where it's comprehensive and precise enough to generate code, it has fallen out of alignment with the original intent.

Of course there are some systems where correctness is vital, and for those I'd like a precise spec and proof of correctness. But I think there's a huge bulk of code where formal specification impedes what should be a process of learning and adapting.


My dream antiprogram is a specification compiler that interprets any natural language and compiles it to a strict specification. But on any possible ambiguity it gives an error.

    ?
This terse error was found to be necessary as to not overwhelm the user with pages and pages of decision trees enumerating the ambiguities.


Openspec does this. But instead of "?" it has a separate Open Questions section in the design document. In codex cli, if you first go in plan mode it will ask you open questions before it proceeds with the rest.

The UX is there, for small things it does work for me, but there is still something left for LLMs to truly capture major issues.


Bless our interesting times.


the goal would be to write it a reusable prompt. this is what AGENT.md is for.


> the entire point of an LLM is you give it the what and it outputs the how

I'm still struggling to move past the magic trick of guessing what characters come next to ascribe understanding of "how" and implying understanding?


> Do you have any examples or resources that worked well for you?

Using this particular example, if you simply paste the exact code into the prompt, the model should able to reproduce it. Now, you can start removing the bits and see how much you can remove from the prompt, e.g. simplify it to pseudocode, etc. Then you can push it further and try to switch from the pseudocode to the architecture, etc.

That way, you'll start from something that's working and work backwards rather than trying to get there in the absence of a clear path.


That’s an interesting approach, but what do you learn from it that is applicable to the next task? Do you find that this eventually boils down to heuristics that generalize to any task? It sounds like it would only work because you already put a lot of effort into understanding the constraints of the specific problem in detail.


What worked for me was Gemini 3 Pro (I guess 3.1 should work even better now) with the prompt "This code is unnecessarily complicated. Simplify it, but no code golf". This decreased code size by about 60 %. It still did a bit of code-golfing, but it was manageable.

It is important to start a new chat so the model is not stuck in its previous mindset, and it is beneficial to have tests to verify that the simplified code still works as it did before.

Telling the model to generate concise code did not work for me, because LLMs do not know beforehand what they are going to write, so they are rarely able to refactor existing code to break out common functionality into reusable functions. We might get there eventually. Thinking models are a bit better at it. But we are not quite there yet.


I wonder if it helps at all to first tell the agent to write the APIs/function signatures, then second tell the agent to implement them.


I have a stupid solution for this which is working wonders. It does not help to tell the LLM "don't do this pattern". I literally make it write a regex based test which looks for that pattern and fails the test.

For example I am developing a game using GDscript, LLMs (including codex and claude) keep making scripts with no classnames and then loading them with @preload. Hate this, and its explicitly mentioned in my godot-development skill. What agents can't stand is a failing test. Feels a bit like enforcing rules automatically.

This is a stupid idea but it works wonders on giving taste to my LLM. I wonder if I should open source that test suite for other agentic developers.


I really should spend some time analyzing what I do to get the good output I get..

One thing that is fairly low effort that you could try is find code you really like and ask the model to list the adjectives and attributes that that code exhibits. Then try them in a prompt.

With LLMs generally you want to adjust the behavior at the macro level by setting things like beliefs and values, vs at the micro level by making "rules".

By understanding how the model maps the aspects that you like about the code to language, that should give you some shorthand phrases that give you a lot of behavioral leverage.

Edit: Better yet.. give a fresh context window the "before" and "after" and have it provide you with contrasting values, adjectives, etc.


Concise isn't specific enough: I've primed mine on basic architecture I want: imperative shell/functional core, don't mix abstraction levels in one function, each function should be simple to read top-to-bottom with higher level code doing only orchestration with no control flow. Names should express business intent. Prefer functions over methods where possible. Use types to make illegal states unrepresentable. RAII. etc.

You need to think about what "good taste " is to you (or find others who have already written about software architecture and take their ideas that you like). People disagree on what that even means (e.g. some people love Rails. To me a lot of it seems like the exact opposite of "good taste").


I spend much more time refactoring that creating features (though, it is getting better with each model). My go-to approach is to use Claude Code Opus 4.6 for writing and Gemini 3.1 Pro for cleaning up. I feel that doing it just one-stage is rarely enough.

A lot of prompts about finding the right level of abstraction, DRY, etc.

An earlier example (Opus 4.5 + Gemini 3 Pro) is here: https://github.com/stared/sc2-balance-timeline

I tried as well to just use Gemini 3 Pro (maybe the model, maybe the harness) it was not nearly as good as writing, but way better at refining.


I actually don’t think golfing is such a bad thing, granted it will first handle the low hanging fruits like variable names etc, but if you push it hard enough it will be forced to think of a simpler approach. Then you can take a step back and tell it to fix the variable names, formatting etc. With the caveat that a smaller AST doesn’t necessarily mean simpler code, but it’s a decent heuristic.


Have you tried meta-prompts e.g. "Rewrite the prompt to improve the perceived taste and expertise of the author"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: