Hacker News new | past | comments | ask | show | jobs | submit login

Tried it! I really like the idea, but I think the clue generation could use some work. Every clue ended in "in games", and honestly most of them were not really game related to start with. For example the clue "Place in games where characters go to rest and replenish health or mana" had the solution "bar"... which I wouldn't describe as right. Similarly "The name of a popular character who may need rescuing in some games" was "Emily".

I think it might be worth working on prompting to make sure the answer is a unique solution to the hint (or at least closer to unique). What model are you using here?




Haha, for those two you mentioned I assumed it was 'Inn' and 'Zelda'. I don't even know who Emily is.

And googling 'Emily video game character' didn't bring up any noticeably popular video game characters.


I put "hub" and "Peach"... There seems to be a lot of possible answers


emily is the name of characters that are rescued in bioshock infinite as well as the first dishonored game. both games are around a decade old but were popular at the time.


Isn't the character in Bioshock Infinite 'Elizabeth'? I'd also assert that by design, Elizabeth was meant to be a character that arguably didn't really need to be rescued, "she can take care of herself".

The only name I could think of in 5 letters that fit here was actually "Peach".


Thank you for trying. You are absolutely right. In my defense i released the very first 20 puzzles without proof reading them. I just wanted to see what the AI can deliver as a starting point and get an idea if crossword players would like or hate the general idea of it. I've just started seriously playing it myself and some clues are indeed strange. The clues were generated by gpt-4o. You can try later puzzles i did some prompt adjustments because i noticed the forced "in games" after 10 puzzles or so. It gets a bit better at later puzzles. Thank you for the feedback


Really goes to show how bad top-tier LLM's are at rather basic tasks like creating a clue out of popular media. This should be among the core competencies of major models with tons of available training data and simple summarization requests.


LLM Hallucinations are real. Admittedly for the prompt I provided just the word and some basic rules. It should be possible to increase the quality if e.g. I provide the sentence in which the word was used to the LLM. Nevertheless Hallucinations will always be a problem i think there is no way around a human quality gate in the process?


No it shows how important good prompting techniques are.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: