I'm certain models like o3-mini are capable of writing Prolog of this quality for puzzles they haven't seen before - it feels like a very straight-forward conversion operation for them.
My comment got eaten by HN, but I think LLMs should be used as the glue between logic systems like prolog, with inductive, deductive and abductive reasoning being handed off to a tool. LLMs are great at pattern matching, but forcing them to reason seems like an out of envelope use.
Prolog would be how I would solve puzzles like that as well. It is like calling someone weak for using a spreadsheet or a calculator.
I actually coincidentally tried this yesterday on variants of the "surgeon can't operate on boy" puzzle. It didn't help, LLMs still can't reliably solve it.
(All current commercial LLMs are badly overfit on this puzzle, so if you try changing parts of it they'll get stuck and try to give the original answer in ways that don't make sense.)