If it can frame the question for the tool, it therefore has the logic (whether that was static recall or deductive).
LLM's struggle with simple maths by nature of their architecture not due to a lack of logic. Yes it struggles with logic questions too but they're not directly related here.
Most of the failures for theses simple logic question come from the inability to simply copy data accuratly.
Logic is too abstract to be measured, but this single bench show something getting in it's way.
I got another bench that show that the LLMs do basic mistakes that can be easily avoided with minimum logic and observation.
> LLM's struggle with simple maths by nature of their architecture not due to a lack of logic.
No, if it was good at logic it would have overcame that tiny architectural hurdle, its such a trivial process to convert tokens to numbers that it is ridiculous for you to suggest that is the reason it fails at math.
The reason it fails at math is because it fails at logic, and math is the most direct set of logic we have. It doesn't fail at converting between formats, it can convert strawberry to correct Base64 encoding, meaning it does know exactly what letters are there, it just lacks to logic to actually understand what "count letters" means.
LLM's struggle with simple maths by nature of their architecture not due to a lack of logic. Yes it struggles with logic questions too but they're not directly related here.