Hacker News new | past | comments | ask | show | jobs | submit login

I am puzzled by the fact that the modern LLMs don't do multiplication in the same way humans do it, i.e. digit by digit. Surely they can write an algorithm for that, but why can't they perform it ?



Two reasons:

1. LLMs "think" in terms of tokens, which usually are around 4 characters each. While humans only have to memorize 10x10 multiplication tables to perform multiplication of large numbers, LLMs have to memorize a 10000x10000 table, which is much more difficult.

2. LLMs can't "think in their head", so you have to make them spell out each step of the multiplication, just like (most) humans can't multiply huge numbers without intermediate steps.

A simple way to demonstrate this is to ask an LLM for the birth year of a celebrity and then whether that number is even or odd. The answer will be correct almost every time. But if you ask whether the birth year of a celebrity is even or odd and forbid spelling out the year, the accuracy will be barely above 50 %.


Can't we tokenize numbers always as single digits and give the LLM an <thinking> scratchpad invisible to the user?


Yes, but we could also give the LLM access to a Python interpreter and solve a much larger class of problems with correctness guarantees and around a billion times less compute.


Are there a lot of examples written out of people talking through running that algorithm? I’d guess not.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: