>Does the n-gram model really need all those parameters to mimic GPT-4? Yes, it ...

jbay808 · on May 25, 2023

That's right, but if you did that compression, it wouldn't be an n-gram anymore. What I'm attempting to get across is that you could model GPT-4 as an equivalent 8000-gram in an abstract sense, but that's not a good mental picture for how it actually functions. Internally, GPT-4 is no more an 8000-gram than Stockfish is a giant lookup table of chess positions. GPT-4 is learning RASP programs, not statistical text correlations.

EGreg · on May 25, 2023

Does ChatGPT really represent an 8000 gram model? Seems the claim was that it just predicts the next word !