How does this square up with literally what Terence Tao (TFA) writes about O1? I...

sebzim4500 · 2024-09-14T23:15:26 1726355726

o1-preview is still quite a specialized model, and you can come up with very easy questions that it fails embarassingly despite it's success in seemingly much more difficult tests like olympiad programming/maths questions.

You certainly shouldn't think of it like having access to a graduate student whenever you want, although hopefully that's coming.