Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Nothing prevents it, no. Transformers are certainly capable of learning mathematical tasks; consider [1] as an example, which uses big but regular token lengths.

Alternatively you could just scale 'till the problem solves itself.

[1] https://arxiv.org/abs/2201.04600



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: