Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

COBOL is the perfect language for LLMs because it looks just like the English text they were trained on to begin with.


It's quite the contrary, the less interpretative the language, the better. And no, LLMs were not trained on English to begin with. And they don't perform best in English.


Please expand more on the idea that LLM's are not trained on English to begin with. Not sure what you mean by this as clearly many LLM's are trained on data that contains a lot of English. For instance GPT-1 seems to have been trained on a purely English corpus.


Interesting, where do you think LLMs perform the best?


SVGs that represent birds on foot powered conveyance devices.


According to some studies, Polish is top performing, while English wasn't near the top.


That’s not how it works. Being trained a ton of human text doesn’t mean you can complete the next token for a program that needs to be logically coherent.

Imagine all your data is Reddit threads and now I ask you what follows “goto”, how would Reddit help you?

The opposite is likely true - there isn’t a ton of publicly available cobol code compared to e.g React, so an LLM will degrade.


Context window required grows though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: