I would expect them to use small sizes for almost all the testing.

baobabKoodaa · on Feb 22, 2023

Yes. There _is_ a need to train LLMs more than once, and training is prohibitively expensive, so you need workarounds such as training on a small subset of data, or a smaller version of the model. We're not yet at the point where a CS student on consumer hardware could afford to do this kind of research.

Dylan16807 · on Feb 22, 2023

> We're not yet at the point where a CS student on consumer hardware could afford to do this kind of research.

Okay. But I was saying someone with millions of dollars to spend could do it. And then another poster was arguing that millions of dollars was not enough to be viable because you need lots of repeated runs.

Nobody was saying a student could train one of these models from scratch. The cool potential is for a student to run one, maybe fine tune it.

baobabKoodaa · on Feb 23, 2023

Here is the upthread comment I was responding to:

> Why would you want to retrain it from scratch every day?

I was explaining why someone might want to retrain it more than once (although not literally every day).