Very expensive. AFAIK the model can’t be quantized during backprop, so right the... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

dartos on March 11, 2024 | parent | context | favorite | on: Show HN: LlamaGym – fine-tune LLM agents with onli...

Very expensive.

AFAIK the model can’t be quantized during backprop, so right there you’d need a ton of RAM.

Backprop is faster bc it can be parallelized, but IIRC you need to hold an entire copy of the model for each backprop process.

scribu on March 11, 2024 [–]

Actually, there have been attempts to do quantized backprop, but not sure how successfully.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact