That's awesome! Are people thinking about training it for more than just 1 epoch...

avereveard · on April 18, 2023

The issue with tokens is that they shoot up inference memory usage

muttled · on April 18, 2023

Once this barrier is broken down we'll see a lot of cool things. 32k on GPT-4 is already pretty cool but once we get into hundreds of thousands/millions of tokens of context we'll be able to easily do things only currently achievable with fine tuning and "memory" tricks. Assistants that remember everything you've ever told them, asking detailed questions about large datasets, even complex systems that are bootstrapped from the context.

sp332 · on April 17, 2023

It's including Common Crawl data 4 or 5 times, does that count?