It's inaccurate to attribute all of these use cases to "LLMs" in general when currently only 3 or 4 of the best models can do all of them well. Especially the ones that involve writing code or highly technical instructions. It's OpenAI plus maybe one other model from another group, but just barely.
Are there any models aside from OpenAI’s that can handle large prompts with task breakdowns? I haven’t tried the Anthropic stuff, but every flavor of LLama and other open source models do not seem capable of this.
That’s true, I’ve only run up to 30B. My understanding was they’re limited to a context window of 2048 tokens based on their training and stuff like llama.cpp has an even smaller input context. You can quickly run over that if you’re doing things like appending a result set to a complex prompt. But if others have working examples of using LLama models with large prompts, I’d be interested to see them.