1. Learning curve: Just like any skill there is a learning curve on how to get high quality output from an LLM.
2. The change in capabilities since recent papers were authored. I started intensively using the agentic coding tools in May. I had dabbled with them before that, but the Claude 3.7 release really changed the value proposition. Since May with the various Claude 4, 4.1 and 4.5 (and GPT-5) the utility of the agentic tools has exploded. You basically have to discard any utility measure before that inflection point, it just isn't super informative.
1. Learning curve: Just like any skill there is a learning curve on how to get high quality output from an LLM.
2. The change in capabilities since recent papers were authored. I started intensively using the agentic coding tools in May. I had dabbled with them before that, but the Claude 3.7 release really changed the value proposition. Since May with the various Claude 4, 4.1 and 4.5 (and GPT-5) the utility of the agentic tools has exploded. You basically have to discard any utility measure before that inflection point, it just isn't super informative.