If anyone trains a model on https://simonwillison.net/tags/pelican-riding-a-bicy...

suddenlybananas · 2026-02-12T19:20:56 1770924056

Why would they train on that? Why not just hire someone to make a few examples.

simonw · 2026-02-12T19:36:01 1770924961

I look forward to them trying. I'll know when the pelican riding a bicycle is good but the ocelot riding a skateboard sucks.

suddenlybananas · 2026-02-12T19:59:13 1770926353

But they could just train on an assortment of animals and vehicles. It's the kind of relatively narrow domain where NNs could reasonably interpolate.

simonw · 2026-02-12T20:04:17 1770926657

The idea that an AI lab would pay a small army of human artists to create training data for $animal on $transport just to cheat on my stupid benchmark delights me.

suddenlybananas · 2026-02-12T20:07:15 1770926835

When you're spending trillions on capex, paying a couple of people to make some doodles in SVGs would not be a big expense.

simonw · 2026-02-12T20:41:19 1770928879

The embarrassment of getting caught doing that would be expensive.

jononor · 2026-02-14T19:32:16 1771097536

They were caught using all the data on the internet without asking for permission or compensating anyone. And it has cost them nothing and earned them billions so far.

red75prime · 2026-02-13T03:39:22 1770953962

Vetting them for the potential for whistleblowing might be a bit more involved. But conspiracy theories have an advantage because the lack of evidence is evidence for the theory.

toraway · 2026-02-13T06:27:54 1770964074

Huh? AI labs are routinely spending millions to billions to various 3rd party contractors specializing in creating/labeling/verifying specialized content for pre/post-training.

This would just be one more checkbox buried in hundreds of pages of requests, and compared to plenty of other ethical grey areas like copyright laundering with actual legal implications, leaking that someone was asked to create a few dozen pelican images seems like it would be at the very bottom of the list of reputational risks.

red75prime · 2026-02-13T07:08:22 1770966502

How do you think who's in on that? Not only pelicans, I mean, the whole thing. CEOs, top researchers, select mathematicians, congressmen? Does China participate in maintaining the bubble?

I, myself, prefer the universal approximation theorem and empirical finding that stochastic gradient descent is good enough (and "no 'magic' in the brain", of course).

usefulposter · 2026-02-13T12:15:23 1770984923

Well, since we're all talking about sourcing training material to "benchmaxx" for social proof, and not litigating the whole "AI bubble" debate, just the entire cottage industry of data curation firms:

https://scale.com/data-engine

https://www.appen.com/llm-training-data

https://www.cogitotech.com/generative-ai/

https://www.telusdigital.com/solutions/data-for-ai-training/...

https://www.nexdata.ai/industries/generative-ai

---

P.S. Google Comms would have been consulted re putting a pelican in the I/O keynote :-)

https://x.com/simonw/status/1924909405906338033

red75prime · 2026-02-13T13:04:59 1770987899

Cool. At least they are working across the board and benchmaxing random things like the theory of mind.

WarmWash · 2026-02-13T15:16:03 1770995763

I think no matter what happens with AI in the future, there will always be a subset of people with elaborate conspiracies about how it's all fake/a hoax.

suddenlybananas · 2026-02-13T18:58:09 1771009089

I'm not saying it's a hoax. If it gets better because of that data, tant mieux, but we have to be clear eyed about what these models are actually doing. Especially when companies don't explain what they've done.

dontwannahearit · 2026-02-13T10:08:28 1770977308

Would it not be better to have 100 such tests "Pelican on bicycle", "Tiger on stilts"..., and generate them all for every new model but only release a new one each time. That way you could show progression across all models, attempts at benchmaxxing would be more obvious.

Given the crazy money and vying for supremacy among AI companies right now it does seem naive to belive that no attempt at better pelicans on bicycles is being made. You can argue "but I will know because of the quality of ocelots on skateboards" but without a back catalog of ocelots on skateboards to publish its one datapoint and leaves the AI companies with too much plausible deniability.

The pelicans-on-bicycles is a bit of fun for you (and us!) but it has become a measure of the quality of models so its serious business for them.

There is an assymetry of incentives and high risk you are being their useful idiot. Sorry to be blunt.

Applejinx · 2026-02-13T14:43:53 1770993833

Or indeed do the Markov chain conceptual slip. Pelican on bicycle, badger on stool, tiger on acid. Pelican on bicycle is definitely cooked, though: people know it and it's talked about in language.