AI coding assistants ruined the global leaderboard experience. AoC might as well...

fuglede_ · on Dec 1, 2024

It's not that bad. I'm sure there are more LLM'ers in there than the one, but you can tell that the majority of the day 1 leaderboard is made up of people who have historically performed well, even before LLMs were a thing.

Compare https://adventofcode.com/2024/leaderboard/day/1 to e.g. https://fuglede.github.io/aoc-full-leaderboard/

There was also at least one instance of people working together where you would have 15 people from the same company submit solutions at the same time, which can be a bit frustrating but again, not a huge issue.

fuglede_ · on Dec 5, 2024

Okay, I think I have to go ahead and retract my own comment. Day 5 appears to have been sufficiently tricky for humans to do quickly while still easy enough for the LLMs that it is clear that there is a very large amount of cheating going on.

gorgoiler · on Dec 1, 2024

I have a rule in life: no summary statistics without showing the distribution.

Usually this goes for any median which might be in a sneaky bimodal distribution of, say, AI models vs humans. I guess it applies to leaderboards too though.

exitb · on Dec 1, 2024

Potentially the challenge just doesn’t make as much sense anymore? There apparently are „mental calculations” competitions and I’m sure their participants have fun. Yet I can hardly imagine doing arithmetic in ones head is any fun for an average mathematician. The challenge just shifted elsewhere over time.

WithinReason · on Dec 1, 2024

They should check that LLMs can't solve the problems in 9 seconds and come up with appropriate problems. Or just allow AI assistants, they are now as much part of the programmer's toolkit as syntax highlighting or autocomplete or Stack Overflow, and pretending otherwise is not useful.

martin-t · on Dec 1, 2024

Not gonna happen. AoC always starts with beginner level problems. That's why it's so commonly used for learning the basics of new languages.

A problem that wouldn't be immediately solvable by LLMs would either be too advanced or simply too large to be fun.

This is probably where programming as a whole is going. Many of the things that make programming fun for me, like deeply understanding a small but non-trivial problem and finding a good solution, are gonna be performed much faster by LLMs. After all most of what we do has been done before, just in a slightly different content or a different language.

Either LLMs will peak out at the current level and be often useful but very error prone and not-quite-there. Or they'll get better and we'll be just checking their output and designing the general architecture.

Retr0id · on Dec 1, 2024

The first few days are supposed to be beginner-accessible, it's practically impossible to have something beginner accessible but GPT-inaccessible.

matsemann · on Dec 1, 2024

That's like going out for a run and taking an electrical scooter around the park instead. The point isn't finishing, the point is doing the activity.

WithinReason · on Dec 1, 2024

Then why have a leaderboard?

matsemann · on Dec 1, 2024

Because someone likes to compete? There are 5k races as well, which people enjoy to do even though vehicles exist. And people would rightfully be upset if they got beaten by someone not running themselves.

zwirbl · on Dec 1, 2024

And then allow aimbots for counterstrike, stockfish at chess tournaments and Epo on the tour de France. The leader board is intended for people to compete against each other, one could make a separate leaderboard for LLM, kind of similar to the chess AI leaderboards.

falcor84 · on Dec 1, 2024

> allow aimbots for counterstrike

I'm not played counterstrike in over a decade, so you got me wondering - are there matches where everyone uses aimbots? What does the game look like then? I suppose there's a new mix of strategies evolving, with a higher focus on the macro movement planning?

worthless-trash · on Dec 1, 2024

> are there matches where everyone uses aimbots?

Yes

> What does the game look like then?

I have only observed the games, it requires a lot of hiding.

Most of the time the winning method is to act at the very last second and hope the other player is distracted.

WithinReason · on Dec 1, 2024

False equivalence. The sole reason for counterstrike and chess to exist is competition. Programming is about solving a problem. If you want to turn programming into a competition you shouldn't take away tools from the programmer.

jbjbjbjb · on Dec 1, 2024

You’re saying programming isn’t not equivalent to chess here because programming isn’t a competition, but the Advent of Code leaderboard very much is a competition.