Hacker News new | past | comments | ask | show | jobs | submit login
Protohackers: Server Programming Challenge (protohackers.com)
197 points by jstanley on Sept 8, 2022 | hide | past | favorite | 32 comments



Fun! One of my college networking classes had us write a file transfer program. Except the server was buggy (on purpose!) and it would corrupt bits here and there. They had a public leaderboard of people who could transfer the most amount of data over this very lossy link.

I originally was planning to implement some sort of forward error correction but then realized I could just use UDP and hammer the server with duplicate packets. Not very elegant but got me close to the top of fastest transfer speeds over a lossy link.


I don't think a leaderboard based on "time to solve" is a very good criteria. It simply rewards people who see the challenge earlier and have time to complete it. I think throughput and latency would be better criteria. What TPS do you support?

Granted, this might be somewhat "pay to win" but I still think it would be better.


I'm thinking real world markets tend to reward first movers who deliver adequate solutions with the cheese, but surely there's leaderboard room for complementary wine of some objectively substantial performance vintage that may take a bit more time to refine for a balanced pairing---that is, so long as you're not already pre-game drunk at the wheel when the RFP opportunity window closes given misaligned engineering trade-offs tend to have real consequences too.


Wow, interesting! I love the testing approach, it's still automated, unlike something like AoC, but doesn't require the owners to run a whole program testing infrastructure.


Do you have any plans to add IPv6 support? Also it's probably a good idea to add an FAQ stating that you currently don't support IPv6.


It's possible if enough people want it. The only reason not to is that it potentially doubles my testing burden for almost no benefit.


One use-case is residential internet using DS-Lite where you can port forward IPv6 but not IPv4.


In case it helps anyone who finds their way back here — I tried ngrok and telebit and both couldn't pass the smoke test.

I opened up a port on my router so things could go directly and sure enough everything went through.

Not sure what those tools are doing to muck things up along the way but heads up!


The idea is interesting, how about have the protohacker listen on a port, and accept connections from our program?

This way anyone without public IP nor credit card can participate.


First I wrote the smoke test with Java, but it failed due to the timeout. Then I tired with Go and it passed.

I'd like to try the other questions with Java, but for that the timeout needs to be increased.

Anyone else tried with Java? How was your experience?


That probably means your server wasn't closing the socket when it reached EOF, so the checker was still waiting.

There's no reason any programming language would be too slow to complete the Smoke Test


Where is the global leaderboard though?


There isn't one yet. I'm still deliberating on how it should work. In particular I'd like it to include a rank for everyone who has ever solved any problem, and still behave sensibly in cases where someone has not completed every single problem.

I think the current idea is that if N people have ever solved any problem, then you get assigned a rank of N+1 for every problem you haven't solved yet. Your "rank score" is the sums of your individual problem ranks, and the overall leaderboard is ranked by "rank score" (lowest at top).

This means you always "lose score" (potentially gain position) for solving a problem, even if you solve it very late, but you still get a somewhat meaningful rank even if you haven't solved every problem.


You could consider calculating a "difficulty" score for problems and use that for ranking. The number of solved problems could be used as a tiebreaker if two people share a score. A simple difficulty ranking would be to take:

  difficulty = #registered/#solved
Which approaches 1 (lowest difficulty) as more people solve a particular problem. You can also account for the number of people who have tried to solve it but not yet succeeded:

  difficulty = (#registered + #unique failed attempts)/#solved
Unique failed attempts means that if I try it 10 times it still counts as 1 failed attempt. You can either keep the failed attempts in the score, or reduce that count as people move from failure to success. In the former case, the score for hard problems (where people consistently submit failed solutions on their first try) will have a difficulty approaching 2 in the limit, and in the latter it will again approach 1.

With numbers, the three schemes:

  #registered = 100
  #unique failed = 50
  #solved = 25
  #solved with an earlier failure = 20

  1) difficulty = 100/25 = 4
  2) difficulty = (100+50)/25 = 6
  3) difficulty = (100+50-20)/25 = 5.2
A person's score is the sum of the difficulty ranking of every problem they've solved. Since the challenges are released at a particular time, you could give a small bump for something like the first 5, 10, whatever participants but be careful of making it too big. Otherwise late joiners will always be in the bottom ranks.


I like the idea of weighting the problems by a difficulty score that is calculated from the number of users solving it.

I think the idea as presented ("A person's score is the sum of the difficulty ranking of every problem they've solved") would not work great on its own because there will be a large number of people who have solved every single problem.

But certainly we could do the thing about adding up people's per-problem ranking with a weighting based on the problem difficulty! I will experiment with this.

The thing I don't like is that it can result in the leaderboard changing, even for people who have solved all problems, simply because more new people attempt a particular problem, which can alter the relative weightings:

Imagine Alice has 1st place on problem 1 and 2nd place on problem 2. Bob has 2nd place on problem 1 and 1st place on problem 2. Problem 1 is considered twice as difficult as problem 2, so Alice is ahead of Bob overall.

Bob doesn't like this. He wants to be ahead of Alice. He signs up for thousands of accounts to attempt (but not solve) problem 2. This makes problem 2 look very difficult, so now problem 2 is worth more than problem 1 and Bob is ranked ahead of Alice.


the first challenge requires supporting 5 simultaneous transactions. Could Python be used for this?


Yes, shouldn't be a problem at all. Several people have done it with Python. If you click on hyperlinked names on the problem leaderboard[0] then most of them will go to git repositories where you can look at people's solutions.

Here's a Python solution for the Smoke Test (SPOILER!): https://github.com/edoannunziata/protohackers/blob/master/00...

[0] https://protohackers.com/leaderboard/0


I really like the idea, but the effort required to boot up a server is holding me back.


It takes about 5min (including logging in and mistyping my MFA code twice) to spin up and SSH into a fresh EC2, another ~2min past that to have it accept git pushes. Depending on the language another 5-10min to set up automatic builds.

Consider it part of the challenge?


Agreed, there are several methods to spin a server and it offers a great learning experience for those that don't know it.


Exactly, I'm using it as an excuse to finally branch out into Terraform and non-AWS cloud providers.


If you're comfortable with Docker, fly.io makes it really easy to spin up a web server, and you can run three servers 24/7 under their free tier.

(not affiliated, just a happy customer)


fly.io works well for this. Just tried it out on the smoke test problem.


I'm trying to setup fly.io for Protohackers, but for some reason I keep on getting a 502 error, no matter what I do.

Here's my main.py and Dockerfile[0], if you feel like taking a look. The monitoring tab of fly.io didn't show any errors, or anything like that. Maybe I'm missing something obvious, but I can't really see what the issue could be.

[0] https://pastebin.com/gSr3405x


https://ngrok.com/ or equivalent should make that easy, it will tunnel and expose a local TCP port to the internet.

Beside that I think Protohackers is an excellent idea.


I'm not sure if ngrok would work - my understanding was that they only offer public domain names, which point to (a limited amount of) their IP addresses, and which client to route to is distinguished by HTTP Host header (eventually SNI). Which wouldn't be usable in this case, since you're dealing with raw TCP


FWIW my echo failed smoke test when I tried with ngrok using the IP I got from dig. Strange thing is that the server still received the test payloads over ngrok, and my echo passes when hosted on EC2. Not sure if it's something with ngrok or if there's some other incompatibility (I'm a windows user :/)


I did not look at the implementation but since they offer pure TCP exposure it should work juste fine byt resolving the DNS, when you use a TCP service there is no host header in the protocol.


Good tip, going to use this as an excuse to play with ngrok!


I can see that the cloud hosting would be a sticking point if you aren’t familiar with it.

Luckily, for Protohackers you can use a really basic setup. I used Hetzner’s standard Ubuntu image, opened the nano text editor on the server, copy-pasted my Python code, then ran the program using the OS’s version of Python. Didn’t even run it in the background; just started the program and closed it when Protohackers reported the tests were done.


I've done azspcs and advent of code in the past, looking at this one... I can't think of a more annoying challenge to take on.


What makes it annoying? I’m curious.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: