Protohackers: Server Programming Challenge

jtchang · on Sept 9, 2022

Fun! One of my college networking classes had us write a file transfer program. Except the server was buggy (on purpose!) and it would corrupt bits here and there. They had a public leaderboard of people who could transfer the most amount of data over this very lossy link.

I originally was planning to implement some sort of forward error correction but then realized I could just use UDP and hammer the server with duplicate packets. Not very elegant but got me close to the top of fastest transfer speeds over a lossy link.

ALittleLight · on Sept 9, 2022

I don't think a leaderboard based on "time to solve" is a very good criteria. It simply rewards people who see the challenge earlier and have time to complete it. I think throughput and latency would be better criteria. What TPS do you support?

Granted, this might be somewhat "pay to win" but I still think it would be better.

metaphor · on Sept 9, 2022

I'm thinking real world markets tend to reward first movers who deliver adequate solutions with the cheese, but surely there's leaderboard room for complementary wine of some objectively substantial performance vintage that may take a bit more time to refine for a balanced pairing---that is, so long as you're not already pre-game drunk at the wheel when the RFP opportunity window closes given misaligned engineering trade-offs tend to have real consequences too.

arc-in-space · on Sept 9, 2022

Wow, interesting! I love the testing approach, it's still automated, unlike something like AoC, but doesn't require the owners to run a whole program testing infrastructure.

CodesInChaos · on Sept 9, 2022

Do you have any plans to add IPv6 support? Also it's probably a good idea to add an FAQ stating that you currently don't support IPv6.

jstanley · on Sept 9, 2022

It's possible if enough people want it. The only reason not to is that it potentially doubles my testing burden for almost no benefit.

CodesInChaos · on Sept 9, 2022

One use-case is residential internet using DS-Lite where you can port forward IPv6 but not IPv4.

mattbaker · on Sept 11, 2022

In case it helps anyone who finds their way back here — I tried ngrok and telebit and both couldn't pass the smoke test.

I opened up a port on my router so things could go directly and sure enough everything went through.

Not sure what those tools are doing to muck things up along the way but heads up!

aabbcc1241 · on Sept 13, 2022

The idea is interesting, how about have the protohacker listen on a port, and accept connections from our program?

This way anyone without public IP nor credit card can participate.

thecodeboy · on Sept 10, 2022

First I wrote the smoke test with Java, but it failed due to the timeout. Then I tired with Go and it passed.

I'd like to try the other questions with Java, but for that the timeout needs to be increased.

Anyone else tried with Java? How was your experience?

jstanley · on Sept 10, 2022

That probably means your server wasn't closing the socket when it reached EOF, so the checker was still waiting.

There's no reason any programming language would be too slow to complete the Smoke Test

DethNinja · on Sept 9, 2022

Where is the global leaderboard though?

jstanley · on Sept 9, 2022

There isn't one yet. I'm still deliberating on how it should work. In particular I'd like it to include a rank for everyone who has ever solved any problem, and still behave sensibly in cases where someone has not completed every single problem.

I think the current idea is that if N people have ever solved any problem, then you get assigned a rank of N+1 for every problem you haven't solved yet. Your "rank score" is the sums of your individual problem ranks, and the overall leaderboard is ranked by "rank score" (lowest at top).

This means you always "lose score" (potentially gain position) for solving a problem, even if you solve it very late, but you still get a somewhat meaningful rank even if you haven't solved every problem.

Jtsummers · on Sept 9, 2022

You could consider calculating a "difficulty" score for problems and use that for ranking. The number of solved problems could be used as a tiebreaker if two people share a score. A simple difficulty ranking would be to take:

  difficulty = #registered/#solved

Which approaches 1 (lowest difficulty) as more people solve a particular problem. You can also account for the number of people who have tried to solve it but not yet succeeded:

  difficulty = (#registered + #unique failed attempts)/#solved

Unique failed attempts means that if I try it 10 times it still counts as 1 failed attempt. You can either keep the failed attempts in the score, or reduce that count as people move from failure to success. In the former case, the score for hard problems (where people consistently submit failed solutions on their first try) will have a difficulty approaching 2 in the limit, and in the latter it will again approach 1.

With numbers, the three schemes:

  #registered = 100
  #unique failed = 50
  #solved = 25
  #solved with an earlier failure = 20

  1) difficulty = 100/25 = 4
  2) difficulty = (100+50)/25 = 6
  3) difficulty = (100+50-20)/25 = 5.2

A person's score is the sum of the difficulty ranking of every problem they've solved. Since the challenges are released at a particular time, you could give a small bump for something like the first 5, 10, whatever participants but be careful of making it too big. Otherwise late joiners will always be in the bottom ranks.

jstanley · on Sept 11, 2022

I like the idea of weighting the problems by a difficulty score that is calculated from the number of users solving it.

I think the idea as presented ("A person's score is the sum of the difficulty ranking of every problem they've solved") would not work great on its own because there will be a large number of people who have solved every single problem.

But certainly we could do the thing about adding up people's per-problem ranking with a weighting based on the problem difficulty! I will experiment with this.

The thing I don't like is that it can result in the leaderboard changing, even for people who have solved all problems, simply because more new people attempt a particular problem, which can alter the relative weightings:

Imagine Alice has 1st place on problem 1 and 2nd place on problem 2. Bob has 2nd place on problem 1 and 1st place on problem 2. Problem 1 is considered twice as difficult as problem 2, so Alice is ahead of Bob overall.

Bob doesn't like this. He wants to be ahead of Alice. He signs up for thousands of accounts to attempt (but not solve) problem 2. This makes problem 2 look very difficult, so now problem 2 is worth more than problem 1 and Bob is ranked ahead of Alice.

supahfly_remix · on Sept 9, 2022

the first challenge requires supporting 5 simultaneous transactions. Could Python be used for this?

jstanley · on Sept 9, 2022

Yes, shouldn't be a problem at all. Several people have done it with Python. If you click on hyperlinked names on the problem leaderboard[0] then most of them will go to git repositories where you can look at people's solutions.

Here's a Python solution for the Smoke Test (SPOILER!): https://github.com/edoannunziata/protohackers/blob/master/00...

[0] https://protohackers.com/leaderboard/0

thunkle · on Sept 9, 2022

I really like the idea, but the effort required to boot up a server is holding me back.

vermilingua · on Sept 9, 2022

It takes about 5min (including logging in and mistyping my MFA code twice) to spin up and SSH into a fresh EC2, another ~2min past that to have it accept git pushes. Depending on the language another 5-10min to set up automatic builds.

Consider it part of the challenge?

otachack · on Sept 9, 2022

Agreed, there are several methods to spin a server and it offers a great learning experience for those that don't know it.

macintux · on Sept 9, 2022

Exactly, I'm using it as an excuse to finally branch out into Terraform and non-AWS cloud providers.

mtlynch · on Sept 9, 2022

If you're comfortable with Docker, fly.io makes it really easy to spin up a web server, and you can run three servers 24/7 under their free tier.

(not affiliated, just a happy customer)

Jtsummers · on Sept 9, 2022

fly.io works well for this. Just tried it out on the smoke test problem.

renovator3 · on Sept 20, 2022

I'm trying to setup fly.io for Protohackers, but for some reason I keep on getting a 502 error, no matter what I do.

Here's my main.py and Dockerfile[0], if you feel like taking a look. The monitoring tab of fly.io didn't show any errors, or anything like that. Maybe I'm missing something obvious, but I can't really see what the issue could be.

[0] https://pastebin.com/gSr3405x

Thaxll · on Sept 9, 2022

https://ngrok.com/ or equivalent should make that easy, it will tunnel and expose a local TCP port to the internet.

Beside that I think Protohackers is an excellent idea.

mvolfik · on Sept 9, 2022

I'm not sure if ngrok would work - my understanding was that they only offer public domain names, which point to (a limited amount of) their IP addresses, and which client to route to is distinguished by HTTP Host header (eventually SNI). Which wouldn't be usable in this case, since you're dealing with raw TCP

jasmaa · on Sept 10, 2022

FWIW my echo failed smoke test when I tried with ngrok using the IP I got from dig. Strange thing is that the server still received the test payloads over ngrok, and my echo passes when hosted on EC2. Not sure if it's something with ngrok or if there's some other incompatibility (I'm a windows user :/)

Thaxll · on Sept 9, 2022

I did not look at the implementation but since they offer pure TCP exposure it should work juste fine byt resolving the DNS, when you use a TCP service there is no host header in the protocol.

shaftoe444 · on Sept 9, 2022

Good tip, going to use this as an excuse to play with ngrok!

ellen364 · on Sept 9, 2022

I can see that the cloud hosting would be a sticking point if you aren’t familiar with it.

Luckily, for Protohackers you can use a really basic setup. I used Hetzner’s standard Ubuntu image, opened the nano text editor on the server, copy-pasted my Python code, then ran the program using the OS’s version of Python. Didn’t even run it in the background; just started the program and closed it when Protohackers reported the tests were done.

keyle · on Sept 9, 2022

I've done azspcs and advent of code in the past, looking at this one... I can't think of a more annoying challenge to take on.

hmcamp · on Sept 9, 2022

What makes it annoying? I’m curious.