*It's really hard to keep moving all the way up and down the stack.* The main ad...

kasey_junk · on Feb 13, 2014

I don't want to put words into your mouth, but I want to clarify something. Those 90s style architectures that had database servers, application servers, caching servers etc (N-tier) were not about scaling the number of connections you could support on a given set of hardware or maximizing message throughput through a system or minimizing message latency. Those architectures came about as a response to a common problem of the time which was lots of different legacy systems needing to talk to each other.

Think accounting systems on a mainframe talking to inventory systems written in C talking to HR systems provided by the consultantware flavor of the month. In those kinds of environments it made a ton of sense to break up the various functional tiers so that you could attach to the appropriate one with minimal translation layers.

Having worked on systems like that in the past, you'd never go in thinking that it is the lowest latency way to do it. Latency wasn't typically a major design driver.

That said, I work on low latency message systems now and the architecture your describing is very typical in that space.

erichocean · on Feb 14, 2014

the architecture your describing is very typical in that space

Totally, and whatever innovation I actually produce is mainly putting together cool stuff other people have done in a novel way (which at times, hardly seems innovative). I'm hugely indebted to all of the research and pioneers in production in our industry.

One of the things I love about CS is how easy it is to take cool shit from one subfield and apply it to another. There's just so many smart people figuring stuff out that to a guy like me, that's like Christmas every week. :)

lukego · on Feb 14, 2014

Fantastic :-).

This is also my way of thinking and the way we are doing things in Snabb Switch. I'm influenced by my brief time working with Mitch Bradley on firmware hacking at OLPC. Linux would never get the same performance out of hardware as his seemingly toy little Forth drivers.

luke@snabb.co if you fancy a chat some time :-)

P.S. Thanks for the generous link-plug that landed on highscalability.com. I look forward to hearing more about your systems in the future.

ArbitraryLimits · on Feb 13, 2014

> one of the startup's I CTO for

Did I understand that correctly? You're the CTO of multiple startups? How's that working out?

erichocean · on Feb 14, 2014

I'm doing two right now. It's not as exciting as it seems.

The first I worked on for a year straight starting fall 2012, it launched, we got our first customer which exceeded our year-one goal in revenue by 2-3x, and was too big for us to handle anyone else without raising funds. We're replacing a Lotus Notes system for a $100 million/year company with 800+ employees.

My dev partner in Australia has been handling that account ever since (it's an Australian company, and I'm in the US), and as we've been working together for almost 8 years, there's not a lot of back and forth needed at this point. Mostly just weekly meetings and the occasional Skype call. My plan is to pick development back up again on that startup mid-Summer in preparation for expanding beyond that first company.

The other startup I began immediately after the first launched, and I've been designing and coding basically non-stop for 4 months. So there's hasn't been much overlap for me, and so far, it's worked out okay.

In my experience, I'm most useful launching projects. I can do the architecture, code up the initial design, and then at that point, other (better) programmers can see what it is I'm trying to do, and buckle in for the long haul.

(BTW I'm contemplating CTOing a third company, if anyone has one they'd like to pitch me (erich.ocean@me.com). I've got 3-4 months of time available to work on something new. If you project can be launched in that amount of time, I might be able to help.)

JoachimSchipper · on Feb 13, 2014

Although that sounds really cool, are you really sure you're not the person this article is aimed at? How much do your users care about near-zero latency compared to, say, having more of their friends on the network?

(Of course, I don't presume to know your situation.)

erichocean · on Feb 14, 2014

Yeah, getting users is much more important. Fortunately, that part has already had two and half years development and a successful (especially, given the quality—it made Friendster feel fast) app, so that part of the business was well taken care of.

That's actually my favorite part about this particular company. Normally, it's a bunch of tech and business guys trying to "get traction". This business already had traction, and just needed a better app and infrastructure.

So I took four months and solved that problem for good. :)

pjungwir · on Feb 13, 2014

> UDP + NaCl-based crypto.

I would love to hear more about this. I was looking for a way to send UDP that is encrypted and authenticated, with private contents and no replay attacks. The best option I could find is DTLS, but that has spotty support on Windows and no API in common scripting languages. Can you say more about your approach? If it is roll-your-own, what makes you confident it is secure?

spc476 · on Feb 14, 2014

I can't speak for erichocean but I've played around with both UDP and NaCL (http://nacl.cr.yp.to/). NaCl is very easy to use, and quite fast. It's not easy to install, but once it's there, it's nice to use.

And because of my playing around with NaCL, I've come to the conclusion that encryption isn't the hard problem---key management is the hard problem.

erichocean · on Feb 14, 2014

I was looking for a way to send UDP that is encrypted and authenticated, with private contents and no replay attacks.

I use crypto_box/unbox as is, although we do do the crypto_beforenm()/crypto_afternm() as (a) our clients are known, and (b) it's ridiculously faster. The docs on NaCl mention it, sort of, but it's much, much, faster. If you can do it, you always should do it.

As for replay attacks, we use what I'll call "hash chaining" (if someone knows the real name, I'd like to know). We have two kinds of messages, ordered and unordered. Unordered messages can be replayed at any time, as they are read-only, or some other kind of status messages, such as a heartbeat.

For ordered messages, we hash the previous message and use it as input for the next message when it is hashed. Both hashes are sent with each message (we use Sip-Hash 2-4, highly recommended BTW). (One analogy would be how git works, where the previous commit's hash is hashed as part of the new commit's hash.)

The chain of messages from the device to the server is called the "device stream", and we have a second chain going back to the device, the "update stream". The system supports multiple device streams for a single users (e.g. an iPhone and an iPad), but there's only one update stream for all devices, so they stay in sync.

Both ends of the connection test the hashes before applying, which trivially prevents replay attacks, since every ordered message can only be "played" once. We store the current device stream hash when the database is updated, in the same transaction, on both the client and the server. This provides resiliency in the face of crashes.

We also use the hash chain to detect out-of-order messages. In our application, they're pretty rare, so I simply drop out-of-order messages and request the correct one from the client.

This could be bad, because playing back old messages would illicit a response. For that reason, and also because I don't want to store update streams in the database permanently, the messages include their offset in the stream. The in-memory, wire format, and on-disk format of each messages is identical, so we can lay them end-to-end on disk. When a messages comes in, we can trivially determine if it is old, and drop it that way, too. If we need to write an update stream to disk, because the client hasn't connected in a while and we want to free up space, it's a single sendfile() + known offset call, conceptually, to send the update stream since they last signed back on. That way, we don't need a separate index for streams on disk.

Hope that helps!

pjungwir · on Feb 14, 2014

Wow, thank you for such a detailed writeup! I'll need to digest this a bit and see if it makes sense for our application.

daurnimator · on Feb 20, 2014

Cause it took me a few minutes to find them, the docs for all the functions mentioned can be found at http://nacl.cr.yp.to/box.html

Also for anyone that got this far without realising, this NaCl is not NaCl the google chrome "native client".

brianpgordon · on Feb 14, 2014

Not sure if serious... this reads like one of those over-the-top joke comments bragging tongue-in-cheek about how 10x the author is.

I'm going to go out on a limb and say that:

1. Ditching TCP for a social network app is insane.

2. Bypassing standard POSIX sockets for a social network app is insane.

3. Writing a custom database server for a social network app is insane.

4. There's no way you did that in four months, especially if you were CTO for multiple startups.

Even if you're the greatest developer alive on Earth today and you really did all that, think of how many amazing features you could have developed in that time to differentiate your product, instead of obsessing over insignificant nanosecond latency gains.