Hacker News new | past | comments | ask | show | jobs | submit login

The protocol has to be designed in a way that facilitates (or, at least, does not actively prevents) manual input. When you mention your performance concern, I have to suggest you profile the code. If your protocol handler is limited by the performance of its message parser, you must be doing something wrong.

As for complexity... No. If your parser is non-trivial when compared to the rest of your stack, again, you are doing something very wrong.

I am not saying to implement full line editing or anything close that. I'd just like to point out that being able to sometimes manually interact with a server is very convenient.




Line folding doesn't really help manually interact or craft messages. Nor do comments. There's no real value in being able to do:

  Foo:  ValueBegin (this is a comment)
    ValueEnds
Versus requiring Foo: ValueBegin ValueEnds

As far as performance, I must say you're incorrect. Review the nginx HTTP parser and you'll see all sorts of bitwise hacks in order to improve performance. This aligns with my own experience writing a packet capture system for a similar protocol.

Having open syntax, comments and line folding being part of the problem, incredibly complicates the parser. Other moronic things are the completely arbitrary handling of header fields. Some header fields allow their value to be split over multiple lines but must treat it as if it was one line. Others use multiple headers to provide some multi-line value. There's no simple parsing, it all must be sensitive to the context. This is just stupid, yet free-text protocol authors revel in it. SIP even publishes a "torture test" RFC where they're just oh-so-pleased with the edge cases their moronic spec allows. They even suggest a parser should guess as to the message sender's intent.

I'd also note that in some cases the parser is most of the stack (simple proxy scenarios, frontend security). Regardless, the fact that the rest of the stack may be complicated is not in any way an excuse for making the parser worse. This is not a programming language.


> you'll see all sorts of bitwise hacks in order to improve performance.

How much time does nginx spend parsing HTTP requests compared to waiting for network or disk IO?

> SIP even publishes a "torture test" RFC

This is actually very clever. I wish other protocols had something like it to help weed out partial and buggy implementations.

> They even suggest a parser should guess as to the message sender's intent.

This may be a little bit too much

> This is not a programming language.

A lot of incredibly powerful uses for technology come precisely from the unintended scenarios - the clever ways to abuse technology and force it to do something it was never intended to. Remember HTTP itself was conceived to do a tiny subset of what it does now.


>How much time does nginx spend parsing HTTP requests compared to waiting for network or disk IO?

Enough that they choose to use much more obtuse code? Not to mention network/IO scale independently so it's not a relevant comparison. I wrote a packet capture system that spent about 70% of its capture/index CPU budget on parsing.

A torture test is only clever if it's not absurd because of a thousand edge cases. Then it just reflects the stupidity in the protocol design.

Do you have any possible use of terrible parsing rules that encourage security holes? HTTP is commonly used outside browsers because it's a simple wrapper for a TCP-RPC model. Request something, get a response. And none of the features depend on the stupid parsing rules - absolutely no one and nothing benefits from that. Except perhaps contractors billing hourly.


> Enough that they choose to use much more obtuse code?

They certainly had it profiled before they optimized it. There must be some numbers somewhere. "Enough" is hardly an acceptable answer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: