Hacker News new | past | comments | ask | show | jobs | submit login

""" We write a lot of Python here at Safari, in fact it’s our most widely-used language. Because it’s a very high-level language, it was likely it would satisfy my second requirement. But my experience with writing log parsers in Python before–even when using tricks like lazy evaluation–led me to suspect that it would not do so well in satisfying the first requirement. """

How can one say one parse is high performance and another isn't if no comparison is made? e.g. the author could use pylogsparser and benchmark it:

https://pypi.python.org/pypi/pylogsparser/0.8

Then we can get an idea of whether the project set out to meet its goal of being quicker than python.

In any event, I'd be interested to see how luajit-lpeg compares to their haskell impl. It even has a nice online test tool:

http://lpeg.trink.com/share/syslog




Author here. Thanks for pointing me to pylogsparser, I'll definitely take a look at that. Your point is well taken: without building a parallel implementation in Python, we don't have a way of knowing for sure if Haskell is faster. The only data points I have are that we've built a couple of logparsers for custom formats in Python before this, and the number of lines parsed/second was far smaller than the attoparsec-based parser.[1] It's not apples-to-apples, since the formats differ a bit, but I don't think that it has no predictive value. So in the second part of this post, which I'm working on now, I'm hoping to be able to provide a fully-functional NCSA combined log format parser in Haskell alongside the blog post. I think that would be fairly easy to benchmark since it's a common-enough log format.

[1] that's just measuring the time to parse log files into some sort of structured data, not necessarily to do anything with it


Thanks for the article, I think you're missing some code, it seems that every time you have a `do` block in your code samples most of the code is cut off (I assume).


Thanks for pointing that out; I think the syntax highlighter may have clobbered some code. It's now fixed.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: