Hacker News new | past | comments | ask | show | jobs | submit login
Sequence: A High Performance Sequential Semantic Log Parser at 175,000 MPS (zhen.org)
21 points by zhenjl on Feb 1, 2015 | hide | past | favorite | 8 comments



Another alternative to regular expression based message parsing that has native support within syslog-ng: patterndb (http://www.balabit.com/network-security/syslog-ng/opensource...)

Very fast and a bit complex to setup, but well documented and well tooled. Mature. It could do with some more community love, tbh.


Thanks for the link. Do you have any info on the performance of this parser?


I'm sorry to say I do not. I've only very recently got a stable monitoring configuration in place with this as a key piece, parsing up messages and sending them to downstream programs.

I welcome the move away from regular expressions though - they are just not necessary in this particular domain. We'll see if PatternDB's coarse grained approach comes back to bite me.

I'm happy to help as I can if you decide to use PatternDB - you can find me at l.skibinski at elifesciences dot org. I have some notes for getting started quickly I really should publish ...


As this appears to have been submitted by the author: the site is very difficult to read on an iPad. The font size toggles between small and large every few seconds. Easily reproduced in both Chrome and Safari.


Thanks for letting me know. I hadn't realized that. Will have to figure out why.


What's the key differentiator between this and logstash? Obviously logstash has this beat on the number of patterns simply because it's been around for longer. If this is truly different (superior and/or faster) than logstash's grok parser, I wonder if this could be implemented as a sort of meta-parser in logstash, possibly useful in cases where someone would have instead resorted to building a grok definition.


I don't have any first hand experience, but it seems like grok [might not be that performant](http://ghost.frodux.in/logstash-grok-speeds/)?


Sequence seems very fast. What format do you output to? Have you fed the resulting data into anything like a database?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: