Oh maaaaan the old graylog2 site had probably my /favorite/ branding in all of software. The heading was "Manage your logs in the dark and have lasers going and make it look like you're from space."
"Users who tested the beta and release candidate versions of v1.0 reported huge improvements in performance and stability. Some of them were unable to crash the system no matter how hard they tried."
This statement from the page makes me conclude that others were indeed able to crash the system, perhaps without even trying too hard.
My guess is that the statement should be re-worded.
I was able to crash the system without even trying.
After waiting for a 500MB something docker image download and running the container halted the host OSX machine while looping with errors (failed API connections, failed to load SIGAR, ...)
The screenshots looked great but the first steps experience were a deal breaker.
We've tested the Docker image on Mac OS X with boot2docker but of course there are always things that can go wrong.
It would be awesome if you could help us improve the Docker image by creating an issue at https://github.com/Graylog2/graylog2-images/issues with the error messages you've encountered when starting the container.
Oh, we got both kinds. We got country and western. ;-)
Seriously, the Docker image and the VM images are there to help people getting started and to quickly try out Graylog. We also offer regular OS packages in DEB and RPM formats (supporting Debian 7, Ubuntu 12.04 and 14.04, and CentOS 6) and a quick setup application (for a simple demo setup of Graylog and its dependencies). So choose your poison and be happy!
That is a very good point. We are in the process of changing that. :)
The meaning should be that with v1.0 it is finally very hard to crash the system by overloading it. Previous versions were easier to crash by filling memory in load spike situations for example.
Hurrah - good to see that the Elasticsearch dependency is no longer locked to a rather old version (that bit me a few times when setting up a cluster last year). I'm still a bit wary about having to spin up MongoDB for it as well, though…
The MongoDB requirement gives me pause, too. I've wanted to give ToroDB (https://github.com/torodb/torodb) a try for awhile now, and I think I'm going to see if Graylog will talk to it for grins.
So is this in the vein of Splunk, say? My company also uses logentries, and has been quite happy with it, but a hosted open-source tool would be great.
We don't want to use ES too much as Graylog should still work when ES has some problems. It's currently the default output but that may change in the future for special use cases.
That being said, we're aware that many people don't like the dependency on MongoDB and we'll work on that.
Is Graylog effectively a single product that does the whole thing?
My dismay with the ELK stack is you are effectively juggling 3 separate products with different release cycles.
Yes! We have put a lot of effort into making this one thing.
The graylog-web-interface connects to the graylog-server REST APIs and that is it. You can manage and monitor the whole system from the graylog-web-interface.
So there's Graylog, then there's Graylog2, then there's the new-and-improved Graylog.... If someone forks it will there be a new-and-improved Graylog2?
As Men in Black 3 wasn't that great and Star Wars: Episode 3 also had its lengths (don't get me started about Lethal Weapon 3!), we tried to avoid that. Let's see how Ghostbusters 3 will work out and maybe (but just maybe) we'll think about releasing Graylog3. ;-)
Their search interface and field extraction let you perform complex queries on your logs. We used the following query to identify users that were having a bad time on the site, by counting slow queries by user and adding them up:
"slow query" source="/var/log/application.log" | rex field=_raw "in (?<time>\\d+)ms" | search time>2000 | rex field=_raw "User: (?<login>[^\\s]+) " | top login
You can do much more complex queries--it's better to think of Splunk as a temporal database than a log aggregator.
Does anyone know if Graylog or other logging solutions can do the same thing? Splunk is amazing, but it's annoying to manage the infrastructure and it's crazy expensive.
I've tried and used sumologic . It may not be appropriate for some companies because everything is sent and analyzed on their cloud infrastructure, but is a very good product.
A log management tool: You send all your local log messages/files to a central place (Graylog) and have unified searching, filtering, monitoring, alerting, forwarding, ...
Call me stupid, but I've always learned better from examples than descriptions. So it's a log management tool. I can have my app write logs or files into Graylog, then I can handle the logs/files within Graylog much more easily than, say, if I were to write my own code to make some kind of sense with the log data. Am I understanding this correctly?
So I can maybe tell easily where users had problems with my app, or I can even use it as some kind of analytics program for my app?
I've visited the site and read what I could before, but without knowing what it does, it was pretty cumbersome to read through the entire website to figure out what exactly it does.
I was initially drawn to the fact that it took five and a half years to get to 1.0. God knows how hard it must've been to work on it to perfect it all these years, heck spending five months on an app is tough enough. I would've given up reading about it otherwise if it hadn't been for the five years thing. I just think that, despite all it's capable of, if the descriptions were more in layman's terms it could reach wider audience.
It actually turns out that this is exactly what I was looking for since I'm about to have my app out, and if I hadn't asked the question, I would've just ignored it all together.
While I'm very thankful for the additional explanations, I still think that nice illustration or a simple video of telling how it can solve real world problems will help people understand it better.
This would give you a list of all HTTP 500s of "your-app".
Or:
user_id:12346 AND http_status_code:>=500
... to see all errors a user caused after customer care called you, reporting an error the user got. Stacktrace is there immediately without having to find the right log file on the right web server.
I've used SumoLogic to do log management at work, and it's been helpful in specific ways:
- We have a server that runs dozens of websites. When the load spikes, we can quickly get a count of recent log entries for all the sites. The site with the anomalously high number of entries is where we start troubleshooting. This could also be automated as "anomaly detection" that sends us alerts, but we haven't configured that yet--happens rarely.
- One of our servers got hacked. Running log searches helped us pinpoint when it happened, which site was "patient zero," and how the bad guys got in.
- We launched a new site and forgot the Google Analytics code, which we didn't catch right away. We were able to run a report from the server logs to approximate the traffic data that GA missed.
Having all the logs feed into a centralized service made it easier and faster to find the information we needed across a bunch of websites, as opposed to working directly with Apache log files.
We looked at using ELK (Elasticsearch, Logstash, Kibana) to do the same thing "for free," but decided we did not want to manage a complex software stack to help manage a complex software stack. :-) We'll take a look a Graylog I'm sure, but there is something to be said for paying for this as a service--one less thing to worry about.
Sentry is for monitoring exceptions and errors on your platform. While Graylog can do that, too (without the aggregation though) it is also capable of monitoring any log messages out there and not only errors.
I looked around a little bit on the web page but the answer to my two questions was not immediately apparent to me. (Note: I might just be blind)
1. Can I set up a second Graylog Server as a failover with automated recovery and log syncing when both nodes come back up?
2. Can I run it on any cheap cheap embedded platforms like a raspberry pi or beagleboard?
Yes, you can set up as many Graylog servers as you want and put them behind a load balancer or integrate any kind of failover you want.
We do not recommend it running on extremely small platforms like a raspberry pi, but a very small VM (we have OVAs and other virtual appliances ready) is able to process a lot of messages already.
Only if your load balancer supports UDP, which most don't. You'll most likely need to use DNS load balancing in this case unless you're sending GELF messages with TCP.
I've used it and have colleagues that have used it a few companies. I would suggest to get the most out of it, you should set up some sort of tagged logging. In Ruby, there's log4r + log4r-gelf to add additional context (e.g., Rails request GUID), that way you can easily and quickly link related messages. If you lack that flexibility, Graylog's extractors work rather nicely too. It's just a lot harder to manage, IMHO.
It's really helpful being able to pull up all requests for a single user across the cluster. Or track the status of all components of a Sidekiq job. Or see which API requests are most popular. Etc.
Yes. We've been using it in production for about 1.5 years to gather all of our syslog and our application logs. No problems at all, and we're up to about 20 shards of data now.
Honestly, I have no idea what we did before we started using it. It's an absolutely essential tool.
Yep! We do about 5k events/sec and it holds up pretty well as long as you throw enough memory at it (It's a java app). Make sure you have a cluster though if you're going to be doing those kinds of levels. Feel free to respond if you have any other questions. I'll be testing out the 1.0 release myself today.
The solution seems neat. It'd be nice to know who is using it in production, how big is their environment, etc. I couldn't find much information online.
We're receiving roughly 200 msgs/s in our environment. We have it set up in an active/active environment across two data centers. It's been rock solid for us. It takes a little bit of time to get certain applications to send messages properly, but once you do, it's pure gold.
We are using NXLog on our Windows servers to send event log messages in GELF format. This allows us to truly delve in and search event logs so much easier than what we normally would be able to in Windows.
Can I just have on thing that does event logs and snmp and uptime and alerting for internal and external devices in a secure way all in one package? I'm so tired of having to have 30 different softwares installed just to accomplish 3 things.
https://web.archive.org/web/20130302051347/http://graylog2.o...
Sad to see it's gone all "professional" now :(
The web service had some hilarious bits too. Does it still say "enraging gorillas... | mounting party hats" etc when you log in?