I went through a few iterations for logging, but now I settled on using built-in GKE logging. Stdout logs from my containers are picked up by kubernetes and forwarded to stackdriver. Since it's just stdout I do not create too much lock-in. I use stackdriver dashboard for investigating recent logs and BigQuery exporter for complex analysis. My stdout logs are jsons, so I can export extra metadata without relying on regexps for analysis - I use https://pypi.python.org/pypi/python-json-logger.
Just as a hint in case you're not aware of this: If you log errors in the format[1] expected by Stackdriver Error Reporting, your errors will automatically be picked up and grouped in that service as well.
since we are not on GKE already, I want to be able to use k8s to forward to my host machine journald and thats broken. I think Google is doing a lot of handholding to get it working with stackdriver.
This is the blocker for me. I cant switch to GKE already because I use AWS postgresql. But I want to use k8s :(
Thr K8S project I was involved in last year used AWS postgresql too. At that time, figuring out how to have persistant data was too much. Further, AWS EBS driver for K8S storage wasn't there. And, PetSets have not come out. With PetSets out, I think figuring out how to do a datadtore on K8S or GKE will be easier. (I had mentioned to my CTO, I don't know how to do persistant store on GKE; he pointed out the gcloud data store; I told him, that was not what I meant ;-)
The owner of the company ran out of money before I could add logging, but my plan was to get it out to something like papertrail.
I was on AWS. I sidestepped the issue by using AWS RDS (postgresql).
I had tried to get the nascent EBS stuff working, but when I realized that I'd have to get a script to check if an EBS volume was formatted with a filesystem before mounting it in K8S, I stopped. This might have been improved by now.
I probably wrote that support (or at least maintain it), and you shouldn't ever have needed to add a script to format your disk: you declare the filesystem you want and it comes up formatted. If you didn't open an issue before please do so and I'll make double-sure it is now fixed (or point me to the issue if you already opened it!)
On the logging front, kube-up comes up with automatic logging via fluentd to an ElasticSearch cluster hosted in k8s itself. You can relatively easily replace that ES cluster with an AWS ES cluster (using a proxy to do the AWS authentication), or you can reconfigure fluentd to run to AWS ES. Or you can set pretty easily set up something yourself using daemonsets if you'd rather use something like splunk, but I don't know if anyone has shared a config for this!
A big shortcoming of the current fluentd/ES setup is that it also predates PetSets, and so it still doesn't use persistent storage in kube-up. I'm trying to fix this in time for 1.4 though!
If you don't know about it, the sig-aws channel on the kubernetes slack is where the AWS folk tend to hang out and work through these snafus together - come join us :-)
@justinsb - based on the bug link I posted above, what do you think is the direction that k8s logging is going to take?
From what you wrote, it seems that lots of people consider logging in k8s to be a solved issue. I'm wondering why is there a detailed spec for all the journald stuff, etc.
From my perspective - it will be amazing if k8s can manage and aggregate logs on the host machine. It's also a way of reducing complexity to get started. People starting with 1-2 node setups start with local logs before tackling the complexity of fluent, etc
I'm not particularly familiar with that github issue. A lot of people in k8s are building some amazing things, but that doesn't mean that the base functionality isn't there today.
If you want logs to go into ElasticSearch, k8s does that today - you just write to stdout / stderr and it works. I don't love the way multi-line logs are not combined (the stack trace problem), but it works fine, and that's more an ElasticSearch/fluentd issue really. You'll likely want to replace the default ES configuration with either one backed by a PersistentVolume or an AWS ES cluster.
Could it be more efficient and more flexible? Very much so! Maybe in the future you'll be able to log to journald, or more probably be able to log to local files. I can't see a world in which you _won't_ be able to log to stdout/stderr. Maybe those streams are redirected to a local file in the "logs" area, but it should still just work.
If anything I'd say this issue has suffered from being too general, though some very specific plans are coming out of it. If writing to stdout/stderr and having it go to ElasticSearch via fluentd doesn't meet your requirements today, then you should open a more specific issue I think - it'll likely help the "big picture" issue along!
The only reason why we didnt go ahead is because of broken logging in k8s - https://github.com/kubernetes/kubernetes/issues/24677