This. Working for a company with a large audience means that your problem often is no longer getting people to pay attention. Your problem is getting it right, at scale, in multiple languages and locales. This can be alleviated somewhat with internal tests, invite-only alphas, bucket testing, and "labs" features.
I always try to launch services with very detailed server monitoring - I want to know how much memory is being used with what, how much I/O and how much time the CPUs spend doing non-application stuff. I want to monitor response times, queue and dataset sizes and anything that helps me say if we will need more servers, different servers or what parts of the application we should port to amd64 assembly.
Munin and some custom plug-ins. It does not give me all the data I would like, but we found a bug the other day by looking at some graphs and how one related to the rest of them.