Hacker News new | past | comments | ask | show | jobs | submit login

Unfortunately I had similar experiences as well; incredibly manual processes, frighteningly long manual procedure descriptions instead of scripted solutions.

My opinion: script it. Always. It doesn't matter if it's ansible, bash, puppet, python, whatever, just make sure it's not an ad-hoc command. Test the script on a server which can be sacrificed. Test as long as there is a single glitch. Run it in production.

It's to eliminate typos and to have a "log" to see what actually had been done.




Oh, absolutely. Where something can be scripted, script it. Why? Because scripting is a process development. You write something, validate it and then remove the human error element.

For things that you can't script, you write abstracted processes that force the executor to write down the things that could cause Bad Things to happen, and use that writing down stage to verify that it's not going to cause a Bad Thing. That forces people to pause and consider what they're doing, which is 80% of the effort towards preventing these issues.

eg: Forcing YP to write down which database they were scorching would've triggered an 'oh fuck' moment. Having a process that dodged naming databases as 'db1' and 'db2' would've prevented it. etc. etc. etc.


Which is what we did, obviously and nobody is allowed to run anything manually while SSHing to production.

But there was a tremendous organic resistance to that from the very same "culture of excellence in engineering". "How can we be sure it works if it's automated?" "It's safer to manually review the log" "How can you automate something like email tests are or web tests?" "It's no worth automating this procedure, we only release this app once a year, and it only takes 5 hours". Expect to hear these kind of claims when engineers have got the equality "menial work == diligence == excellence" pummeled into them for generations.


Also, script disaster recovery too. Script it when creating your backup procedure (not at the time of disaster), use the script to test your procedure, and do it often.

This way, when your script fails, you can recover quickly.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: