Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

is sre at google just a maintenance team? what do they do then?


Effectively yes. The main things SRE provides are oncall support, production focused design consulting and integration with other infrastructure. In practice, the engagement usually always provides 1) and then the rest are dependent on how mature the SRE team is.

In a typical split, SWEs often do the dev work for features and large reliability/scalability changes (which SRE helps appropriately prioritise), whereas the SRE team maintains the software around the project (config pipelines, monitoring etc.) and might occasionally write some smaller reliability/scalability modifications.

But there can be lots of variance. It’s atypical but some of the infrastructure-focused SRE teams often maintain non-trivial software, but are part of SRE because of other responsibilities.


Google wrote a book about it. It's free to read. https://sre.google/books/


SRE is the first-responder team. They are on-call 24/7 (the team, not each person), perform systems and service monitoring, triage failures and mitigate outages.

That doesn't mean it's all handwork, I'm sure SREs at Google employ a boatload of automated event handling and custom response scripts. But "keeping the service up" requires different skills than "building a service", and Google chose to separate their Dev and Ops this way. As others said in this thread, if some service isn't up to SRE standards (in terms of monitoring, logging, or robustness), the SRE team won't accept it and Devs would have to do their own Ops.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: