Hacker News new | past | comments | ask | show | jobs | submit login

I've worked for several "mid-sized" web companies (between 75-200 employees) and this has always been an issue. Here's why:

1) The development team interacts directly with a single business unit but the administrators serve the whole company. So say you're a developer that works in the SEM team. All the technical products and processes that manage the company's SEM campaigns, that's what you work on. You directly communicate with the product team and project management. If there's any bottleneck or issue on your projects as you work on them, the entire team is easily made aware of it. You tell your project manager, "Dan in analytics was supposed to get me the data and if I don't get it tomorrow then I'll be late," and your project manager talks to Dan or Dan's boss and you get your data... or you don't, but it's on Dan anyway.

However the administrators serve the whole company and they have entirely different sets of priorities and responsibilities that you have no visibility into. So you're ready to go to production and you submit your ticket with your release notes, which is pretty much just, "push these files from SVN and run this SQL on this DB." And it just sits there. A day passes. Then another day. Dan asks why your project's not available yet given that it's "done." You ask your project manager to ask what's going on, he says he's trying to find out but the sys admin manager hasn't been around. You try dropping by the admins yourself (usually sequestered in some remote location in your building, if they're even on site at all!) and ask about your project and they snarl and say, "load on the consumer site has been up 12% all week, we have bigger problems" and mumble a bunch of other things about permissions and server racks and subnets and all you know is that whatever to-do list they're working off of, you're all the way at the bottom.

And then...

2) The production and development environment just aren't in synch and this never gets addressed. The sys admins finally get to your ticket a week later. Finally, you think, this will go live. Then two hours later you get an e-mail from an admin named Stan that says, "Release failed, please fix the permissions on the directories your application creates and re-submit the ticket." And your ticket's closed. What the hell? Your application didn't even create any directories.

If you're lucky Stan included a copy-paste of his shell with the commands he used to export your code and whatever barfing error he got. If not you have to hunt Stan down, ask him to see what exactly the error he got was. Stan sighs, because they're just sooo swamped and sooo busy and the site load has been up for 12% since last week, but he grudgingly does what you say. Oh, yes, your application uses a directory which is owned by 'application1' in dev, but is owned by root in production. So you tell him to just chown the directory to application1, and he says submit a ticket. You blow up and say, "You're right there! Just type in the freaking chown command!" and he says he can't, you have to submit a ticket, and then submit another ticket for the release again. Then one of the other admins says, "Stan, foosball?" and Stan gets up to play some goddamn fucking foosball, while your project is now going on its second week of being late.

The next time you meet with your boss and mention how much things easier would be if you could have access to production, and your boss sighs and says it's just not happening. So then you talk to him about how you need better integration with the sys admin team, and it's critical to ensure dev and production environments are identical, and your boss agrees to talk to Stan's boss, and ultimately nothing gets done and you just resent the lack of control over your own projects.




Being the admin, I'm pretty sensitive to:

1) People trying to bump the priority of their own tickets by showing up in my cube. This is inevitably at the expense of other, more polite, users of the system. Prioritizing requests is one of the toughest parts of operations job - if you think I'm doing it all wrong, you can talk to my manager.

2) People who think that "open a ticket" means "go to hell". It actually means "We need to document all changes done to production and we need to prioritize requests. Please help us do our job by using the system designed to do so. The request will be done within 5 minutes if its urgent/important/really simple. But we still need a ticket to document it and follow up if it causes issues in the future."

On the other hand, if you boss is competent, he can raise the availability and priorities of admins in all kinds of management meetings. Hopefully resulting in better priority for developer requests and maybe even hiring few more admins.


I don't work in such an environment, so what is the proper response to a ticket that took a week to process and is closed with no useful detail? Open another ticket to ask for more detail on why the previous ticket was closed?


Make friends with some sysadmins. When the gears aren't spinning, ping them on IM to find out why. Usually there's a very good reason for that.

Sysadmins are your friends if you treat them well, like human beings, and you respect their work. If you treat them like a ticket processing machine, or worse, like an obstacle, you'll struggle.

I learned this from 4 years of deploying apps at UBS (one of the world's largest banks), which had exactly this kind of ticketing system (called GCMS, iirc). I was IRC-pals with 3 different sysadmins in 3 different continents (yes, there was an internal IRC system), and so whenever things got stuck in the pipes, I could ask them to have a look in the system and see what was going on. As for the other sysadmins, I always treated them courteously, used the ticket system, and got the necessary approvals whenever I could.

The one downside of this is I ended up being stuck managing lots of deployments, because I was good at it.


I think this goes both ways. I understand that sysadmins are human and get busy and/or make mistakes. Most of the time though, nagging developers could be pleased by sysadmins being a bit more proactive in communicating with people.

After all, when a ticket has been open for a week with no response, most people will start to get a bit frustrated and take it out on the sysadmin. A simple "Hey, know you've been waiting on this a while, but I have X, Y, and Z to take care of before I can get to it" will do wonders for sysadmin/developer relations.


It goes a bit deeper than that - a ticketing system by itself is just a tool, and isn't a process or a solution.

If Sysadmin just threw up a ticketing system and said "put your stuff in there and we'll get to it" - then they can't expect things to get much better than email. It's a start, but only a small one.

They need to put the proper process, SLA's (even if they are approximate) and review procedures around the system to make sure it's meeting the needs of the rest of they organisation.


At my last company we had exactly these problems with a central IT service. In the end we got so frustrated with the poor service that we moved to using a 3rd party for hosting and admin. We never looked back. We didn't have a ticketing system, we spoke to them directly in irc. They also took the attitude that any request that might be repeated was automated so that we would either not have to bother in the future or we could easily do it ourselves. Generally this automation was put in place at the time of the request. It should be noted that we did have full access to the production environments and that we did our own QA. We found that by using a strict process we were able to keep quality very high; in the two years that we had this setup I can't recall a single occasion where we experienced a service failure. Of course there were bugs etc. but we were extremely happy with the low incidence rate of these.

On another note, I'm now in one of those startups without any sysadmins. We use heroku with a handful of addons. This setup has quite frankly blown me away. You can have the best of everything, monitoring, backups, cron jobs, error tracking, memcached servers etc. with no admin required beyond switching the services on. Having used such a setup I see the requirement for sysadmins being far diminished.


Your post is both depressing and familiar. I think the reasonable long term solution is that each dev team have one admin as well. That way you have "your guy" who knows what's going on with the code being developed/deployed.

I'm not sure if there would be enough work for a full time admin / team so maybe one admin could be shared among two or more dev teams.


That's how it starts. The scenario goes something like this: Central IT is not responding in a "timely" manner. Bosses meet, but no common ground is reached. <INSERT BAD EVENT HERE> Developers complain to Boss that they are not to blame. Boss goes to his Division Boss. Higher level boss disagreement. Division Boss gets his own admin and servers. Central IT is no longer the only IT as Divisions now have IT. < TIME PASSES > New plan to centralize all IT. < CYCLE REPEATS >


One of the problems with mid-sized companies is that they're big enough to need to have separation of responsibilities and processes but not quite big enough where they have the time, resources or manpower to put processes and the tools that enforce those processes in place. And that's if they at least recognize the issue in the first place.

By the way, the answer in the situation you've detailed is to include the administrators within the project and assign the installation task to them within the project plan. Even if it just seems like a "bookkeeping" change (and it is), it serves the purpose of highlighting who is responsible.

Also, if an install is critical enough, don't be afraid to escalate the issue as high as it needs to be up the management chain. If the sysadmin manager isn't available/responsive, then go to their manager (you might need to go up in parallel on your own management chain first). If you're hunting down the sysadmins on your own, it basically means that management isn't doing their jobs and/or the project really isn't a priority to them.


All of these issues could be fixed with better process, better people (that grumbling about 12% extra load sounds like bullshit, its not like they are serving the page themselves) and better alignment of the dev/ops team.

These problems are not intractable. What you've described is a giant management failure.


...better process

more processes? Sure. But better? hardly

better people

never ever seen a company I've worked for admit a problem is because they've employed a bunch of twats; problems are always due to insufficient number of processes. Solution? More processes


more processes? Sure. But better? hardly

I'm not sure how accurate or exagerrated the OP's examples are, but here is how I'd fix a number of them:

they snarl and say, "load on the consumer site has been up 12% all week, we have bigger problems" and mumble a bunch of other things about permissions and server racks and subnets and all you know is that whatever to-do list they're working off of, you're all the way at the bottom.

Have a regularly scheduled release train so everyone in the company knows when new releases can go out (i.e. every Friday night, etc.) and can plan ahead accordingly. If a release misses the train, then it can wait for the next one to leave the station. This allows all groups to plan ahead, make sure they have resources available, etc.

The production and development environment just aren't in synch and this never gets addressed.

This is pretty inexcusable, it's a matter of laziness or unwillingness to spend the time to make the environment better. You can clearly see the effects of this bad practice when you have releases that fail in production and have to be re-tried several times. If the production environment is so complex that it can't easily be mirrored in development, at the very least the release should be tested in a staging environment (not QA) which does mirror production - so you can test the release process itself.

The fundamental problem that nhashem seems to be describing are admins that don't seem to care too much about whether or not new software releases are being released in a timely fashion to production. While I understand that the admins have a whole lot of other areas to also be responsible for, the entire reason why someone wants to release this software in the first place is that there is some business value in the release/new version/feature etc. Not allowing this release to reach customers as fast as possible reduces it's value. If the company is not focused on getting value in front of the customer as quickly as possible, then why are they even spending the time developing the software/changes in the first place?

This is why it's a management failure - a failure to plan, get teams to work together, and cut things down to the bottom line of delivering value to the customer as rapidly as possible while still maintaining stability and other responsibilities. Company's that can't do these things will have their lunch eaten by competitors who can.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: