Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Without a doubt, a future generation of executives will revisit and reverse the decision to rent all information infrastructure, but that will likely be many, many years down the road. In the meantime, the current generation of executives who made this decision will look very smart for saving the company lots of money for a good number of years. And they stand to benefit personally from it. They're doing the rational thing!


I used to work with a very smart man that I'm sure was some kind of secret genius. He's was that sort of tech gofer. Hardware, software, didn't matter, if there was a problem he'd solve it. Sort of guy you'd see carrying a thick ass SQL book around because he 'needed to learn it' to solve just one little problem. He built whole entire solutions for the company I worked at in his spare time that the company once tried to sell for 500k and at a previous company I heard he figured out a way for the pain mixing machines to save on paint or recycle it or something saving them 1.3 Mil a year. When Raspberry Pis first came out he was one of the first people I saw tinkering with them and he was in his 50's doing it just for fun, I think he ended up using it to open and close his garage door from work or something just to scare his wife.

That sort of guy. Well he once told me something about executives and upper managers working for corporations that I have never forgotten. He said to me, and of course I am paraphrasing:

"Change gives the illusion of progress". I asked him what he meant and he responded with something to the effect of "They have the habit of changing big things every 5-10 years on purpose to make it look like they are productive, and to justify their own roles, one guy will come in and 'cut costs', the new guy after him will 'invest'".


"A new CEO was hired to take over a struggling company. The CEO who was stepping down met with him privately and presented him with three numbered envelopes. “Open these if you run into serious trouble,” he said.

Well, six months later sales and profits were still way down and the new CEO was catching a lot of heat. He began to panic but then he remembered the envelopes. He went to his drawer and took out the first envelope. The message read, “Blame your predecessor.” The new CEO called a press conference and explained that the previous CEO had left him with a real mess and it was taking a bit longer to clean it up than expected, but everything was on the right track. Satisfied with his comments, the press – and Wall Street – responded positively.

Another year went by and the company continued to struggle. Having learned from his previous experience, the CEO quickly opened the second envelope. The message read, “Reorganize.” So he fired key people, consolidated divisions and cut costs everywhere he could. This he did and Wall Street, and the press, applauded his efforts.

Another year passed and the company was still short on sales and profits. The CEO would have to figure out how to get through another tough earnings call. The CEO went to his office, closed the door and opened the third envelope. The message began, “Prepare three envelopes...” "

https://duckduckgo.com/?t=ffab&q=the+three+envelopes&ia=web


Is this a quine where the processor is a CEO?


I think it's more like malware. A virus that spreads from CEO to CEO.


CEO is the virus that spreads from one organisation to another


This is how governments run too, but they add a few more envelopes as they have more avenues to peruse. War, plague, etc.


I'm 53, and I've worked for 3 Fortune 250's. Can confirm. I've seen this happen over and over again. Senior management makes some broad pronouncements, and the mid-level lieutenants have meetings with consultants, and then implement new, expensive projects that "will surely fix 'it' this time." Five to ten years later, after the dust settles, and we figure out that we've strapped YET ANOTHER LAYER of technical debt on top of everything else, and things are worse than ever. But at the height of the project, when everything is still rosy, the managers in charge update their resumes, and hit the bricks.

IN PARTICULAR, at one Fortune 250 (which no longer exists), we implemented OneWorld to replace the mainframe. After 7 years, we still had the mainframe, AND a badly-implemented version of OneWorld. Then the company got bought by another Fortune 250, moves were made to "commonize" the IT systems, and then the parent company sold everything. But I'm positive that everyone involved in the project to retire the mainframe made the project look very successful on their resumes.


8-10 years is as long as senior executives last, if they don't make a big change then thee is no way to take credit for their vision. Even if the company had a "perfect" org chart (as if such a thing is possible), they need to make changes otherwise someone will say that they old org chart is the cause of success and they as a leader were not worth anything.

I don't know if the above fear would actually play out, nobody is willing to not make changes to find out.


I think we are giving these people too much credit. Being the head of the organization is like inheriting someone elses filing system. the only way for them to actually understand wtf is going on at the company is to reorganize some things in a way that makes sense to them.

High level management is fundamentally hard to stay on top of over time. It's about as easy as thinking chess is easy because you can theoretically know every move your opponent might make. There are so many moving parts to an organization that having visibility to them isnt enough to perform well. They have to influence most of the business pretty indirectly. If changing things gives you the confidence necessary to keep things running.. that's what you're going to do. Everything is a gamble, so doing nothing is kind of unacceptable leadership behavior unless they are actively taking up the mantle left by previous management and understand it very well already.


I would compare this to asking a dev team to support a big code base without embarking on a major redesign or re-engineering.

It's hard to keep a good (confident, ambitious) team from re-engineering. All the same dynamics apply: In your mind the disadvantages of change are small because you don't know them, but the advantages are large because you planned them. Making change gives you more control over your fate because you are executing your own plan as opposed to staying the course. Finally, how do you keep people motivated to show up every morning if you don't have a vision for change in the future?

I don't think its that different for managers and engineers. There's a lot pushing people to try something, even if the objective odds of success aren't great.


That’s not the only way but “come in and change things to establish dominance” is a commonly taught business school chant.

Management are contemporary clergy, spewing high minded ephemera, only to go home unable to point at anything net new left behind by their effort.

I grew up in farm land; we had no middle managers. Somehow food still got grown, harvested, and sold; somehow a Linux kernel and other wildly popular open source exists without them.

Post-WW industrialism needs to wind down. Militant minded people came home and forced their PTSD on workers. We spend a lot of resources equipping people to output nothing in deference to traditional economic memes. America of recent decades necessarily built itself into a production powerhouse to resupply a destroyed world. Such memes are outdated given automation and unsustainable given real material costs.


Yes but they ultimately live materially better lives because of their position, so you can't hand wave away criticism of the value or lack thereof their actions take, especially when those actions can have a negative effect on those below them and even to the external environment.


criticism is fair game. it just usually doesnt account for the reality of what they are doing, and puts them to blame for not directly controlling things that in reality were outside their control. Taking responsibility for the failings of the company i part of the job description, so by all means hold them responsible. I am just saying all of that is tangential to the root of the problem - which is that no matter how much someone gets paid, they are still human.

It's an area that is pretty tough to make meaningful criticism. its kind of a show dont tell type situation imo. If upper management seems under-qualified and overpaid to you, then maybe you have a calling to go perform better and get paid more.

Company management is in underdeveloped game-theory territory. Sure we can isolate one part of their job and describe how they are failing on it, but we dont know the trade-offs being made on a daily basis with their time and focus. a lot of which is going to be company secrets, if it even leaves their personal thoughts. Any criticism that comes down to saying "they should have done more" is likely out-of-touch, for example. Unless you can prove they were actually being lazy.. which is usually not the case, since they are often workaholics (ime). But we usually cant tell if something is a good move or not until it plays out on the market. So making criticisms based on hindsight is weak, as is making criticism that lack the full picture of the organizations goals and the time / energy they actually have on hand to accomplish them


What parts of these generalized arguments have anything to do with CEOs? As a mental experiment, put them into the mouth someone making excuses for why a crew of expensive painters did a horrible job painting your apartment.


So true. Reading their over generalized screed left me thinking if CEOs are really that useless anyone could be a CEO and their position isn't special, which sort of torpedoes the entire point.


My point is that properly criticizing a CEO is exhausting so it is rarely done properly.

CEO is a very general job role. I dont understand your point with the painters. That would be an operations issue, so I actually wouldnt criticize the CEO of the painting company at all for it. thanks to him/her, I was able to contact a painting company, they showed up, they painted the apartment, and left. I would think the operations team (the painters) deserve to be fired and held accountable for claiming they know how to paint a room when they clearly didn't. Nothing about their job is general or consists of trade-offs. If I said "you only have 10 minutes to paint the room and then leave" then yeah, it might come out like shit and the "excuses" would be valid. which is the kind of time pressure CEO's are often under with respect to things they are actually doing on a day-to-day basis.

i would hold the CEO responsible with respect to resolving the issue and refunding me, etc, since the CEO role is to be responsible for the outcomes of the company.. but he's not the one painting the room. Just like with any company, the CEO does not literally run the company. If service is poor, it is usually because people are finding their way past hiring filters to get jobs they aren't qualified for. Let's not forget that people everywhere are often advised to lie their way into employment, fake it till you make it, baffle them with bullshit, reword your resume to sound more impressive, etc. These people line the mechanisms that CEO's use to accomplish anything at any company.

Of course, the CEO position is no exception to this and I am not saying it is literally impossible to build a case against a CEO. I am saying it needs to fully encompass the position or else you're likely assigning criticism to the CEO for some culmination of lower level operational incompetence that they simply failed to overcome. If a director over-promises to the CEO and the CEO signs off on the basis of trust with the director, then when the bar is not met later of course the CEO will be held responsible but the reality of fault sits with the director, or maybe a subordinate to the director who convinced the director that the over-promise was doable. You then have to get into the weeds of whether or not there were signs the CEO should have seen as to not trust the director, or if they had reason to overrule the directors approach, etc. you then have to do similar things across all areas of the company to derive a valid criticism that the CEO is the common denominator in it all.

Leadership is significantly harder to criticize appropriately than operations. Personally, I would like to stop reading meaningless criticisms from people who want to complain and be heard but dont want to do the work necessary to make a valid complaint.


I think you misunderstood the previous commenter a bit. They were not saying look at the painting example from the standpoint of the CEO, they were saying "Put the excuses" in the mouths of some painters. Also, there's quite a bit of depth and generalization in painting. There are many different types of paint that are better for certain tasks, eg eggshell vs satin finish. Painting walls is different then painting ceilings, then you add in moving furniture, some walls have edging, some walls will be multiple colors, plaster vs drywall vs wood, etc. That's just painting, lets say the company they work for has a CEO/manager that is demanding more jobs completed, now they have to deal with someone telling them "Do it in one day" and all the compromises that must be made to do so. Almost all fields have a lot of depth and generalizations.

So if had a horribly painted room that you just paid extremely well to have completed, and the painters came up to you and gave you a laundry list of reasons that they failed... Would you hold off on criticizing them until you had a complete understanding of what it takes to be a painter?


yes, if I claim that the painters did a bad job and they gave me a laundry list of professional reasons for it.. I would consider those reasons before criticizing. would you not?

trying to use painting as an analogous situation like that isnt transferable to the point i am making though. Putting the excuses in their mouth doesnt even make sense. We are presupposing that the painters did a horrible job.. while discussing how to decide whether or not a CEO did a horrible job. The only reason you know the painters did a bad job is because we are saying they did. the only reason we can use painting as an example is because most people can imagine a terrible paint job. i.e. we do have a full scope understanding of what it takes to be a painter. I am saying it is much harder to imagine the role of a CEO and what good results would look like than it is a painter.

Maybe my wording was fuzzy, but I am not saying you need a complete understanding of the CEO's role, but it does need to be of full scope. I see that reads near synonymous, so in other words it may be infeasible to account for the total depth of their role, but at the very least the entire breadth of the role should be looked at. If you default to "i gave you a lot of money to make it happen so it should be perfect" type logic; you're just being a "karen". the cost of something has nothing to do with the results, directly. Money needs to be converted into something that helps the work, and in that process we are all still limited by reality; diminishing returns, supply chains, quality of communication, availability of resources, etc. A CEO is at the focal point of all of this, and is human. Whether they get paid nothing or everything doesnt change how effective they can reasonably be.

But they do deserve criticism. it just needs a lot of work to do it right. you have to provide some sort of evidence that across all scopes of work the trade-offs do not make sense. maybe the CEO sacrifices on every front in order to provide the fastest service in the business and is successful in that. If you leave speed of delivery out of your criticism it becomes a meaningless criticism. "They charge a lot for poor quality". "These painters did a terrible job.. (even though I called them this morning, and they were done by lunch which allowed me to do a walk through with a potential tenant)".

All i'm really trying to emphasize is that we absolutely can criticize a CEO, but if you dont do it properly it is very easily washed away by the many unknowns of the position. however, if it is done right - it would be very damning as they cant default to company policy or directives from above as a scapegoat since they are the ones creating such things.


> they ultimately live materially better lives because of their position

This is a bullshit reason based on jealousy, not reason.

> when those actions can have a negative effect on those below them and even to the external environment

This is the real reason it’s fair to be very critical of their maneuvers.


>This is a bullshit reason based on jealousy, not reason.

Reality is not zero sum, but neither are resources infinite. Is it really bullshit to critique more thoroughly that to which more of the finite resources are dedicated?


>Is it really bullshit to critique more thoroughly that to which more of the finite resources are dedicated?

this kind of aligns with my point tbh. We give $10 critiques to million dollar positions as if they hold weight.


> This is a bullshit reason based on jealousy, not reason.

If this is trully BS, then allow lead developers to write checks for their whole team using the company's bank account.

they hold power in organisation, they can increase their own renumeration in a way that's rank and file staff cannot. Executive compensation has skyrocketed in the past 20 years.

If their management is ineffective, then they don't deserve top comopensation.


I would expect it is just as much a hedge in case things go wrong. When your job is to steer the course of a company, it won’t look good if you crash and your hands weren’t even on the wheel


counter-examples: all of FANG/MAGA, even when you exclude founders

Satya: 1992, CEO in in 2014 Tim: 1998, CEO in ~2009 Sundar: 2004, CEO in 2015 Andy: 1997, CEO in 2021 Ted: 2000, CEO in 2020

They were all senior executives well before assuming their CEO roles.


Oh man...Dell is terrible about this. Not sure what the policy is anymore (esp since they went private), but to get promoted you had implement a significant cost savings project, which ironically lead to multiple implementations and reversals of policies...they all showed a cost savings but it depended on your perspective.


I don't know anything about Dell's promotion criteria, but their consumer ordering process has to be one of the most hostile ever — I'm guessing anywhere between 50-75% of their orders get auto-cancelled by some overzealous anti-fraud and anti-reseller algorithm.


In SOFTWAR Ellison called it a sort of fashion cycle.


> "Change gives the illusion of progress". I asked him what he meant and he responded with something to the effect of "They have the habit of changing big things every 5-10 years on purpose to make it look like they are productive, and to justify their own roles, one guy will come in and 'cut costs', the new guy after him will 'invest'".

That's a very succinct way to put it. I think that observation also applies to consumer technology (e.g. regularly re-doing UIs to "improve" them when the changes are either in fact regressions or just different but not better). We've had it drilled in our heads throughout the modern era that new == better (e.g. the ubiquitous "new and improved!" marketing language), but that's not actually always true. Change for change's sake justifies itself through that misunderstanding.

In the recent past, we had a really high rate of genuine technological progress. But at some point we'll have picked most of low hanging fruit and will enter a period of slower progress, where faking progress will become more an more tempting for producers than the real thing.


right now, they can't retain the people or afford the people to do this work. When you can go work elsewhere for more money you move. And, running a main frame isn't just having it, it's keeping the people running it, plus paying for the electricity and space.

Depending where, the real-estate and energy prices are nuts in most places. And, the engineers are expensive right now, and the services are cheaper.

It's not about giving it up, or change for the sake of change, it's about seeing the writing on the wall. These managers see the larger trends, rising energy costs, maintenance costs rising, hiring difficult, and retention of existing engineers impossible. Once you see those trends and a department is underwater and it's getting worse, you have to move. At another point in the market, when you may find engineers are less expensive, easier to hiring, technologies and space are less costly and easy to deploy. You move back.


> These managers see the larger trends, rising energy costs, maintenance costs rising, hiring difficult, and retention of existing engineers impossible.

But how will the cloud providers avoid these trends? They won't. They will have to do the same thing as anyone else: pay more. And therefore charge more. There are economies of scale, but those savings are logarithmic and a company like FedEx is already pretty far out on the X axis.


> a company like FedEx is already pretty far out on the X axis

AWS and Azure probably run hundreds of Fedex-sized clouds and it's their core business. I don't think Fedex is very far compared to them.


> But how will the cloud providers avoid these trends?

Innovations are more likely to happen if it is someones priority to fix a certain thing. I think they hope that savings from innovation and better methods at what ever company they hire out to are passed onto them. It is naive if they are unable to change clouds though that they will see these savings and as long as one relies on vendor specific features on is in that position.


They aren't playing the same game. Facebook designed its own servers, they were chassis-less, didn't have a mains power input so no switch-mode power supplies, instead they had a 12V DC feed, they had no rack-wide large UPS instead each server had a small battery in it, they were not built for massive redundancy like a Dell server with dual PSUs and redunant networking, because they were disposable nodes in a larger software cluster, e.g. [1] [2]

Things that aren't Fedex's core competency.

Or, Microsoft's roofless datacenters[3], or locating datacenters in remote and colder climates, things the big players can do with economies of scale beyond buying things cheaply, they can customise the entire datacenter. Microsoft has experimented with underwater datacenters[4] and modular containerised datacenter extensions[5] which could be datacenters no human needs to be near to work on, or which could be dropped off somewhere with cheap land and power and internet, and picked up three years later and retired from use, or etc. Ideas which are not FedEx's core competency and need large scale and software clustering on top.

While FedEx would be hiring ordinary IT employees to work in a standard datacenter in cheap business park - not very enticing - Amazon could be hiring datacenter workers to work with Amazon's undersea cabling connecting their worldwide datacenters; more enticing work for skilled employees.

Google has been known/rumoured to migrate heavy batch processing workloads around the planet, following the day/night cycles to take advantage of regional cheaper night-rate electricity all the time. Something which reduces their energy costs but which FedEx may not be big enough to do.

[1] https://engineering.fb.com/2016/03/09/data-center-engineerin...

[2] https://engineering.fb.com/2019/03/14/data-center-engineerin...

[3] http://www.500eco.com/exhibits/microsoft-roofless-datacenter...

[4] https://news.microsoft.com/innovation-stories/project-natick...

[5] https://www.datacenterdynamics.com/en/news/microsoft-develop...


When The Facebook started designing their own servers (which, by the way, have lots of switch-mode power supplies in them, and always have) the game they were playing was "be a better MySpace". They were running a bunch of PHP pages. At the time you could have made the same argument about The Facebook vs. Rackspace: "While Facebook would be hiring ordinary IT employees to work in a standard datacenter in a cheap business park, Rackspace could be hiring datacenter workers to work with Rackspace's BGP peering connecting their worldwide datacenters."

But The Facebook decided to make informatics their core competency, to the point of building their own servers with 12 volts running to the rack, same as Google before them.

There were surely industrial companies in 01922 who decided that management wasn't their core competency (though they used different words), and if they needed help with management they'd contract out to management specialists like Taylor or Gilbreth. They met the same fate that will meet companies today that decide that informatics isn't their core competency.


it's called economies of scale


"retention of existing engineers impossible"

So where are all these mainframe engineers going to go work? Our company has 3 admins for our two mainframes, and these guys while expert level admins for a mainframe, have trouble with Linux and Windows. Same with the developers writing code for the systems. When all you've worked on is a mainframe, then everything looks like a batch cycle...


Retiring?


In this sense, Oscar Wilde was the executive's executive: "I have spent most of the day putting in a comma and the rest of the day taking it out."


""Change gives the illusion of progress". I asked him what he meant and he responded with something to the effect of "They have the habit of changing big things every 5-10 years on purpose to make it look like they are productive, and to justify their own roles, one guy will come in and 'cut costs', the new guy after him will 'invest'". "

That's deep and 100% makes sense, I can see how the cloud is both of these "things."


Meet the new boss, same as the old boss.


I work in a bank. This is literally what upper management does every 4-5 years. Always some new "initiative" that was preached to them at a conference somewhere, and will now presumably change everyone's lifes.


That’s definitely how the IT department at my company works (I bet a lot of other roles too). Every few years a new exec comes in, sets a new strategy, claims tons of savings, creates “excellence” initiatives (everything that has “excellence” in its name triggers my BS detector). This lasts for a few years until the next guy comes in and goes through the same process but different direction.


Never confuse movement for action :-)


As everybody who saw people, and orgs, rotating in place at incredible speeds can confirm.


This isn't always easy to determine, especially if you are part of the movement. Often, we move by dead reckoning and only after some amount of time can we determine if what we did was "movement" or "progress".


Good anecdote!

In exchange I offer two relevant quotes from my quote file:

> The empire long united must divide, long divided must unite; this is how it has always been. (Luo Guanzhong)

> When cuffs disappeared from men’s trousers, fashion designers gave interviews explaining that the cuff was archaic and ill-suited to contemporary living. It collected dust, contributed nothing. When the trouser cuff returned, did it collect less dust and begin at last to make a contribution? Probably no fashion designer would argue the point; but the question never came up. Designers got rid of the cuff because there aren’t many options for making trousers different. They restored it for the same reason. (Ralph Caplan)


Other formulations of this I’ve heard are “movement doesn’t mean progress” and “fire and motion”, the latter offers by Joel Spolsky in one of his better blog entries.


A better analogy is a ship sailing against the wind: "To reach its target, sailors that intend to travel windward to a point in line with the exact wind direction will need to zig-zag in order to reach its destination. This technique is tacking." https://www.lifeofsailing.com/post/how-to-sail-against-the-w...


> the pain mixing machines

That sounds ... dystopian.


The pain is mixed, then applied in the pain booth.


Obviously he's been working with javascript recently.


definitely sounds like webpack to me


This is what grabbed me to kee pop reading his story


So what's the moral of the story?

Do you think it's "don't change"? Cause that gives right up on progress anyway. So good luck on getting large shareholders to give you the reins with that message. They're looking for ROI, not the status quo - if you don't evolve, your competitor will, barring monopolies and such. And even there... Microsoft of today is much less dominant than the MS of 25 years ago, and arguably could be worth a lot more had they made better moves. If that's not a compelling example, maybe check out Sears...

Maybe most people in these positions know that change doesn't guarantee improvement, but they know that sitting still is the same as just waiting to be defeated. So maybe there's something less-than-stupid about these "short-sighted" "illusory" changes.

But maybe if you want to be a wise executive, the key is to recognize that change might not just fail to improve your position - it might actually actively harm it. So the good executive is the one who chooses to try to change things according to reasonable calculations about both potential upside benefit and downside risk...


>"Change gives the illusion of progress".

This is one of the biggest reasons many jobs are so miserable. You want to be in the job at the beginning of the"invest" decision. Unfortunately most people get little say in when they join because the information about this is kept internally to the company.


> "Change gives the illusion of progress".

That itself is based on the belief that (historical) progress is an improvement.

Mostly true over the last few centuries - aspirin is nice - except for those occasions where it was mass murder.


First "pain mixing machines," and now "asp(i)rin is nice."


So? People can't spell and typos are a common phenomena.


Just noting humorously related typos, that's all. I'll add smilies here so you can comprehend. :-) :)


there used to be an allegory of how to identify a bad, new CIO... if your current system was massive, networked printers that were shared amongst floors.. the new CIO would come in and give everyone individual printers... or vice versa .. because 'change'


It sounds a bit like the bodybuilding method of bulking and cutting.


Bodybuilders make progress though :)


Isn't there a ceiling? Or if we lived as long as those turtles, would you go to the gym and see spheres of muscle?


Some of the extreme pictures I've seen look like people that can't move, let alone lift, but I guess it works.


The returns are diminishing each cycle.


That seems to be a common take on HN - cloud is too expensive.

I'm curious whether the folks claiming that have any data center ops experience.

Because, personally, I'd rather retire than deal with Dell, HP, Cisco, fibers, cooling issues, physical security, hardware failling... And that's just the hardware. Then you still need to pay VMWare for a decent virtualization platform, monitoring tools, etc.... Seriously, no amount of money would make me work in a DC again.

I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though.


ML workloads definitely cost a lot of money. Even for a preemptible VM, A100 GPUs cost $0.88/hr/GPU. That's $624 a month for a single GPU and only the 40GB model. Want a dedicated 8 GPU machine in the cloud to do training with? That'll run you around 16 grand a month. Do that for 2 years and you may as well have bought the device. Want to do 16/24/40 GPU training? Good luck getting dedicated cloud machines with networking fast enough between them so that MPI works correctly, and prepared to give up your wallet.

Also, that's just compute. What about data? Sure cloud accepts your data cheaply, but they also charge you for egress of that data. Yes you should have your data in more than one location, but if you depend on just cloud then you need it in different AZ which costs even more money to keep in sync and available for training runs.

I think for simple workloads and renting compute for a startup, cloud definitely makes sense. But the moment you try to do some serious compute for ML workloads, good luck and hope you have deep pockets.


The other thing is nVidia try and sell GPUs with similar performance at two very different prices. One price for data centres and a quite different price to kids. If you do the job yourself you can often get away with using the much cheaper gamer grade cards for AI work (unless you need a lot of VRAM), whereas such as AWS can't do that and are required by nVidia to use the considerably more expensive cards. If your workload will fit on a gamer grade card there's no contest on price between an on-prem system and the cloud.


That is a really good point, and the 3090s have a surprising amount of VRAM on them. For many smaller models this is sufficient. However, where I work without going into a lot of specifics, because of the size of the models, the amount of VRAM is crucial, as well as the infrastructure of the PCI lanes connected to it, the speed of the local storage, and the networking between both cards on the same node as well as between nodes.

The moment the model gets to be bigger than the size of any one GPU's VRAM, the higher by orders of magnitude of difficulty in the process of training that model.


A lot of that is just good old fashioned marketing.

Here's the list of ingredients for Excedrin Migraine:

Active Ingredients: Acetaminophen - 250 mg (Pain reliever), Aspirin (NSAID) - 250 mg (Pain reliever), Caffeine - 65 mg (Pain reliever aid) Inactive Ingredients: benzoic acid, carnauba wax, FD&C blue #1, hydroxypropylcellulose, hypromellose, light mineral oil, microcrystalline cellulose, polysorbate 20, povidone, propylene glycol, simethicone emulsion, sorbitan monolaurate, stearic acid, titanium dioxide

This is the list of symptoms that Excedrin Migraine claims to treat:

- migraines

And now here's the ingredients for Excedrin Extra Strength:

Active Ingredients: Acetaminophen - 250 mg, Aspirin (NSAID) - 250 mg, Caffeine - 65 mg Inactive Ingredients: benzoic acid, carnauba wax, FD&C blue #1, hydroxypropylcellulose, hypromellose, light mineral oil, microcrystalline cellulose, polysorbate 20, povidone, propylene glycol, simethicone emulsion, sorbitan monolaurate, stearic acid, titanium dioxide

This is the list of symptoms that Excedrin Extra Strength claims to treat:

- headache - toothache - a cold - arthritis - premenstrual & menstraul cramps - muscular aches

And while SOME places have normalized the prices between the two, they can be often found on shelves at two different price points.


Re data, I think egress rates are going to start disappearing over the next few years.

The part that’s always missing with these rent vs buy analyses on HN for some reason is that it’s totally ignoring the opex cost of operating your own hardware which is going to be non 0. Sure, it won’t be quite as expensive (no profit margin) but it’s not an order of magnitude. Additionally, most companies don’t run the HW 24/7 and, if they do, it’s not a level of people they want to hire to support said operations. Not just running it, but you have to invest and grow something that’s not a core competency to get economies of multiple teams loading up the HW.

If the next revolution in cloud comes in to cause companies to onsite the HW again, it’ll look like making it super easy to take spare compute and spare storage from existing companies and resell it on an open market in an easy way. Even still, I think the operational challenges of keeping all that up and running and being utilized at as close to 100% as possible and not focusing on your core business problem will be difficult because you won’t be able to compete with engineering companies that have a core competency in that space.


> The part that’s always missing with these rent vs buy analyses on HN for some reason is that it’s totally ignoring the opex cost of operating your own hardware which is going to be non 0.

Effectively hiring, retaining, evaluating and rewarding competent staff is hard. Even at a big company the datacenter can be a really small world, which makes it hard for your best employees to grow. Things are especially hard when you don't have a tech brand to rely on for your recruiting, and the staff's expertise is far outside the company's core business, making it harder to evaluate who's good at anything.


> Re data, I think egress rates are going to start disappearing over the next few years.

I'm not sure why you think that. AWS hasn't budged on their egress pricing for a decade (except the recent free tier expansion), despite the underlying costs dropping dramatically. GCP and Azure have similar prices.

Fact is, egress pricing is a moat. Cloud providers want to incentivize bringing data in (ingress is always free) and incentivize using it (intra-DC networking is free), but disincentivize bringing it out. If your data is stuck in AWS, that means your computation is stuck in AWS too.


Disclosure: I work on Cloudflare on R2 so I’m a bit biased on this.

I think we’re going to put real pressure for traditional object storage rates to come down. Since Cloudflare‘a entire MO is running the network with zero egress. As we expand our cloud platform it seems inevitable that you will at least have a strong zero egress choice and if we do a good job Amazon et all will inevitably be forced to get rid of egress. Matthew Prince laid out a strong case for why either scenario is good for us in a recent investor day presentation (either we cannibalize S3’s business and R2 becomes a massive profit machine for us because they refuse to budge on egress or Amazon drops egress which is an even larger opportunity for us).

Products like Cache Reserve help you migrate your data out of AWS transparently from any service (not just S3) - you just pay the egress penalty once per file.

Anyway. I’m not saying it’s going to disappear tomorrow but I find it hard to believe it’ll last another ten years.


> totally ignoring the opex cost of operating your own hardware which is going to be non 0

Early in my career I worked at a company and we had a DC onsite. I remember the months long project to spec, purchase and migrate to a new, more powerful DB server. How much that costed in people-hours, I have no idea. I upgraded to a better DB a couple months ago by clicking a button...

Don't even get me started with ordering more SANs when we ran out of storage or the time a hurricane was coming and we had to prepare to fail over to another DC.


>> I think egress rates are going to start disappearing over the next few years.

Compute costs generally drop over time. Do you have any data points to confirm egress will soon go to zero?


Cloudflare Bandwidth Alliance and R2. S3 felt some pressure just because of our pre launch announcement. It’ll be interesting to see how they adjust over the next couple of years.


It's probably worth remembering that a company the size of FedEx isn't going to be paying the listed prices.


They actually probably be paying more that the average listed prices, as Mainframe (basically a on-premise PaaS, where IBM rents you high performance, distributed and redundant hardware cluster on a pay-as-you-use manner) users are often dependent on very high reliability, high uptime and low latency.


The ability to scale up experiments is really nice in cloud. In my experience you need to be quite large before you’re using your own GPUs at a utilization percentage that saves money while still having capacity for large one off experiments.


There are a few different ways to run a data center, a subset of which are much less expensive than the cloud but require a level of competency that some organizations will never have. It can also be relatively pain-free when done well. Some workloads are inherently inefficient in the cloud because of the architecture.

Data center ops is ultimately a supply chain management problem, but most people don't treat it as such. That was my primary learning from doing data center ops at a few different companies. If you get the supply chain management right, and are technically competent, there can be a lot to recommend running your own data centers.


To a company you need to pay those costs no matter what.

If the AC breaks at 3am it neds to be fixed. It doesn't matter if you have your own HVAC people on sight 24x7, your own people on call to service it, a local HVAC service to come in, or you outsource the entire operations and so you have no idea how that is handled. In the end the important part of this story is that whatever you are doing with the AC continues to work. Different operations demand different levels of service (I doubt that anyone keeps HVAC techs on staff 24x7, but if the AC is that critical it is mandatory). The only case where the CEO is up at 3am is if the CEO is the owner of the local HVAC service company, not the CEO of the building with the problem.

Once you realize that to management the cost is outsourced no matter what the only question is do it with your own people and HR, or outsource it. There are pros and cons to both approaches, but for most companies it isn't their business and so the only reason to do it in house is they can't trust any company they hire.


The thing is, the cost for the HVAC 24x7x365 support for a datacenter will be roughly the same for a given location... but it makes a difference if it is you paying the whole bill (=you're self-hosting in your own datacenter), you are splitting the bill with a bunch of other customers indirectly (=you're self-hosting in a colo DC), or if you're splitting the bill with a shitload of other customers (=you're using some service on one of the big public cloud providers).

The downside for saving the costs is that you're losing control with every step taken away: as soon as you go into a datacenter of any kind, you simply cannot call up a HVAC company and offer them 100k in hard cash if they're showing up in the next 60 minutes and fix the issue. With a colo DC you can usually go and show up there to see if the HVAC, UPS and other systems are appropriate to your needs, but with one of the big cloud providers you have to trust their word that they are doing stuff correctly.


> so the only reason to do it in house is they can't trust any company they hire

Now I'm deeply confused. Any company hired either has a profit margin (plus enough to fund an "Oh shit" fund in case times turn bad) or will not stick around longer than a few years. At which case why not just hire people directly and cut out the other company's profit margin? Assuming you hire similar people at the same rate, using your own existing and already-paid-for HR, how is that not cheaper?


You need to deal with overhead. Nobody does their own HVAC in house because you rarely need them, and would have to pay to train people on that despite them not using it.

In some cases you can even get a discount. Utilities are. Big customer of tree trimming, the companies doing that work can give a great deal because the utility doesn't care that they take a week off after a storm for high profit margin consumer trimming.


Lots of places have their own HVAC techs in house, if they have enough HVAC work to justify it. Even if it's not their core line of business. They will do whatever costs less, +/- some amount of subjective "hassle factor."


Especially when it's "line critical" to their business, or if the person can do other things as well.

Larger hotels often have dedicated staff for things like HVAC, etc, because the importance of getting things fixed quick if possible is worth the cost of having someone onsite/available.

And you see similar things with colleges, etc; they often have a maintenance deportment that can be pretty large (though no doubt they've spun it off and brought it back in-house for the same "change is progress" reasons).


I have dealt with a large number of retail colo providers, wholesale data center providers and corporate owned data centers across the US over the last 20 years and all of them used contractors for HVAC and electrical. I'm not saying dedicated staff never happens but it is definitely not the norm.


Doesn't this logic apply to pretty much everything? Why hire external anything then? Why not do your own deliveries, hire your own trucks to transport goods etc?

There is a cost to taking on things that aren't part of your core business too.


All I can come up with is "Because economies of scale". I work for a transportation company, but we employ plumbers, carpenters, electricians, elevator repairmen, and many more that I'm not aware of, because we have enough locations / work to justify them. The pizza place has enough work to justify hiring a fleet of drivers, Amazon ships enough crap to justify having their own trucks (when they can't sucker another company into taking the unprofitable routes).

Similarly, Google doesn't ship enough stuff worldwide to justify drivers, insurance, trucks, jets, etc. - Fedex has the size and scale to make every package a couple cents cheaper, so it's just not worth it for Google.

The only other argument I can think of is the challenge of keeping every plate spinning, in good times and in bad. This is where your point of having a cost to take on something outside your core business comes in, but we seem to be in an era of mega-corporations - I'd expect lots of companies to snake tendrils into whatever will save them a fraction of a cent every time they have to do something.


Not really. Time to service depends on SLA and redundancy. If you have no redundancy your time to service must be less or equal than your SLA. If you have redundancy it can be longer.


I've got some experience with a big academic data center - >1 acre floor space, >10MW, ~$100M construction cost. I've also worked for commercial companies of various sizes.

If your compute installation is big enough that payroll is a small fraction of the operating cost, then it's way cheaper than cloud. (that payroll has to include people who actually know how to build and run a huge compute installation)

The problem is that people come in integer units, you need a bunch of them to cover a bunch of different areas of expertise, and the particular ones you need are expensive. If you've got $1M worth of computers, you're almost certainly better off scrapping them and going to cloud, although the folks you're currently paying to run them might disagree. If you have $100M+ worth of machines it's a whole different ballgame; I'm not sure where the exact crossover is.

Note - that's assuming a single data center, and that you're big enough to build your own data center instead of renting colo space. If you need your machines to be geographically dispersed, you'll need to be even bigger before it's cheaper than cloud, and I'm not sure whether you'll ever hit crossover if you're renting colo space.


1000% this. HN loves to talk about Dropbox. I spent most of my (short, praise God) career at Dropbox diagnosing a fleet of dodgy database servers we bought from HPE. Turns out they were full of little flecks of metal inside. Thousands of em, full of iron filings. You think that kind of thing happens when you are an AWS customer?

If you are sophisticated enough to engage an ODM, build your own facilities, and put hvac and electricians on 24-hour payroll, go on-prem. Otherwise, cloud all the way.


That's not quite where I would draw the line, I don't think. I used to work for an ISP and we were kind of split between AWS and on-prem. Obviously, things like terminating our customers' fiber feeds had to be on-prem, so there was no way to not have a data center (fortunately in the same building as our office). Moving our website to some server in there wouldn't have been much of a stretch to me, at the end of the day, it's just a backend for cloudflare anyway.

Like most startups, our management of the data center was pretty scrappy. Our CEO liked that kind of stuff, and we had a couple of network engineers that could be on call to fix overnight issues. It definitely wasn't a burden at the 50 employees size of company (and that includes field techs that actually installed fiber, dragged cable under the street, etc.)

We actually had some Linux servers in the datacenter. I don't know why, to be completely honest.

So overall my thought is that maybe use the cloud for your 1 person startup, but sometimes you need a datacenter only and it's not really rocket science. You're going to have downtime while someone drives to the datacenter. You're going to have downtime when us-east-1 explodes, too. To me, it's a wash.


I mean, you did want to manage bare metal servers, right?

AWS almost certainly gets batches of bad hardware too. And if your services are running on the bad hardware, you can't have a peek inside and find the iron filings. For servers, this is probably not too bad, there used to be articles about dealing with less enthusiastic ec2 vms since a long time, and if you experience that, you'd find a way. AWS has enough capacity that you can probably get vms running on a different batch of hardware somehow. With owned hardaware, if it was your first order of important database servers and they're all dodgy, that's a pickle; HPE probably has quick support? once you realize it's their hardware.

If your cloud provider's network is dodgy though, you get to diagnose that as a blackbox which is lots of fun. Would have loved to have access to router statistics.

There's a lot of stuff in betwren AWS and on-prem/owned datacenter, too.


> If you are sophisticated enough to engage an ODM, build your own facilities, and put hvac and electricians on 24-hour payroll, go on-prem. Otherwise, cloud all the way.

I imagine the entire sentiment of the comments is because FedEx is one that really should be sophisticated enough.


Not really a meaningful dichotomy.

There is a smooth curve between cloud and dedicated DCs, which has various levels of managed servers, co-location, and managed DCs. (A managed DC can be a secure room in a DC "complex" that shares all the heavy infrastructure of DCs.)

Primarily, the FedEx managers are committing the company long-term to Oracle/Microsoft platforms. Probably mostly to benefit their own careers.

Outsourcing hosting and management of DCs would have been something different, and probably healthier for FedEx and the industry.


> You think that kind of thing happens when you are an AWS customer?

You bet it does! But as the AWS customer you'd never notice because some poor ops dude in AWS-land gets to call up the vendor and bitch at them instead of you. It ain't your problem!


Why do you buy servers with metal flakes in it? No quality controll on your side?


Are you saying that part of the expected savings from going on-prem is that you will have to disassemble equipment bought from major OEMs and examine it for microscopic metal dust?

That doesn't sound like it will save much money, honestly.


They’re saying it’s a surprise to hear that Dropbox doesn’t know what QC and order acceptance means. And it is, I agree. That you spent the time investigating it, implying those servers were in production, is a shibboleth to those of us that know what we’re doing when designing hardware usage that Dropbox doesn’t. It is, however, your self sourced report and we don’t have an idea of scale, so maybe they do and you’re just unlucky.

And no, operators don’t disassemble to perform QC. And no, I could hire an entire division of people buying servers at Best Buy, and disassembling them, and stress testing them, and all of that overhead including the fuel to drive to the store would still clock in under cloud’s profit margin depending on what you’re doing.

You’re of course entitled to develop your cloud opinion from that experience. That’s like finding a stain in a new car and swearing off internal combustion as a useful technology, though, without any awareness of how often new cars are defective.


Many hardware problems do not surface at burn-in. Even at Google, the infamous "Platform A" from the paper "DRAM Errors in the Wild" was in large-scale production before they realized it was garbage.


Filings from the chassis stamper, which yours certainly were given the combination of circumstances and vendor, are present when the machine is installed. If you’re buying racks, your integrator inspects them. If you’re buying U, you do. It’s a five minute job to catch your thank-God-my-career-was-short story before the machine is even energized, which I know because I’ve caught the same thing from the same vendor twice. (It’s common; notice several comments point to it.) Why do you think QC benches have magnifiers and loupes? It’s a capital expenditure and an asset, so of course it’s rigorously inspected before the company accepts it, right? That’s not strange, is it?

You can point at Google and speak in abstracts but it doesn’t address the point being made, nor that your rationale for your extreme position on cloud isn’t as firm as you thought it was. Is Dropbox the only time you’ve worked with hardware? I’m genuinely asking because manufacturing defects can top 5% of incoming kit depending on who you’re dealing with. Google knew that when they built Platform A. The lie of cloud is that dismissing those problems is worth the margin (it ain’t; you send it back, make them refire the omelette, and eat the toast you baked into your capacity plan while you wait).


Are you saing you just buy some server unpack them and throw them into production.....oh man...the lost art of systemadmin, if your system is not stable (in testing) you for sure disassemble it, or send it back. How much money have you lost playing around with your unstable database? Was it more then test your servers for some weeks? Do you buy/build software and throw it into production without testing?

You can test your stuff and be still profitable henzer aws etc would make no money otherwise....you know they test their server much more (sometimes weeks/month)


Did they pass typical memory/reliability tests and so on?


Maybe in the first day's they survive it, but the flakes are 99% from the fans/bearings, that's why you test servers at max load for at least 1 week and HD's for 2-4 weeks.

But i don't think they made even a initial load-/stresstest.

Unpack it, trow it into the rack, no checking of internal plug's just nothing...pretty sure about that.


Metal chips is squarely in the long tail of failure modes that you can't really anticipate (but of course really easy to be smug about in hindsight). It is also extremely unlikely the bearings, most likely these are from chassis frames assy not cleaned up properly.


I had some metaldust and it was from bearings, but op said something flakes and then microscopic particles. Particles = bearings, flakes = chassis or even stickers, but anyway just because of transport you dont trow a server into production without testing and inspection.

I am beeing smug about not testing your hardware as you do it with software....shitty testing is shitty testing, counts for software hardware firmware and everything between. Even for your diesel generator ;-)


I heard tale of a banking centre that had a diesel generator installed by a local company.

Load and simulated power failure tests all passed.

Then some time later there was a total power cut and that's when they realised the generator had an electric start wired to the mains supply.


And there is also the true story when "someone" forgot to fill the tank after 5 years of regular monthly tests, then the real thing happened.

> had an electric start wired to the mains supply.

But that's a good one, humans being humans...but it worked every time before today ;))


Wait, are you saying that an org needs expertise to QC all of the the hardware they procure? How expensive is that? How easy it is to hire that type of QC?

Do you see how these costs all start to add up?


Well, are you saying that an org needs expertise to inspect faulty cars, like, by calling a mechanic?

Is that like too much these days for companies that owb fleets of cars? is opening a server harder than checking whars wrong with a car? like a cable comes loose and that's gane over?


If I procure a fleet of cars I expect none of them to be faulty...how about you?


>I expect none of them to be faulty

So you don't even test the car's, you just expect that the tire pressure is correct, tank is full?

Expect that something "just" works is exactly why pilots have checklist's.

Expectations are the main point for disappointments, you would never do that with software right?


The point, which you seem so dedicated to avoiding, is that "in the cloud" these steps are not my problem. Inspecting a literal shipload of computers for subtle defects is a pain in the ass. Amazon does it for me. When I get on an airplane I do not personally have to run the checklists. The airline does it for me.


>The point, which you seem so dedicated to avoiding

Not true the point was you pay for it (cloud), or you do it yourself (but then do it right, and not like a amateur who build's his first "gaming-pc").

And if you do it yourself you can still be very much competitive vs cloud.


> (but then do it right, and not like a amateur who build's his first "gaming-pc").

Again, still avoiding the point, but oddly enough proving the point. You assume everyone isn't an amateur and knows how to build and maintain server hardware. Furthermore, because the market doesn't have enough talent to support all of the companies that exist, consolidating this to a few vendors who do have the expertise is what makes sense (economies of scale) and is what the market already decided.


>Again, still avoiding the point, but oddly enough proving the point.

Please read, that was my comment:

>>Not true the point was you pay for it (cloud), or you do it yourself

>You assume everyone isn't an amateur and knows how to build and maintain server hardware.

Yes that i assume, correct. Otherwise i would not call it "maintaining", is a amateur maintaining your car? Your software? If you have just amateur's handling your hardware it's probably better to pay a cloud-provider or pay a integrator todo that.


> you would never do that with software right?

Hilarious you used this as an analogy since software development shops are notorious for cutting corners when it comes to QA.


And that's why you have to test the software before production right? ...Hilarious indeed.


> you would never do that with software right?

You facetiously implied that every company fully tests software before it gets to production. Oh boy, do I have news for you...

Note the word "fully" as the variations of what gets tested is so broad, I don't even know where to start to explain this to you.


I never wrote "fully", but you test your software (i hope). Your just try to justify bad work-ethic.

>Oh boy, do I have news for you...

Nah it's ok, just happy that i have colleges with a much better mindset and risk-management understanding.

And i stop here, since you try to change what i really wrote.


>"I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though."

This is what I do. I rent bare metal from Hetzner and OVH. I also have some hosting hardware right at my place. It saves me a ton of money and no I do not spend any meaningful time to administer. All done by a couple of shell scripts. I can re-create fresh service from the backup on a clean rented machine in no time.

As for cloud - if I need to run some simulation once a month on some bazillion core computer then sure. Cloud makes much sense in this particular case. I am sure there are other cases that can be cost effective. Bot for the average business I believe cloud is a waste of resources and money.


If you don't enjoy it then what were you doing working at a datacenter?

I enjoyed server admin, "back in the day", when your servers were pets and not cattle. But of course we have to make tech just as expendable as our workers, business school demands it! What if your pet server gets hit by a digital bus?!


Pet server classes is a much nicer concept anyway. I never liked the instance based personalization. Creating a machine, defining its class, and seeing it become a machine of its class is magical.

Of course, the newest idea is for creating and destroying machines automatically... that outside of the could is quite pointless but people want it anyway. I imagine seeing all that orchestration working must be even nicer than a machine autoconfiguring, but I am yet to see a place where it just works.


One argument for "cattle" servers on bare-metal is security. Being able to reset the machine to a clean, known-good state would clear any leftovers including potential malware. Having machines provisioned from images that include everything they need to run also means you don't even need to grant anyone root access (which you'd otherwise need to be able to audit so they don't leave anything malicious in there).


I, too, enjoy pet servers over cattle. But when important parts of modern life depends on servers, I can definitely see the rationale for cattle.


Hetzner is about 10x cheaper than AWS, give or take.


I love Hetzner and send them money every month, but you do get what you pay for. I don't think I'd like to run FedEx off of Hetzner.


I was actually commenting on the margins that "cloud providers" charge.

If I was a manager at FedEx I'd definitely spend some resources to DYI and send some of those millions over to my direction instead.


> Because, personally, I'd rather retire than deal with Dell, HP, Cisco, fibers, cooling issues, physical security, hardware failling...

This isn’t really a meaningful analysis though. It’s just “when you do things in house there are things you have to do”.

It’s like saying, “I would rather retire than clean the toilets, restock the toilet paper, etc” in a discussion about whether to outsource your bathroom maintenance. Doesn’t tell you what’s cost effective.


I'll be really curious how much change Oxide will bring to the status quo.

The promise is to be able to pay up front for a rack that will function as a highly capable VM, storage, and/or compute host, without any of the overhead that Dell, HP, and IBM bring. Just plug it in and start giving it workloads to do. All config can be done through the web-based management console or via the API, just like AWS.


> fibers, cooling issues, physical security

All of that can be handled by your colocation facility. In most cases you won't ever reach the scale where building your own DC makes sense.

> Then you still need to pay VMWare for a decent virtualization platform

Should still be cheaper than paying the AWS premium including for bandwidth, not to mention that you don't always need virtualization. If all you need the bare-metal for is a handful of machines to do a very specific task that's too expensive on AWS then running directly on the metal is an option (and leave on AWS the stuff that does require the convenience of virtualization).

> I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though.

Agreed. Most companies shouldn't ever deal with hardware directly - just rent it from a provider and let them do the maintenance.


I'm more or the less the sole decider for all tech decisions in my org (I don't have full budget authority, but I tell the budget holders what things cost). I'm 100% on board with cloud and even going further up the value chain to PaaS and SaaS. Cloud is expensive, but predictable. DevOps is very expensive and unpredictable. I can't even keep staff retained these days. Having a fixed dollar cost, even if it's high, saves not only the operations cost, but also the accounting cost and recruiting cost. And not just cost, but risk! Managed services are generally lower risk, and even if they aren't you can buy some indemnity that they'll cover some of the cost of failures.


that part can be outsourced to eg. Hetzner


You are right about this.

One of the visible signs of an unrecognized (and therefore, unresolved) dilemma is oscillation between two poles.

I got this from the late Eli Goldratt and the problem-solving tools he created. One of those tools is the "Evaporating Cloud," which is a way to visualize a dilemma of some kind.

I'd suggest that there is an unresolved dilemma here: Should we rent or own data centers?

Over a period of years or decades, you can watch these things flip-flop between two extremes.

It's interesting to me that this dilemma is kind of like a specialization (in OO terminology) of a more generic dilemma, which might be something like: "In general, do we want to own or rent the things we need to run our business?"

If your turn your head a bit, close your eyes a bit and squint, you can see how a dilemma like this can apply to things like "What should be our policy regarding employees vs. contractors? Should we try to hire and retain over a long period of time, or should we rent the people we need to get the job done and release them as soon as we are done with them?"

The overall point is that whenever you see flip-flopping between two poles, think "Unrecognized dilemma."

Sorry if this is vague.


Often it’s simply the principle of the excluded middle.

If not for the rent seeking behavior of AWS and its cohort, the answer to ”own or rent” would be clear. The answer is “yes”.

Running a data center means you have your own sheep, which means you need shepherds. Shepherds are very useful to have when you have a question about sheep, especially when those questions are about how to manage or use sheep to best effect.

An approachable shepherd can save the rest of the company a lot of money on missteps and bad assumptions, but if you start getting rid of all the sheep, the shepherd will leave too.

We should be using cloud providers for DR, and for regional load balancing. But the company should be maintaining at least one data center of their own, in the same time zones as most of their developers.

I mostly blame Dell and IBM for this. IBM experimented with making server rooms easier to maintain 15-20 years ago and didn’t make it stick. Others ran with some of those ideas. Dell… I don’t know what Dell has done but I know nobody has been writing about it, so from a visibility standpoint they have done nothing.

If/when someone makes it easier (reduced labor) to manage your own servers, the pendulum will swing back.


You're right, I mean, this was the vision for Multics. They would rent computing as a utility to all the businesses, big and small, backed by their computers. At the time it didn't pan out but it was definitely a desire.


I think you are absolutely right, however when they do reverse the decision they will publish a great article about how bringing their datacentres in-house is going to save them $800m a year and a bunch of execs will grant themselves a bigger bonus for saving the company so much money. It's a win-win!


It's quite possible that at today's prices, they will save $400M/yr, and in the future, as cloud vendors raise prices, bringing it in-house will save them $800M/yr.


Not only that, it's possible that cloud vendors don't charge an extra penny but that the their software grows to consume the virtually limitless computing resources available to them costing them significantly more than expected.

It wouldn't take the most creative exec in the world to make a plausible looking case for changing in either direction even based on the same numbers. A bit of wishful thinking about which costs are included or excluded is probably more than enough.


> rent all information infrastructure

Most of them were renting most of the infra anyway:

- Co-location (physical space rented)

- Managed service to keep the lights on (power, back-up generators)

- Leased hardware that gets replaced every 3 years

- Managed service for switch, firewall, etc. monitoring

- ISP, back-up generators, etc.

> but that will likely be many, many years down the road

They're going to explain to their future bosses that buying land, building a huge building, buying servers, figuring out cooling, setting up redundant ISPs, etc. etc. is somehow going to be smarter? Explain that one to me?


You're paying for all of that whether it's the cloud or on-prem. The only real difference if you're a large company is whether you're paying another company's profit margin as well.

If you are big enough to have your own datacenter, you are paying Amazon enough to buy that much physical space, power, bandwidth, IT staff, etc. plus funding Bezos' trips to space.

There's a justifiable niche where you're too small to justify running your own server/below the need of one full-time IT staff where the cloud makes sense, and startups temporarily benefit from the ability to rapidly scale on cloud platforms. But any Fortune 500 transitioning to the cloud is literally just taking their money and burning it, because the alleged cost savings of efficiencies of scale are being completely absorbed by the cloud provider's profit margins. And they're still going to end up paying their own IT staff to handle the cloud management in addition to (by proxy) paying the cloud provider's IT staff to manage the hardware.


> You're paying for all of that whether it's the cloud or on-prem. The only real difference if you're a large company is whether you're paying another company's profit margin as well.

So, then why even provide the distinction of rent vs own if thats the case?

> you are paying Amazon enough to buy that much physical space, power, bandwidth, IT staff, etc.

And you're also sharing the costs with other people. Clearly FedEx is saving $400M/year while also funding Bezo's trips to space, no?

> But any Fortune 500 transitioning to the cloud is literally just taking their money and burning it, because the alleged cost savings of efficiencies of scale are being completely absorbed by the cloud provider's profit margins.

This is literally not the case without the details so stop speculating. Every F500 should (and probably is) be doing a rent vs own calculation and determining the TCO, and then making the decision from there. It's not unilaterally the case, it has to be assessed.


> Clearly FedEx is saving $400M/year while also funding Bezo's trips to space, no?

I wonder how much they'd save if they rebuilt their infrastructure with own DCs. There are huge savings in decommissioning mainframes, switching to free software and automating more. At the scale of Fedex, I don't think they'll spend much more on hardware and ops than AWS.


The profits by cloud providers are still limited by a competitive environment. Multiple cloud providers have a strong incentive to compete and keep prices low. It's much more difficult to switch from in-house data infra so they aren't as focused on efficiencies.


Today I learned that economies of scale and specialization are not things.


Please whenever you envoke any model, have a sense of scale or region of applicability of the model, no model is true in all cases unless you're religious. The point the GP is making is at the scale of a F500 it starts to make less sense especially if you already have the infrastructure, the expertise, the experience, and so on. AWS is going to manage your VM better than you at your 3 person start up, but the economy of scale / specialization arguments work less well when you have an experienced workforce and developed system in place already and you're one of the world's largest logistics companies.


You're ignoring the massive ecosystem of vendors, managed service providers, colocation facilities, consulting firms, etc. whose profit margins are paid to manage most on prem deployments. Very few entities are doing it all with in-house staff.


There are a couple comments pointing this out, but that's not unique to on-prem: Cloud providers are also buying hardware from vendors, employing contractors, buying or renting facilities, hiring consultants, etc. All of that you are going to pay for either way.

The cloud just offers an additional middleman that also gets a profit margin.


Yeah but you pay one invoice instead of 100 invoices, you have one sales representative OK maybe a couple to deal with instead of 100s.

You get your support in one place instead of Hodge pogge of ... yeah we rent server from X but software that runs on it is from Y and in reality provider Z is support so now C has to agree for physical access to the server and now align stars to get all 4 working together instead of shifting blame around to fix anything.


Sure, you pay an Amazon employee to manage that for you instead of paying an employee of your own. Either way you're paying for that, and with the cloud you also pay for Bezos' spacecraft fantasies.

And you hope the Amazon employee makes decisions that are favorable to your business, even though they do not care about it.


I feel like everyone in this discussion is missing the forest for the trees, especially given the way most public company executives think and operate, which is quarterly.

The problem with building out a new state-of-the-art datacenter - or several of them - is the enormous capital expenditure you've just put on your company's books, not to mention the operating expenditure of all the people that will be required to run it. Yes, it's true that as time goes on, you can claim some tax advantages in the form of depreciation, etc. on some components of this new huge outlay, but at the end of the day, when the company misses quarterly or yearly expectations from Wall Street and the stock price takes a 10% hit, the "blame" gets squarely laid at your feet - you are, after all, "the problem". You spent a shitload of money provisioning future resources for the company's (expected) growth.

Meanwhile, in Cloud Cuckoo-Cuckoo Land, your rival has migrated all or nearly all existing infrastructure to the cloud, thus including a significant op-ex, but not nearly as large as the enormous cap-ex you've just incurred. They're hailed as a hero. A goddamned visionary! Look at all that money they're going to save the company! Nevermind the fact that they explicitly instructed the IT department's head to provision only the necessary resources for current operations, after all, the "promise" of the cloud is that you can just spin up whatever you need in a few minutes, anyway. And besides, the cloud deployment of all the company's servers and the necessary expansion won't incur a significant turning of heads until long after our visionary executive has jumped ship to another company for more pay, a corner office, and a better stock compensation package. Best of all, he say that he saved the company $XX millions of dollars over your plan, and it can be said legitimately, even though it is, of course, inaccurate.

If you're huge, it's never cheaper to farm out the administration of your critical infrastructure to qualified experts. But because of the dominance of the Quarterly Report Cycle in modern business, this gets swept off by the wayside as "outdated thinking".

I say this, by the way, as an IT professional building out cloud solutions for companies. For a lot of small-to-medium sized businesses, and startups especially, the cloud makes sense. If someone at Federal Express thinks the cloud makes sense, they're looking out for themselves, not the company.


> benefit from the ability to rapidly scale on cloud platforms

Big companies need this too. Some team wants to spin up some new service to try out some idea? In cloud-land they push a button. In "rack & stack" land they have to wait for Ops to purchase the hardware and provision it.

Cloud makes it cheap and easy to try out new stuff.


Assuming you're doing dedicated machines for services. My company runs on-prem with a big cluster scheduler & maintains headroom in it; small deployments of new services and modest scale-ups of existing services don't require explicit capacity requests. Only if you're going to provision a huge number of instances do you need to wait for infra to buy machines. Which also requires advance planning with the "elastic" cloud anyway.


As any cloud, you should always have spare capacity on-prem. Spinning up a VM on-prem doesn't have to be slower than in the Cloud. At this scale, you're just your own cloud provider.


“Paying for another company’s profit margin as well”

All transactions, no matter if in the cloud or on prem, go towards another company’s profit margin.


I don't think it will be a future generation of executives; I think it's the current generation of executives at companies and agencies like Google, Amazon, Microsoft, and the NSA who will not rent all their information infrastructure. I agree that the executives who thus turn their companies into sharecroppers on land owned by Microsoft et al. will be richly rewarded despite the misfortune that befalls their stockholders; like Stephen Elop, if things go badly, they have an impeccable excuse.


Thanks. I think it will have to be a future generation of executives whose egos won't be tied up in old debates and who therefore will be more willing to sacrifice old decisions. If things go badly, that future generation of executives will take over sooner.


Some might return to on-premise data center, but most will not.

It's really just like the utilities these days, 98% people will not dig their own well and install some sewage system and buy a generator, but big companies can still opt to have them on-premise installed, even though some of those are just backup systems.

there is no return for the cloud for 98% of us.


You generally can not go from city water/sewage to on-prem for legal reasons. However, many are going on-prem with solar. I am going almost completely off-grid for cost and SLA requirements.


What's interesting is people seem really excited about installing rooftop solar, but I see fewer companies building collectors in places with cheap land and ample sun. That tells me part of the selling point of rooftop solar is it feels good, and the economics might not actually work.


> but I see fewer companies building collectors in places with cheap land and ample sun

You mean there are fewer companies making large investments than smaller ones?


> part of the selling point of rooftop solar is it feels good

As someone who's been building an off-grid solar system this is largely the case. I've spent thousands on panels and batteries, my electricity bill is only a few hundred a month but we have frequent outages during storm season. It's definitely more of a feel good and control thing. If I consider the cost of running the electricity to our remote location, I'd probably barely break even in 10 years if I'm lucky.


If you spend US$6000 on panels and batteries, your electricity bill is US$300/month, and the outages and power lines cost you nothing, then your payback time for the off-grid solar system would seem to be 20 months, under two years, not 10 years. The outages and power lines would seem to shorten the payback time further, not lengthen it. Without disparaging the feel-good control thing, which I think is reasonable and important, I feel like I must be misunderstanding something about your explanation.


The economics favors distributed storage rather than generation.

I worked in a building during rolling blackouts in California. The Tesla Powerwall on the building was "screaming" (ultrasonic harmonics from piezo components) 24/7. The security guy actually came to get me to ask if the thing was going to explode.

Clearly, the building owner was load shifting and making quite a bit of money from it.


> The economics favors distributed storage rather than generation.

They shouldn't, though. If powerwalls are that great, it should be more cost effective to install giant banks of them in cheap warehouses outside major metros since you'll have economies of scale on installation and maintenance. Utilities are sorta getting into this game, but it's more common at the individual level, despite being more expensive.


> If powerwalls are that great, it should be more cost effective to install giant banks of them in cheap warehouses outside major metros since you'll have economies of scale on installation and maintenance.

This works only if the electrical grid has the ability to consume back your stored power. Most electrical grids in the US would have problems doing that.

For example, UCSD has their own generating facility; however, SDG&E was sufficiently backward that UCSD was unwilling to go through the grief necessary to put energy back into the grid. Therefore, UCSD only does load shifting or disconnects the campus from the grid during rolling blackouts.

Because of the poor energy grid transport, storage batteries make the most sense when you can consume the stored energy locally to minimize your grid consumption during high energy prices--ie. you have a high-rise office with lots of air conditioning or a manufacturing facility.


There are definitely countries and US States where the economics of solar don't work.

Luckily, it does work for most of the US, especially desert/arid areas.

> but I see fewer companies building collectors in places with cheap land and ample sun.

They are plentiful in Turkey and some southern EU countries like Greece/Croatia, I believe. Not sure about deployment in the US.


Captured Solar (electricity in general) doesn't travel well over long distances.

Generating solar for local use saves the cost of transport.


> electricity in general doesn't travel well over long distances.

High-voltage line losses can be as low as 3% per 1,000 km. That should be easily offset by the better location and better angle/tracking.


It's amazing there are no other voices in the room pushing back against this. It really makes you wonder how companies even function at all, much less make money. Being so short sided never ends well.


Actually, I don't think the decision is that shortsighted in this case. It could work out pretty well for a decade or longer. IMHO FedEx is likely to have a good run before the costs and risks of this decision start outweighing its benefits.


However, when the situation starts to become unbearable, how long will it take to rebuild the infrastructure and source talent? Another decade? Adding the startup costs for any possible rebuild, you're apt for a substantial net loss. So you're probably going to shy away from such a reintegration, while losses accumulate further, until this becomes unsustainable, putting the entire corporation at risk.

(The wise thing to do to mitigate financial impact may actually be starting the reintegration process right now.)


Will it? If they do this right it won't. There is a large cost running a data center. So long as long as there is competition they are better off with experts doing computers while they focus on logistics which is what they do well.

Of course they should make some effort to ensure there is competition. That means they are careful to ensure more than one provider exists even if it means taking a slightly higher priced option. Also contracts end early if the company is bought out. (that is if AWS, buys google cloud - see above about ensuring there are competitors in the market)


> Of course they should make some effort to ensure there is competition.

My guess is, we'll rather see some consolidation, like in every other segment. Which also means considerable expenses for the remaining players, who will experience increasing need to regain some by pricing. As theses players are sharing roughly the same boat, this will show quite naturally some characteristics of an oligopoly. At the same time, self-managed infrastructure will have become a rarity and costs of reintegration are prohibitive, which should allow for some elasticity in market prices. Also, any investments towards reintegration won't show results during the turn of current management, minimizing chances for this to happen, yet again. (This is not a level playfield anymore.)


$400m saved today in exchange for $500m 10 years from now sounds like a good deal financially!

Sometimes tech debt can be used instead of financial debt!


Mind that if you'd ever wanted to integrate again, this means buying land, planning infrastructure, building it, planning, obtaining and setting up hardware and software, hiring, training, defining and testing procedures, etc, as you're starting from zero again – and all this while the costs, which forced you to consider this move, are piling up. Odds are, you'll never do this and cloud providers will know this. The comparative costs are now not those of running your own infrastructure, but those of setting it up (again).


FedEx had very few on site people managing the hardware, it seemed like most work on hardware was done by the suppliers. There were also very few servers there and actually running, I think the software powering the enterprise took up a lot less than they expected.

Either way, I think you're spot on with the "training" part. FedEx was one of the first companies to raise significant capital and pursue a business dependent on technology. It also treated its people very well so relatively few left since the 70's. I think they just didn't train the next generation enough and they realized that those skills are disappearing rapidly because no college grads want to learn old & boring stuff and everyone who does know it costs a lot to keep (FedEx still heavily relied on COBOL when I left ~5yrs ago).


I deeply appreciate you sharing this context. What do you think about my wild predictions of disaster?


> how long will it take to rebuild the infrastructure and source talent?

How costly is it to maintain that level of talent? Do you think top server admin talent is screaming to work at FedEx or Amazon?


There are always going to be short sighted decisions. But a good design that would abstract away the fact that you are running on AWS or on-prem would pay dividends.

With a good design, there is always that implicit threat that they could move back to on-prem with little effort.


Convincing people future risks outweigh the short term gains is very hard. That goes for all things.


There are very likely voices, but they didn’t win this battle.


The optimize for the near term and after a while some other company optimizing for their near terms as startups stumble upon an awesome new product-market fit and upend the previous companies.

There is no grand plan, there's just evolution.


Some are starting to question the almighty Cloud™: https://www.economist.com/business/2021/07/03/do-the-costs-o...

One of the things that bother me about AWS is that I still need to manage their risk of a data center going down by using multiple availability zones and managing the complexity of extra resilience. There is a conflict of interest: AWS has a perverse incentive in keeping a single AZ robust and unless there is a natural calamity, it should not go down.

I thought one of the major reasons of going to cloud is that you need not manage risk and offloading it from on-prem.


I disagree. This isn't FedEx's core competency or a differentiator for them. They're also not at a scale where it could make sense to colocate or build their own DCs. Fiddling with low-level tech infra would just be a distraction.

Now, they might find that software is a differentiator for them, but that's different.


For a logistics company, the IT system is the differentiator. Reliably routing and tracking packages to maximize utilization of trucks and planes allows them to be cheaper and faster. That all depends on both hardware and software. So I'd definitely think that running their own systems should be a given.


They are doing the rational thing and there is no going back.

HNers need to get their heads around this: the economies of the could represent a fundamental, secular shift.

Imagine if Fedex designed and made their own delivery vehicles. Then they 'outsourced' that to Ford/GM. You might say 'look at Ford's amazing margins, look at the money being left on the table' - but in reality, it's still more cost effective to 'buy vehicles' the to 'DIY' them.

In the 'long run' those Azure/AWS margins will erode - and/or - the long tail will creep up on them. Basically data centers offering less, for less, and companies realizing they don't need the complexity of AWS do do basic things.

What that will look like is hard to say.

With hardware it meant 'pivot to Asia'.

Will this happen again? Chinese companies creating large data centres in the US with minimal staffing, monitored and operated out of China?

If we were not in a giant geostrategic kerfluffle with them, then yes, I would say that is the future. But given security issues etc. it's probably not.

Can India do that? Maybe.


Ford and GM and Freightliner are not in a position to use price discrimination to confiscate FedEx's entire profits; if they try to sell trucks to FedEx at an inflated price, FedEx can just buy the same trucks from dealers, or buy them used, or buy GM trucks instead of Freightliner trucks.

By contrast, generally speaking, cloud providers are in an excellent position to use price discrimination to extract the entire surplus value of every transaction. And the flexibility and reliability provided by FedEx will to a significant extent simply be that of their computers.

(As chrisseaton pointed out, Boeing is in a cloudier position.)


So 100% yes, AWS has much more power in the value chain than Ford. But not really.

Especially to the extent that special services are not leveraged, then there are cheaper alternatives. A lot of IT is just 'instances' not 'fancy cloud queues'.

AWS pricing is very transparent and people can make the decisions they want. Generally speaking, AWS does not price discriminate on the basis of 'business model' and their margins are meaningless with respect to the overall business efforts of a Big Co.

I mean - unless you are doing big AI crunching, or 'free hosting content' - then frankly the AWS bill is not going to be a huge line item relative to overall operating costs.


Fedex isn't building their own vehicles, same as they wouldn't build their own servers. But going cloud is more like outsourcing all transport activities to other logistics companies. No one says they have to do every part of DC ops themselves, but outsourcing everything and using proprietary products makes them very vulnerable.


Sounds a lot like farming out NOC-Ops to Wipro. Somebody had to think it was a good idea & managed to sell it to the policy makers for a nice promotion. So, when reality hits, does the same person get demoted further down than their original ladder rung?


They are gone by that point, with a padded resume to boot :-)


We have our own datacentres. Lucky enough to have got them up and running pre cloud era. If we were any later we would be full cloud but the reality is we outcompete everyone on price in our space because of it.


Tragedy of the executives


> that will likely be many, many years down the road

And this is where people in my age range (20-30) will get to swoop in if we invest the next 10-20 years learning about and tinkering with server hardware.

I highly recommend it for everyone.


This is one of the things bad about Wall Street: Short-term decision-making. Everybody cares what happens to stock prices in the near-term, rather than 10 years out.


Great concrete example of the principal-agent problem here.


At least in Oil and Gas. Servers moved back into the personal towers for the first time in a decade. Latency issues.


AWS Outposts ?


Ah did not know about AWS outposts. They use Azure so hybrid setup and moved the latency sensitive servers back into the tower everything else lives on Azure.


And they will gently depart under their beautiful golden parachutes. What a racket.


reminds me a bit of:

https://www.reddit.com/r/AskHistorians/comments/vqr30e/jack_...

"Jack Welch extracted record profits from GE for 20 years, but left it a hollowed-out "pile of shit," according to his successor. What exactly did Welch do that was so damaging, and how did he get away with it for so long?"

from the post reply, in case it's not available from the URL:

alecsliu · 2 days ago · edited 2 days ago Gold2Eureka!Bravo!Today I Learned

Welch took over a GE that was at the time, a major company. At the time, he viewed GE as bloated and needing to change. While he might've been right about that, the approach he took was perhaps less ideal.

One of the worst things he helped make commonplace among American companies is the concept of stack ranking. The way it worked was like this: people were divided into three groups: A, B and C. A's were the top performers who needed to be rewarded generously, B were adequate performers who should be allowed to stay, and then C were those who needed to be fired.

So far so good right? Well, not exactly. For Welch, these three buckets could be separated into the top 20%, the middle 70%, and the bottom 10% (20/70/10). Based on the above points, the bottom 10% would thus need to be fired annually.

In the short term, this helped in making the company lean and look more productive, increasing the bottom-line and portraying an image of success. However, the long term consequences of such a change was cultural degradation and the introduction of new bloat and waste (running all of those performance reviews and firing and rehiring so many people takes a lot of money.) Consider the case of a perfectly adequate team: the entire team has achieved their targets and has contributed to the company as per their job description. The issue? In stack ranking, 10% of this team would still have to be fired even though everyone did their job. Unsurprisingly, the introduction of this competitive atmosphere where it isn't enough to succeed, one must be better than their peers, results in backstabbing, competition, and a host of issues which eventually weaken a company's competitive edge.

This was only a part of Welch's general treatment of workers as numbers rather than humans. Aggressive cost-cutting, offshoring, etc. were the norm under Welch's regime and he would destroy entire divisions. This again, was great in the short-term but bad in the long term. Welch clearly had a dim view of company culture and believed it to be unimportant.

The other major issue of Welch's was the acquisition of hundreds and hundreds of businesses, as part of Welch's goal of acquiring his way to the top. On the surface, this is what Welch did; on paper, he didn't destroy the main profit makers of GE so much as create new ones, primarily in the form of its financial arm.

The result was that this allowed GE to play with the numbers in a way that allowed him to make sure that GE was always meeting targets set by Wall Street; in order to generate the right earnings, simply buy or sell certain assets, write-off others, etc. and when your company is an acquisition machine, it's not too hard to find the right numbers. This process helped expand GE into a mega-conglomerate but again, ultimately left the company in a weaker than expected state. In particular, on paper the core business of GE became its financial arm, as that was where all the funny business with the numbers was happening (nothing explicitly illegal though). Ignoring the damage the Welch did to GE's profit centers through his horrible business practices, this was a huge part of why GE declined so rapidly in the years following. GE's valuation was based on an inaccurate picture of its profits and value, so when the truth started coming out (especially with the Great Recession collapsing financial services profits), GE was quick to follow.

As for why Welch was able to get away with it all? Well, the answer was because he was delivering. He hit the earnings targets, he made the board and shareholders very happy, by all accounts GE was the paragon of success and everything was going right. What Welch did was unprecedented, and it's hard to really understate that. For all of his faults, he had America tricked into believing that what he could do was unique and that he could avoid the realities of the economy and cyclical markets, that no matter what was going on, GE was special. Welch died a rich, rich man and many, many people profited greatly off of GE during its 20 year bull-run.

Edit: My off-the-cuff writing is always horrible wrt to grammar and structure so I'll probably edit it later haha




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: