New IBM LinuxONE 4 Express – Rack-mounted pre-configured Linux mainframe

abtinf · 2024-02-06T20:07:00 1707250020

I previously led a significant chunk of developer evangelism at IBM. Mainframes are unbelievably powerful, feature-full, and cost-effective. You are almost certainly better off buying a low-end mainframe than hiring a team of engineers to build yet another not-great data replication/recovery/HA system. Imagine being able to hire engineers to work on your core value add instead of basic computer management problems IBM solved in the 60s.

The reason you don’t is because it is basically impossible to get access to a mainframe you can play with and learn on. And IBM’s internal incentives and financial metrics ensure that can never happen.

* My views are my own and not those of my former employer.

sonnyOrullivan · 2024-02-06T20:48:11 1707252491

Sorry to say that your previous "evangelism" shows. It would be very interesting to hear what "unbelievably powerful" means (as in facts and numbers) compared to what a hyperscaler can provide. E.g. SPEC for CPU, GFLOPs, Tensor-ops for inference, latency and so on. Last I checked the consolidation story that claimed factors of savings took underutilized ancient Intel boxes as baseline.

"feature-full" is also quite a stretch when it comes to software when most modern software needs to be ported or run in a Linux LPAR.

In addition I'd question that not getting access easily is the main reason for the lack of adoption. It might be part of the problem but the other problem is that the thing is a niche product and IBM doesn't seem to be utterly successful in either explaining what the niche is nor how to extend it.

bayindirh · 2024-02-06T21:11:21 1707253881

Being able to spare a couple of cores as a new partition and moving a partition to it to replace the memory and even the processor of the faulty partition with zero downtime is great.

Consider that partitions are completely isolated from each other. Not some pesky soft isolation either, it's all done in hardware. In practice, every partition in a mainframe is a different logical yet isolated mainframe.

Mainframes are built for different kinds of workloads. They are not cloud machines. They are batch machines with 6 or more nines of uptime.

In my current job, a mainframe would be useless. However, fore mission critical core services which needs predictable latencies (bank transaction engines, central big databases etc.) I'll take a mainframe all day, every day.

sonnyOrullivan · 2024-02-06T21:51:21 1707256281

> Consider that partitions are completely isolated from each other. Not some pesky soft isolation either, it's all done in hardware. In practice, every partition in a mainframe is a different logical yet isolated mainframe.

"Completely isolated in hardware" as in isolated by a software hypervisor (PR/SM) that doesn't have a whole lot of hardware support?

> Mainframes are built for different kinds of workloads. They are not cloud machines. They are batch machines with 6 or more nines of uptime.

That's kind of the point. What exactly is the niche, i.e. which new customer with exactly what software and latency requirements would switch from their current system to a z? IBM won't tell you apart from buzzword bingo.

fuzztester · 2024-02-06T22:24:17 1707258257

>buzzword bingo

I'm guessing that many of the terms in this thread are buzzwords for many of us reading it :), since relatively few people work on mainframes.

I had not heard of the term buzzword bingo before, so I googled it:

https://en.m.wikipedia.org/wiki/Buzzword_bingo

Interesting. It aligns with my corporate experience in a few cases.

20after4 · 2024-02-07T10:33:59 1707302039

https://www.youtube.com/watch?v=RXJKdh1KZ0w

rbanffy · 2024-02-07T21:04:32 1707339872

I think this is the one https://www.youtube.com/watch?v=XLgvlPMp0o8

fuzztester · 2024-02-08T23:23:58 1707434638

Wow. She nails his guff at the end, with one single word.

I'm nominating this video for the next Oscars, Grammies, or whatever the heck the movie awards are called (I don't keep track of such stuff). Bet it'll win gold in the under-1-minute category ;)

rbanffy · 2024-02-09T11:54:10 1707479650

That campaign probably got a truckload of awards in Cannes. I was working at Ogilvy back then and IBM was one huge client of ours.

fuzztester · 2024-02-11T08:37:16 1707640636

Oh, wow. I have read his book, Ogilvy on Advertising. Found it interesting, to say the least.

prng2021 · 2024-02-07T05:14:07 1707282847

Read chapter 3 here: https://www.redbooks.ibm.com/redbooks/pdfs/sg248233.pdf#page...

They’ve spent the last 60 years developing and refining everything for extreme levels of uptime and high availability. It’s truly unique to these mainframes because IBM controls every aspect of it. The closest equivalent of that kind of vertical integration is your Macbook Pro.

I highly doubt there are any brand new mainframe customers these days. But many of the biggest companies you know and use every single day have tons of workloads that will never move off of it.

sonnyOrullivan · 2024-02-07T10:14:14 1707300854

I don't think I need to be reading this. The uptime and several 9s of high availability are attainable with z/OS and a parallel sysplex. No doubt systems like these are used by banks and others in the wild.

But this doesn't say anything about "unbelievably powerful" or "feature-full" as in the OP?

The niche customer claim probably is "will never move off of it because nobody ever wants to touch millions of lines of COBOL" then that's fine. It's likely sane from a business perspective to continue using them as long as maintenance burden is manageable. Luckily managers in those more conservative companies consider full rewrites dangerous, rightly so. But in order to claim otherwise (i.e. unbelievably powerful and thus for everybody) we need to see numbers.

prng2021 · 2024-02-07T14:43:16 1707316996

I’d say 15+ years ago, they were very powerful relative to other solutions on the market during those times. But I agree that’s no longer the case when compared to modern server racks.

I pointed out that Redbook about HA features since it’s a major differentiator of mainframes even today, but there are also these that document other features:

https://www.redbooks.ibm.com/redbooks/pdfs/sg248950.pdf#page...

https://www.redbooks.ibm.com/redbooks/pdfs/sg246366.pdf#page...

You can find 1 or more books that go into a deep dive of every chapter.

IBM has taken input from the biggest companies in the world over many decades to run as many of their workloads as possible. They’ve honed everything in their software and hardware to do so, down to developing specific cpu instructions to support specific use cases. If all of this doesn’t convey an extremely feature rich system, I’d like to hear why you think otherwise.

wolverine876 · 2024-02-07T02:18:56 1707272336

> "Completely isolated in hardware" as in isolated by a software hypervisor (PR/SM) that doesn't have a whole lot of hardware support?

Are you saying the GP's statement is outright false?

rodgerd · 2024-02-07T03:47:56 1707277676

It is extremely optimistic, to put it mildly. You can definitely get inter-LPAR noise. I've seen it first hand, personally, on systems that I've worked on.

Moreover the norm for zLinux system is to isolate guests with zVM, not LPARs. LPARs are more likely to be boundaries for chunkier workload definitions - that advice may have changed since I last cracked open a redbook; zVM offers similar isolation to what you'd see with KVM, VMWare, or other hardware-assisted hypervisors.

jupp0r · 2024-02-07T05:15:20 1707282920

What do you even mean by "predictable latencies"? Afaik this needs an RTOS.

bayindirh · 2024-02-07T06:46:25 1707288385

An RTOS defines a stricter latency envelope than a mainframe, almost into "deterministic latency" range.

For example, you can say "this operation will take at most 3ms" in an RTOS, and it'll never exceed that number. If you're running on a fixed-frequency system, you can even say that I expect an answer in 2.8ms all the time, every time.

In a mainframe this is a bit relaxed. You can say that latency is <3ms 99% of the time, <3.1ms in 99.999% of the time and <3.5ms in 100% of the time.

IOW, you never think/say "somebody is running a heavy load on the host, and I also slowed down because of them".

jupp0r · 2024-02-07T14:11:44 1707315104

How is the mainframe's 100% number different from the upper boundary that RTOSes offer? It's not 100%, is it?

bayindirh · 2024-02-07T18:06:32 1707329192

What I tried to say is, a mainframe might answer late, but rarely, and very slightly. a RTOS doesn't deviate from that number in either direction. It's neither early nor late.

Some RTOS even doesn't consider late answers as acceptable.

jupp0r · 2024-02-08T06:41:02 1707374462

Yeah exactly. My arm64 Linux box also might answer late, but rarely.

bayindirh · 2024-02-08T08:31:31 1707381091

That's true, but that jitter envelope gets bigger as the load increases. For a mainframe this jitter envelope is much narrower, and in a competent RTOS, that's basically 0.

cbsmith · 2024-02-06T21:01:51 1707253311

Last I checked, the Teliums had some pretty impressive numbers for memory bandwidth even compared to NVidia.

sonnyOrullivan · 2024-02-06T21:42:42 1707255762

Telum is attached to the L2/L3 so it's that bandwidth as long as the model fits into it. Afterwards you go to memory and you really don't want to compare DDR4 modules with RAIM against HBM3 or the likes that a current compute GPU uses. Latency might be closer but you mentioned bandwidth.

rbanffy · 2024-02-06T22:14:08 1707257648

IIRC, with 32 MB per core (8 per die - a Telum package has two of them), one socket has 256MB of cache for 16 cores. L1 and L2 are in-core, but L3 is on-die unused L2 from other core and L4 is unused L3 on the other die.

I am not sure if that continues off-package, but if it does, a drawer with 4 chips of 16 cores each will have 2 GB of off-chip cache and a full-sized Z with 5 drawers would have 10 GB of off-drawer cache (at this level it's probably not that much faster than same-drawer memory).

As for the RAIM, I think it's safe to assume it has a very wide path (n-1 modules) to the sockets and that aggregate bandwidth will not leave the 4 sockets starved (even if latencies suffer because of the redundancy, but you can replace a defective memory module with a running computer without it having to pause.

sillywalk · 2024-02-07T02:39:29 1707273569

Some more details for people who know about these things, emphasis mine:

2 chips make a dual-chip module with 512MB cache, 4 dual-chip modules make a drawer with 2GB cache, and a 4 drawer system with 32 chips makes 8GB cache.

"The L3 and L4 virtual design across the fabric provides 50% more cache per core, with improved latencies. The idea is that software and microcode still see two separate caches. These caches are shared and distributed across all eight cores via a 320 GB/s ring and across the integrated fabric. Horizontal cache persistence should further reduce cache misses as well. Specifically, when a cache line is ejected, the system looks for available cache capacity on other caches, first on the chip and then even across the 32-chip fabric."[0]

"The accelerator itself delivers an aggregate of over 6 TFLOPS of 16-bit floating-point throughput per chip to scale up to roughly 200 TFLOPS per system. 1024 processor tiles in a systolic array make up the matrix array, and 256 fp16/32 tiles make up the accelerator for computing activations and include built-in functions for RELU, tanH, and log. The platform also provides enterprise-class availability and security, as one should expect in a Z, with virtualization, error checking/recovery, and memory protection mechanisms. While 6 TFLOPS does not sound impressive, keep in mind that this accelerator is optimized for transaction processing. Most data are in floating-point and are highly structured, unlike in voice or image processing. Consequently, we believe this accelerator will deliver adequate performance and is undoubtedly much faster than offloading to another GPU-equipped server or running the inference on a Z core. The latency of off-platform inference can cause transactions to time out, and inference does not complete"[0]

"Intelligent Prefetcher and Write-Back – 120+ GB/s read bandwidth to internal scratchpad – 80+ GB/s store bandwidth – Multi-zone scratchpad for concurrent data load, execution and write-back Intelligent Data Mover and Formatter – 600+ GB/s bandwidth – Format and prepare data on the fly for compute and write-back"[1]

[0] https://cambrian-ai.com/wp-content/uploads/edd/2021/08/IBM-T...

[1] https://hc33.hotchips.org/assets/program/conference/day1/HC2...

zwaps · 2024-02-06T23:29:20 1707262160

That's not a lot compared to 8xA100

cbsmith · 2024-02-07T01:15:15 1707268515

A100's are definitely still the best, but IBM's offering isn't a joke. Depending on what you're trying to do, it might be a good way to go.

justinclift · 2024-02-07T05:06:52 1707282412

We'd probably need to see pricing for IBMs offering, because it's possible it'll be eye watering-ly high compared to buying even A100's.

jabl · 2024-02-07T06:43:20 1707288200

TBH, I don't really think they compete in anything like similar markets.

You buy a DGX A100, or a cluster of them, for training and running large deep learning models (or for doing "traditional" HPC).

IBM's solution is more a small inference engine that is part of the CPU, so you don't need to move you data off-chip when doing a little bit of inferencing as part of some other workflow. I don't work with mainframes so I could be talking out of my behind, but maybe something like DL-assisted fraud detection as part of processing bank transactions?

cbsmith · 2024-02-07T15:03:53 1707318233

> TBH, I don't really think they compete in anything like similar markets.

Yes.

wmf · 2024-02-06T21:15:45 1707254145

It doesn't. Mainframe terminology is so obfuscated that it's impossible to tell for sure but it sounds like less memory bandwidth than Genoa.

rbanffy · 2024-02-06T22:26:07 1707258367

They are tailored to the traditional mainframe workloads (they do a lot of hardware/software co-design in their mainframe lineup), so I wouldn't expect a mainframe designed for the generic cloud hyperscale workloads.

In any case, I have played with their LinuxONE Community Cloud service (running on the previous-gen z15) and it's very fast. The impression I get is that it doesn't need to wait for IO. There is a ton of very clever engineering on those machines and the z16 is a technological wonder.

https://www.anandtech.com/show/16924/did-ibm-just-preview-th...

Tor3 · 2024-02-07T07:11:38 1707289898

Check out this youtube video: https://www.youtube.com/watch?v=ouAG4vXFORc&pp=ygUMbWFpbmZyY...

It does a reasonably good job at comparing mainframes to a regular server setup. It's not about SPEC or GFLOPs etc.

jpleger · 2024-02-06T20:47:18 1707252438

> Mainframes are unbelievably powerful, feature-full, and cost-effective.

Maybe things have changed in the last 10 or 15 years, which is the last time I had a legit mainframe at a day job, but back then none of those things were true if you looked at it more than 20 minutes.

I seem to remember nothing being included in the base mainframe… when you started to add things like DR and data duplication and virtualization, it became extremely expensive. Like on the DB side, effing Oracle was much cheaper.

rbanffy · 2024-02-06T22:32:08 1707258728

Software licensing is expensive on them, but the people to keep your 99.999% uptime on your Kubernetes cluster aren't cheap either.

And software licenses are one of the reasons why LinuxONE machines exist - they don't run z/OS, so you don't pay those licenses. You can even start a dozen VMs under an LPAR and run your Kubernetes cluster as if it were running on more common hardware that just never, ever fails. IIRC, you can run a special version of z/VM to manage your Linux VMs if you don't want to run Linux on the LPAR and use KVM for your VMs.

jpleger · 2024-02-07T03:14:59 1707275699

When I moved from a tech company to an aero company who used mainframes, there were literally hordes of IBM contractors who helped maintain the mainframe environments. Getting anything done in those environments was a multi month project, even for basic stuff like patching the OS. There were probably 2 or more sysadmins per rack, employed to just keep the damn lights on those boxes.

For context, there was about 1 admin for every 400 servers at the tech company, and that was for the entire tech stack (LAMP).

Mainframes require a lot of care and feeding. In my experience, having worked at 3 different companies who relied on them (education, aero and finance), their capex is higher, their opex is higher and their uptime / reliability is entirely dependent on the facilities / staffing.

candiddevmike · 2024-02-06T23:03:02 1707260582

You don't think it requires some expensive mainframe admins, even with the fancy software? Anywhere I worked with a mainframe, the mainframe admin was highly distinguished within the org and extremely knowledgeable.

I would argue Kubernetes talent is cheaper than mainframe talent these days because it's ubiquitous.

BeefWellington · 2024-02-07T01:42:11 1707270131

I think the rough premise is there's a 1:>3-5 ratio of "super knowledgeable mainframe guy" to "super knowledgeable kubernetes guy".

Also, Kubernetes talent is far from ubiquitous. That sounds more to me like you're counting anyone who has successfully deployed one time.

simianpirate · 2024-02-07T02:12:35 1707271955

Yeah, I am that super knowledgeable K8s guy, but I’d still say true mainframe admins are still a higher metric.

But in regards to K8s talent, lots of the people who think they know Kubernetes in production but have never had to actually upgrade the cluster, or go through the process of having to update manifests, deployments, and CSIs, and having to actually deal with api removals.

mschuster91 · 2024-02-07T09:05:51 1707296751

Agreed. The tooling around upgrades is painfully atrocious, and stuff like kubepug [1] should be part of the Kubernetes core.

[1] https://github.com/kubepug/kubepug

zeroCalories · 2024-02-06T22:43:26 1707259406

Why bother using an LPAR if you're just gonna use kubernetes anyway? Why not just use one fat machine?

fweimer · 2024-02-07T08:07:49 1707293269

For many generations now, you can't run Linux on bare metal anymore, it has to be inside an LPAR. I think this is true for z/VM and the other operating systems as well.

neverartful · 2024-02-06T23:10:44 1707261044

Why bother using a mainframe LPAR for Linux at all? In most cases, it makes much more sense to run it as a guest under z/VM.

rbanffy · 2024-02-06T22:49:24 1707259764

LPAR isolation happens on a lower level than z/VM or KVM. I don't think anyone has ever demonstrated a successful LPAR escape attack.

wang_li · 2024-02-06T23:29:55 1707262195

It may be implemented in the system firmware, but it's still a hypervisor performing context switches and enforcing access to pci devices. Even if you've never looked at the processor architecture manuals you can tell this is what's happening when you can assign 0.1 cores to an LPAR. Different implementation details but the same functionality as SPARC LDOMs and Intel's vt-x & vt-d.

T3OU-736 · 2024-02-06T23:16:39 1707261399

The lack of escape demonstrations are likely, at least in partv due to a fairly low availability of those systems to the security researchers.

I do not want to make it see that LPAR isolation is just waiting to be compromised, but security-by-unavailability also plays a part :)

linksnapzz · 2024-02-07T00:23:29 1707265409

OTOH, the technology has been in production for decades.

blincoln · 2024-02-07T01:04:24 1707267864

It took many years for Spectre and Meltdown to be discovered, and that was for CPUs affordable for individuals.

How many security researchers are even familiar enough with the concept of a mainframe to consider looking for an LPAR breakout, let alone have access to the necessary hardware?

linksnapzz · 2024-02-08T19:06:44 1707419204

>It took many years for Spectre and Meltdown to be discovered, and that was for CPUs affordable for individuals.

Both of these were anticipated in security papers dating back to the 1980s. It wasn't practical to use those types of exploits on 10mhz VAXen.

themoonisachees · 2024-02-07T10:11:45 1707300705

Also consider: anyone with access to a mainframe will never ever get approval to try to hack it because companies that have mainframes will never want the risk of accidentally breaking the host in some way.

sillywalk · 2024-02-07T04:41:28 1707280888

"... some mainframes have models or versions that are configured to operate slower than the potential speed of their CPs. This is widely known as kneecapping , although IBM prefers the term capacity setting, or something similar. It is done by using microcode to insert null cycles into the processor instruction stream. The purpose, again, is to control software costs by having the minimum mainframe model or version that meets the application requirements."

https://www.ibm.com/docs/en/zos-basic-skills?topic=concepts-...

chaxor · 2024-02-07T02:51:56 1707274316

I'm lost in this thread.

This whole time I just thought mainframe was an older style word for a large rack based server or server room. Like 'cloud' storage for someone's computer running miniserve.

What is the real difference between a mainframe and, e.g. a rack full of H100s, or rack full of 100GBps networking stuff, or some nice stack of 12x blades with 8x 256 core CPUs?

Why or how does a "mainframe" have more power than that?

zie · 2024-02-07T03:30:42 1707276642

Mainframes deliver reliable uptime. They so far are the only ones that can do it reliably, for decades.

Essentially the bunch of boxes model(k8s being the new kid on the block) has been trying (and mostly failing) to provide what mainframes have been providing for 60+ years.

Which is being able to treat your workloads as just a random virtual job you can push wherever and let it run while also giving you ridiculous uptime.

Mainframes are basically the hardware infused uptime deliver machines. They can and will offer 5 9's without any trouble. AWS, Azure, Google's cloud, none of them can deliver that amount of uptime, they ALL have failed repeatedly, so much so, that they purposely try to obfuscate their downtime records. Many don't make any historical data available.

k8s and the like have been trying and failing at reliable uptimes. Sure we've arguably been making some progress, but your average self-hosted k8s team has full-time dedicated teams of people that do nothing but babysit k8s. How many staff do your average mainframe org dedicate to keeping the mainframe alive? Usually 1 person, maybe two. Of course the price you pay IBM or whoever you choose as a mainframe provider will help offset the staff savings from your k8s team :)

It's not about raw compute power. It's about keeping a workload alive for as long as you can deliver power to the mainframe. i.e. the Mainframe promises to deliver uptime for as long as you can keep power to the machine. As parts fail, seriously any part: memory, CPU's, disks, backplanes, it doesn't matter. Mainframes can route around the failed part and you can replace it without turning anything off or affecting your workload. This means your mainframe is sized larger than your workload of course. It's not like the fundamentals of compute change in that regard.

The question then becomes, is the juice worth the squeeze? If your entire business model requires uptime, then you best really, really care about uptime. There is a reason the Visa and Mastercard networks have basically never, ever been down. It's because they know their business only exists as long as their network works. When you want uptime at all costs, you don't run k8s(or whatever the latest craze is tomorrow), you run a mainframe.

Most of us get more uptime than we need with insert favourite cloud provider here. Uptime isn't something they actually sell when you read the contracts you sign. Uptime is just marketing spam.

justinclift · 2024-02-07T05:08:45 1707282525

> They so far are the only ones that can do it reliably, for decades.

AFAIK VAX's are still around, though good luck trying to buy a brand new one. ;)

zie · 2024-02-08T14:58:51 1707404331

VMS still exists and is ported to x86! :)

I should point out, I was talking about mainframes in general, to include Vaxen, Fujitsu, Unisys, etc. IBM isn't the only mainframe game in town.

justinclift · 2024-02-09T02:27:19 1707445639

The term "mainframe" has never included Vax's. Not sure why you're trying add it now. ;)

zie · 2024-02-09T03:47:56 1707450476

The VAX9000 was the end of their mainframe line of VAX machines: https://en.wikipedia.org/wiki/VAX_9000

sillywalk · 2024-02-09T19:36:55 1707507415

That was a disaster for DEC. It cost $Billions to develop and was barely faster than an NVAX that fit into a desktop case.

zie · 2024-02-12T15:03:13 1707750193

Indeed, they definitely screwed that one up.

justinclift · 2024-02-09T05:09:36 1707455376

Oh wow, hadn't heard of that. I'd only heard of them doing minicomputers.

That being said, from that their mainframe effort didn't last long.

So they're not a mainframe vendor any more. Their minicomputers were reliable across decades, rather than their mainframes. ;)

zie · 2024-02-09T15:07:02 1707491222

Well that was their last mainframe model. Yes, VMS still exists, but only on x86 now. VAX has been dead and buried for a long time now. They had a whole different compute architecture or two between then and now.

VAX -> Alpha -> Itanium -> x86.

I worked on VMS at the tail end of the VAX days and then through the Alpha days. Overall I thought VMS was pretty neat. Built in file versioning, the cluster stuff was awesome, arguably better than what k8s and the like can even provide now. They had uptimes in decades, that's not something k8s or any of the other new clustering stuff can even dream of achieving right now.

justinclift · 2024-02-10T00:23:22 1707524602

Yeah, I got an account with VMS Software a few months ago.

However, their website and staff members are so brain damaged that I just couldn't bring myself to waste time with it.

Specifically, their website can't even take apostrophe's (nor many other non alphanumeric symbols) in passwords.

Reported that to them, and their staff not only didn't give a shit, they were adamant that only "$#@!%*&" are all the non alphanumeric characters anyone might ever need (in passwords). :(

Thus my complete lack of interest in VMS from that point onwards. ;)

Thaxll · 2024-02-07T04:02:06 1707278526

You oversell mainframes, not sure why you're mentioning k8s but Google which brings 10x more money than visa and MasterCard combined does not use mainframes at all but k8s or equivalent and yet they don't have downtimes either.

With k8s on a major cloud provider you get 99.99% for what? 150$/month? And there is no maintenance what so ever.

Full team managing k8s is a lie, even on prem, what exactly there is to manage on a daily basis?

zie · 2024-02-08T22:09:08 1707430148

You are talking about hosted k8s, which is not what I'm talking about. If you have never worked on a on-prem k8s team, you are totally missing out! :)

k8s is brittle, all the other competing tools are also fairly brittle, so it's not like k8s is really alone here. This is why we keep replacing the newest mess of virtualization every once in a while, someone eventually gets fed up with whatever the current crap is and writes a new one. It becomes popular, breaks in new and unique ways, rinse and repeat.

k8s is not even a decade old at this point. Mainframes have been around with 5+ 9's of reliability for 6+ decades.

I'm not overselling what mainframes provide. I think you just have selective memory.

99.99% is not what Mainframes provide, they provide many more 9's than that think 7+.

Most of us don't even need 99.99% uptime, so most of us probably shouldn't be buying mainframes. If you DO need severe uptimes, then you either have a huge oversubscription and dedicated teams of people around the clock babysitting things or you buy into mainframes. You absolutely don't buy AWS or Google Cloud or Azure and say, that's good enough, because their uptimes are just marketing speak, not reality.

Keyframe · 2024-02-07T06:20:31 1707286831

Even if it were true what you're saying, z16 for example targets different needs. Can you scale up to seven or eight nines, and even if you could - could you for same price? That's 3 seconds or 300 milliseconds of downtime per year, reliably. Along with predictable high performance (vertical), hotswap anything including memory, resilience with hardware under provisioning, etc. That's what these beasts are for.

jimkoen · 2024-02-07T04:11:35 1707279095

> With k8s on a major cloud provider you get 99.99% for what? 150$/month? And there is no maintenance what so ever.

Gross understatement of costs don't you think? Last time I checked it was a base fee of 150$/month for managed k8s + whatever computing resources you end up using.

andyferris · 2024-02-07T03:22:51 1707276171

I think a lot of people might say only IBM sells mainframes - the “z” system.

It’s designed to work as a single high-availability machine with a focus on I/O speeds, rather than a PC cluster that implements high availability at the application level and communicates over IP. You can hot swap faulty components without turning it off. I haven’t ever used one, but I believe you write application code without worrying about such details - the hardware and the libraries you link against will just “make it work”.

Perhaps my knowledge is slightly off though.

rcbdev · 2024-02-07T06:17:22 1707286642

Many companies also still rock Fujitsu or Unisys mainframes - IBM isn't the only player in this town. But at least Fujitsu's mainframe EOL is in 2035, so that won't be more than historical trivia soon.

jhallenworld · 2024-02-06T20:14:13 1707250453

Well you can run Hercules, the emulator. But you will then discover that the mainframe 3270 terminal user interface experience is terrible. A funny thing is this: there is a mainframe plug-in for Visual Studio Code. The idea is you do your COBOL development in Code, and only the plugin actually talks to the mainframe.

A decade ago there was an Eclipse version of this..

This guy is showing all three ways:

https://www.youtube.com/watch?v=_CYUYnKim7U

ethbr1 · 2024-02-06T21:22:32 1707254552

> the mainframe 3270 terminal user interface experience is terrible

[... for new users]

For expert users? I'd pit it against any interface you want.

No-mouse + hands-on-keyboard + tab is hard to argue with.

Sadly, the number of people who remember when GUIs could be driven from memory by keyboard-only has dwindled.

Watch a 10-year veteran insurance employee use a green screen.

Or as a counter-example, try to tab through your modern app du jour. (Modern developers often don't even bother to set indexes correctly)

jl6 · 2024-02-06T22:13:45 1707257625

> Watch a 10-year veteran insurance employee use a green screen.

Yes! I knew an Insurance customer service op who self-described as having no computer skills, but when I saw her using Reflection it was like watching a speedcuber savant.

ethbr1 · 2024-02-06T23:21:44 1707261704

The spit take for me was doing process documentation, screen by screen.

I would explain to them to stop after each screen or keystroke, so I could document it.

Despite being smart people and understanding the ask, almost every SME I worked with failed at some point, blazing through multiple screens without even realizing it.

That's some strong muscle memory there!

bitwize · 2024-02-07T01:03:25 1707267805

My mom, who worked in insurance customer service, was very quick with the green-screen interface. Not so much the J2EE replacement for said interface.

msla · 2024-02-06T23:51:14 1707263474

> For expert users? I'd pit it against any interface you want.

Linux CLI with a good modern shell and Python versus 3270 block mode and Rexx would be an interesting match up.

Similarly Emacs vs XEDIT.

lproven · 2024-02-08T15:04:59 1707404699

> Sadly, the number of people who remember when GUIs could be driven from memory by keyboard-only has dwindled.

This is so very true, and it's tragic.

Watch an experienced keyboard user, such as many blind users, navigating Windows and it's a wonder of efficiency and speed. It makes the Vim prowess of many Linux folks look very sad, because Vim is just a single editor, while with Windows' keyboard UI, the entire OS and all its apps are operable at this speed. Desktop, file manager, all bundled apps, all 3rd party apps... everything has one global unified keyboard UI.

elzbardico · 2024-02-07T08:58:08 1707296288

There was a big appliance/tv/stuff retail chain in my home country that create a fancy new web app for their sales personnel. If you were smart you'd always look for the older salesman, that would ignore the new web app shell and use the old mainframe app from a terminal emulator. it made the sales experience like 3x faster.

pmontra · 2024-02-06T21:31:09 1707255069

Except when you must set tabindex="0" or "-1" or Bootstrap's modals won't work. /grin

bitwize · 2024-02-06T20:39:01 1707251941

This kind of thing goes back five decades, actually. One of the earliest versions of Unix was PWB/Unix, for "Programmers' Workbench". It was designed as a system for AT&T engineers to author programs for the mainframe in a more sensible (for the time) environment and included software to submit jobs to the mainframe once the programs were written.

The idea of "Unix as your IDE" is a deep, deep cut into Unix lore.

https://en.m.wikipedia.org/wiki/PWB/UNIX

andyjohnson0 · 2024-02-07T10:56:31 1707303391

> But you will then discover that the mainframe 3270 terminal user interface experience is terrible.

I was an intern at IBM UK in the late 80s and I had a 3279 on my desk connected to a 3090 mainframe. My recollection is that it was fine for editing and running code: sharp, bright text and snappy response. The clicky keyboard was nice too - probably the best I've ever used in the decades since - although the height of the enclosure would probably hurt my wrists now. The key caps had APL symbols on them too, which was interesting and prompted me to learn about that language a bit. Mostly I was writing APL and REXX code for image processing.

Some of the f/t employees I worked with had worked on the development of the 3297 at Hursley Park. They had some interesting stories.

CountHackulus · 2024-02-07T00:13:57 1707264837

There's also an IBM-built emulator that has at some point shipped with Eclipse. I used to work on it. The 3270 was fine, just took a bit of getting used to. JCL was the real horror show.

rbanffy · 2024-02-07T06:01:49 1707285709

JCL is notable for being considered the worst programming language ever even by its own creators.

pjmlp · 2024-02-07T09:13:36 1707297216

Somehow I think it is still better than the YAML spaghetti that is used for the same purpose nowadays.

rbanffy · 2024-02-07T11:14:05 1707304445

"JCL: even YAML is better"

p_l · 2024-02-07T12:55:08 1707310508

The very argument that was posited by JCL creators about why it was a bad language is one that hits YAML very, very, very hard.

The creators of JCL admitted later that the original sin of JCL was that it explicitly tried to NOT BE a programming language. Everything "programmable" evolved out of real-world needs, creating the mess that it has become.

You can see it today with "programming in YAML", whether it is Ansible, Kustomize, Kyverno (K8s manifests get a pass because they are just serialization of non-programmable thing), or any other weird "we started to add functions to JSON/YAML dumps"

rbanffy · 2024-02-07T16:24:14 1707323054

I keep thinking of writing a Terraform equivalent in Lisp. Would also be handy to replace configuration files.

p_l · 2024-02-07T17:47:29 1707328049

I'd say it might be very similar to Pulumi or Chef.

From my own musings about making such a tool :)

rbanffy · 2024-02-07T20:30:02 1707337802

One day Terraform will anger me enough that such unwise endeavour is inevitable.

gtirloni · 2024-02-06T20:53:08 1707252788

> Mainframes are unbelievably powerful, feature-full, and cost-effective.

I really doubt they are any more powerful than the best x86 rack servers. Their featurefulness is super dated. They are definitely not cost effective (good luck buying it from IBM without a huge contract).

I pity the company that gets into mainframes today. Does that even exist?

It's a proprietary nightmare.

coolkil · 2024-02-06T22:04:35 1707257075

I also think most cloud providers are at least a vendor lock-in and maybe a proprietary nightmare. This way you at least got your own iron and ibm comes to fix parts when they break. (Without disabling your system)

I do agree it is debatable whether it is beter than x86 most of the time cost savings projected when using a LinuxOne it is based on licensing. In the past oracle licenses where per core and we did a project where 8 ifl’s replaced 140 x86 cores

dhosek · 2024-02-07T04:02:10 1707278530

I worked at a large health insurance company a few years back. They had a DB2 server running on an IBM mainframe that they were working on replacing with a data lake and cloud-based NOSQL solutions.

It seemed rather pointless to me. The DB2 server was serving queries just fine and when we were working on setting up a microservice that would allow viewing claims history, the storage and server costs for being able to provide queries against three years’s claims were prohibitive.

But they were still persisting down that road when I left.

cbsmith · 2024-02-06T20:56:28 1707252988

IBM's proprietary nightmares have, historically, often been quite cost effective. AS/400 did eventually get too long in the tooth, but it had an amazing run for decades where it was extremely competitive for TCO.

rbanffy · 2024-02-06T22:44:20 1707259460

AS/400 is alive and well. It's now called IBMi and runs on POWER hardware.

cbsmith · 2024-02-07T00:21:47 1707265307

Yeah, but they aren't so dominant on the TCO side as they once were.

gtirloni · 2024-02-06T23:07:02 1707260822

Historically sums it up.

linksnapzz · 2024-02-07T00:37:35 1707266255

"Powerful" is irrelevant if the hw isn't up when you need it. The current sophistry is that the complexity of multiple "redundant" examples of cheap hardware always costs less than a suitable (or small number ) of better systems.

alfalfasprout · 2024-02-06T20:34:05 1707251645

And that lack of access is exactly why there's such a minimal ecosystem for open source tooling for mainframes.

fweimer · 2024-02-06T21:00:23 1707253223

Open source software shouldn't be a problem. Pretty much anything that has been ported to 64-bit big-endian also works on s390x. The architecture is otherwise pretty non-challenging: always 4K pages, strong memory model. Pre-built binaries are a different matter, of course.

The problem with access is not access as such, but the fact that the community hardware that you can access for free tends to be heavily oversubscribed (or maybe connected to a mismatched storage system).

jabl · 2024-02-07T07:05:46 1707289546

I've never used a mainframe, but I did manage to introduce a small performance regression in gcc that someone complained about (I did fix it, yes). ;-)

Turned out that s390 couldn't fuse cmp+jmp when the operand was a 8-bit value (_Bool). (That was a while ago, maybe current generation mainframe HW can do it?)

rbanffy · 2024-02-06T22:45:16 1707259516

That is something that concerns me, but much more on the Z side than on the LinuxONE. On the Linux side, it's just a very fast computer.

sillywalk · 2024-02-06T23:54:27 1707263667

"Only" $5,980.00 / user / year for the IBM Z Development & Test environment "personal edition"

https://www.ibm.com/products/z-development-test-environment

vibepaaac21 · 2024-02-06T22:40:19 1707259219

It should be noted that I believe you can get access to Linux s390x boxes in the cloud for not very much money.

It's the z/OS side of things that tends to be prohibitively expensive. Thats almost entirely due to software licensing and not hardware though. The hardware is really not much worse than POWER or any other non-x86 specialty hardware.

So if you are making a product that could be big in the kinds of places that have a lot of mainframes, and you are offering an on-premises version, I'd say to go ahead and get a port to s390x tested (on Linux). It's very cheap to do and a potential selling point for a lot of large companies. Then from s390x to z/OS Unix it isn't all that much more work if you decide to take it all the way and offer official support, i.e. if somebody is going to sign a big contract where mainframe integration would be the differentiator between you and a competing product. Once you've got commitment for a contract of some kind I'm sure there are ways to get the access you would need to test under the expensive software.

yjftsjthsd-h · 2024-02-07T04:41:37 1707280897

> I believe you can get access to Linux s390x boxes in the cloud for not very much money.

Could you say where and what "not very much money" is? Like, are we talking $5 DO droplet equivalent or only a few hundred bucks a month? (but to be clear, this is a serious question; I'm a sucker for weird options and would be interested in running a s390x Linux box if it doesn't have too bad price+caveats)

electroly · 2024-02-08T06:10:17 1707372617

Go to https://cloud.ibm.com/vpc-ext/provision/vs and choose North America > Washington DC > Washington DC 2. Click "Change image" and choose "IBM Z, LinuxONE", then pick Ubuntu. I get $0.09/hour for 1 vCPU/4GB RAM with storage extra.

For fun, then try picking z/OS instead of Linux and see what that software license costs ($/hour). Take a guess first how much you think it'll be and then see if you're right.

rodgerd · 2024-02-07T03:42:49 1707277369

> Mainframes are unbelievably powerful, feature-full, and cost-effective.

Having run IBM's own software (WAS and WAS derivatives) on both x86 and IFL, and found the same applications with the same workloads having a ratio of 1.5-2 x86:1 IFL, while the IFL itself is 1-2 orders of magnitude more expensive than an x86 core, that zVM is an order of magnitude more expensive than VMWare's hypervisors, that memory is two orders of magnitude more expensive than x86... well. I think you're long on the evangelism and short on the reality.

madmulita · 2024-02-06T21:30:22 1707255022

One huge problem is you then have to deal with IBM.

jsiepkes · 2024-02-06T22:44:26 1707259466

...and you can check-out, but you can never leave (aka vendor lock-in).

rbanffy · 2024-02-06T22:51:34 1707259894

LinuxONE is just one big Linux machine. There's very little to lock you in.

IgorPartola · 2024-02-07T02:02:44 1707271364

Not many manage to get off AWS once they start either…

Thaxll · 2024-02-06T21:17:43 1707254263

Cost effective is a stretch, I'm going to take a dumb example but the reliability of AWS step functions and S3 is good enough for 99.99% of most use cases for a 1/100 of the price.

Mainframes are really powerful for some specific domain which most people on HN don't work in.

Another take is that the most valuable company in the world ( tech one ) don't run their core businesses on mainframes.

mschuster91 · 2024-02-07T09:13:01 1707297181

> Mainframes are really powerful for some specific domain which most people on HN don't work in.

Yeah, but the cases where mainframes run the show are the ones where stuff really matters, where actual human lives and existences are at stake: banks, insurances, governments, airlines, logistics. If Google has a day long outage, not much of value will be lost... but a large US bank, stock exchange or MC/Visa suddenly failing? That can be enough to trigger an actual bank run with all its consequences.

(Of course, the argument can and should be made that no company should grow large enough to even be able to be such a threat, but that one will have to wait until at least 2028 to get solved)

wmf · 2024-02-06T21:18:37 1707254317

You are almost certainly better off buying a low-end mainframe than hiring a team of engineers to build yet another not-great data replication/recovery/HA system.

To be clear, this only applies to regular mainframes running z/OS. LinuxONE machines, as the name implies, only run Linux where you'll have all the usual devops/SRE responsibilities.

rbanffy · 2024-02-06T22:52:40 1707259960

The machines are still incredibly reliable, much more than a rack full of Dells.

themoonisachees · 2024-02-07T08:26:51 1707294411

How so? What problems have IBM solved that dell can't figure out? In my experience sysadmin/SRE work is rarely about the reliability of the hardware and much more about machine management like predicting when you're running out of disk space etc.

tines · 2024-02-06T20:13:23 1707250403

What even is a mainframe, exactly?

abtinf · 2024-02-06T20:18:14 1707250694

A modern mainframe is a system specialized in extremely high transaction throughput at extremely low cost-per-transaction, while guaranteeing data durability and computational correctness.

tines · 2024-02-06T20:29:36 1707251376

Sounds like a computer. Why would getting access to one to play around encourage people to buy one? In other words, it sounds like a quantitative difference rather than a qualitative one.

zeroCalories · 2024-02-06T22:51:11 1707259871

My understanding is that Z mainframes have a number of unique features to support those use cases. Stuff like hot swapping CPUs, and hundreds of IO coprocessors to avoid the main cores from getting blocked. Don't think they're just rebranded x86 machines, but not an expert.

jsight · 2024-02-06T21:07:14 1707253634

True. I find it interesting how difficult that they find it to put the benefits into words that don't sound like a brochure.

Surely there should be some specifications behind this? Benchmarks?

gitfan86 · 2024-02-06T23:59:47 1707263987

I worked on them a bit while at IBM.

You can open a terminal and come back in a month and it will still be there. Unlikely kunernetes where containers regularly go down.

You can obviously achieve high reliability with Kunernetes, but you'll need queues and retry logic, that you don't need with SystemZ.

_joel · 2024-02-07T10:00:02 1707300002

> You can open a terminal and come back in a month and it will still be there. Unlikely kunernetes where containers regularly go down.

You mixing apples and oranges there, by comparing kubernetes workloads to mainframes. Kubernetes isn't really designed to serve long-persistent workloads of that fashion. Although tbf I've had VM's that last for years and real hardware (albeit Sun) that's had over a decade of uptime, so I'm not sure what all the fluff is about.

gitfan86 · 2024-02-07T12:29:52 1707308992

Both have a goal of high reliability. They are both solving the problem of running you application on a single server.

_joel · 2024-02-07T19:40:22 1707334822

Kubernetes runs across multiple nodes to achieve that, that don't need to be in the same datacentre, though.

tryauuum · 2024-02-07T01:10:07 1707268207

Is this where we are right now as a society: a terminal which doesn't die in a month is presented as an achievement?

I assume the achievement part is that any CPU can be replaced transparently to the terminal running?

vidarh · 2024-02-07T04:13:46 1707279226

I've not worked with a mainframe, but I've worked with an IBM storage system that worked on similar principles: We could connect our systems via dual controllers, to separate bays of controllers on the storage array. You could pull whole bays of controller cards and the system would stay up. You could pull whole bays of hard drives, and the system would stay up. You could pull power supplies and it'd remain up. You could swap RAM and CPUs in the servers managing it without shutting them down, but you could also pull one of those servers, and it'd remain up. If stuff started failing, and IBM engineer would show up because the system would call home. This was around 25 years ago.

It wasn't cheap, but it made a typical "modern" high availability setup look like a crude homemade toy.

But as impressive as it was, there are just very few places where that impressiveness provides enough value to justify the cost. And having to deal with IBM.

gitfan86 · 2024-02-07T01:16:28 1707268588

Yes, Hard drives can die, power supply can die, another user can write an infinite loop and it will not kill your terminal.

Not a big deal today, but in the 80s hard drives died more frequently and a bank losing just 5 seconds of transactions could cost them millions.

rbanffy · 2024-02-06T22:54:29 1707260069

Anandtech ran a story when the z16's CPU was announced:

https://www.anandtech.com/show/16924/did-ibm-just-preview-th...

mixmastamyk · 2024-02-07T02:59:39 1707274779

Hah, try getting ECC memory from a PC supplier. Surprisingly difficult and expensive when it should have been default twenty years ago.

themoonisachees · 2024-02-07T10:16:50 1707301010

ECC memory is widely available, you can buy it on amazon, and if you have a contract with a server manufacturer they'll be happy to sell it to you.

It's just not compatible with consumer CPUs (well, ddr5 is, but ddr4 and lower wasn't) because Intel was voluntarily segmenting the market to upcharge for server CPUs.

mixmastamyk · 2024-02-08T02:48:46 1707360526

Sounds like a working solution is not as “widely available” as you suggest. Apple/Amd play similar games.

beAbU · 2024-02-06T21:03:48 1707253428

I see it as similar to an on prem mini AWS cloud, with dedicated hardware for things like WAF, storage, compute, networking, lambdas and databases.

I've heard it said a couple of times that working in cloud is closer to working on a mainframe than most want to admit.

teepo · 2024-02-06T23:37:43 1707262663

That’s a good analogy. They are both platforms. The accelerators for things like Java, MQ, databases, are built in to the platform.

beAbU · 2024-02-07T06:47:39 1707288459

Yep, cloud is really just an infinitely scalable mainframe.

sillywalk · 2024-02-06T23:05:17 1707260717

The IBM mainframe: How it runs and why it survives[0] from last fall.

From way back in 2004[1], Commercial Fault Tolerance: A Tale of Two Systems compares the reliability / philosophy of IBM mainframes with Tandem Servers.

[0] https://arstechnica.com/information-technology/2023/07/the-i...

[1] http://www.engr.newpaltz.edu/~bai/CSE40534/bar04.pdf

Keyframe · 2024-02-06T21:46:22 1707255982

Dave Plummer recently did a tour: https://www.youtube.com/watch?v=ouAG4vXFORc

It's definitely intriguing if it were.. I don't know, available to tinker with?

dgacmu · 2024-02-06T20:44:43 1707252283

The other definition you got isn't bad, but I think it misses the point for this discussion: it is a rack or multi rack scale computer with built-in and centrally managed features for redundancy, failover, job allocation, scaling, etc.

As an example, in many main frames, you can configure them with a spare set of CPUs, and if one of your CPUs fails, the replacement is brought online automatically and transparently, the code you write doesn't have to know about anything related to the failure.

rbanffy · 2024-02-06T22:57:05 1707260225

One neat thing is that the hot spares can be used during boot to shorten the time to availability. Not sure LinuxONE boxes are bought like that, but for the Z-series you can buy the machine with more capacity than you pay for and pay for it to be enabled at a later point.

rahen · 2024-02-07T23:20:23 1707348023

I'm astonished that people on Hacker News can ask such a question.

zjaffee · 2024-02-06T20:53:34 1707252814

It's essentially a very large single clocked machine.

yencabulator · 2024-02-07T22:38:53 1707345533

> The reason you don’t is because it is basically impossible to get access to a mainframe you can play with and learn on.

Once upon a time, we made a product that was selling to Fortune 500s, and IBM loaned one of the smaller zSeries to us to make sure the product was running well in a Linux VM on it. The thing only fit in the elevator after we took it out of the wood crate...

A lot of the porting effort happened on Hercules, an emulator: http://www.hercules-390.org/

hobobaggins · 2024-02-07T00:17:23 1707265043

unbelievable powerful (within certain parameters), and definitely feature-full when compared to virtually any other highly-available platform.

however, cost-effective? only if you're sitting on piles of cash.

IBM mainframes are unique. They have 75 years of incredible research (IBM is often the #1 patent inventor in the world, year after year) into a platform that they completely control. This has produced an absolutely bulletproof product. But you gotta pay for that.

Your other point that it would take a team of engineers to not-even-compete; yes, you're right about that.

I'd love to own one, but I'd rather own my house first.

unixhero · 2024-02-07T01:35:21 1707269721

Yeah. Because of this animosity to learners, I block any purchase of IBM hardware or software. I carry a grudge.

madleprechaun · 2024-02-07T07:12:37 1707289957

I did a mainframe apprenticeship for three years at IBM. Trying to learn anything on my own was damn near impossible. And that’s not even getting into the fact that IBM basically has to pay itself for access to some things between divisions. Though that could have been because my division was mostly ignored and eventually spun off.

bluesounddirect · 2024-02-07T19:39:30 1707334770

So ibm cloud question. When ibm sells powervs vs a powerpc based setup in ibm cloud , which would land on a LinuxOne setup?

656565656565 · 2024-02-06T20:59:00 1707253140

656565656565 · 2024-02-11T19:11:36 1707678696

downvoted so adding more this

kjuulh · 2024-02-06T21:13:51 1707254031

I've previously worked on mainframes (building business software, and worked quite low level with crypto libraries on them), and now do some of the same but for "cloud" based software instead.

Mainframes are such a weird subject to me, on one hand, the results we got out of the mainframes were pretty amazing, but the experience was painful, absolutely, painful. The tooling is some of the worst you've ever seen, official tools feel like something people dreamed up in winforms. I worked mostly with PL/1 which was a great language, stuck in the 80s but great nonetheless, I actually prefer it to C.

What killed it for me was the restrictions, you had to either dynamically or statically link programs, but do it in the IBM flavor and build system. Which made it extremely cumbersome to do, so files were 20k SoC with a single entry (main) and function calls, because creating files was a pain. Line limits of 72 characters, we even had to send in our programs to get syntax checks because the IDEs weren't capable enough (this was in 2018), now I could whip up something to emulate it in docker and neovim, but the people that taught me had no idea there was something better out there and neither did I. We had to release once a week on saturday mornings because the execution model used had to have downtime during the deployment.

Mainframes are cursed because of the lack of tooling, and the restrictions. I think Mainframes and the execution mode it uses could be useful in a bunch of other places, but IBMs business model just continuously tightness the screws and scared people away. It doesn't have wide spread appeal either, it is "Business machines" after all.

It should be said that I worked in IBMs version of Serverless (not what it was called but it was pretty much what it was), mainframes support Linux, Java, etc, etc. But you give up a lot of benefits and performance if you move off, of the native way of doing things.

I'd also argue that with todays tooling and hardware you can get comparable performance to a mainframe, the only reason that mainframes are as performant, and was so dominant, is because of the restrictions it places on the developer. You have to write low level C, PL/1 or Cobol, so often by proxy your software is just fast. Kind of like Rust or C++ is nearly always quite performant out of the box. Most business software just don't need that level of latency control and performance today.

vibepaaac21 · 2024-02-06T22:08:06 1707257286

The fact that it sorta forces you to a lower level of abstraction is actually a very interesting topic and something I think about a lot in the context of how mainframes have remained relevant.

If there were a way to restrict non-mainframe ecosystems in a similar way, I think it could provide a lot of value. Especially for large organizations where standardization and prevention of IT churn is a serious problem. I just don't see how you do that in an open platform, and so I don't really think it's viable outside of the mainframe space (although it certainly explains part of why a lot of orgs still choose mainframes despite knowing all of inherent difficulties the platform presents). It's a people problem not a technical one at the end of the day.

I'll add that a lot of the tooling deficit has been resolved in the mainframe space, but often in a way I'm not comfortable with. It's normal these days for mainframe developers to not use the old school ISPF development environments, but the thing about the awful old tooling, is that it was brutally efficient. Now you are likely to be giving up a significant amount of that efficiency because you've replaced it with a bunch of JVMs running web services.

There is in some sense an unavoidable tradeoff between modern conveniences and efficiency that I don't think there are any easy ways to get around. We can't have our cake and eat it too, and you cannot have mainframe level performance without it just being a bunch of COBOL and C.

This applies just as much to Linux servers as it does mainframes. I'm not really even commenting about the architecture. There are similar tradeoffs involved in how we choose Linux distributions, although perhaps a little less extreme.

gary_0 · 2024-02-06T21:05:22 1707253522

I think this is the result you'd get if Oxide Computer went through IBM's digestive tract.

swozey · 2024-02-06T20:44:28 1707252268

What exactly is considered when putting a dc together that jumps you from knowing that instead of maybe a very large/massive cluster of whatever 1024 core servers your project needs a mainframe?

I've been in cluster infra for like 15 years and couldn't tell you pretty much anything about mainframes other than a few names of them and processors.

Is it some sort of determined raw performance metric, teraflops of some processing work?

jl6 · 2024-02-06T22:33:34 1707258814

The #1 reason by far for buying a mainframe is that you already have mainframe-based applications that you need to keep running.

I’m not sure what kind of orgs (if any) are buying mainframes today as new customers, but I imagine it would have to be a use case something like… a niche scaling challenge, or an outsourcing arrangement where the relationship with IBM is not just vendor lock-in but is some kind of partnership, or you have specialist compliance requirements, or contractual service level obligations, or you need to provide absolute guarantees about the performance/availability/security of the whole stack from hardware up. Some organizations will pay $$$$$$$ to have one ass to kick rather than herd multiple vendors.

indrora · 2024-02-11T11:46:49 1707652009

Mission criticality, insane parallelism, and a few others.

With modern mainframes, you can configure two processors (or more!) to run in perfect lockstep. If one of them disagrees, the system can immediately see why and how. Run three processors in lockstep and you have quorum: One disagrees? Kill it, clone one of the others, keep going. Or, isolate it, snapshot the disagreement, hold it hostage (but alive!) and bring in a clone of the others.

One of the guarantees of IBM Z/System is "Every read is perfect, every write works." You don't have 2-3 bits of ECC, you have nearly fully redundant RAM. Oh now it has to be encrypted in transit, at rest, and in process? Not a problem, it can do that without changes in code.

IBM's mainframes take SIMD a step further. Let's say you're parsing JSONL. With a little work, you turn a traditional serial process into a parallel one, and what would be a massive map-reduce problem is just a matter of openMP coordination. Or better yet, since it's just lines, you can make the system pretend it's a tape full of records that happens to be interleaved by, magically, a bunch of local cores! Amazing.

Oh, and now the boss says you need to have physical redundancy. Why stop there? Run your application, in lockstep, across multiple systems separated by miles. Don't need it in lockstep? That's fine, your application still thinks it's running inside one head when it's running inside many independent heads. Oh you need to take down us-west-2b? Hold on let me move that entire workload to us-west-1a.

There's also Kubernetes... Here's a good talk on such things: https://www.youtube.com/watch?v=3-NC6ntYAy4

azinman2 · 2024-02-06T21:50:30 1707256230

“As businesses move their products and services online quickly, oftentimes, they are left with a hybrid cloud environment created by default, with siloed stacks that are not conducive to alignment across businesses or the introduction of AI.”

Word smithed to death corp speak.

klysm · 2024-02-06T23:28:57 1707262137

Can't be leaving out the introduction of AI! Surely these mainframe systems would benefit from some neural networks!

gottorf · 2024-02-06T19:51:12 1707249072

Yours for only $135k (starting)! Supposedly it's designed for eight nines (99.999999%, or ~30 seconds of downtime a year) and a MTBF in the decades. I'm curious if at that point, assuming it lives up to it, if other factors outside of it, like DC power and cooling availability or internet connectivity, won't also be difficult to achieve.

kens · 2024-02-07T02:12:15 1707271935

No, eight nines is 316 milliseconds of downtime per year. 30 seconds of downtime a year is six nines. (I spent a long time working on high availability at Sun, so the numbers immediately looked wrong.)

https://en.wikipedia.org/wiki/High_availability#Percentage_c...

gottorf · 2024-02-07T02:21:56 1707272516

You're right, my mistake.

neverartful · 2024-02-06T19:57:49 1707249469

The $135k is just the hardware price for the base configuration. I'm sure you're required to license several software SKUs that are not cheap and that you have to pay annually for maintenance.

tyingq · 2024-02-07T01:54:59 1707270899

True. Here's the footnote on that $135k:

"5 This price reflects the base hardware configuration, and does not include additional items, maintenance, the operating system or other software. All prices are in USD. Prices shown do not include tax. Price will vary based on country and currency."

https://newsroom.ibm.com/2024-02-06-New-IBM-LinuxONE-4-Expre...

rbanffy · 2024-02-06T23:00:26 1707260426

I don't think that's true for LinuxONE. One of the points is that it behaves as a large and fast (with lots of dedicated acceleration hardware) Linux machine.

neverartful · 2024-02-06T23:17:03 1707261423

Really? No license for PR/SM, z/VM, etc.?

benjarrell · 2024-02-06T23:25:37 1707261937

You also need an IBM storage array

neverartful · 2024-02-06T23:39:55 1707262795

You probably do need a storage array, but I doubt it has to be from IBM. Ever hear of EMC?

wmf · 2024-02-06T23:34:57 1707262497

I don't think LinuxONE requires that.

jhallenworld · 2024-02-06T20:01:39 1707249699

or, especially, software

It's a z-series mainframe and I don't doubt these reliability numbers.

The world's financial system runs on these mainframes. An undetected bit error at the federal reserve might cause IBM to appear in the news, in a bad way, like Boeing..

It will not have the fastest single-threaded performance, but that's not why you buy it.

The mainframe mindset: Of course each core is running at 100% all the time... I paid a lot for it, so I want my money's worth. z/OS is designed to make this feasible. Not sure how well that works in Linux.. but it's where the market is.

bitwize · 2024-02-06T20:06:18 1707249978

The LinuxONE line is, I believe, a series of Linux-only mainframes; z/OS is not an option on them. They're IBM's entry-level option for businesses who salivate at those mainframe reliability numbers but don't have any OS/360|OS/370|z/OS apps and don't want to pay for the support on those systems.

jhallenworld · 2024-02-06T20:07:12 1707250032

I think you're right it can't run the apps, but I see it's running z/VM as the hypervisor.

coolkil · 2024-02-06T22:09:15 1707257355

Only for Linux guests. It is also possible to run kvm

rbanffy · 2024-02-06T23:00:52 1707260452

IIRC, IBM Cloud's Hyperprotect VMs run on KVM.

bitcharmer · 2024-02-07T10:32:53 1707301973

> The world's financial system runs on these mainframes

No, it hasn't for a long time

Yasuraka · 2024-02-06T20:00:34 1707249634

30s a year is 6 nines, 8 would be 300ms

gottorf · 2024-02-06T21:07:02 1707253622

I believe the ninety-nine before the decimal point counts as two of the nines, but I'm sure different people word that differently.

Yasuraka · 2024-02-06T21:47:21 1707256041

Yes they count and were included

https://www.globalcallforwarding.com/wp-content/uploads/2022...

gottorf · 2024-02-07T02:21:48 1707272508

You're absolutely right! My mistake.

dalemhurley · 2024-02-06T20:23:56 1707251036

So one month of AWS costs.

hobobaggins · 2024-02-07T00:36:13 1707266173

You made me spit my Java out. (and you're right, even for smaller companies)

billforsternz · 2024-02-06T21:55:27 1707256527

30,000,000 (ish) seconds in a year, 99.999999% of that is .... let me see, 99.999999% of 100,000,000 is 99,999,999 I suppose. So downtime is what's left over, about 1 second every 3 years or 300 milliseconds a year not 30s a year. I think.

wmf · 2024-02-06T21:21:53 1707254513

A single box of any type absolutely cannot deliver eight nines. Maybe a sysplex running very limited software...

hobobaggins · 2024-02-07T00:35:25 1707266125

Tell that to my Linux boxes that have been up for years, or others who have been up for decades.

neverartful · 2024-02-06T23:20:19 1707261619

I would suspect that it IS multiple boxes and with z/VM using single system image. This is the first time I've ever heard of mainframe "in a rack". In the past they come in their own cabinet, but if you open the cabinet door you won't see one "box".

wmf · 2024-02-06T23:30:08 1707262208

No, it's a single box with one Telum chip.

neverartful · 2024-02-06T23:40:57 1707262857

Ok, good to know!

e12e · 2024-02-07T14:19:16 1707315556

I mean it's susceptible to fire an ac/cooling failure - other than that? Redundant power, redundant connectivity are both feasible?

shrubble · 2024-02-06T20:22:11 1707250931

Note that IFL = cores that only run Linux and cannot run mainframe OSes. The Telum CPU is 5Ghz and has some level of builtin inferencing accordong to their specs.

However the emphasis is going to be on single system image... you won't need to run/manage stuff, you just kick off VMs as if it is a single huge ESXi host, as I read it.

ehutch79 · 2024-02-06T20:34:01 1707251641

This feels like it's strictly a brochure for management types.

Maybe I'm missing some larger context as to why this is on the front page.

neverartful · 2024-02-06T20:03:56 1707249836

Would be curious to know how it compares in power consumption and heat output; licensing cost of z/VM; is PR/SM (allows you to create and configure LPARs) available? licensing cost?

coolkil · 2024-02-06T21:51:11 1707256271

I worked on multiple LinuxOne systems. You can run z/vm and pr/sm. it will also run kvm if you like.

Power consumption (of our box z13 based)is 3kwh regardless of configuration (give or take) but also consider you need a lot less san switches and networking switches to achieve the same goal as a large rack of individual servers

trackofalljades · 2024-02-06T20:58:25 1707253105

What on earth does "cyber resilient" mean?

rbanffy · 2024-02-06T23:02:08 1707260528

I think it's their parlance for their quantum-resistant cryptographic acceleration, but I'm not sure.

gtirloni · 2024-02-06T20:50:29 1707252629

I thought it was weird IBM didn't mention quantum computing to win the buzzword bingo, but it's at the end.

rbanffy · 2024-02-06T23:06:25 1707260785

They could. IIRC, they have quantum-resistant cryptography baked into their hardware assists.

mise_en_place · 2024-02-06T23:10:30 1707261030

Hybrid cloud with mainframe in-house colo is actually quite a safe play. You are reducing your reliance on providers like AWS, which may start increasing prices in the near future.

cameron_b · 2024-02-06T19:56:46 1707249406

“Compared IBM LinuxONE 4 Express Model consisting of a CPC drawer and an I/O drawer to support network and external storage with 12 IFLs and 736 GB of memory in 1 frame, versus compared 3 x86 servers with two Xeon Sapphire Rapids Platinum 8444 processors with 32 cores each (2ch/32c) with a total of 384 cores”

I’m not a math major, but that’s a bit strange to me.

stonogo · 2024-02-06T20:01:18 1707249678

That adds up to 192 cores, so they probably mean 384 threads.

vb-8448 · 2024-02-07T08:03:52 1707293032

For the one that wonder who is using mainframes nowadays: most banks, insurances, airlines, grocery stores, public administrations still have something running on a mainframe, maybe in their datacenter or in some outsourcer one's.

From a technical point of view, mainframes are really awesome and many things people speak about today thinking it's something revolutionary were already a thing in the 90s (or even earlier) on the mainframe, for e.g. "serverless".

But basically no one is starting a new business on a mainframe, and all sells IBM is still doing are on existing clients.

It's a slowly dying tech.

simne · 2024-02-09T03:52:58 1707450778

Well, I visited page, because IBM mentioned. Nothing new found.

It is known very long time, that IBM hardware is reliable and you must buy it to run their software (to be honest in this case I'm not sure what IBM software I could run on this hardware, but ok). And it is also known, IBM software and services are really good, but expensive, so cost of IBM hardware is not much important if you really need their services.

Thank you for read this.

aetherspawn · 2024-02-07T08:16:47 1707293807

So what does a mainframe cost to buy though? I can get a 40 core server from Dell for $5000 USD so how many R7515's can I buy for the cost of a mainframe?

renewiltord · 2024-02-06T20:34:34 1707251674

Have lots of on-prem hardware. Zero mainframes. Looks like just a computing device that has high uptime? Can see the value in that, but I prefer building reliability on many less-reliable machines. The incremental cost of failure is lower and unused load factor is higher.

rbanffy · 2024-02-06T23:03:50 1707260630

OTOH, it must be fun to built even more 9's of reliability out of many very reliable boxes.

bitwize · 2024-02-06T20:01:15 1707249675

Lol @ the marketing gymnastics. "Hybrid cloud" and of course a namedrop of AI.

What this is is a mainframe, a solution with decades of proof behind it in terms of reliability and I/O throughput. In other words, the advantages of cloud without the cruft and BS. Maybe that's what they mean by "hybrid cloud"?

I wish IBM would come out and say what they mean, but maybe they wouldn't be IBM if they did.

rbanffy · 2024-02-06T23:05:42 1707260742

> a namedrop of AI

The Telum has a lot of its surface dedicated to an inference accelerator. Their aim is to be able to very quickly run fraud detection while processing payment operations.