One of the stories I heard about the 50 is that if you threw it into the ocean it would sink intermittently.
Anyway the 50 was the first computer I was paid to program on. Many were bought to run in 7070 emulator mode as the 50 was the smallest / cheapest machine that could run 7070.
Happily I was on the 360 side.
At another shop, we had 2 50s, one running DOS and the other OS with HASP.
One day a student was having trouble with the Test and Set (superceded by Compare and Swap) instruction not setting the condition code properly. I wrote a short diagnostic program to exercise TS, store the resulting condition codes and dump. The instruction was not performing correctly.
I showed the dump to the non IBM engineers who hemmed and hawed. A couple days later one phoned me up to say that TS wasn't setting the condition code correctly - exactly what I had been telling him.
They fixed the problem. The next day I got a phone call that HASP wasn't starting on the other 50. Took a dump and found HASP was in a wait just after TS.
The engineers had swapped microcode cards between the two machines.
No. The 50 was subject to intermittent failures (whether microcode or hardware I don't know).
SYNCH is how user mode exits are called from supervisor code.
COLT (Canadian On Line Teller) often ran (runs?) in supervisor state and in the bank I was working at the time would initialise a transaction buffer by setting it to all one's - very bad news when that buffer address was erroneously set to zero because some registers were not preserved across pseudo reentrant (a particularly repulsive term) interrupt points.
This invalidated all the New PSWs triggering an interrupt cascade such that the STOP key would not work because the instruction never completed. SYSTEM RESET (courtesy the IBM engineer) did the job.
> At another shop, we had 2 50s, one running DOS and the other OS with HASP.
Nowadays it's worth mentioning that that "DOS" has nothing to do with MS-DOS/PC-DOS on PCs (or any other DOS on any other microcomputer where the names may coincide).
The release dates of the IBM PC and System/360 being closer together than the IBM PC is old today, and the mainframe world being so secluded from mainstream computing, I wouldn't be too surprised if someone thought there were s/360s with an "A:\>" prompt on a teletype somewhere. :)
To the rest of the story, I cannot find the original quote no matter how I search (I'm sure it's out there, Google has just become increasingly worse lately), but I remember a story in a single paragraph of someone who contacted CDC or IBM or whatever, about a seemingly misbehaving instruction. The engineers replied that this was curious, as the instruction in question was usually one of the more stable instructions. That day, the person asking learned that apparently, machine code instructions can be ordered by reliability.
Thanks for the blog post @picture. I worked on emulating the 360/370 instruction set using both an 8086 processor and a bit slice processor, when I was at Big Blue. The hardware and microcode development was insanely complex. This ended up being a shipped product called the XT and AT/370. https://en.wikipedia.org/wiki/PC-based_IBM_mainframe-compati...
Sorry for making it worse, but big/little Endian is nowhere near past us: While most architectures are running little endian now, network byte order (used in IP, UDP, TCP, DNS, etc.) is still big endian. Lots of conversion ensues, so htonl(3) and similar standard library calls are very much alive.
EBCDIC you'll only encounter in the mainframe and midrange world, and it's indeed sobering to realize that its encoding makes perfect sense on punchcards, but none whatsoever on any more modern medium.
ARM, POWER, and several other architectures still exist in big-endian, little-endian, and boot-selectable-endian versions; and some otherwise wonderful ARM chips (looking at you TI Hercules!) are big-endian only.
For any that are selectable, I was careful to say running little endian. :) You are right that big endian still exists, I just wanted to express that a significant part of the world is little endian now.
How did IBM convince Motorola to modify the 68000 and Intel to modify the 8087? Did IBMers do this work, or Intel and Motorola?
When I was at IBM, we were making x86-based appliances for business software. I thought it would be cool if they used a mainframe CPU instead of x86 for them. It would have been slower, but possibly more reliable. Definitely it could have had some marketing cachet.. On the other hand, we otherwise did not want to eat our own dog-food.
> How did IBM convince Motorola to modify the 68000 and Intel to modify the 8087? Did IBMers do this work, or Intel and Motorola?
IBM had Nick Treddenick, the lead designer of the 68K do the work. His book "Microprocessor Logic Design" covers the development of the "Micro/370" in great detail, including flow charted microcode that looks remarkably similar to the flowcharts in Ken's article (though apparently Treddenick used flowcharts for the 68K microcode prior to moving to IBM. Regrettably I haven't actually read through my copy yet, so I can't give any further details.
Ken, if you see this I highly recommend tracking down a copy to read, I think it's right up your alley.
Mainframes are expensive and busy running production workloads. Offloading development and test to franken-PC's running 370 emulation had large cost saving potential. Plus, back in those days the IBM sales organization had a ton of power and could sell anything, the margins were always fantastic, and large numbers of sales reps earned more than the IBM CEO.
I'm not sure VM/370 is less suffering. For example, I run MVS in the typical old 3.8j version (the last public domain one, from I think 1981) as an enthusiast, and it's just amazing how little abstraction there was. MS-DOS practically cuddles you, having an actual file system!
it's just amazing how little abstraction there was
That was deliberate and made a lot of sense. The functionality was split between two programs. The VM/370 part emulates the raw machine. CMS is the "operating system" on top of the VM "hardware".
I used to be involved in Mini sales (System 36). Selling DOS based PCs was a doddle compared to these singularly sluggish and ridiculously heavy machines.
I remember how the microcomputers of the 1980s (Apple ][, C64, TRS-80 Coco) were mainly dead ends. There were possible paths forward (6502 to the 24-bit 65C816, Z80 to the eZ80, ...) and second generation machines like the Apple 2gs, C128 and Coco 3. They were all dead ends, as were the machines based on the 68k series.
There was a strange time in the early 1980s that every computer manufacturer feared obsolescence but machines like the Apple ][ kept selling because there wasn't a path for continuous improvement. Microcomputers at the time were all built around the video display system so you couldn't drop in a 10% faster CPU.
Intel didn't have a clear plan for the 8086, in fact they expected the future to be the doomed i860 or iAPX 432. They stumbled into the 24-bit 80286 which was "brain damaged" in terms of it's memory protection model but just plain kicked ass in performance. I bought a 12 MHz 286 machine in 1987 which was by far the best computer I ever owned, powerful enough that I could emulate the Z80 and use it as a software development workstation for CP/M. That was when the PC crushed everything else.
I could afford a 32-bit 486 and run Linux on it in 1993, and 10 years later I upgraded to a 64-bit version of the architecture.
Like the 360/370/390/z-Architecture the Microsoft-centric PC maintained instruction set compatibility over the years. Yet, that's not the only path to a sustainable computing brand. Apple started the mac on the doomed 68k, switched to Power PC, switched again to Intel, and switched again to ARM.
Microsoft tested the waters for such a migration multiple times but it's never amounted to much.
Sometimes I dream of a world where Apple didn't abandon the ][, released the //gs a few years earlier than it did, and eventually came out with a 32-bit extension of the 6502 architecture.
I lusted after the 68k back when I owned a TRS-80 Coco. Circa 1984 a number of machines with advanced graphics came out based on the 68k such as the Apple Macintosh, Atari ST, Commodore Amiga, and Sinclair QL, not to mention some 'workstation' class machines such as the Sun 3.
On paper the 68k was an attractive machine with lots of registers and the ability to access more memory in a more comfortable way than the low-cost microcomputers (that added bankswitching to evade the 64k limit in later iterations) and the IBM PC (which had a segmentation scheme that in some ways annoying but can also be a lot of fun for the assembly language programmer)
In practice there must have been something wrong with the 68k because even Motorola gave up on it and all of the computer lines based on it either went extinct or made a transition to RISC architecture (Apple to Motorola's Power PC, Sun to SPARC.)
I'd love to hear the real story of why Motorola abandoned the 68k.
I don't think it was a case of the 68k being bad, anymore than the 6809 was bad. I think they had a desire to jump on the RISC bandwagon with the 88000. They didn't seem to be too concerned about backwards compatibility when they did their jumps. The 88000 had low adoption (the price was a bit high) with Data General being the main company buying it (and, if I remember correctly, the bus living on in the early PowerPC).
I wouldn't say they abandoned the architecture, it just wasn't a PC chip. They did try a modernized version with Coldfire, but the original 68K line found a home in PDAs and embedded. DragonBall was pretty successful. The 68K ended up in the non-PC market because most of their customers jumped to RISC and they didn't execute the jump successfully enough to attract customers. Sun had SPARC and HP had PA/RISC.
For a while the 680x0 in Macs, Amiga and Atari was the most popular CPU, and I'm sure the first HPUX workstation I used in my Uni. had a 68040 so even in Unix and NeXT workstations.
But after the 68040 it never seemed to evolve into anything else, and probably got subsumed by the Power
line of CPUs into oblivion.
I would bet it would have worked the exact same way. Compaq's success was the BIOS and Microsoft's willingness to license DOS, and those wouldn't have been changed with the 68008. Given IBM's want for the 48 pin package, Compaq might have had a more interesting machine with the full 68000.
It doesn't matter if it was about the BIOS or whatever else, had IBM won there would be no clone market.
Microsoft could have tried to make MS-DOS portable similar to CP/M or UNIX, but it would be a big what-if regarding Amiga, Atari, Mac, Archimedes and UNIX market.
Had IBM won, most of the fair use we take for granted would have gone out the window. The modern day version of that lawsuit was Oracle vs Google. Compaq started the clone market and I see no reason that their strategy would have been altered in a Motorola vs Intel world.
On my alternative universe, the same people that decided to pick Motorola might also have had the perspicacity of putting in place the mechanism to prevent such actions.
I've heard from someone that used one that the 360/91 was microcoded, just very differently both compared to contemporaries and to today's machines. That the fetch, decode, and dispatch was in pure logic, but the execution units themselves (at least in the float unit, which was the only unit that was out of order) ran their own microcode programs.
You wouldn't happen to know where more documents than what's on bitsavers exist for the 360/91 (or related like the 95 and 195) do you? Competing anecdotes are less than satisfying. My source has been worng before but he bats a pretty good average for his contrarian comparch statements.
That's a great paper! I really appreciate you finding that; it was a great read and does go further in depth than the funcChar docs.
It unfortunately doesn't seem to disambiguate the specific question of if floating point execution units were implemented via microcode unless I'm missing it.
I don't think the Mod 91 ever shipped. It was a combat machine designed to squash CDC (IBM was sued by them for their anti-competitive actions here) and was eventually transmogrified into the Mod 95/195 which did ship.
There were approximately 15 Model 91s produced. The Computer History Museum has the console from one, and the Living Computers Museum has another.
The Model 95 was the even more rare version with thin-film memory instead of core memory. Only two of them were produced, for NASA.
The Model 195 was a reimplementation of the Model 91 with "monolithic" integrated circuits instead of hybrid SLT modules. The System 370/195 was very similar to the System 360/195. Curiously, the System 360/195 has the black styling of System/370 control panels, rather than the beige control panel of System/360.
According to wiki, there were 15 model 91s, 4 kept by IBM and the rest went to customers including NASA. There were two 95s and both went to NASA (and really the 95 was a 91 with a different RAM technology but the same CPU).
That lines up with my source who started his career doing simulation and analysis of rocket propellent in micro gravity for the tail end of the Apollo program and some of the initial work for the space shuttle.
I'm trying to understand how they read out that BCROS memory. From the picture, it appears to be a single layer of copper on a mylar sheet. How did they get the bits out?
They read the data capacitively. A sheet with perpendicular drive lines was put on top of the copper sheet. Energizing a drive line capacitively coupled to the square capacitors on the BCROS sheet. Each bit had two capacitors (balanced), driving a differential amplifier (sense amplifier) to produce the output bit. Using two capacitors reduced noise since the differential amplifier would cancel out the noise.
There's not really much risk. Because the S/360 was built from gates (SLT) rather than an integrated circuit, they could easily change the design if needed. (Even at a customer site!) It's not like a microprocessor chip that would require an expensive re-spin if something went wrong. So if they discovered they needed another micro-operation, they could just add it to the design. There are a few micro-operations that seem rather contrived, as if they realized it would really help performance to do this random combination of things.
The System/360 I/O channels are kind of like a DMA (direct memory access) controller, so I/O can happen in the background instead of making the CPU wait. The Raspberry Pi PIO is more of a programmable state machine for low-level I/O protocols. So they are kind of similar in that they offload I/O tasks from the main CPU. But they also act at different levels: a System/360 channel does something like reading a block of data from tape, while a PIO deals with individual up and down transitions on a line.
Is this I/O channel architecture basically the same thing that makes mainframes so reliable even today? Might you or anyone else have any recommendations on mainframe I/O architecture evolution? I've tried to look it up in the past but without much success.
They are reliable through exhaustive testing of components, plus error-checking, plus redundancy (on some models). What channels get you is high I/O throughput. Especially in 80 character long records
The machines (were and maybe still) are optimized to run COBOL, and what COBOL programs are commonly used for is operations like "Read a record, use columns 50 thru 59 to add to a total, repeat until EOF. Then print the total formatted as $**9,999.99" So the faster you can read records in, the better.
Channels also did buffering when reading from devices like tape drives that did not like to start/stop their motion a lot. So the channel controller would tell the tape drive to read "a few" records and hold them for when the CPU asked for them. Mechanically, the tape reel would slow to a stop and a vacuum column would hold excess tape. The inertia of the reel did not allow for sudden starts/stops and if you tried you'd snap the tape. Curious Marc has a video:
In 360 a channel could connect to a number of control units that each connected to a number of devices. Early on, just one IO could run on a channel; so all other devices were locked out until the IO was performed (several details omitted).
Early disk IO held the channel until the desired record came under the head. That was after the arm was positioned. 370 brought in rotational position sensing allowing other disk IO to take place.
Later on, tapes also accommodated record positioning. Forward Space File came with 360.
Remember though that these days tapes are often archived in a virtual tape library physically on disk, but processed with tape channel commands.
Back in 360,control units that could handle simultaneous IOs appeared. They were connected to two channels.
The Apple Macintosh IIfx had IOP's which I think were 6809's as I/O controllers. I believe they only worked in System 6 and were not used in System 7.
Somebody mentioned that the Mac Quadra 950 had IOP's as well.
IIRC, the IBM/360 exception model is such that if an exception occurs, there instruction in question cannot have any side effects -- no partial execution. That leads to a clean exception model but can cause performance pain.
Say the instruction is a string copy. The instruction itself might cross a page boundary, and the source might cross a page boundary (perhaps more than one), and the destination can also cross a page boundary. The microcode can't just begin processing the instruction, then hit a page fault, fix up the page mapping, and continue. Instead it must check that each possible read or write will not cause a page fault and only then proceed with the entire sequence.
Interestingly this is somewhere that x86 is cleaner. For instance on x86, the 'memcpy' instruction rep movsb is architecturally defined as simply not incrementing the program counter until it meets it's end condition. It can perform one move decrement remain count, increment addresses, check for exceptions and interrupts then repeat.
Cortex M also exposes how far you are into a ldm/stm for recoverable exceptions as well.
I'm reading the excellent "The soul of a new machine" by Tracy Kidder right now. Your post has helped visualize the types of computers mentioned in the book.
That's a great book. Keep in mind that the computer in "The soul of a new machine" (Data General Eclipse MV/8000) was a minicomputer, not a mainframe, and in 1980, not 1964. So there are a lot of differences as well as similarities. (I'm not trying to be pedantic, but just want to make sure you don't get the wrong impression of the Eclipse.)
Technically it was a 32-bit so-called "supermini" analogous to the VAX rather than a smaller 16-bit machine like most classic minicomputers (PDP-11, DG Nova/Eclipse).
Wonderful book. There's a snippet on conflicts between engineers and how engineers approach them - often not in the best way - that I've referenced so many times in my career.
Great book. I relentlessly and boringly quote the commandment from the CEO in that book that there was to be "no mode switch".
I have found this to be a great rule of thumb in software too. Engineers often first instinct is to add "advanced mode", "legacy mode" or whatever. Often with no idea how much trouble having two modes of behaviour will cause downstream in training, understanding, compatibility etc.
These articles are fantastic. When I started at IBM in the early 80s there were plenty of 360s still around, but the new hotness was 370, so all the training courses, installations, management focus were for the new machines.
The cool kids went to London, Tokyo and Poughkeepsie for training courses - the rest of us toughed it out fixing these venerable old beasts using oscilloscopes and mountains of manuals.
There were no decent helicopter level documentation like Ken's available. Having access to articles like this back then would have been life changing.
> The cool kids went to London, Tokyo and Poughkeepsie for training courses
I grew up around the city on that list that ain’t like the others. The idea that “cool kids” would be sent there, on _purpose_, and as some kind of _reward__, just made me laugh and laugh even though i know why it’s true and that it’s not a joke.
With hindsight it was sure exciting but not really fun.
There was huge pressure to get the machine back up again. An outage might mean that an entire bank branch was down, or that payroll could not be run. The CEO could be looking over your shoulder. That kind of pressure saps the fun quickly.
Also the bugs could be very, very hard hard to find. In room-sized machines like 3033 for example, there are many thousands of signal leads (trileads) that were many yards long. A slightly bad connection could cause electrical ringing and highly intermittent errors that were excruciatingly difficult to reproduce, let alone to find and fix.
Intermittent errors were common and brutal. You could think you'd fixed it, only to have another crash a few hours later. A common debugging technique was to swap cards around and see if the bug moved. Just doing that often introduced new bugs.
In latter years the role of the on-site engineer was reduced, and often you would be a lacky for some remote engineer back in the plant where the machine was built.
Thanks, yeah it does sound suck and no halo at all...I guess the best way is to either transfer to the "cool boy" branch ASAP or get a few years of experience and move to greener places? BTW what did the cool boys do? I guess they became system programmers, admins and such?
> The IBM System/360 was a groundbreaking family of mainframe computers announced on April 7, 1964. System/360 was an extremely risky "bet-the-company" project for IBM, costing over $5 billion
Ken - Dad got to see the S/360 assembly line one time (Kemet was a supplier for the 360), and he told me they had a machine built to probe & verify all the wiring in a cabinet. Pneumatic pistons pushed the test connectors (all of them, all at once) into the cabinet, and then another computer would run a set of verification tests to ensure that each wire ran to where it was supposed to, and that it had certain electrical properties (presumably crosstalk, signal loss, etc.)
While each model of 360 was standardized, customers could request certain options for their machine and IBM would accommodate them. So the testing computer would have to know what the customer's configuration was to know what the wiring was supposed to be for that particular cabinet.
I think it was actually 1960s dollars. The project was huge for its time, and had it failed, IBM would be but a footnote in the history of computing today.
The link says "$5,000,000,000 in 1964 is equivalent in purchasing power to about $44,968,064,516.13 today", which is $45 billion.
Considering that the entire revenue for IBM in 1964 was $3.2 billion (1964 money), spending $5 billion on a project (over several years) was certainly a "bet the company" move.
$45Tn is half the GDP of the whole planet (~$80Tn).
I should know. Law enforcement were trying to indict me for $100Tn in currency fraud. I wanted to point out to them that this was higher than the GDP of Planet Earth, but I don't think they would have understood what GDP is.
I was arrested. In my wallet was a $100Tn bill. Law enforcement send it to the State crime lab for analysis. Lab report says it is the most sophisticated piece of currency forgery they have ever seen. It checks 8 of their 9 boxes (it's been eight years since I saw the report) which I remember some were: correct paper, correct ink, metal strip, hologram, dots.. I don't remember the rest. I can't remember which one it failed.
They finally decided not to follow-through on the prosecution for it. I guess someone eventually realized that:
a) It was a real bank note, not a forgery
b) It wasn't US currency as they had believed, but a note from Zimbabwe:
Based on your (amazing) write-up, it seems like there was some sort of abstraction layer between a programmer writing in some kind of assembler and the actual CPU registers, is that correct?
That's what microcode is - an abstraction between machine code ("compiled" assembly) and control lines. The machine code is a CISC-like system that controls the microcode ROM/PLA which outputs either the internal RISC-like opcodes or the actual control lines.
> Microcode can be implemented in a variety of ways. Many computers use "vertical microcode", where a microcode instruction is similar to a machine instruction, just less complicated. The System/360 designs, on the other hand, used "horizontal microcode", with complex, wide instructions of up to 100 bits, depending on the model.
I'm pretty sure that horizontal microcoding is more common, at least in modern CPU design. On x86, the micro-ops are generally wider than the machine code instructions.
From what little we know of recent designs (the best public documentation being the fantastic work to reverse engineer AMD K8 and K10 microcode here https://github.com/RUB-SysSec/Microcode ), I'd describe x86 microcode as particularly wide vertical microcode, 64 bit ops in the case of k8/k10.
The bit width is more a heuristic. With horizontal microcode you can look at each group of bits and it's clear 'these three bits are the selection input to this mux', 'this bit is an enable for the buffer linking these two buses', etc. Vertical microcode in contrast is further decoded with bit fields having different meanings based on opcode style fields. RISC in a lot of ways was the realization 'hey, we can assume with this new arch that there's an i-cache, so why have microcode at all, but instead load what was vertical microcode from RAM dynamically and execute it directly'.
Pretty universally, OoO superscalar cores will use vertical microcode (or vertical microcode looking micro-ops even if they don't originate from microcode) because that's the right abstraction you want at the most expensive part of the design: the tracking of in flight and undispatched operations in the reorder buffer, and how the results route in the bypass network. Any additional wodtch there really starts to hit your power budget, and it's the wrong level for horizontal microcode because the execution units will make different choices on even how many control signals they want.
They're wider, but that's just because one "word" holds the whole instruction, instead of multiple bytes. In fact, reverse engineering efforts[0] (and the "RISC86" patent[1]) make clear that they're actually "vertical". Intel Goldmont (from [0]) has entries that are 176 bits each, but that's actually three (distinct) 48 bit uops and a 30 bit "sequence word".
Horizontal microcode is much simpler for in-order processors, but my understanding of this stuff seems like they wouldn't work well with the superscalar processors of today. Gating the hundreds of control lines seems (to me) like more effort than gating a few dozen bits of a uop.
In college we were tasked with designing a CPU. Mine was a stack oriented (started register based and retained the registers but most ops were on the stack) that used a very large microcode word, one per clock cycle of the instruction being executed. In the end, I was saving bits from the control word and doing "ready" signals between the blocks so that the microcode didn't need to drive everything. In theory, it could do more than one thing in a clock cycle if the stars aligned just right and there would be no dependencies. No instruction used the feature in the end, because the deadline was too close.
Wish I had the time to implement it. OTOH, I'm glad I never had to debug all the analog glitches and timing bugs that design certainly would show when it colided with reality
Up to that point many ISVs were using mainframe emulators such as the Flex system, but that license was killed off and ISVs were either pushed to remoting to a mainframe or going to z/PDT.
It would be cool if we could get the 360 instruction set / architecture to run on something small like the ESP32. I'd definitely try that out. Similar things have been done with CP/M for example.
It's still executing a single instruction, so it's not ILP. Multiple functional units can do different parts of the instruction in parallel, but instructions are still sequential.
The System/360 Model 91, a more advanced model, used instruction level parallelism for higher perormance.
That's sufficient to write an emulator at the instruction set level. However, I want to light the console lights accurately. This requires implementation of the internal registers and the microcode, which varies from model to model and isn't documented in the Principles of Operation.
Anyway the 50 was the first computer I was paid to program on. Many were bought to run in 7070 emulator mode as the 50 was the smallest / cheapest machine that could run 7070.
Happily I was on the 360 side.
At another shop, we had 2 50s, one running DOS and the other OS with HASP.
One day a student was having trouble with the Test and Set (superceded by Compare and Swap) instruction not setting the condition code properly. I wrote a short diagnostic program to exercise TS, store the resulting condition codes and dump. The instruction was not performing correctly.
I showed the dump to the non IBM engineers who hemmed and hawed. A couple days later one phoned me up to say that TS wasn't setting the condition code correctly - exactly what I had been telling him.
They fixed the problem. The next day I got a phone call that HASP wasn't starting on the other 50. Took a dump and found HASP was in a wait just after TS.
The engineers had swapped microcode cards between the two machines.
My manager was not pleased with the engineers.