Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> In the 1980s a clear case of this was that MS-DOS 2.0 had system calls for file operations that basically worked like Unix whereas MS-DOS 1's filesystem API looked like CP/M.

I honestly don't think that's a good example. On the contrary, I think it actually obscures what I mean, and would lead the casual reader to assume that things were actually much less different than they actually were.

Both MS-DOS and CP/M still had the very clear and almost identical concept of a "file" in the first place. I don't know if CP/M (and in turn CMS) was inspired by UNIX in that way, or whether the "file" concept came from a different common ancestor, but it's worth repeating that MVS has more or less nothing like that "file" concept.

MVS had "datasets" and "partitioned datasets", which I often see people relating to "files" and "directories" through a lens colored by today's computing world. But if you start using it, you quickly realize that the semblance is actual pretty minimal. (If you use the 1980s MVS 3.8j, that is.)

Both datasets and partitioned datasets require you to do things that even the simplest of filesystems (e.g. FAT12 for MS-DOS) do on their own and completely transparently to the user (or even developer). And moreover, datasets/members are usually (not always) organized as "records", sometimes indexed, sometimes even indexed in a key-value manner. This goes so fundamentally with the system that it goes down into the hardware, i.e. the disk itself understands the concept of indices, record lengths and even keyed records with associated values. MS-DOS, CP/M, and practically all modern systems, instead see "files" as a stream of bytes/words/octets or whatever.

A lot of this has been abstracted away and "pulled" into the modern and familiar "file" concept the closer you get to z/OS, but that's what MVS back then was like.

A C64 with its 1541 is closer to old school MVS than MS-DOS and CP/M both are, because an 1541 supports both "sequential" files (byte streams) and "relative" files (indexed record sets), and because it provides relatively high level interface to circumvent that altogether and work with the disk ("volume" in MVS parlance) more directly. There's even a "user defined" file type. However, altogether the 1541 is closer to MS-DOS and CP/M again, because usually (not always!) you leave the block allocation to the system itself. Like you always do in a modern system and MS-DOS or CP/M, there is basically no sane way around it (at best you can slightly "tweak" it).

That's not even touching on what "batch processing", and associated job control languages and reader/printer/puncher queues, mean in practice.

It's so alien to the world of nowadays.



> I don't know if CP/M (and in turn CMS) was inspired by UNIX in that way, or whether the "file" concept came from a different common ancestor, […]

CP/M drew heavily on DEC operating system designs, notably RSX-11M – it even had PIP as a file «manipulation» command as well as the device naming and management commands (e.g. ASSIGN). Perhaps something else. MS-DOS 1 has descended from CP/M whereas MS-DOS 2 diverged from it and borrowed from UNIX .

> […] but it's worth repeating that MVS has more or less nothing like that "file" concept.

Ironically, the today's blame that [nearly] everything is a file and a stream of bytes in UNIX is the root cause of all evil was the major liberating innovation and a productivity boost that UNIX has offered to the world. Whenever somebody mentions that stderr should be a stream of typed objects or alike, I cringe and shudder as people do not realise how the typed things couple the typed object consumer with the typed object producer, a major bane of computing of old days.

The world was a different place back then, and the idea of having a personal computer of any sort was either preposterous or a product of distant future set science fiction.

So, the I/O on mainframes and minicomputers was heavily skewed towards the business oriented tasks, business automation and business productivity enhacements. Databases had not entered the world yet, either, and as they were still incubating in research departments of IBM et al, and the record oriented I/O was pretty much the mainstream. Conceptually, it was the overengineered Berkely DB so to speak baked into the kernel and the hardware, so it was not possible to just open a file as it was not, well, a file. In fact, I have an opened PDF on my laptop titled «IAS/RSX-11M I/O Operations Reference Manual» that has 262 pages in total and has 45 pages in Chapter 2 dedicated to the description of how to prepare a file control block alone required just to open a file. I will take an open(2) UNIX one-liner any time over that, thanks.


> CP/M drew heavily on DEC operating system designs, notably RSX-11M

It mostly feels like a single-user version of TOPS-10, which Kildall initially used to write CP/M. OS-8 and RT-11 also borrow heavily from it, making it basically the common ancestor of anything that feels "DOS".


I think those things are largely orthogonal, though. Opening a file can be simple, while not everything having to be a file.

So, having the concept of files at all (and simple to open ones, to boot) is of course much better than MVS datasets, which barely abstracted the storage hardware at all for you. But on the other end of this, that does not mean everything has to be a file, as UNIX popularized.

To be clear, I am never defending MVS. We've come a long way from that, and that's good. I may however want to defend AS/400, which is far away from UNIX in the other direction, and so in a lot of ways the polar opposite of MVS. However, I haven't actually worked with it enough to know whether it's awesome seeming concepts actually hold up in real life. (Though I've at least frequently heard how incredibly rock solid and dependable AS/400s are.)


> Both MS-DOS and CP/M still had the very clear and almost identical concept of a "file" in the first place.

ms-dos files (after 2.0) were sequences of bytes; cp/m's were sequences of 128-byte 'records', which is why old text files pad out to a multiple of 128 bytes with ^z characters. ms-dos (after 2.0) supported using the same 'file' system calls to read and write bytestream devices like paper tape readers and the console; cp/m had special system calls to read and write those (though pip did have special-case device names). see https://www.seasip.info/Cpm/bdos.html

that is, this

> CP/M (...) instead see[s] "files" as a stream of bytes/words/octets or whatever.

is not correct; cp/m has no system calls for reading or writing bytes or words to or from a file. nor octets, which is french for what ms-dos and cp/m call 'bytes'

admittedly filesystems with more than two types of files (ms-dos 2+ has two: directories and regular files) are more different from cp/m


Granted, I wasn‘t aware. But unless you also have to tell CP/M how many cylinders and/or tracks you want to allocate for your „file“ upfront, and how large the (single!) extent should be should that allocation be exceeded, as well as having to separately enter your file into a catalogue (instead of every file on disk implying at least one directory entry, multiple for filesystems that support hard links), then CP/M and MS-DOS files are still very similar to each other.

Also, it sounds to me like those 128 byte records were still very much sequential. That is, 128 bytes may have been the smallest unit that you can extend a file by, but after that it‘s still a consecutive stream of bytes. (Happy to get told that I’m wrong.) With MVS, the „files“ can be fundamentally indexed by record number, or even by key, and it will even be organized like that by the disk hardware itself.


yes, exactly right, except that it's not a consecutive stream of bytes, it's a consecutive stream of 128-byte records; all access to files in cp/m is by record number, not byte number


While octet is used in French, it is also used in many other languages and it is preferred in the English versions of many international standards, especially in those about communication protocols, instead of the ambiguous "byte".


it was ambiguous in 01967


> This goes so fundamentally with the system that it goes down into the hardware, i.e. the disk itself understands the concept of indices, record lengths and even keyed records with associated values.

Interesting, so the disk controller firmware understood records / data sets?

I believe Filesystems with files that could be record-oriented in addition to bytestreams were also common e.g. VMS had RMS on Files-11; the MPE file system was record-oriented until it got POSIX with MPE/IX. Tandem NonStop's Enscribe filesystem also has different types of record-oriented files in addition to unstructured files.

I assume it was a logical transition for businesses transferring from punch-cards or just plain paper "records" to digital ones.


> businesses transferring from punch-cards... to digital ones.

A small quibble, but punched cards are 100% digital.

A card held a line of code, and that's why terminals default to 80 columns wide: an 80-character line of code.


> Interesting, so the disk controller firmware understood records / data sets?

Yep. The disk was addressed by record in a fundamental manner: https://en.wikipedia.org/wiki/Count_key_data

An offshoot of this is that the Hercules mainframe emulator reflects that in its disk image format, which unlike other common disk image formats is not just an opaque stream of bytes/words.

> I assume it was a logical transition for businesses transferring from punch-cards or just plain paper "records" to digital ones.

Yeah, that is a sensible assumption. In general, MVS's "not-filesystem" world looks in a lot of ways like an intermediary between paper records/tapes and actual filesystems.


> Yep. The disk was addressed by record in a fundamental manner: https://en.wikipedia.org/wiki/Count_key_data

Well, mainstream hard disks (what IBM calls "FBA") are also addressed by record in a fundamental manner. It is just that the records (sectors) are fixed length–often hard disks support a small selection of sector sizes (e.g. 512, 520, 524 or 528 byte sectors for older 512 byte sector HDDs; 4096, 4112, 4160, or 4224 byte sectors for the newer 4096 byte sector HDDs; the extended sector sizes are designed for use by RAID, or by certain obscure operating systems that require them, e.g. IBM AS/400 systems)

Floppies were closer to IBM mainframe hard disks than standard FBA hard disks are. Floppies can have tracks with sectors of different sizes, and even a mix of different sector sizes on a single track; IBM standard floppies (used by PCs) have two different types of sectors, normal and deleted (albeit almost nobody ever used deleted sectors); standard PC floppy controllers have commands to do searches of sectors (the SCAN commands–but little software ever used them, and by the 1990s some FDCs were even omitting support for them to reduce complexity).

And although z/OS still requires CKD (actually ECKD) hard disks, newer software (e.g. VSAM, PDSE, HFS, zFS) largely doesn't use the key field (hardware keys), instead implementing keys in software (which turns out to be faster). However, the hardware keys are still required because they are an essential part of the on-disk structure of the IBM VTOC dataset filesystem.

Actually, the Linux kernel contains support for the IBM VTOC dataset filesystem. [0] Except as far as Linux is concerned, it is not a filesystem, it is a partition table format. [1]

I think part of the point of this, is if you have a mixed z/OS and z/Linux environment, you can store your z/Linux filesystems inside a VTOC filesystem. Then, if you end up accessing one of your z/Linux filesystem volumes from z/OS, people will see it contains a Linux filesystem dataset and leave it alone – as opposed to thinking "oh, this volume is corrupt, I better format it!" because z/OS can't read it

> In general, MVS's "not-filesystem" world looks in a lot of ways like an intermediary between paper records/tapes and actual filesystems.

I think the traditional MVS filesystem really is a filesystem. Sure, it is weird by contemporary mainstream standards. But by the standards of historical mainframe/minicomputer filesystems, less so.

[0] https://github.com/torvalds/linux/blob/v6.10/arch/s390/inclu...

[1] https://github.com/torvalds/linux/blob/v6.10/block/partition...


Yet even Microsoft tried to shove something like that into the PC world, whether it was the OFS effort during Cairo development, up to WinFS that actually appeared in a developer's release of Longhorn.

And even more recently, there've been efforts to expose a native key:value interface on SSDs, to let the drive controller handle the abstraction of the underlying flash cells.

I'm not well-enough versed in this stuff to understand how similar these things are to what you're talking about, however. Very much appreciate any clue you feel like offering.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: