> In the 1980s a clear case of this was that MS-DOS 2.0 had system calls for fil...

inkyoto · on July 17, 2024

> I don't know if CP/M (and in turn CMS) was inspired by UNIX in that way, or whether the "file" concept came from a different common ancestor, […]

CP/M drew heavily on DEC operating system designs, notably RSX-11M – it even had PIP as a file «manipulation» command as well as the device naming and management commands (e.g. ASSIGN). Perhaps something else. MS-DOS 1 has descended from CP/M whereas MS-DOS 2 diverged from it and borrowed from UNIX .

> […] but it's worth repeating that MVS has more or less nothing like that "file" concept.

Ironically, the today's blame that [nearly] everything is a file and a stream of bytes in UNIX is the root cause of all evil was the major liberating innovation and a productivity boost that UNIX has offered to the world. Whenever somebody mentions that stderr should be a stream of typed objects or alike, I cringe and shudder as people do not realise how the typed things couple the typed object consumer with the typed object producer, a major bane of computing of old days.

The world was a different place back then, and the idea of having a personal computer of any sort was either preposterous or a product of distant future set science fiction.

So, the I/O on mainframes and minicomputers was heavily skewed towards the business oriented tasks, business automation and business productivity enhacements. Databases had not entered the world yet, either, and as they were still incubating in research departments of IBM et al, and the record oriented I/O was pretty much the mainstream. Conceptually, it was the overengineered Berkely DB so to speak baked into the kernel and the hardware, so it was not possible to just open a file as it was not, well, a file. In fact, I have an opened PDF on my laptop titled «IAS/RSX-11M I/O Operations Reference Manual» that has 262 pages in total and has 45 pages in Chapter 2 dedicated to the description of how to prepare a file control block alone required just to open a file. I will take an open(2) UNIX one-liner any time over that, thanks.

rahen · on July 17, 2024

> CP/M drew heavily on DEC operating system designs, notably RSX-11M

It mostly feels like a single-user version of TOPS-10, which Kildall initially used to write CP/M. OS-8 and RT-11 also borrow heavily from it, making it basically the common ancestor of anything that feels "DOS".

anyfoo · on July 17, 2024

I think those things are largely orthogonal, though. Opening a file can be simple, while not everything having to be a file.

So, having the concept of files at all (and simple to open ones, to boot) is of course much better than MVS datasets, which barely abstracted the storage hardware at all for you. But on the other end of this, that does not mean everything has to be a file, as UNIX popularized.

To be clear, I am never defending MVS. We've come a long way from that, and that's good. I may however want to defend AS/400, which is far away from UNIX in the other direction, and so in a lot of ways the polar opposite of MVS. However, I haven't actually worked with it enough to know whether it's awesome seeming concepts actually hold up in real life. (Though I've at least frequently heard how incredibly rock solid and dependable AS/400s are.)

kragen · on July 17, 2024

> Both MS-DOS and CP/M still had the very clear and almost identical concept of a "file" in the first place.

ms-dos files (after 2.0) were sequences of bytes; cp/m's were sequences of 128-byte 'records', which is why old text files pad out to a multiple of 128 bytes with ^z characters. ms-dos (after 2.0) supported using the same 'file' system calls to read and write bytestream devices like paper tape readers and the console; cp/m had special system calls to read and write those (though pip did have special-case device names). see https://www.seasip.info/Cpm/bdos.html

that is, this

> CP/M (...) instead see[s] "files" as a stream of bytes/words/octets or whatever.

is not correct; cp/m has no system calls for reading or writing bytes or words to or from a file. nor octets, which is french for what ms-dos and cp/m call 'bytes'

admittedly filesystems with more than two types of files (ms-dos 2+ has two: directories and regular files) are more different from cp/m

anyfoo · on July 17, 2024

Granted, I wasn‘t aware. But unless you also have to tell CP/M how many cylinders and/or tracks you want to allocate for your „file“ upfront, and how large the (single!) extent should be should that allocation be exceeded, as well as having to separately enter your file into a catalogue (instead of every file on disk implying at least one directory entry, multiple for filesystems that support hard links), then CP/M and MS-DOS files are still very similar to each other.

Also, it sounds to me like those 128 byte records were still very much sequential. That is, 128 bytes may have been the smallest unit that you can extend a file by, but after that it‘s still a consecutive stream of bytes. (Happy to get told that I’m wrong.) With MVS, the „files“ can be fundamentally indexed by record number, or even by key, and it will even be organized like that by the disk hardware itself.

kragen · on July 17, 2024

yes, exactly right, except that it's not a consecutive stream of bytes, it's a consecutive stream of 128-byte records; all access to files in cp/m is by record number, not byte number

adrian_b · on July 17, 2024

While octet is used in French, it is also used in many other languages and it is preferred in the English versions of many international standards, especially in those about communication protocols, instead of the ambiguous "byte".

kragen · on July 17, 2024

it was ambiguous in 01967

sillywalk · on July 16, 2024

> This goes so fundamentally with the system that it goes down into the hardware, i.e. the disk itself understands the concept of indices, record lengths and even keyed records with associated values.

Interesting, so the disk controller firmware understood records / data sets?

I believe Filesystems with files that could be record-oriented in addition to bytestreams were also common e.g. VMS had RMS on Files-11; the MPE file system was record-oriented until it got POSIX with MPE/IX. Tandem NonStop's Enscribe filesystem also has different types of record-oriented files in addition to unstructured files.

I assume it was a logical transition for businesses transferring from punch-cards or just plain paper "records" to digital ones.

lproven · on July 18, 2024

> businesses transferring from punch-cards... to digital ones.

A small quibble, but punched cards are 100% digital.

A card held a line of code, and that's why terminals default to 80 columns wide: an 80-character line of code.

anyfoo · on July 16, 2024

> Interesting, so the disk controller firmware understood records / data sets?

Yep. The disk was addressed by record in a fundamental manner: https://en.wikipedia.org/wiki/Count_key_data

An offshoot of this is that the Hercules mainframe emulator reflects that in its disk image format, which unlike other common disk image formats is not just an opaque stream of bytes/words.

> I assume it was a logical transition for businesses transferring from punch-cards or just plain paper "records" to digital ones.

Yeah, that is a sensible assumption. In general, MVS's "not-filesystem" world looks in a lot of ways like an intermediary between paper records/tapes and actual filesystems.

skissane · on July 17, 2024

> Yep. The disk was addressed by record in a fundamental manner: https://en.wikipedia.org/wiki/Count_key_data

Well, mainstream hard disks (what IBM calls "FBA") are also addressed by record in a fundamental manner. It is just that the records (sectors) are fixed length–often hard disks support a small selection of sector sizes (e.g. 512, 520, 524 or 528 byte sectors for older 512 byte sector HDDs; 4096, 4112, 4160, or 4224 byte sectors for the newer 4096 byte sector HDDs; the extended sector sizes are designed for use by RAID, or by certain obscure operating systems that require them, e.g. IBM AS/400 systems)

Floppies were closer to IBM mainframe hard disks than standard FBA hard disks are. Floppies can have tracks with sectors of different sizes, and even a mix of different sector sizes on a single track; IBM standard floppies (used by PCs) have two different types of sectors, normal and deleted (albeit almost nobody ever used deleted sectors); standard PC floppy controllers have commands to do searches of sectors (the SCAN commands–but little software ever used them, and by the 1990s some FDCs were even omitting support for them to reduce complexity).

And although z/OS still requires CKD (actually ECKD) hard disks, newer software (e.g. VSAM, PDSE, HFS, zFS) largely doesn't use the key field (hardware keys), instead implementing keys in software (which turns out to be faster). However, the hardware keys are still required because they are an essential part of the on-disk structure of the IBM VTOC dataset filesystem.

Actually, the Linux kernel contains support for the IBM VTOC dataset filesystem. [0] Except as far as Linux is concerned, it is not a filesystem, it is a partition table format. [1]

I think part of the point of this, is if you have a mixed z/OS and z/Linux environment, you can store your z/Linux filesystems inside a VTOC filesystem. Then, if you end up accessing one of your z/Linux filesystem volumes from z/OS, people will see it contains a Linux filesystem dataset and leave it alone – as opposed to thinking "oh, this volume is corrupt, I better format it!" because z/OS can't read it

> In general, MVS's "not-filesystem" world looks in a lot of ways like an intermediary between paper records/tapes and actual filesystems.

I think the traditional MVS filesystem really is a filesystem. Sure, it is weird by contemporary mainstream standards. But by the standards of historical mainframe/minicomputer filesystems, less so.

[0] https://github.com/torvalds/linux/blob/v6.10/arch/s390/inclu...

[1] https://github.com/torvalds/linux/blob/v6.10/block/partition...

myself248 · on July 17, 2024

Yet even Microsoft tried to shove something like that into the PC world, whether it was the OFS effort during Cairo development, up to WinFS that actually appeared in a developer's release of Longhorn.

And even more recently, there've been efforts to expose a native key:value interface on SSDs, to let the drive controller handle the abstraction of the underlying flash cells.

I'm not well-enough versed in this stuff to understand how similar these things are to what you're talking about, however. Very much appreciate any clue you feel like offering.