Hacker News new | past | comments | ask | show | jobs | submit login
Unidentified PC DOS 1.1 Boot Sector Junk Identified (os2museum.com)
184 points by kencausey on Jan 7, 2022 | hide | past | favorite | 66 comments



Uninitialised memory in development tools can have very interesting results! There's a Game Boy Color game that contained HTML from a porn site in an uninitialised region: https://tcrf.net/DynaMike


Wow. That site is great. Thank for linking.


so did that mean...the game itself could be theoretically 18+ since it contained implicit references to pornography?


While there is precedent[0] for reclassifying a game based on content included on the distribution medium but not normally accessible within the game as released, I think you would have a hard time making that case here since you're probably going to have a much harder time inducing a GBC game to render an HTML document or JPEG image.

[0]: https://en.wikipedia.org/wiki/Hot_Coffee_(mod)


I had my own misadventure with uninitialized memory in the late 1970s when I was working at Tymshare. I was maintaining the assembler and linker for one of our machines (I think it was the PDP-10 but could have been the Sigma 7).

The linker had a bad habit of leaving unused memory uninitialized. Every time you linked a program, the binary would be different. Functionally the same, but they wouldn't compare byte for byte. So my manager asked me to make sure the linker zeroed out all unused memory.

After linking a program, the linker printed a message something like this:

  8412 bytes used
   439 bytes free
The linker was a mess of spaghetti assembly code, and it was a real pain to find and fix all of the places where it failed to clear memory.

My manager knew what a hassle this task was, and he was fairly chill, so just for fun I added a temporary message meant for his eyes only:

  8412 bytes used
   439 bytes free and it's pretty fucking clean
I figured he would get a laugh out of this and then I'd remove it.

Unfortunately he wasn't the first to see the message. His manager was giving a demo to a customer and they saw the message. Oops.

Later we wanted to add a "weak external" feature to the assembler and linker, something like the "weak reference" that a few modern languages have. If the external was missing, then instead of failing the link step, it would leave a zero that you would check at runtime.

The regular external directive was called EXTERN, and the tradition was for these names to be very short. I thought of calling it WEXTERN but then decided to call the feature a "secondary external", so the directive was SEXTERN!


I love this. This is what makes the internet the beautiful, amazing place that it is - anyone can publish and you get these highly specialized, well-written, informative articles on esoteric topics.


Especially stuff you might have come across 40 years ago and it's been sitting at the back of your brain ever since.

I recently closed a 20-something year mystery to do with a weird TV advert from the UK in the late 90s. I could never figure out why a certain poster for a musician was on the wall in a room. Fast forward to 2021 and I managed to hunt down the writer of the advert who had some background info, and then put me in touch with the director and we figured it out and had a laugh about it. And I contacted the musician who was very happy to also have the mystery closed. It's nice to check off a 20-year-old item on your TO DO list.


There's a really good NPR podcast about this very kind of thing, where there's a guy who's been hunting down a particular "musical tone" for more than a decade, and it talks both about how hard it is to be the kind of person that needs those answers, the process of arriving at an answer (including many, many false starts and red herrings), and the satisfaction that comes from figuring it out.

I don't want to spoil it, but the tremendous tragedy/irony is that a certain class of people within the IT industry could have answered the guy's question (or at least put him definitively on the right track) after hearing it just one time, which I think is actually a really important lesson to fully internalize: whatever thing that's been haunting/daunting you, if you get lucky enough to just mention it in front of that one right person you'll be immediately closer to an answer than if you spent literally ten years chasing it without their insight.

I'm not a huge podcast (or any audio/visual media) advocate, but this one is worth listening to if you're stuck in the car or something:

https://www.thisamericanlife.org/516/stuck-in-the-middle/act...


If you like that one, you may enjoy this one too (which has much more of a mystery to it imho) https://gimletmedia.com/shows/reply-all/o2h8bx


I was listening to that episode while riding a bus and I was almost screaming internally the whole time. I knew exactly what that song was and by whom, and had a dusty old low quality mp3 still on some harddrive at home, possibly acquired through less than legal means at least some 15 years ago.


Amazing! That must have been very interesting as the show did a good job leading one to think it was a dead end.


This is exactly why I miss IMDB's "I need to know" board. You could post a partially correct memory snippet from decades ago and someone would chime in with the answer. I had memories of movies I'd watched on TV as a young child. I had no name or context, but simply little moments stuck in my brain, and someone was able to resolve those to a title.

The closing of the IMDB message boards was a terrible loss IMO. You could visit an obscure actors message board, and you might run into someone who knew them. You could discuss a movie's intricacies with someone, and share theories, etc, on the specific board for that movie. It was a goldmine for movie fans, and I still miss it.


I wish there was something like that for books.

I remember reading a book in my youth about a kid that lived in a shielded city with an AI controlling it with the assignment to keep everyone ‘happy’ (= docile). He/she breaks protocol and gets ejected through a trash chute. Meets an old man. Crossed a wasteland and finds old ruins, learns something and then blows up the shielding of the city to make it free again.

In the Dutch translation either the old man or the main character was called ‘gull’ or ‘sea gull’. I’ve posted on a few subreddits and forums with people pretty much ignoring, and I have scoured Google without much results neither.


Try https://scifi.stackexchange.com/questions/tagged/story-ident... - they solve this kind of thing all the time.




Logan's run was the first thing I thought of as well. Maybe a dutch adaptation of the movie that took some liberties?


The last two sound familiar. Further research is required. At any rate, I thank you!


Goodreads has the What's the Name of That Book?[0] group for identifying books based on a description.

[0] https://www.goodreads.com/group/show/185-what-s-the-name-of-...


Haha, I have a memory of a movie I saw in the 1960s where there's a dream sequence with the guy in a stylized prison cell with pink mist and giant keys dangling in the air. I've never been able to figure out what this was.



Many answers are low quality. Some participants will just Google and post their answers, something you've likely done before asking.

Now r/WhatIsThisThing, that actually feels like crowdsourcing.


There's also the NFSW version /r/tipofmyp** (I'll let you guess the last word)


And I also found out /r/tipofmyjoystick for games.


Ye it is impressive how good those type of forums work. Pointing out a movie from details.

I wrote in a similair local forum about a movie I was not allowed to watch as a kid, where people at a plane disappeared and their fake teeth was still there (only thing I remember) and someone could immediately point me to the correct kinda crappy Stephen King movie ...


The Langoliers


I've had the same with some music and even from pretty vague descriptions about tempo and content some people were able to find a couple of pieces that I thought I'd never hear again.


That sounds really interesting. Any chance you could share more details?


Sure!

So, this advert went out in I think, 1998, according to the writer:

https://www.youtube.com/watch?v=pKS_yw4MqAA

It features a poster of Griff Pilchard on the wall.

Now, when I was 16 my crush loaned me a tape of Griff Pilchard which she treasured. It was totally bizarre, but I love it in its weird way. But, as far as the Internet was concerned in the 90s and into the 2000s, Griff Pilchard did not exist.

Even in 2022, very little exists about the man and his music. He is a ghost.

Anyway, when that advert aired in 1998 it was the only reference I had seen anywhere, outside the cassette, to the man's actual existence.

The man himself did finally turn up on Facebook, but he was as baffled as I was about the advert because:

a) No-one on Earth has heard of him b) He's never had any posters c) That isn't his face on the poster

So, to have "his" face on a poster on a major national TV advert that ran constantly for months was unexplainable.

Anyway, I recently managed to find the writer of the advert. He explained that the director had decided on the subject of the poster and had dressed up in a wig to make it. He'd mentioned Griff Pilchard to the rest of the staff, but they had no idea who he was talking about.

It was nice to get an explanation after 20 years of it bugging me. The advert would pop up from time to time in the media and each time it did I couldn't figure it out. Now I've been able to check it off. I sent the info to Griff Pilchard and he was very happy to also know what the fuck it was all about.


That's amazing! :-) Thanks for sharing!


3rded!



Very much seconded!



I will recount one mystery that I personally wondered about for decades, and that coincidentally the author of this article might have shed most light on:

The 386 was the first x86 processor to introduce four "Control Registers", CR0-CR3. The first one, CR0, was technically already introduced with the 286, but then still named "Machine Status Word" and manipulated with its own set of different , explicit instructions (that still survive to this day).

Now, if you look at Intel's documentation of the 386, or any later CPU in the line--including today's Intel CPUs in Laptops, Desktops, and Servers--you will see that the second of those registers, CR1, is entirely "reserved". No single bit in it can be accessed, neither reading or writing.

This is bizarre not only because CR1 was introduced by the 386 along with CR2 and CR3 (which are defined and common), but also because the successor, the 486, introduced a new Control Register, CR4, instead of starting to use reserved bits in CR1. This is despite CR4 sharing its characteristics with CR0 (and unlike e.g. CR2): It's mostly a bit field for global processor state. So while you could have assumed that CR1 existed as planned "overflow" to add new control bits to once CR0 became full, the seemingly simple addition of CR4 in the immediate successor goes against that theory.

Decades ago, I even wrote to the 386's chief architect to try to settle that question... he must have not understood my question, because he just replied that according to Intel documentation, that register is "reserved".

But incidentally, a while ago I talked about that mystery with the author of this article, and received the most plausible theory so far: It turns out that an early pre-release document shows that the 386 was originally planned to contain an on-chip cache. Evidently, that part of the plan must have been scrapped, because the 386 shipped without an on-chip cache, which was only added with the 486 (maybe they ran into problems with implementing a cache controller, or maybe at the time so much on-chip SRAM would have made for a prohibitive price point).

It is therefore not unlikely that CR1 was once meant, and in prototypes maybe even did, control the cache. Once that cache was removed, CR1 was just made "reserved", and not repurposed in the 486 and later CPUs out of an abundance of caution for compatibility: Accessing CR1 reliably causes an Undefined Opcode trap, and maybe some important software at the time relied on that in a bad way.


Now I wonder if inside the modern Intel processor silicon that register actually exists (and what's it actually plugged into?)


Yeah, I have wondered the same about the register in the original 386, it's part of the fun for this mystery. Unfortunately, while simpler CPUs like the 6502 and even the 8086 itself have been reverse engineered to great extent (the 6502 pretty much fully, the 8086 at least very extensively), briefly talking to members of the community that achieved those feats (some of which are also active here by the way, like Ken Shirriff) made it apparent that, at least for now, reverse engineering a 386 is still out of reach.

I would guess that reality is pretty boring and that MOVing from and to the register is more or less hardcoded to the Undefined Opcode trap. For anything modern, I'm certain beyond reasonable doubt that that's the case.

But for the 386 we cannot be sure, and if the cache control theory for example is correct, I wouldn't be surprised at all if there are remnants of the former design.

The best outcome would of course be that the register exists in the 386, does something amazing, and just needs to be enabled somehow. Chances for that are rather slim.

In the meantime I have an eBay search for 80386 engineering samples saved. But not even the searches for specific serial numbers of known buggy very early 386 CPUs (might be samples as well), that were prevalent enough that some publications warned against them[1], ever turned anything up at all in years.

https://www.pcjs.org/blog/2015/02/23/


Yeah, reverse-engineering a 386 is way beyond what I can do. The 6502 has about 3500 transistors, the 8086 has about 29,000, and the 80386 has about 275,000 transistors. So it's almost an order of magnitude of complexity each time. Not to mention that the 80386 has two metal layers, which makes it a whole lot harder to see what's going on.


386 may be a bit too far, but there was a high resolution shot of an 80286 on the visual6502.org wiki (now sadly offline), were you could almost read the microcode ROM visually.

I spent a lot of time staring at that and got at least a good part of the opcode pattern PLA. Many bits were so blurry that I could only guess what they are, based on what other opcodes I already had. The actual microcode is 53760 bits with a very high density, where the difference between a 0/1 comes down to a fraction of a pixel. I think with a better image, it might be possible to read those bits and even automate it.

Not sure if there are any mysteries left to solve though. In the PLA there are patterns with "don't care" bits to match opcodes 64-67 and 0F 07 to 0F FF, so all of these must be invalid (they go to the same address in microcode).

Some time ago I found out what the last undocumented opcode 0F 04 - and the F1 prefix - do by experimenting[1]. One slightly odd thing about that instruction is that the CPU seems to grant bus requests for DMA/refresh, but then completely ignore it and write at the same time.

On the first machine I tested this on, this caused some crashes and spurious I/O that I didn't notice at first, but on another one it did absolutely not work unless DRAM refresh was disabled. Maybe there is something in the microcode that explains this, but more likely it has to do with other parts of the hardware?

[1] https://www.vcfed.org/forum/forum/technical-support/vintage-...


Thanks for the confirmation! Maybe instead some day the 80386 is considered so old and historic that Intel makes some design documents available, but that practice does not seem terribly common...


286 was an RTL model hand converted module by module to transistor/gate level schematic. Afair 386 was the first Intel CPU where they fully used synthesis (work out of UC Berkeley https://vcresearch.berkeley.edu/faculty/alberto-sangiovanni-...) instead of manual routing. Everything went thru logic optimizers (multi-level logic synthesis) and will most likely be unrecognizable.


That’s indeed interesting. Assuming detailed pictures, it sounds then that finding the original purpose might only be feasible if there are any very significant remnants of the registers purpose… (And then it seems much harder, but not impossible, a bit akin to reading very optimized compiler generated machine code).


I found a paper on this 'Coping with the Complexity of Microprocessor Design at Intel – A CAD History' https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22... There was another one dedicated to 386, but I dont remember the title :(

on the other hand there are insane people able to RE Playstation CPU/GTE back from standard cell array chip http://www.psxdev.net/forum/viewtopic.php?t=551&start=60


The first person to make this discovery (as far as we can tell) was Daniel B. Sedory aka The Starman, whose illustrated PC DOS 1.1 boot sector page is much nicer than anything I could put together.

I highly recommend his site too --- there's plenty of boot-sector analyses there, explanations of the PC boot process, and other low-level information presented in fast-loading, script-free ad-free HTML from someone whose goal isn't to make a quick $$$, but rather to share information freely. In other words, a great resource of the "document web".

As for this article, it also reminds me of "junk DNA" sequences, which could have similar relevance to biological historians.


If he's still updating the site, someone tell him that the mystery of the mystery bytes has been solved…

https://thestarman.pcministry.com/asm/mbr/mystery.htm

https://retrocomputing.stackexchange.com/a/14309/


Perhaps someone could help out solve my 20 year old mystery? When I was a kid, I had briefly access to a computer. My memory might be mistaken on the details, but I just can't find on Google what kind of computer it was. It had 5 inch floppies, I am pretty sure that games on it only used four colors (cyan, magenta, white, black). I think it had joystick similar to Atari's. It had Bubble Bobble game.

Edit: Photos of Commodore computers look really close to what I remember, but not exactly. Also, Latvia (where I live) just recently had regained independece from USSR and it was a time when the market was flooded with clones for everything. E.g. instead of NES kids had famiclones like Zhiliton, UFO or in my case, I think it was Dendy.

Edit: Floppy reader was in-built, not external, as far as I remember.

Edit: Sorry, more like 30-year mystery. Somewhere around beginning of 90s.

Edit: I think it also had game Alley Cat.

Edit: Thank you everyone for chiming in! This inspired me and gave extra keywords for further investigation and narrowing down the exact model / make. Gonna continue searching on Sunday.


The colors sound like normal PC CGA graphics, but the joystick sounds like maybe a PCjr (or compatible), which, IIRC, sometimes came with a joystick, since the machine itself had built-in joystick ports. The CGA graphics could be explained by the games being written for PC CGA graphics, and not the better, but obscure, PCjr graphics.

But… 20 years? That’s like, 2002. PCs were abundant then, and CGA graphics were already ancient, and use of joysticks in games were already relatively uncommon; mouse & keyboards were the norm.


You could click through the different platforms bubble bobble was released for on mobygames to see which one looks familiar. The DOS version seems to have had a CGA 4-color mode but it wasn't cyan/magenta. Amstrad CPC looks like a candidate:

https://www.mobygames.com/game/cpc/bubble-bobble/screenshots


https://www.retro-exo.com/exodos.html has "eXoDOS/eXo/eXoDOS/Bubble Bobble (1988).zip" which you could try with DOSBox. Other commenters mentioned that the DOS screenshots use the red/green/yellow CGA palette but plenty of CGA games switched video modes for a tiny extra bit of variety.


On that note, does anyone know whether the original IBM TTL monitors supported "hardware emulation" of alternative palettes?

On a compatible Amptron(?) display we had, the brightness and contrast pots had an integrated switch. If you pulled on the brightness adjuster, it switched to a green-black monochrome palette, while the contrast control switched it to the red-green-yellow palette.


I just played it and it's pretty fun. If I force CGA mode, the Taito title screen is in white/cyan/magenta but the rest of the game (first two levels at least) are in yellow/green/red.



I think we need a little more info, roughy date, location?

Some computers were very regional.


Alley Cat also ran on the PCjr: https://oldcomputers.net/ibm-pcjr.html


Could it have been a BBC Micro?

They look a little the commodore but have 8 colours from memory though…

Right time period.

Some had a built in joystick i think from memory but can’t find any photos from a quick search.


That sounds like CGA. Could have been any IBM compatible, can you describe the shape of the case and whether or not the floppy drive(s) were built in or external?


FWIW, threads lock after a few days. So if you figure this out within the next day or two you should be able to update the thread. I'm interested :3



Is PC DOS the predecessor to MS-DOS, made strictly for IBM? How many versions were there and when did Microsoft change it?


Yes. PC DOS 7.1 was the last version, i.e. it continued development in parallel with MS-DOS. The whole history can be found here [1]. All versions including their amazing (they're really good) manuals are available on WinWorld [2].

1. https://en.wikipedia.org/wiki/IBM_PC_DOS

2. https://winworldpc.com/product/pc-dos/1x


Yeah, just to confirm that for v7.1, we got a full source license of it for Symantec Ghost because Microsoft had withdrawn permission for us to distribute anything at all to do with MS-DOS, and we had a lot of customers still using DOS boot disks. Prior to that I believe we had a distribution license for PC-DOS but around that point IBM no longer wanted to maintain it and we had small things we needed to do, so we got the source license and just distributed builds we made as part of the Ghost Boot Wizard disk builder.

What was more interesting from a software archaeology point of view was when I found Symantec had acquired the assets of the then-defunct Quarterdeck; so the source for QEMM and Desqview eventually turned up in the corporate Perforce server and that was certainly fun to read, having made all kinds of things for QEMM back in the day, like a replacement overlay manager for Turbo Pascal (and Turbo C++) that would run overlay code segments directly out of the EMS page frame - this meant pretty much any program we built with the Borland compilers could be turned into a TSR with only the data segment resident.

The one thing that I never did find, though, was the source for the Whitewater Group's Actor. When it was announced that Symantec had acquired those assets from Borland I tried as hard as I could to see if the Actor source code was included but no-one I was ever able to find knew about that. Actor 3.0 was my first introduction to the Smalltalk way of building software, and I still sometimes fire it up in a VM to play enough with its VM format.


Hi!

Massively predictable comment making noises about Symantec code-dumping QEMM, DESQview and DESQview/X :)

Besides being a lot of fun to read, the latter (which incidentally got some attention on here just a month ago - https://news.ycombinator.com/item?id=29396561) is a whole DOS-native X server. IIUC only Netware's commissioned port of XFree86 comes close to that, but that runs on top of Netware's kernel infrastructure so doesn't really count.

Naturally these sorts of situations sadly represent communicative chasms between geeks and management. Case studies in "it wasn't the end of the world" can perhaps be useful. My standard such example is the HP-20b and HP-30b calculators. On the surface they look just like other calculators HP made around the same time period (2008), but there happen to be a set of pins inside the battery compartment that allow reprogramming the non-bootloader-locked microcontroller inside. An SDK (https://www.hpmuseum.org/cgi-sys/cgiwrap/hpmuseum/archv018.c...) (consisting mostly of a keypad test demo) filled in the remaining pieces (how the keyboard matrix worked, and how to get code running on the ARM CPU, basically), and the result was the WP-34S firmware that reimplemented calculator functionality completely from scratch (the code samples in the SDK provided zero such functionality).

TL;DR, this kind of thing can definitely work and be constrained to a nice little obscure niche of very happy people with low/zero external side effects.

In this case, I wouldn't even mind if the code was released under a similar license to what D used to use (https://archive.ph/20161022202138/https://github.com/dlang/d...) before Symantec relinquished control of the project (https://forum.dlang.org/thread/oc8acc$1ei9$1@digitalmars.com) and it was made fully open-source (https://news.ycombinator.com/item?id=14060846). Microsoft licensing GW-BASIC (https://github.com/microsoft/GW-BASIC) and early versions of MS-DOS (https://github.com/microsoft/MS-DOS) under MIT is... fascinating but understandable, yet entirely unnecessary. "View only with no repercussions but not open source" works for historical interest just fine too.

What is a very good question is whether adequate precedent was set by relinquishing D that it would be tractably viable for say one or two individual campaigners to internally achieve something similar for arbitrary effectively-abandonware software that is only of historical interest.

(If this turns out to not be possible because QEMM and/or DESQview are actively being supported in some way I think the whole retrocomputing scene would be very curious to hear about that through the grapevine!)

Thanks for the interesting info about PC-DOS. That's very cool!


There's a version table on Wikipedia, probably.

The history of DOS is actually really complicated, arguably even moreso than UNIX, so apologies for the mess I'm about to throw upon you. Because, before we can even talk about DOS, we need to talk about two other technologies IBM wanted in their new PC: CP/M and BASIC. CP/M was the closest thing to a standard disk operating system in the microcomputer world, and BASIC was the JavaScript of the 1980s. IBM absolutely needed both if they wanted to sell computers.

Microsoft was involved with the IBM PC because IBM needed a BASIC interpreter. The OS would have come from Digital Research. But, there was a problem: IBM wanted to use Intel's 8088 chip. CP/M only worked on the incompatible[0], older 8080, and CP/M-86 had been delayed for so long that a company called Seattle Computer Products (SCP[1]) started writing their own 8086 DOS instead. They called it the Quick and Dirty Operating System (QDOS). Bill Gates got wind of IBM's OS problems, bought SCP and QDOS, and sold it to IBM so the PC could ship on time with an OS: "The IBM Personal Computer DOS", or "PC-DOS".

So, to answer your first question strictly; PC-DOS was not made for IBM. It was made for SCP's own S-100 machines first. However, in a sense, it was made for IBM, in that Microsoft made plenty of customizations for them and it contains drivers specific to IBM machines. Bill Gates stipulated that they'd retain ownership of the OS; but the intended licensing model was for other PC vendors to buy the source code and customize it to their own, incompatible architectures. This was the licensing model CP/M used[2]; such a machine would be DOS-compatible but not run any IBM software.

Later on, of course, Compaq was able to legally reimplement the IBM BIOS, creating an "IBM-compatible" machine; and then everyone else in the industry figured out how to do the same and drastically undercut IBM. So MS-DOS was eventually turned from an OEM licensable into a consumer software product that would work on a "standard PC". Digital Research also got into the consumer OS market with DR-DOS, which Microsoft contemplated killing through underhanded tactics[3] but was able to outcompete by better OEM relationships. There's also legally distinct reimplementations of MS-DOS itself, such as DIP-DOS, X-DOS, and later FreeDOS.

[0] Blame Federico Faggin jumping ship and taking the patents with him. Or the Intel manager who demanded he show up to work on time.

[1] Not a front for the SCP Foundation. Or, at least that's what THEY want you to know!

[2] This licensing model also persisted into early Windows; there are special OEM versions of Windows 1.0 that run on Zenith Data Systems machines. It's also the reason why OS/2 1.x comes in "IBM" and "Microsoft" flavors. Microsoft OS/2 was more of a devkit for OEMs who were expected to customize OS/2 in the same way that they were supposed to customize DOS. But it also shipped with technologies that IBM OS/2 didn't have, like LADDR - pluggable disk I/O drivers that would later be reworked into Windows 95.

I should also note that the "DOS-compatible" business model looks a lot like Android fragmentation if you squint at it a little. Remember how I said this was more complicated than UNIX?

[3] The idea was to have Windows 2 or 3 throw fake, non-fatal error messages at the user if it was running on DR-DOS.


Such an excellent breakdown of the events. Thanks!


It was IBM's OEM version of MS-DOS, more or less.

Random Google result: https://www.techrepublic.com/blog/classics-rock/my-dos-versi...


It sounds eerily like geneticists trying to decipher where a gene or a piece of "junk" DNA came from and what they do.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: