Hacker News new | past | comments | ask | show | jobs | submit login
Recovering Fossil Data on IBM 8” Floppies [video] (youtube.com)
72 points by zdw on Feb 25, 2020 | hide | past | favorite | 25 comments



I retrieved my files off of old PDP-11 8" floppies by contacting a friend of mine at cheshireeng.com who had an old 11 in a closet. He didn't know if it still worked, but it did (yay for DEC quality!), and was able to transfer the disk contents out via serial line. He was able to retrieve files and images from all my disks, didn't lose a bit (yay for DEC floppy quality!). This was after 30 years of sitting in a box.

Anyhow, I was able to retrieve my original 11 version of Empire this way:

https://github.com/DigitalMars/Empire-for-PDP-11

DEC made good machines. Not one of my machines from the 80s or 90s would power up, though I stored them in working condition in warm and dry places. But the 30 year old DEC worked great.


The typical thing that goes wrong with computers from the 80's and 90's is the electrolytic capacitors dry out and go bad. This got even worse with the "capacitor plague" of the late 90's/2000's.

I would guess the PDP-11 has fewer capacitors or of a different design.


Higher quality caps


My vintage 1981 Carver amp still runs all day every day.


I notice Wikipedia states you wrote Empire for the PDP-10. Are there two versions or is Wikipedia wrong?


I love his videos he has a webpage with merch if you want to support him https://www.curiousmarc.com/

If you are new to his channel I cannot recommend enough the Apollo AGC videos https://www.youtube.com/watch?v=2KSahAoOLdU&list=PL-_93BVApb...


Somewhat related, I have a couple of tapes from the 80s that contained software for the ZX Spectrum (mostly my first attempts at writing code). They were written in a custom format by a custom device I no longer have, and of which I have very little information (I managed to track down one of the engineers who designed the thing some 30 years ago).

I have raw audio files of these tapes. I have managed to convert some other tapes in standard ZX Spectrum format to readable files I can load in an emulator. However, for the special tapes, there's no tooling available - all I have is a waveform.

If I had an array of bits, I could start trying to figure out the format of this thing. However, I have no idea how to go from a raw waveform to the zeros and ones it encodes. My best idea so far is to write a small program that looks for zero crossings, and depending on the timing output zeros or ones, but I suspect there might exist some software that does this already? I have next to no knowledge of signal processing.

Any suggestions on how to go about this?


Your typical tape from that era was simple FSK, 256 byte blocks + a checksum. You may want to start with trying that and simple variations on it. It shouldn't be too hard to figure out the FSK frequencies from the audio using a good scope.


I'd recommend asking people at https://www.worldofspectrum.org/forums/. This forum has been around since the 90s (iirc). I'd expect the kind of detailed knowledge you're after to be present there.


Interesting Q uestion you posed in your deleted comment. Unlikely to get traction in this forum, but interesting nonetheless.


Now I'm curious.


If you just want to get a quick answer on basic questions of the encoding, such as where to put the zero/one threshold, bits per second, how bits are encoded etc you can open up the file in audacity and zoom in on the wave form.


Start with this for a technical backgrounder.

http://www.worldofspectrum.org/tapsamp.html

What you have is probably some variant of FSK, so if you view a spectrum (in Audacity, or similar) you should be able to pick out the frequencies being used.

Decoding the block structure is going to be harder, and it will also depend on the data representation. The data is probably tokenised. If it's standard ZX BASIC it should be easy-ish to decode, but if it's a custom format it's going to be much harder.


A .wav file is pretty simple, just an array of amplitudes indexed by time quanta. See if your audio files can be converted to .wav files.

Then, read the amplitudes into a C array (or a D array, even better!) and they're easy to do simple processing on.

I wrote a program a few years ago that would do this, looking for things that looked like clicks and smooth them with a cubic interpolation. It was fun to dink around with it.


> Any suggestions on how to go about this?

No audio domain specific related experience, so there may be something better for this, but, failing that, for format exploration at least, I'd think numpy/scipy+jupyter would be great for interactively mucking around with the bits/bytes e.g.

np.where(x > ((max(x) - min(x)) / 2))

(roughly, am a bit rusty at the moment)

basically gives you a boolean array containing approximate zero crossings in an array, and so on.

similarly, you could subslice on range boundaries to ignore imagined 'marker' bits, index this through an ASCII table & display results, etc.

if you're not up with numpy array syntax / dtypes it takes a bit of getting used to, but well worth the effort IMHO in terms of the overall data exploration skills gained


They should try Dave Dunfield's ImageDisk:

http://www.classiccmp.org/dunfield/img/index.htm

This is used by bitsavers to preserve old data from 8 inch floppy disks. I was able to write an emulator for a Motorola 6800 "Exorciser" that could boot disks saved this way.


I don't think it handles IBM formatted disks. You can use it with you have an FDC that does some handling for you. Problem is that those disks are practically 'punchcards-on-disk' for mainframes, not PC-based at all. Oddly enough, it does know about paper tape.

Then again, OmniDisk can't autodetect it initially either, so perhaps the whole concept of reading mainframe formatted disks and mainframe encoded data on a non-mainframe system was rather problematic anyway.


It does, IBM format is the most basic floppy standard (same for ASCII and EBCDIC). I watched the video again- they did use ImageDisk for the bulk of the transfer- you can see it flashed a few times (13:46 is one point). It's the mostly blue screen with the red bar on the top.


Re-watching the video and re-reading the comments, this makes sense. I probably was too focused on the subtitles I had on.

It is pretty interesting that while older formats can be somewhat obscure these days, because the format was much simpler it can be 'understood' by one person much easier.


Part 1 for a more detailed introduction to the situation: https://www.youtube.com/watch?v=MPOYHQTMnf8


If you haven't seen this already, CuriousMarc has also done this awesome series on the restoration of an original AGC - Apollo Guidance Computer last year.

https://www.youtube.com/watch?v=2KSahAoOLdU&list=PL-_93BVApb...

And then there are real treats such as the restoration of a Teletype 33 ASR:

https://www.youtube.com/watch?v=QzfjT1mCRww&list=PL-_93BVApb...

Or them trying to get Fortran to compile on an IBM 1401 Mainframe dated 1959. https://www.youtube.com/watch?v=uFQ3sajIdaM


This skewers the notion that old programmers were honed and refined by their resource constraints. Whoever wrote these files was using a slow, expensive physical format and wasting virtually all of it on padding.


All data was typed in by hand, and the entire dataset fit into a few boxes. Resources were never an issue, so why optimize for them?


I think there's a practical difference between a folder full of data and a wheelbarrow full of data.


About six years ago I found several 8" floppies for the 1970s Ohio Scientific system. These had been in storage since the late 1980s and contained games and utilities. One disk label was dated 1/31/1979.

I sent them to the author of the OSI emulator and I believe all but one of them were fully readable and dumped for emulation.

http://osi.marks-lab.com/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: