Yes, enjoyable and good ideas. I feel like I've been on that "project" before. You do a bunch of work, dig into things and then discover a simple answer that, in hindsight, could have been applied immediately and made all the investigation irrelevant.
Usually out of pride or the sunk cost fallacy (or something like it) I'll convince myself there was no other way the problem was going to be solved. Either way the next time around I spend just a little bit longer trying to think of an easy way out.
Although it is a historical (and disavowed) footnote, the Open Text Index launched in April 1995 and sported full text indexing of web pages -- more than 6 months before Alta Vista's launch in December that year.
It did serve as a back-end search for Yahoo! though it never could keep up with the full query volume. It was lambasted by the creators of Google for experimenting with putting clearly-marked ads in search results. That transgression looks hilariously quaint to modern eyes.
The original arcade Mortal Kombat (and MK2, MK3 and UMK3) were all written in assembly for the TMS34010 processor. No doubt ports of it to the Sega Genesis did use 68K assembler.
The TMS34010 was something like a CPU and 2D graphics processor combined. It has the ability to change its word size from anywhere between 1 and 32 bits with data addressable on bit boundaries. It's wild.
Like many games at the time the code in the ROMs had tamper protection. The basic idea is to detect if the ROM has been altered and if so assume it is a pirated copy and do something to cause the game to fail. Pirates could still make exact copies but they'd have to retain copyright notices and trademarks which makes them more vulnerable to detection and prosecution.
At the last minute Dave changed the position of some logos on screen but forgot to update the copy protection. When an illegal copy was detected the game would use some of the digits of the score obtained at the end of the game to zap bytes in memory if the overall score itself was within some range. This kind of random scrambling that only happens semi-randomly is better than copy protection that triggers deterministically. Imagine if the game did a checksum of its ROM and immediately said "Illegal Copy Detected - Program Halted". It would make it easier to find the copy protection routines and disable them.
So the first version that shipped thought it was an illegal copy and triggered the random scrambling. It so happens that sometimes the random scrambling would result in giving the player 40 credits.
I imagine the original labels would have been in English. The assembly source code I've seen for Japanese games has variables and labels in English with Japanese comments in Shift-JIS. I would guess the choice was forced because the assembler, linker, debugger or other tools did not support Shift-JIS properly. Often labels are restricted to 6 bytes which would be 3 Japanese characters. Perhaps such a limit was also a factor.
This hybrid approach was used for a series of arcade games on XArcade for the Xbox 360. Frogger, Gauntlet, Joust, Gyruss and a bunch of others. I think we referred to it as "reskinning".
Mostly it was applying higher-resolution tiles and sprites but some had particle effects added. For instance, some exhaust smoke on the cars in Frogger.
My favorite was Gyruss which is somewhat like Space Invaders of Galaxian but with a 3rd person view from behind the player's ship so you see enemies come toward you in pseudo-3D. We found that the game internally used polar coordinates and had a fair bit of code that converted those 2D and chose the right sized sprites. Replaced all that with 3D models rendered at the same spot in space.
I view these techniques along a continuum where you're essentially changing the porting layer. An emulation puts that layer at the hardware. A hybrid approach pushes it some ways into the game. A source code port is above the binary but some ways into the code. A remake is at the "game design" level.
Do you have a more direct link? That one just goes to Noah's home page and for the life of me I couldn't navigate to the ultra-fast math page nor find it via Google.
Yes, for better or worse ugly is definitely a 2nd-order predictor of success.
I guess "ugly" is a fair assessment of the 8086 at the time. Certainly the 68000 was a much cleaner and orthogonal architecture. On the other hand, I'd rate the 8086 at least as good as if not better than other contemporary microprocessors such as the Z-80 and 6502. The 6809 was sweet but a 16 bit address space rooted it in the previous direction and the 68000 make it clear that the 6809 wasn't in Motorola's future plans. Sure, had IBM chosen the 6809 there surely would have been a compatible follow-on but I can't imagine even the stanchest IBMer to have that kind of hubris.
But calling MS-DOS ugly at the time would have been unfair. It was as capable as any other microcomputer OS at the time in the home computer space. It was widely proclaimed to be a rip-off of CP/M so we might take that as a compliment and if you look at TRS-DOS, Apple DOS or whatever PET's used it was just fine. It'd be unrealistic to suggest and mini or mainframe OS was an option and Unix just wasn't there yet. If IBM had given Microsoft more lead time they might have went with Xenix which they did have out in 1981 for the Z-8001. I'm not so sure IBM would have been interested, though, as they wouldn't have an exclusive license to the OS.
Not to mention that the overhead of the operating system was an important consideration. The machines didn't have much capacity to waste and whatever you picked it still had to perform well on a system with only floppy drives. Maybe that in itself doesn't rule out Unix but it sure cramps the design space.
In both cases, CPU and OS, the ugliness really took off with backward compatibility to maintain. The 80286 was already being designed so it drove that deeper into the weeds and there was no way of bypassing MS-DOS compatibility once it anchored the marketplace. The only way forward was to improve the OS while keeping MS-DOS programs running and the whole OS/2 debacle only helped to delay that upward path.
I mean, fair enough to say "ugly won" but some consideration should be given to the lay of the land when these long-term trends were set in motion.
You could choose that the memory accesses be at any bit size from 1 to 32 with both unsigned and sign-extended register loads. In fact, I think you got to pick two sizes and for most normal operation you'd choose 16 and 32 bits. There were some load/store operations that always operated at 8 bits so you'd get the usual complement of data sizes. Pretty useful feature for something that was intended to be a graphics processor. I've also seen the different bit sizes used for fast and simple Huffman decoding in gzip decompression.
The auto-increment/decrement addressing modes were aware of the bit size. Thus you could have polymorphic subroutines where the same code could be used to sum an array of bytes, words, 13 bit quantities or whatever size word you wanted (up to 32 bits).
Since the processor had built-in circuitry to help drive graphics displays it took a high input frequency of something like 40 or 50 MHz and actually divided that down to run the processor at around 10 MHz or so. The opposite of what we're used to now and made the darn things look scary (well, scarier) from an emulation programmer's perspective.
The memory interface was designed to work with shift-register VRAM. The idea there being that the VRAM chips had a built-in shift register 512 pixels wide. The display circuitry would make use of it by having each line from the frame buffer dumped into the shift register as the raster was moving down the screen. And then the shift register would clock out the pixels as the raster moved across each line.
During VBLANK (when the video circuitry is waiting for the raster to return to the top of the screen) you could use special instructions to load and store the shift register which would be as fast as any load/store operation. The entire frame buffer could be filled very quickly with any repetitive pattern or erased entirely by copying some fixed line to all the others.
Yes and then the TMS34020 came out which had a trapezoid fill instruction. Essentially you would partition your polygon into trapezoids; For each trapezoid you would load the appropriate registers ... the slopes of the edges used 16.16 fixed point numbers. I remember how much you could do with fixed point arithmetic.. very few programmers even know how program using fixed point anymore. A very useful lost art.
There was also a built in Bressenhsm line drawing instruction.
Then there was the TMS34082 floating point coprocessor. I essentially wrote a primitive OpenGL'ish pipeline API. That was fun.
Wow... I really wish a general purpose processor like this made it into the PC. The best stuff doesn't always win in the marketplace.
Usually out of pride or the sunk cost fallacy (or something like it) I'll convince myself there was no other way the problem was going to be solved. Either way the next time around I spend just a little bit longer trying to think of an easy way out.