Hacker News new | past | comments | ask | show | jobs | submit login
MetalNES: Transistor Level NES Simulation (github.com/iaddis)
282 points by ingve on Feb 26, 2022 | hide | past | favorite | 49 comments



MetalNES is by the author of NESticle. Like Visual2C02 and Visual2A03, presumably it runs at such a slow speed that games are unplayable. Like, it may have taken hours to reach the SMB menu screen, but I would love to be corrected :)



Awesome... so we just need a computer that's about 200x faster than what we have available today. So 20 years from now.


Maybe ReactOS will reach feature parity with Windows 7 by that point.


GPU should be able to do digital logic simulation a lot fast no?


And the next step after that to improve performance would be an FPGA, then an ASIC... ;)


Then maybe we should just simulate the behaviour of the system in code... wait a minute.


Now you're getting it!


For a while now there have been companies using GPUs to do parallel digital logic simulation for things like automated testing


> MetalNES is by the author of NESticle

Thanks Shitman!


Absolutely sick. Chip community is so rad.


Reminds me of GateBoy, which was posted a while back: https://news.ycombinator.com/item?id=28396927, https://github.com/aappleby/MetroBoy

GateBoy runs surprisingly fast for a gate-level simulation.


What does it mean to be a transistor-level simulation in this context?

I used to work with circuit simulators to design circuits (discrete and integrated) and there are many transistor models (mathematical models) depending on the use case. In spirit, not too unlike DFT for Schrodinger equations in quantum mechanics.


Most game emulators are looking at the instructions from a ROM file and then implementing those CPU instructions in code. Usually with those approaches there are inconsistencies due to the original game console's quirks (or bugs/flaws) that show up during gameplay, making the emulation not completely true to the original hardware.

In this context it means emulating "perfectly" by actually simulating the hardware in full instead of taking the aforementioned approach.


So, in this context, transistor-level simulation does not mean actually simulating transistors (voltages, currents, doping, etc)?


I would guess "transistor-level simulation" means you simulate a NAND transistor by doing something like.

bool c = !(a && b)


Maybe saying 'gate-level' would be more accurate


I love this mans work; even if getting it to run at 100% might see me an old man, if not me, then my children.


Would be really cool to see a video of this in action.


I don't really pretend to understand much of what I'm seeing, but I've recorded a short video of it running and uploaded it to my server [0].

From what I can tell, this 3-minute video shows the process of generating 2 and a bit frames.

[0] http://topazgryphon.org/metalNES.mp4


go lil' electron beam, go!

9000 cycles/s / 1.79 MHz = 0.5% of realtime. The only time you ever see rendering a line at a time is single stepping through an emulator.

I wonder if this is how FPGA developers working in VHDL/Verilog see the world? https://www.patreon.com/posts/37038491 here's a short clip of the VRAM of a PSX rendering about half of a completed frame of Final Fantasy 7's battle screen.


~9000 cycles/sec (displayed in the video), wow! For comparison, the NES ran at 1.8 MHz, or about 200x faster. From the number of frames drawn and time, I'm guessing this simulator is actually averaging much lower than 9 kHz.


Super curious about your personal server setup, mostly how you're serving content directly to the open web?


I mean you can always use a server on digital ocean or something. That’s how I serve web sites. It’s still just a Linux machine and you can throw whatever content you want on it.


That it would. I don't have access to a Mac or time to get it building elsewhere, so a video would be great.


Niiiice.

-Austin, GateBoy guy.


is this allowed for speedruns


I think it'd be considered TAS if you report your time by frame count. Slowing down an emulator is a TAS.


Allowed? Maybe, maybe not.

Helpful? Extremely not.

I just ran this on my high-end M1 Max, and it rendered 2 frames in roughly 3 minutes.

Hardware-accurate emulation is, generally speaking, not fast emulation.


Hardware-accurate emulation in software is typically not fast. FPGAs however can achieve low-level emulation with a high degree of hardware accuracy:

https://github.com/MiSTer-devel/Main_MiSTer/wiki


I mean, aside from the question of whether fpgas can truly accurately simulate everything at the transistor level, the truth is that most fpga emulation cores for this sort of thing don't. They are, afaik, still mostly 'just' cycle-level accurate emulations built off the same reverse engineering that accurate software emulators are. They can do it more efficiently for sure, but that doesn't mean it's worth it for them to go much farther.

Like, for an extreme example there's no world in which the in-development PSX mister core, running on a de10 nano, is particularly more accurate than a software emulator. The chips involved have too many transistors to completely simulate. But this will be true to some degree for everything above some complexity threshold. Maybe the Atari 2600 core is perfectly accurate.

I wouldn't be surprised if the most accurate software emulators are more accurate than the most accurate fpga cores, especially for anything newer than the NES, just because that's where most of the original work involved happens.

This idea that FPGAs are inherently more accurate seems to be rooted in marketing FUD from a company called Analogue.


I’ll 100% give you the marketing spin, but FPGAs provide an element of accuracy with frame delay. If you plug in a controller and use a CRT, there no more lost frames than the original hardware.

That’s the accuracy that’s above what an emulator on our Babel’s tower of a stack on Windows.


That's not because of the fpga though. You can also plug a CRT into a PC. You can even chase the beam if you're willing to go to a low enough level. People generally don't but this is a limitation born advantage. There's no OS (sort of) to get in your way and why would you put one there if it's not helping you.

(I say sort of because there usually is a system that could be described as an OS running somewhere on the fpga board, to load the fpga bitstreams, if nothing else, but also sometimes to supplement the fpga's abilities)


You will find projects that have zero-input lag with FPGA based emulators right now. You will not find PC-based emulators with zero-input lag at all. That this is the reality has no effect on whether a PC could do it or not.


The 6502 unfortunately relies on some effects that can't be described in FPGA synthesizable logic. You might get a lot of speed up over the extremely simplified spice style simulation you have to do that still might get you to real time, but it's not the slam dunk you might think.


Sounds interesting. Any citations where I can read more about those effects?


You can sort of divine them from how the undocumented instructions work.

https://www.pagetable.com/?p=39

It more or less falls under true classic highz buses with multiple drivers.

You have three options I can see

1) treat it as a spice simulation that aggressively optmizes to synthesizable logic most places other than the problem parts.

2) hand replace the problem parts with synthesizable logic sorta like how Nvidia replaces graphics shaders with had written replacements in their drivers

3) just throw the whole thing into a spice simulation


I think the decrement relies on open bus behavior I think - the 6502 uses it to cheat and set a value to FF in the same cycle which is then sent through the same path as increment or add, effectively adding -1 to a value.


Out of curiosity, what effects are you talking about?


The internal buses are basically true highz buses with multiple drivers and latches hanging off of them.


That shouldn't be a problem to implement on an FPGA with a tiny bit of extra logic.


Totally agreed, but you can't just throw the netlist at a fpga and expect the synthesizer to be happy is all I'm saying.


Is it still really emulation at that stage? Or is it more like a reimplementation?


yes, they're called slowruns


A beautiful small codebase from which to run little experiments.


iirc this is what they said would never be possible with any computer, whatever the specs (at real life speed).


It won't run at real-time speeds. Way, wayy slower. You could put this on a FPGA though, if you port it to one. Then it would run in real-time. I doubt it will run exactly as it is, with the limitations an FPGA provides. The logic should be fine though.


Not to be picky here, but silicon is not a metal, but a metalloid. MetalloidNES sounds great in my opinion.


I would get a little kick out of playing Metroid on that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: