Hah, I had the same thought. What kind of hacks can I do to convince the process...

djmips · 2024-07-05T15:58:51.000000Z

Back in the nineties, 3DFX had a synthetic rendering benchmark that relied on keeping the entire benchmark in L1 but the secret was taking over the entire machine so that no interrupt or other mechanism could pollute the cache.

dhosek · 2024-07-04T02:36:49.000000Z

A bit of ignorance on my part, but would the L1 be holding the data and the instructions? In which case we would be trying to fit our entire 6502 emulator in less than 64K of memory alongside the emulated RAM/ROM?

Someone · 2024-07-04T07:19:17.000000Z

https://en.wikipedia.org/wiki/Apple_M1#CPU:

“The high-performance cores have an unusually large 192 KB of L1 instruction cache and 128 KB of L1 data cache and share a 12 MB L2 cache; the energy-efficient cores have a 128 KB L1 instruction cache, 64 KB L1 data cache, and a shared 4 MB L2 cache.”

⇒ chances are you don’t even have to try to fit an emulator of a 8-bit machine and it memory into the L1 cache.

filleduchaos · 2024-07-04T09:14:15.000000Z

I think you would very much have to try to fit a complete emulator of, say, the Game Boy into 128 + 64KB.

There's plenty of behaviour that is self-evident on real silicon but verbose and tricky to express in software.

throwaway2037 · 2024-07-05T04:22:48.000000Z

Real question about L1 caches. For a long time, x86 (Intel & AMD) L1 caches have been pretty much pegged at 32KB. Do you know why they didn't make them larger? My guess: There is a trade-off between economics and performance.

account42 · 2024-07-05T12:02:55.000000Z

There is a trade-off between cache size and latency.

throwaway2037 · 2024-07-05T16:12:39.000000Z

Ok, so why do the new Mx chips from Apple have an L1 cache size greater than 32KB? Did they solve some long standing design issue?

cmrdporcupine · 2024-07-04T04:05:50.000000Z

The CPU decides what goes in there and when. You can only pray and offer sacrifices and guess at when and how.

pjc50 · 2024-07-04T09:15:41.000000Z

Depends on the precise architecture, but ARM (and other RISC designs) usually have separate data and instruction L1 caches. You may need to be aware of this if writing self-modifying code, because cache coherence may not be automatic