Hacker News new | past | comments | ask | show | jobs | submit login

Just using I$ hit ratio is problematic in many ways. E.g:

- You'll probably not find implementations of different ISAs with identical cache configurations (size, associativity etc).

- It says little about what work is actually done (different ISAs = insns do different amounts of work).

- On x86 all bets are off w.r.t. the effect of the uop cache on the L1I cache hit ratio, and the uop cache hit ratio can't be compared to any other machine.

- You need to reproduce the same program flow on different architectures to be able to compare the numbers.

...etc.

I think that the only reasonable way to do it is to have a multi-ISA simulator where you are in full control of all these aspects. And it would be really hard work.




Re 2, the work per instruction doesn't matter if you compare the same program/program execution, in practice you will get an estimated of the resident set size over the amount of work.

All your other points do stand and that's what I mean with 'is very machine dependent'. And yes, if you want to isolate fully the effect of instruction density an emulator might be the only solution. Still I think that profiling counters can get you 90% there.


I just don't see how I would run profiling counters on a z15 machine for instance ;-)

With my limited resources this got me much more architecture coverage.

And as usual, the answers are in the data, it's merely a matter of what questions you think that you are asking...


Yes, z15 is an issue :)

Still an arm vs x86 should be doable and maybe even riscv.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: