Unicorn: a lightweight multi-platform, multi-architecture CPU emulator framework

tiles · on Aug 2, 2015

Reminds me of the long-abandoned libcpu project: https://github.com/libcpu/libcpu

jaytaylor · on Aug 2, 2015

Do you know why it was abandoned?

andrewflnr · on Aug 2, 2015

So this is a competitor to qemu?

AlexAltea · on Aug 2, 2015

I have a lot of questions myself that I hope to get clarified on August 5th, but I think the goals of Unicorn Engine and QEMU are quite different.

From the sample snippet, it looks like Unicorn Engine provides an interface that will help at reverse engineering tasks and building other tools that require the translation+execution of certain functions or blocks of instructions of another architecture (it reminds me to the disassembly framework 'Capstone Engine' in that regard). In the other hand, QEMU's goal is rather the emulation of an entire system and different audio/video hardware (basically, a virtual machine).

kitsunesoba · on Aug 2, 2015

Could it be used as the the CPU emulation component of a compatibility layer? For instance, if someone were to create a WINE-like Classic Mac OS compat layer, could Unicorn serve as the PowerPC emulation portion? Or is the intended use case something entirely different?

AlexAltea · on Aug 2, 2015

Regarding performance: An important question would be how is the target machine code being emulated. From the scarce information we have so far, I think they are using an interpreter, which would be too slow for most applications. With a JIT (or even AOT) compiler this scenario would be more realistic.

Then, there is the question of what does Unicorn Engine do at user-level with instructions like syscall on x86_64 or sc in ppc64. I see no references to any of this in the site. If your target is "high-level emulating" the underlying operating system (like WINE does), being able to specify native handlers for such instructions is a must.

If both conditions are satisfied then it should be doable for most applications: Load the binary, replace calls to dynamically linked system libraries with native implementations that match the specifications (on undocumented systems this means lots of reverse engineering) and ensure that any syscall/sc/etc. instruction results in a behaviour the application expects. From here you could take care of the rest yourself: Multiple threads could be handled by multiple host threads running each a Unicorn Engine instance with separate CPU register states, but sharing the entire virtual address space. The target application's stdin/stdout could be redirected from the host's emulator stdin/stdout. And so on... :-)

Disclaimer: I'm just guessing what could be done if they provide a decent binary translator and the API cares about high-level-emulation. But I'm not sure if that's even a goal for them by looking at their site.

pygy_ · on Aug 2, 2015

Whose existence they don't seem to acknowledge...

halosghost · on Aug 2, 2015

I'm very interested to see how well it goes. If it is as flexible as it seems, perhaps it could be used to emulate the Mill? It'd be great to mess around with all these architectures!

danielvf · on Aug 2, 2015

Looks very interesting, but no docs, downloads or code yet.

Turbo_hedgehog · on Aug 2, 2015

Looks like their briefing is August 5 or 6.

https://www.blackhat.com/us-15/briefings.html#unicorn-next-g...

asb · on Aug 2, 2015

With no code release yet it definitely seems early to be discussing on HN. The author of the excellent Capstone disassembly engine (http://www.capstone-engine.org/) is involved though, so it definitely has my interest.

faragon · on Aug 2, 2015

Questions:

- Does it support unaligned access hardware exceptions for ARM, MIPS, and Sparc?

- Documentation and download links are broken. Are you going to fix it?

nekitamo · on Aug 2, 2015

Here is an example that I found on their Twitter: http://dpaste.com/0YA53VR

amelius · on Aug 2, 2015

> Implemented in pure C language

It seems to me that enormous performance gains could be attained by JIT-compiling the assembly instructions to native assembly instructions.

asb · on Aug 2, 2015

This is how QEMU and its TCG works: http://git.qemu.org/?p=qemu.git;a=blob_plain;f=tcg/README;hb...

pygy_ · on Aug 2, 2015

Is it still written in plain C like the former dyngen?

sgk284 · on Aug 2, 2015

How would that interact with code that modifies itself?

AlexAltea · on Aug 2, 2015

If the emulator detects a memory write to one of the already recompiled blocks it can invalidate the results of the previous translation. If branch occurs to that block or to a new area in memory where new instructions were emitted, it can translate that part and cache the results. With a JIT binary translator this is no big deal. However, AOT translation of self-modifying code would not work.

funsing · on Aug 2, 2015

their sample looks very cool: http://dpaste.com/0533J4M

they will release the code after their BlackHat USA talk.

Karunamon · on Aug 2, 2015

Not to be confused with Unicorn, the Rack web server, which I'm pretty sure had the name first :/

http://unicorn.bogomips.org

Would it be gratuitously negative of me to point out that I wish projects would stop doing this?

icebraining · on Aug 2, 2015

And before that there was the Unicorn text editor & dictionary: http://www.quasillum.com/software/unicorn.htm

And there's also the Unicorn Control Software: http://www.gelifesciences.com/webapp/wcs/stores/servlet/Cate...

The reality is that if you choose an existing word, chances are it's already being used.

Flow · on Aug 2, 2015

Perhaps the name CPUnicorn would also be more descriptive?

anonbanker · on Aug 3, 2015

Let's just change the name to Firebird. Worked for Mozilla.

funsing · on Aug 2, 2015

you remind me that Microsoft used to try to ban the name "Windows". pathetic, isnt it??