Hacker News new | past | comments | ask | show | jobs | submit login

Google for anything Rolf Rolles has published on the topic, believe it or not there are general approaches to solving this. Someone already mentioned dumping the text segment, that only works for silly 90s-era obfuscators.

Contemporary obfuscators _rewrite_ the protected code as a series of instructions executed on a virtual machine whose bytecode (and bytecode semantics!) are randomly generated at build time. The solution (AIUI) is symbolic execution of the instructions to determine their underlying architectural effect, synthesize some compiler IR that is equivalent to those effects, run an optimization pass (like a regular compiler) over that IR, and finally generate x86 from the result.

The optimization passes are necessary to remove side effects that do not impact the state of the program ("noise"), which modern obfuscators like Themida insert a ton of into the instruction stream

In other words, rather than attempt to dump some particular part of the program, the binary as a whole is statically analysed to determine, regardless of the indirections inserted by any obfuscation pass, what machine instructions are ultimately executed for a given program input. The abstract representation is then compiled to an equivalent new program which is much easier to read, because all of the indirections and noise have been optimized away.

When I was reading about Rolles' work initially, I couldn't help but imagine this is the kind of approach Geordi La Forge would have come up with if cracking an encrypted binary were ever the plot for an episode of Star Trek :)




instructions executed on a virtual machine whose bytecode (and bytecode semantics!) are randomly generated at build time

Like the one built into Windows: https://github.com/airbus-seclab/warbirdvm




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: