I had the pleasure of watching Brossard's presentation on wcc at DEF CON last weekend, and it was quite fascinating.
There was a live demo of wcc "unlinking" an ELF executable into a portable library that was then linked with and called by another library. Looking forward to where this project goes in the future! He did mention that help expanding this project would be much appreciated, as it takes time to reverse the relocation patterns of various binary formats, so please help out if you find this interesting!
The primary use of wcc is to "unlink" (undo the work of a linker) ELF binaries, either executables or shared libraries, back into relocatable shared objects.
If I understand correctly, this is a novel way of speeding up reverse engineering by creating the missing middle step of compiling a binary - object code. Normal decompilers go from machine code to source code, which is difficult to work with. Going from machine code back to object code is not supposed to be easy or even possible. (ELF is only supported right now because reverse-engineering the linking process for different architectures takes a lot of work.)
WCC unlinks machine code into a library, allowing you to interact with the object code and reliably include and call its functions from your own code, as if it were a shared system library rather than a black box.
I wonder how many folks are aware that ELF shared libraries and ELF executables are the exact same thing?
There's no substantial compiling or decompiling necessary to go from shared object to executable and back. One need only add or remove the appropriate symbols from the object and (perhaps even optionally) change the object type.
Have you ever tried to directly execute your libc.so.6 on a linux system? You can.
There's nothing at all novel about this process, other than making it easy for folks who've never read the manual for their linker.
Edit: Someone downvoted this observation; I imagine they think I'm unfairly trivializing this so I'll add: Read the source code. This tool we're discussing is a simple ~100 line program. It's not complicated: https://github.com/endrazine/wcc/blob/master/src/wld/wld.c
I don't get it. Isn't there a missing step of making the objects relocatable (i.e. what -fPIC does)? That requires rewriting branches and stuff. But clearly wcc isn't doing this?
PIC makes the code position independent. That is, the code can be moved without relocation fixups. ELF executables typically use relocatable code (with relocation sections) rather than PIC because especially on i386 it has performance and size implications to use -fPIC. But on x86_64 it's actively discouraged, but not actually prevented if you generate an ELF file that the linker is satisfied with.
The main issue with that latter part is that 64 bit code can contain 32 bit relocation fixups, and such addresses will be impossible for the linker to relocate outside the 32 bit memory space, and your mileage may vary whether your load time linker will be able to even try to handle that or not on a 64 bit platform.
If your code is free of 32 bit relocations, there "shouldn't" be any reasons for them not to be usable as shared objects even if they contain relocations instead of use PIC (whether any specific load time linkers unnecessarily enforce restrictions on this is another matter).
> WCC unlinks machine code into a library, allowing you to interact with the object code and reliably include and call its functions from your own code, as if it were a shared system library rather than a black box.
Shared library is as much of a black box as an executable.
This is not what I'd call an "API dump"; all prototypes return void pointers and allow any arguments.
Hence, it's just as much of a blackbox as the original executable is because you have no idea what the functions do, what arguments they take and what values they return.
I don't know, having the ability to compile your own test driver against it rather than manipulating the executable is what I'd be after with the tool. You may be reading too much assumption into my comment.
Suppose you have an executable compiled for /usr/local on a SVR4 UNIX system, or on GNU/Linux (this is a severe violation of SVR4 and LSB FHS standards [1][2]). With these tools, you could theoretically unlink the application, then re-link it with $ORIGIN linker keyword, and install it into /opt, where the standards say it belongs. (Same for 3rd party unbundled applications which were linked to use their own shared libraries residing in /usr/lib[64] or /usr/lib[/64], another very serious architectural offense.)
Let's build upon the author's example in order to demonstrate the above (imagine for a moment that ls(1) is a third party, unbundled application which clashes with vendor's own ls(1) as delivered with the operating system):
These tools really have potential! Hopefully wcc will gain support for non-intel, non-GNU/Linux ELF files as well.
And the readme.md! I love how the author shows an example after every explanation! If only every readme.md was written that well!
Since shared object libraries must be generated as position independent code, how does converting an executable which was not generated as position independent code work though?
This is really cool. I've wondered occasionally whether something like this might be possible, but this looks like a really spiffy implementation. I wonder how hard it would be to add Mach-O support to it...
wcch Sounds very useful. It sounds similar to some other tools in other languages, but I'm not sure how those are implemented, JNA comes to mind.
Recently I wrote a Rust library and wrote the header by hand for C. It sounds like this could at least get things started. Though my guess is it would generate the right 'const ' vs. plain '' for pointers.
Could this be used to combine the output of projects written in incompatible languages? EG: say I have a rust program and a go program, both are open source, but I want to call them from a swift program. Could I compile the rust and go programs into ELFs then use WCC to break them out into "libraries" and then call them from swift?
Is there a better way to do this? Right now the alternative I'm thinking about is creating opening sockets from the Rust and GO side and talking to them via TCP.
But the end result I want is a single binary for the target platform and no install complications other than putting the binary on a stock unix machine.
There is indeed a much easier way: Just create `.so` libraries (with a C interface, for example) from go or rust, as they compile to native code anyway.
I wouldn't use wcc for that ever, only for reverse engineering and similar things.
There was a live demo of wcc "unlinking" an ELF executable into a portable library that was then linked with and called by another library. Looking forward to where this project goes in the future! He did mention that help expanding this project would be much appreciated, as it takes time to reverse the relocation patterns of various binary formats, so please help out if you find this interesting!
EDIT: Talk outline here provides some nice context https://www.defcon.org/html/defcon-24/dc-24-speakers.html#Br...