Yea, large effort (objcopy, strip, sstrip), that I need to do only once, after that I already know how to do this an can just put it in a shell script. Now, I would like to know how quickly you'll be able to write a quicksort implementaion, that would outperform my C code, while being at least 25% smaller . Yeah, and how big your binary will become, once you link libc or opengl or whatever to make it able to do actual work.