Something to consider is whether an inherent limitation of HLLs is that they simply cannot compete with Asm on efficiency, or if the common mantra of "the compiler can always do better" is ultimately true. The presence of projects like this certainly calls into question such points.
Put it another way, could the same results be achieved with C, or even something much higher level like Haskell or Lisp, if only compilers were better at generating code?
I've looked at a lot of disassembled code over the years, and it's extremely easy to tell whether something was generated by a compiler or hand-written by a human; the "texture" is quite different.
Remember that BASIC, Pascal, Modula, Oberon, Ada, and LISP were used in the past for OS's and system software on machines with almost no hardware by today's standards. Ada, Java subsets, and Astrobe's Oberon are still used in embedded systems today.
What makes today's software bloated is the crud built-up over time, standardization, security, reliability, a trend toward easier maintenance/productivity over raw speed, and so on. Here's [1] a simple program that got trimmed down to mere bytes. You can see how much overhead the aforementioned items add to C code which, by itself, produces very efficient assembler. For people wanting a middle ground, there are High Level Assemblers such as Hyde's HLA [2] and I've speculated we could do something similar with LLVM's bytecode.
Put it another way, could the same results be achieved with C, or even something much higher level like Haskell or Lisp, if only compilers were better at generating code?
I've looked at a lot of disassembled code over the years, and it's extremely easy to tell whether something was generated by a compiler or hand-written by a human; the "texture" is quite different.