Just for fun, I decided to pass the output.js file through Google Closure Compiler's advanced optimizations. It does a surprisingly good job at reconstructing part of the strings.
Not pasting the full thing. But it reduces the output.js file from ~118 KiB to ~9.92 KiB, which is pretty good!
There is technically not much stopping the compiler from inferring that 1/0 === Infinity, recognizing (1/0+[])[4] is free of side-effects, and eventually concluding its safe to substitute the whole expression with "n". Google Closure already has optimizations for string concatenation, so if it were able to perform an optimization pass with Infinity, then it would also be able to emit the string "constructor" instead of "co"+(1/0+[])[4]+"structor"
Interesting. I wonder if an even more effective approach to unobfuscation would be an "anti-jit" compiler; since what we're interested in is the actual execution flow, can we leverage all the browser's optimization engine to pull that out for us?
Do the various JIT engines use an intermediate representation (IR) and what does it look like?
V8 starts with interpreting bytecode (Ignition), then hot code gets tiered up to a non-optimising JIT without an IR (Sparkplug), and even hotter code goes an optimising one with an IR (TurboFan)
That way it can start executing quickly and not waste time on compiling/optimising things that only get run a few times
There is technically not much stopping the compiler from inferring that 1/0 === Infinity, recognizing (1/0+[])[4] is free of side-effects, and eventually concluding its safe to substitute the whole expression with "n". Google Closure already has optimizations for string concatenation, so if it were able to perform an optimization pass with Infinity, then it would also be able to emit the string "constructor" instead of "co"+(1/0+[])[4]+"structor"