> But it's also very easy to make stuff slow and it can be hard to guess what kind of stuff will be an order of magnitude slower
It's not that hard to guess. Just look at what most NPM/Electron programmers and guides recommend you do, and then do the opposite. I.e., think about what your program is doing, and write the most boring, 90s-style code you can. Works really well internally for the Firefox codebase, and has been for 20+ years (for longer than it has even been called "Firefox").
For those who don't want to go through all that, there's basically 3 simple rules to make your code optimizable:
* Objects should have a fixed set of keys (no adding more and especially no `delete`) and the type of the value should never change (note: in JS, a type must NOT be a union of multiple types if you want it to optimize)
* Arrays must be of one type and a fixed length
* Functions must take a fixed number of arguments and the type of those arguments must never change. This is by far the most important if you want to get your function's code optimized.
And as a corollary from this article, it's not JS specific, but converting string -> float and float -> string isn't a cheap operation. I'd go so far as to say that it is the most expensive operation in a huge amount of the functions where it happens. Int to string is especially egregious because it requires a lot of division which is basically the slowest (common) thing you can do on a CPU.
imul rcx, rcx, 1374389535
shr rcx, 37
imul edi, ecx, 100
mov r8d, edx
sub r8d, edi
movzx edi, WORD PTR .LC2[r8+r8]
mov WORD PTR [rsi], di
So if our example input is 12345, the first two instructions use the field-inverse property to compute "123" with multiply rather than divide (1 clock, latency 4), then multiply up again to get "12300", then subtract that to get "45". That can then be looked up at position 45+45 in the string "00010203040506070809101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899" to give you the two digits.
It's important, but I purposely stepped around this to not complicate things.
In truth, the only likely situation from these rules is hand-copying configuration then manually transposing. This isn't likely to happen in really hot code paths.
This is the one place where JS helps us because lazy typers are encouraged to return object literals from factory functions and the return will almost always be one object returned at the bottom of the function.
It also (intentionaly) bypasses manually adding elements to an object directly after the object is instantiated. In truth, doing this consistently and directly after it is created can usually be optimized too, but not necessarily and describing this also complicates things (and varies from one JIT to the next).
I was messing about with some profiling and seemed to find that TypedArrays are not always faster than regular arrays carefully handled. And what's worse it seemed like a regression from node 16 (fast typed arrays) to 18 (slow typed arrays). I will have to dig up my profiles again and see if I can verify it. Posting this comment in the hope that someone who knows more than me has something to add!
It would be great if there was a tool that shows you where deoptimization is taking place. IIRC the Chrome Javascript profiler once flagged functions where optimization failed, but this feature is not available in current versions of the devtools performance tool. Although deoptimization is logged, I am not aware of a way to trace the entries back to the source code.
> in JS, a type must NOT be a union of multiple types if you want it to optimize)
Does that mean that tagged unions (functional programming-style data types) are inherently slow in JavaScript? Maybe there is a way for the TypeScript compiler to pass pragmas (via comments in generated code) to the interpreter.
They can have pitfalls is a better way to think about it. Inheritance has a similar penalty.
The potential problem is that JS engines only see the ‘shape’ of an object. That is the members and member order. Which is slightly more strict that TS’s structural types where order doesn’t matter. So union variants and subtypes are each different shapes if they have different members or member order.
Shapes are important for function calls. Objects property access is optimised based on the number of ‘shapes’ that have been seen. Into the following:
Monomorphic - It’s seen only 1 shape and can access the properties by offset directly.
Polymorphic - A small inline cache of several shapes. Access is a lookup into this. Slightly slower.
Megamorphic - A lookup into a global table. This is a performance cliff.
So if you have lots of functions that take a tagged union or base class *and at runtime* they see lots of different types passed in *and this is in a hot path* they can be a problem.
It's not that hard to guess. Just look at what most NPM/Electron programmers and guides recommend you do, and then do the opposite. I.e., think about what your program is doing, and write the most boring, 90s-style code you can. Works really well internally for the Firefox codebase, and has been for 20+ years (for longer than it has even been called "Firefox").
Relevant talk: <https://www.youtube.com/watch?v=p-iiEDtpy6I>
Relevant paper: <https://users.dcc.uchile.cl/~rrobbes/p/EMSE-features.pdf>
Relevant thread: <https://news.ycombinator.com/item?id=33790568>