Hacker News new | past | comments | ask | show | jobs | submit login
Speeding up the JavaScript ecosystem, one library at a time (marvinh.dev)
481 points by fabian2k on Nov 29, 2022 | hide | past | favorite | 188 comments



Heh, so many of these fixes boil down to "I'm converting this thing to a string and then searching the string". And, funnily, I've often stumbled on this problem with Java programmers. They'll take a nice rich type, turn it into a string and then spend a bunch of code/time turning that string back into something else.

Super wasteful when you start thinking about what all that means. Rather then having a representation that neatly fits into registers, now you have this blob of memory that has to be linearly searched to pull information from it. Just. yuck.

And why do devs do this? TBH, I really don't know. Because regexes are easier to understand? IDK.


>And why do devs do this? TBH, I really don't know.

it's one of the pillars of the unix philosophy at work - text is the universal interface.

your nice rich type requires figuring out how that type works, maybe even reading documentation. if you make it a string, you can just treat it like a string. and we all know how to handle strings, or at least how to copy code from somewhere else that handles strings.


C#'s System.Object base type contains a ToString, allowing any richer object to fallback into such inefficient systems if required.

I'd also caution that "everything is a string so no need to read the documentation" is unrealistic and even outright dangerous. *NIX's string philosophy has directly resulted in multiple security vulnerabilities.


Just FYI, I’m reasonably certain that parent is being satirical.


I'm not so sure about the documentation part, but this seems valid at face value:

> it's one of the pillars of the unix philosophy at work - text is the universal interface.

In general, text is indeed the lowest common denominator that's available to you and fits with the majority of GNU tools nicely (and how most *nix distros are structured). Of course, some structure helps, since dealing with something like Nginx or Apache2 text logs might not be as nice as having your logs in JSON, though in regards to configuration files or program output there's lots of nuance - structured text (like JSON) will be easier to work with programmatically but might not be as nice to look at. Working with binary data in comparison can be asking for problems, to the point where using SQLite for your application data might make more sense: https://sqlite.org/appfileformat.html

In this particular case, however, the conversions were made within the confines of a single program, so the interop is indeed not a valid concern here. A properly written type might indeed allow you to avoid various inefficiencies. If you're not serializing your objects and whatnot to make the life of the consumer of your data easier (be it another tool, or a REST/SOAP API client that needs JSON/XML, or a configuration file on the file system), then there's not that much point in doing those conversions.


Unfortunately "text is the universal interface" is pretty much a lie, for the exact same reason that "bytes in memory are the universal interface" is a lie.

An interface has some kind of structure for the data (and operations) that it represents. That structure may be explicitly defined or left implicit, but it is always there. And in the vast majority of cases, that structure is more complex than "a single independent string".

That's where the whole "pipe text between processes" approach breaks down - in many cases, you aren't just dealing with a single string, but with some sort of structured data that just happens to be encoded as human-readable text. That it's "text" is entirely immaterial to the structure of the data, or how to work with it.

And that's how you end up with every CLI tool implementing its own set of subtly incompatible data parsers and stringifiers. This is strictly worse than having a standardized structural format to communicate in.


> That structure may be explicitly defined or left implicit, but it is always there.

But this is something that every piece of software out there has to deal with. By this reasoning, claiming that RESTful APIs can evolve over time and don't need something like WSDL (though OpenAPI still exists) is also a similar lie, because there absolutely are assumptions about how things are structured by anyone and everyone who actually needs to integrate with the API. Even "schema-less" data storage solutions like MongoDB are then also built on similar lies, because while you can throw arbitrary data at them to store it, querying the data later will still depend on some assumptions about the data structure.

I'm inclined to agree, and yet, RESTful APIs are still very popular and seem to have much success (though some push for more "immutable" APIs that are versioned, which is nice), as do "schema-less" data stores in certain domains. Edit: you can also say that about everything from CLI tools or even dynamic programming languages as a whole. I think there's more to it.

> And that's how you end up with every CLI tool implementing its own set of subtly incompatible data parsers and stringifiers. This is strictly worse than having a standardized structural format to communicate in.

I suspect that this is where opinions might differ. Half-assing things and counting on vaguely compatible (at a given point in time) implementations has been the approach on which many pieces of software are built to solve numerous problems. On one hand, that is ignorant, but on the other hand - how often is the output from "ls -la" going to change its format, really? The majority of CLI tool writers seem to understand how brittle and bad everything is and thus treat their data formats carefully, adding a whole bunch of different flags as needed, or allowing for custom formats, like: "lsblk --output UUID,LABEL,SIZE"

For many out there, that's good enough. At least to the point where something like PowerShell isn't mainstream, despite some great ideas in it (or other shells that would let you work with objects, instead of text).


> Even "schema-less" data storage solutions like MongoDB are then also built on similar lies, because while you can throw arbitrary data at them to store it, querying the data later will still depend on some assumptions about the data structure.

They are. I have been complaining loudly about this for years :)

> I'm inclined to agree, and yet, RESTful APIs are still very popular and seem to have much success (though some push for more "immutable" APIs that are versioned, which is nice), as do "schema-less" data stores in certain domains. Edit: you can also say that about everything from CLI tools or even dynamic programming languages as a whole. I think there's more to it.

The part that I think you're missing is the difference between structure and schema. It's entirely valid to leave the schema implicit in many circumstances (but not all - generally not in a database, for example), but you still need the general structure of how data is represented.

When you deserialize something from JSON, CBOR, whatever, you will get back a bunch of structured data. You may not know what it means semantically, but it's clear what the correct (memory) representation is for each bit of data, and that's exactly the most fragile part of dealing with data, which is now solved for you.

You do not get the same benefit when passing around strings in ad-hoc formats that need custom parsers; it is very easy to mess up that fragile part of data handling, which is why parsers are so infamously difficult to write, whereas working with parsed data structures is generally considered much simpler.

Likewise, from a usability perspective, it's fairly trivial to let the user pass eg. a dotpath to some tool to select some nested data from its input; but letting the user pass an entire parsing specification as an argument is not viable, and that's why most tools just don't allow that, and instead come with one or more built-in formats. If those supported formats don't match between two tools you're using, well, sucks to be you. They just won't interoperate now.

Essentially, "standardized structure but flexible schema" is the 'happy compromise' where the most fragile part is taken care of for you, but the least predictable and most variable part is still entirely customizable. You can see this work successfully in eg. nushell or Powershell, or even to a more limited degree in tools like `jq`.

> I suspect that this is where opinions might differ. Half-assing things and counting on vaguely compatible (at a given point in time) implementations has been the approach on which many pieces of software are built to solve numerous problems. On one hand, that is ignorant, but on the other hand - how often is the output from "ls -la" going to change its format, really? The majority of CLI tool writers seem to understand how brittle and bad everything is and thus treat their data formats carefully, adding a whole bunch of different flags as needed, or allowing for custom formats, like: "lsblk --output UUID,LABEL,SIZE"

Speaking as someone who is currently working on a project that needs to parse a lot of CLI tools: the formats still change all the time, often in subtle ways that a human wouldn't notice but a parser would, and there's effectively zero consistency between tools, with a ton of edgecases, some of them with security impact (eg. item separators). It's an absolute nightmare to work with, and interop is just bad.

> For many out there, that's good enough.

I think there's a strong selection bias here. It's not without reason that so many people have an aversion to terminals today - they just aren't very good. And it's not just their terminal-ness either, because many people who cannot deal with a standard (eg. bash) shell will happily use terminal-like input systems in specialized software.

It's certainly true that most people who use terminals on a daily basis, consider this paradigm good enough. But that's probably because those who don't, just stop using terminals. We should be striving to make tech better and more accessible, not just "good enough for a bunch of Linux nerds", and certainly not upholding "good enough" paradigms as some sort of virtue of computing, which is what happens when people say "text is the universal interface".

Of course changing this is a long process, and there are very real practical barriers to adoption of other interop models. But that's not a reason to downgrade the 'ideal' to fit the current reality, only a reason to acknowledge that we're just not there yet.


> It's entirely valid to leave the schema implicit in many circumstances (but not all - generally not in a database, for example), but you still need the general structure of how data is represented.

> ...

> Essentially, "standardized structure but flexible schema" is the 'happy compromise' where the most fragile part is taken care of for you, but the least predictable and most variable part is still entirely customizable. You can see this work successfully in eg. nushell or Powershell, or even to a more limited degree in tools like `jq`.

That's a fair point, thanks for putting emphasis on this!

> I think there's a strong selection bias here. It's not without reason that so many people have an aversion to terminals today - they just aren't very good. And it's not just their terminal-ness either, because many people who cannot deal with a standard (eg. bash) shell will happily use terminal-like input systems in specialized software.

The question then becomes what can be reasonably done about it? Or rather, can it even be realistically achieved (in an arbitrary time scale that we care about), given how much of the culture and tooling is currently centered around passing text from one command to another. Change for something so foundational surely wouldn't be quick.

> But that's not a reason to downgrade the 'ideal' to fit the current reality, only a reason to acknowledge that we're just not there yet.

In the end, this probably sums it up nicely. Even if that current reality, which often will be the lowest common denominator, is what most people will earn their paychecks with (and possibly leave edge cases for someone else to deal with down the line).


> The question then becomes what can be reasonably done about it? Or rather, can it even be realistically achieved (in an arbitrary time scale that we care about), given how much of the culture and tooling is currently centered around passing text from one command to another. Change for something so foundational surely wouldn't be quick.

It definitely won't be quick, no. I think the tooling is actually not that big of a problem there; the functionality provided by the 'common tools' generally isn't that complex in scope (at least, in the context of modern development tools), and the many RIIR projects have shown that total do-overs for these sorts of tools are viable.

The bigger problem is going to be cultural, and particularly that persistent (but often unspoken) belief that "computers peaked with UNIX in the 70s/80s and what we have now is the best it will ever be". It often results in people actively pushing back on improvements that really wouldn't have any downsides for them at all, thereby creating unnecessary friction.

That same 'ideology', for lack of a better word, also makes it very difficult to talk about eg. hybrid terminal/graphical systems - because "GUIs are evil, text is better" and similar sentiments. That really gets in the way of advancing technology here.

Ultimately, I think the actual interop and/or reimplementation problems are going to be a walk in the park, compared to dealing with the messy human and cultural/ideological factors involved that actively resist change.


yeah, i'm not trying to say it's a good idea. but the parent asked why, and the answer is essentially because we're lazy, it's easy, and it works often enough that we can get away with it.

i figured the article we're all commenting on would be evidence enough of why it's not necessarily a good idea.


It’s not even text. It’s just UTF-8 at best and just ASCII in some cases. When did awk get Unicode support? 3 months ago?


I'm guilty of having written code like this on occasion, as I'm sure we all are. It's easy to look at these changes and say "of course it's slow", but for every instance of this that OP fixed there's probably 99 other occurances of the same thing that _don't_ matter for performance. In a perfect world, sure, everyone would know whether they're on the hot path or the slow path and prioritise accordingly - it doesn't matter if something that is called once a day takes 15ms instead of 0.15ms, until someone adds it to the hot path and it gets called 290 times.


I stubbornly hate code like this - but largely because of the security and correctness nightmare this is when combined with user generated data.

For example, in a library one of my coworkers wrote, object paths were flattened into strings (eg “user.address.city) and then regular expressions and things worked on that. Sometimes user data ended up in the path (eg “cookies.<uuid>”). But then - what happens if the user puts a dot in their cookie? Or a newline? Does the program crash? Are there security implications we aren’t seeing? What do we do if dots are meaningful? Do we filter them out? Error? Escape them (and add a flurry of unit tests?) Are there bugs in the regex if a path segment is an empty string?

It’s a nightmare. Much better to have paths be a list of keys (eg [“cookies”, uuid]). That’s what they represent anyway, and the code will be faster, and this whole class of string parsing bugs disappears.


Oh I totally agree. It's also my gripe with using Unix tools in general and chaining together tools in bash with pipes.

Perfect is the enemy of good though, and probably 90% of my code is done properly, and I cut corners where I think it's appropriate - sometimes thats ok, and sometimes in hindsight it's not.


> regexes are easier to understand

Not a statement I ever expected to see.


Same, I mean they are tricky to get a handle of but seemingly logical once you gain familiarity, until you have a use case that does not fit into a previously encountered pattern, and then it's back to the drawing board. But I guess, it's easier then reading the manual but what do I know


I’ve never seen this approach. I imagine I’m just not working in a domain that touches on problems “solved” this way. Any popular examples to point to?


Nope, just stories from my day job.

One time, to solve the problem of "we want floating point numbers that are close to be considered equal" on an object, the programmer decided to take every field of that object (reflectively of course) turn that into a string, and then compare the two strings together.

Another time, in order to convert a LocalDate to our own internal date representation, a dev decided to toString the local date, split on the `-`s to extract the day month year. (Yes, localdate has day, month, year methods).

One time, to convert an int to a double a dev decided to turn that int into a string and then used "Double.parseDouble" to turn that int into a double (I wish I were joking).

All these and more were found through the joys of profiling :D


I’m in pain reading this. But I feel the need to confess my own sin:

I once didn’t want to write or add a dependency for a Python deep equality checker so I just sorted, JSONified, and string compared the two. It worked fine for years. It wasn’t performant important code.

But it still keeps me up at night.


> I once didn’t want to write or add a dependency for a Python deep equality checker so I just sorted, JSONified, and string compared the two. It worked fine for years. It wasn’t performant important code.

I think this one is actually fine. You thought about the solution, considered your options and made a design decision based on the balance of your priorities. That's what engineering is: a bunch of trade-offs solidified into a product.

The situation where someone uses Double.parseDouble(number.toString()) to translate an int into a double is different. There's really no reason to do it like that: it's harder to understand, uses more memory and cpu and is slower. There's no trade-off that makes that line come out as the best decision.


It's stuff like this that makes me mad when I see "Optimization is the root of all evil" quotes thrown at me.

The code I'm most typically fixing for performance isn't some complex algorithm that needs inline assembly and AVX register weaving... No, the kind of code I most often fix for performance is code where someone does `Double.parseDouble(foo.toString())` and turn it into `double bar = (double)foo;`.

Or code where someone does n^2 when they could have had n by using a dictionary/hashmap. Or code where someone uses a dictionary/hashmap instead of using a simple POJO.

It drives me bonkers when someone apparently goes out of their way to write code in the slowest form possible.


Eh - It's good to know when you're taking the slow route, but most times the slow route is fine.

It's easy to point to the title piece and assume the right advice to is try to optimize early - but the reality is that the gains mentioned here basically only matter because these libraries are used by millions of developers, and run often.

essentially - don't try to optimize until you have a reason to make that effort worth it. For these libraries - it's worth it.

For the times you're getting asked to fix performance, it's because performance has become a feature that matters enough to have someone ask you to fix it. It's fine to wait until that time to swing back around and pay attention to it. There are thousands of times where that never happens, because the performance was fine taking the scenic route through a string. Plus - that leaves you plenty of low hanging fruit to pick up some easy wins.

---

I think the real story here is not that folks are taking the slow route, I think the story is that the low hanging fruit hasn't been knocked out of these libraries. And really, that goes to show that a lot of software really is supported by a very small number of library authors, who aren't doing it as their day job, for very little in the way of real compensation.

Ex: SVGO is used by 9.1 million people (on github alone) but only has 163 contributors, and only 4 people have more than 10 commits. That means only one in 55,000 users has bothered to add anything. And there's only one real dev for every 2.25 million users. And "users" here also includes entire companies.


It's fine to not optimize from the start, but the examples they gave are of cases where the code is doing something more complex that's also slower.

Don't make your solution more complex than it needs to be. Start with the simplest solution and you will often get a faster solution as a side effect.


I disagree with this take. In both examples (the original article, as well as the one posted right above) the code under review was slow but simple.

Taking just the two from the comment right above this:

  Double.parseDouble(foo.toString())
This will work for basically any possible value/type of foo, assuming foo supports .toString(). It will work for NaN, it will work for infinity, it will work if the value is already a string.

It's explicitly avoiding the complexity of having to consider those cases at programming time, in exchange for taking the slow route (code someone else has written to already do that for you).

Simply casting to double will probably work, but not always - worse, the warnings you get, and the edge cases you might need to deal with will vary significantly across languages. Ex: Java will do different things than C++, which will do different things than C#, Python will vary by flavor (CPython and Jython use different underlying types).

Basically - it's saving the programmer a boat-load of complexity (reduces the edge cases down to just the exceptions parseDouble() might throw), in exchange for doing something slow.

---

Take the hashmap example right after the double example. Will a hashmap let you speed up some algorithms by avoiding n2 in favor of n? Sure - at the expense of memory size and longer initialization. In most places that's probably a sane call. Is it always the right call? Nope. Would not be ideal for my arduino - the memory is so limited n2 is unlikely to matter most times, and again - the memory is so limited I'd rather go slow and finish than try to go fast and segfault.

It's also not the right call when the expected length of the data is always going to be small values of n. If n is always going to be less than 5, I'd bet you good money the hash implementation will be slower in most cases.

----

Basically - optimize when optimizing becomes important, because something is slow and it sucks. Until then... slow is smooth and smooth is fast.


> This will work for basically any possible value/type of foo, assuming foo supports .toString(). It will work for NaN, it will work for infinity, it will work if the value is already a string.

A) the example above was explicitly for int -> double, so the fact that it the path through .toString will work for other types is a benefit we won’t get but which has a cost we’ll have to pay regardless.

B) I find it hard to believe that the Double class doesn’t my have a method you could call which will convert an int and catch NaN, Null, and MaxInteger.

> In most places that's probably a sane call. Is it always the right call?

Just as you’d bet that the hash implemention will become slower, I bet you that only 1 in a million people (if that) would write the Double code we’re talking about as the result of thoughtful consideration.

> optimize when optimizing becomes important, because something is slow and it sucks. Until then... slow is smooth and smooth is fast.

Slow is smooth assumes that the slow path you pick is sane.

There’s also a big difference between having to write code that needs optimization, and writing less code to do a job faster.


> the example above was explicitly for int -> double

?????

Show me where the type of foo was defined in that comment... I'll wait.

For all we know it was some god-awful custom bucket/container class that's hiding all sorts of polymorphism from you.


There’s no black and white. I’ve known really good optimizers who do harm to the company and their peers by obsessing over gains that simply don’t matter.

And I’ve known people who make programs unnecessarily slow and painful because of sloppiness.

That’s not to say you can’t endeavour to “do it right the first time” but very regularly that is not a realistic goal.

Good engineering requires understanding the big picture, particularly what matters and doesn’t matter to a business (and how those priorities change over time).


Engineering has pretty much always been the art and science of "good enough". Anyone can make a bridge with infinite resources, but an engineer is who can make a bridge just good enough to not fall down within budget and spec.

The challenge with software is the underlying "physics" of our world continue to get more powerful year over year (unlike real physics), so "good enough" ends up being sloppier and sloppier as time goes on, and for a lot of applications it doesn't matter.

Of course, until that day where it turns out it does matter. Good developers I think can anticipate some of those cases, and great developers can pick the few that they can sink the time into getting Right (tm) (c) 2022, to get the most bang for their (time) bucks.


Yeah, I generally favor the approach of first making the code easy to understand/validate, and easy to replace later if you find a performance problem. I’ve found it’s usually only a small percentage of the time that your first guess for what’s going to be a bottleneck is accurate.


wait wait wait, the actual quote is "PREMATURE optimization is the root of all evil". Completely different meaning


We also shouldn't forget the full quote

“There is no doubt that the holy grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.”

— From “Structured Programming with go to Statements” (Knuth, 1974),


> It drives me bonkers when someone apparently goes out of their way to write code in the slowest form possible.

We used to call that BASIC.


I just yesterday found a case in my codebase of linear searching dict.keys() for the dictionary key.


I had a candidate try that in an interview with me yesterday.


You did the right thing.

Deep clone and equals on deeply nested, probably heterogenous, object graphs can be challenging. And certainly error prone.


Depending on how much that equality check is deciding the application's performance, jsonifying to compare stuff is fairly ok.

Just beware of any lists which you'll need to sort manually (on both sides) to prevent false negatives :)


That’s a tough one because Python lacks a standard convention for deep comparisons so your only options are some bespoke function you (probably copy from some random website) stick in every library you use or pull in some third party dependency or use a standard module with a fairly straightforward serialization method that does what you need which I would guess is either pickle or json. Is there a better option?


I’m having flashbacks to some contract Java developers who were calling toString() on deserialized JSON objects and then parsing the output (old school index/substring, not even regex) because they insisted it was too hard to use the built-in accessors. Bonus points for then trying more kludges when they got a reminder that JSON does not guarantee object key order so they couldn’t write code assuming values would be adjacent.

They also did that exact same thing parsing dates, except that they flubbed the logic and IIRC had an off-by-one on the month for the constructed Date object.


Lots of JSON libraries can be told to enforce key order. And you’ve gotta remember that if you’re going to do something so ghoulish!


These were definitely not the kind of developer who read the documentation looking for options like that. I've mostly avoided Java for years and still found myself going “this is like 20 lines of code [badly] re-implementing a built-in library function” on almost every code review.


Could this just be incompenence?

I have seen this kind of code many years ago when external company was brought to sell us their implementation of CMS on some (even then) outdated Java/js-on-server-before-node-era platform. I was asked to do code review to chcek if their code can be integrated into our codebase.

I was quite unexperienced at that point in time but enough so to notice that there was something off with the code. There was a lot js that looked almost like exercise in obfuscation. Things like doing join, append and split on array to do push - and it was not abstracted into function - this three operations were repeated all over the code. And then I met the programmer - it was ovious after few questions that he did not know what he was doing and his direct manager, who was also present, know nothing about programming (he was more od a business person as he himself proclaimed).

It was as if this company hired the cheapes guy they could find, that on paper looked competent, give him task of implementing CMS, and only looked at results (the CMS actually worked), but no one ever checked the code that was hiding under (and the guy never worked with anyone else and never read any other code besides his!)


We saw it with contractor-developed code. They took JSON from an API, then converted it to a big blob of text, and did some gnarly searching.

It made no sense.

But we were also in a rush to get this app built, so we didn’t push back on this tortured logic. But we were saddled with this technical debt for an embarrassingly long time.


That sounds almost more excusable than the rush code I found the other day. They were storing something (to local storage and then to a DB) via JSON.stringify() but for whatever reason they hadn't thought after retrieving it back from the database to just JSON.parse() it and instead used a weird mash of string replacements, splits, and regexes to parse an ad hoc brittle subset of JSON.


People will convert a float to an integer by using parseInt. That function automatically converts its argument to a string, then parses it as an int, ignoring anything after the decimal point.

Completely fails for a number big enough to be stringified to 1.34e20.

Plot twist, if your number is in a reasonable range this is very fast because there was one a popularJS benchmark that did this.


Another common one is converting booleans to "Y" and "N".


Can be worse. On the old codebase where I work, I found booleans as : S, s, si, Yes, yes, Si, 1 ... I managed to unified it to be 1 and 0, but sometimes I keep seeing and old "Si" on some place


I have, and it boggles the mind. You'd think Java was a string-oriented language.


I have had to do it where all of the member methods are private (or otherwise inaccessible to me) and toString() is the only useful way to get information out of the object.


Common example I’ve seen is checking if an IP is in a range. Most (JS) libs I’ve seen do something like regex or, more commonly, split on periods and then have a set of nested ifs. Versus if you have a proper IP type, you check if its long value is between ranges. (Not saying you can’t do this, but because so many times an IP is stored as its string representation, that seems to be the default.


Also haven't seen it, but I could imagine it happening if someone desperately misses duck-typing due to too many similar classes not under their control (can't make them share interfaces) and they have a lot of routine with e.g. json from/to but none with reflection. (I do enjoy a hearty dose of tojson in my println debugging)


> They'll take a nice rich type, turn it into a string and then spend a bunch of code/time turning that string back into something else.

They find the strongly typed system "inflexible", so they fall back to a "stringly typed" one.


I think it’s just because everyone knows how to work with strings, it’s so tangible, and you can see what data a string contains right there in the print(thing).

You don’t even have to know what the type is of the thing to work on it’s string representation.


Java has (had?) such beauties as "InternalFrameInternalFrameTitlePaneInternalFrameTitlePaneMaximizeButtonWindowNotFocusedStates".

Now imagine a bit of reflection on that to do some string manipulation.

Good times!


Smalltalk, Objective-C and Swift top that, because the parameter names are part of the method name as well.


Why? Because it’s all they know.


Most of these actually come down to supporting logic in comments to turn thing on / off, etc. Or enhancing parsing. Not as much what you describe.


One thing is that there is no deep comparison for objects. Say if you want an object as a key in Map, how would you go?

In c#, I would use a Tuple. Or override equals to do a deep check.


In JS there are a pair of Stage 1 proposals to increase support for richer/more complex keys in collections like Map: https://github.com/tc39/proposal-richer-keys

Strings seem really easy for composite keys and "deep comparison" for objects, but you could do more like C# actually does under the hood which is compute simple hash numbers (getting back to the article's point that computers are often faster with number calculations than string work).

C# does most of its equality "magic" with GetHashCode() (surprisingly not Equals() and obviously not ToString()). (Relatedly this is why C# linters will yell at you if you override Equals but not also GetHashCode. Equals seems the more "useful" override but GetHashCode is the actual work horse.) Most implementations of GetHashCode (even for deep comparison situations) are generally starting with some random prime constant XOR bits of all the data and return the integer you have left after all the XORs. You can do the same thing in JS today if you wish. Modern .NET now has a nice System.HashCode helper class for doing it quickly with a lot of good best practices baked in (a strong starting constant, for instance), so it's easy to forget the bad old days of writing all those XORs by hand, and .NET has always had GetHashCode a required method of System.Object so "recursion" of hashcodes is guaranteed to be simple. But those things aside, it's still really easy to compute hash codes by hand in JS to use for comparisons than to use string manipulation, if you put your mind to it.


My Java is a bit rusty, but can't you define the hashing function for this class?


Can you write deep comparison? For a particular object? Or for many objects representable as a JSON, use a generic JSON comparison?


The problem is how to make Map use the deep comparison? I can do it manually yes. But if standard libraries are going to do referential check, then string is the only option left


Or … just don’t use that stuff.

Postcss is not necessary for custom properties. IE isn’t a thing anymore, and using css custom properties directly without a transpilation step is fine. The same thing applies to the for … of issue. Babel is not necessary anymore, all browsers support ES8. JSX is the only thing babel still does that offers real value, but that can be replaced by the runtime alternative developit/htm. We can just stop using babel and sidestep this whole issue.

I see a lot of energy going into making faster build tools, and I feel this energy is misdirected. We need to dramatically strip down the build tools as most of them have become largely redundant since the death of IE.


I share your opinion, but recently got very disappointed to find out that safari still needs lots of -webkit- prefixes; even for stuff seemingly available for ages now. So we’re not fully out of the pit yet, at least until the most valuable company in the world finally cares to fix their fucking browser (yes, dear Apple engineer, I’m talking to you!)


> recently got very disappointed to find out that safari still needs lots of -webkit- prefixes; even for stuff seemingly available for ages now.

Safari is fairly conservative in de-prefixing new features, but also sometimes beats Firefox (e.g. Container Queries and Container Query Units).

For reference, here's the caniuse.com list of Safari CSS features which need prefixing, compared to Firefox: https://caniuse.com/?compare=safari+16.1,firefox+107,ios_saf...


I don’t know what you’re using but most developers most likely don’t need “lots of -webkit- prefixes”.

In the link posted by the sibling commenter, I see 2-3 properties that could be useful to the general public. The rest is esoteric, never-ratified, or straight up user hostile (font-smooth)


Apple's app monopoly gets stronger when the browser gets weaker.


It feels kinda weird but a lot of the time you can just use the -webkit- prefixed versions and they'll work in firefox too.


It's like user agent strings. Developers didn't fix their user agents, so other browsers just lied about what they were so the website would run properly. Only now they were stuck with the lie as long as devs depended on the user agent string for stuff leading to abominations like this one from Chrome on Android where it basically claims to be every single browser out there.

Mozilla/5.0 (Linux; Android 12; Pixel 6 Build/SD1A.210817.023; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/94.0.4606.71 Mobile Safari/537.36

Devs were NOT supposed to use prefixes in production, but they did anyway (best viewed in Chrome). Unfortunately they then proceeded to NEVER UPDATE these as other browsers added support. Non-webkit browsers had the choice of interpreting the CSS or leaving their users with a broken experience.

Major browsers moved away from vendor prefixes between 2012 and 2016 (when webkit finally stopped), but it was too late. Browsers 50 years from now will still be supporting this garbage because of what devs did.

The key takeaway: Never give devs a tool you don't want to be stuck with forever. Also, can we please not do this yet again in the future?


> Never give devs a tool you don't want to be stuck with forever.

That's right. You can pry the <layer> element out of my cold, dead, hands.


Firefox added support for -webkit prefixes several years ago now... I think mostly to capture things written for mobile Safari but also Chrome I imagine.

I don't know if anybody really does other directions of that, though. There was probably no percentage in working with -moz prefixes for other browsers by the time this became an idea to do.


Prefixes for what?


You still need postcss or something like it to use css modules. The alternative of CSS-in-JS requires a build step to extract either a CSS file or inline critical styles for server rendering. The browser-native alternative of CSS Module Scripts doesn't work in Safari.

You still need babel, tsc, or swc to transform typescript to JS.

And then there are imports of files that aren't JS (images, fonts, stylesheets, etc).


The best approach for a no build CSS approach I've found is using BEM notation to scope the CSS of each component, and then @import'ing the separate CSS files into a master stylesheet. When hosted over HTTP/2 those @imports are downloaded pretty efficiently. However, when doing that the lighthouse score is impacted and it doesn't scale for large websites.

Typescript is indeed a given. There's no good way to do that without build tools. The question is whether typescript's benefits are worth the overhead of the build tooling. I tend to prefer it for libraries with strict API's, but I don't experience as much of a benefit for web applications. I think libraries also benefit a lot from bundling/minifying, so there I would choose a full build tools approach. For small web applications the overhead of the build tooling is not worth the benefit IMHO.

And then when it comes to direct imports of images, fonts, svg, I don't think people should do that at all, even when using heavy build tools. YMMV.


@imports are invisible to the browser's preparser as the stylesheet that contains them must be downloaded and parses before the browser can discover them

This will lead to slower FCP etc times particularly if the stylesheets aren't in cache


Funny that you mention htm, because developit is Jason, who build Preact, and the author of the article is Marvin, also working on Preact. So I'm pretty sure the author know about what is needed and the alternatives and still sees use cases and just wants faster builds. Just look at the article before about unit tests in 1s. That's where he is coming from.


> Installing Tailwind CSS as a PostCSS plugin is the most seamless way to integrate it with build tools like webpack, Rollup, Vite, and Parcel.

:(


> I see a lot of energy going into making faster build tools, and I feel this energy is misdirected. We need to dramatically strip down the build tools as most of them have become largely redundant since the death of IE.

what if we standardize browsers ? Like if it happens, we get easily all of the power from browser itself. It's same story like ES6 and later versions did to underscore and lodash (long live lodash).

This will make all build tools OBSOLETE


I recently set up a Three.js app with otherwise vanilla JS using an import map. So refreshing to have zero build step.


Isn't Babel still needed for metaprogramming, since JS does not have those capabilities itself?


Maybe don't use metaprogramming, then?

The Babel example problem in the article is a fascinating one because that's someone's metaprogramming that's just lost in the ether of time and the labyrinthine tower of a Babel install someone's too afraid to break to just tear down. Not only is it an unnecessary downlevel transform in 2022, but it doesn't look like any of the official transforms that common Babel transforms produced, so who knows what metaprogramming thought might be a good idea that delivers that from that configured stack?

As others have pointed out: JSX transforms have non-Babel options (also including Vite, esbuild, and Typescript).

As for other metaprogramming commonly done in Babel: I would point out that the Stage 0 Decorators that Angular was built on (not to be confused with the Stage 3 Decorators that might actually come to JS) have relied on very specific Typescript build options anyway and the double-up that Angulars tools by default do of also feeding that Babel is kind of silly.

One of those parentheticals also slightly buries the lede that there is a Decorators proposal at Stage 3 and likely to get added to Browsers "in the near future" for those that desperately want to do metaprogramming in JS to have metaprogramming in JS proper, no Babel needed. (There are Stage 1 proposals to offer even more meta- and reflection tools beyond that directly in the language.)

In 2022 though, it is worth asking what in JS needs metaprogramming? A lot of applications get by without it just fine.


How would I remove all console.log from my code automatically during a build?

How would I add syntactically nice things to JS?

How would I get my environment variable values set for the build into the code, which is to be run in a browser?

JS being JS, there seem to be no good facilities in the language itself to do that inside the language. My understanding is, that babel was created to do these things outside of the running script. Maybe there is another way that I am not aware of?


> How would I remove all console.log from my code automatically during a build?

Make it a lint warning?

Add a higher-level logging abstraction and/or library?

I know Webpack and Esbuild both have ways to define symbols like `console` at build time and get dead code elimination. No Babel needed.

> How would I add syntactically nice things to JS?

What do you need in 2022?

I definitely understand that in 2014 there were a lot of nice polyfills and prollyfills to be found in the Babel ecosystem for syntax that made the language a lot nicer to work in, but in the last eight years, most of them are in the language itself now and supported in nearly every browser. ES2015+ has so much of that goodness baked in and every browser generally supports ES2015+ well (and so does Node, mostly). Most caniuse statistics says browsers are caught up to at least ES2020.

We are over and on the side of the hump that Babel helped collectively prepare us for.

Yes, there's still a usefulness there in testing future ideas and it will probably still always be a collaborative tool for helping with TC39 early stage testing of proposals. But at this point, should your Production builds really rely on anything before Stage 3 with TC-39?

For stuff in Stage 3+ there's always Typescript as an alternative to Babel.

> How would I get my environment variable values set for the build into the code, which is to be run in a browser?

You do that in Babel? I've never heard of anyone doing that Babel. I've seen all sorts of dynamic import strategies in Webpack and other builders. I've done some neat things with JSON files and sometimes builder imports and sometimes just good old fetch().


One of the best bang for your buck optimizations I did recently was add the following to our next.config.js (we're using V13):

  const swcOptions = require('next/dist/build/swc/options')
  const { getLoaderSWCOptions } = swcOptions
  swcOptions.getLoaderSWCOptions = function (...args) {
    const result = getLoaderSWCOptions(...args)
    result.jsc.target = 'es2022'
    return result
  }

This overrides the target https://swc.rs/ uses from the default of es5 to es2022, which sped up our quite heavy app by up to 3X in hot code paths, just by using newer JS syntax instead of falling back to 'transpiled' versions. Unfortunately Next.JS doesn't allow setting the target directly. My hunch is that especially this syntax in our hot path was a big benefactor:

  function cloneAndSetProperty(obj, prop, value) {
    return { ...obj, [prop]: value }
  }


You can enable the experimental `browsersListForSwc: true,` option which uses your browserlist to determine the appropriate target. You can also disable `legacyBrowsers`

ref: https://nextjs.org/blog/next-12-2#other-improvements


Nice, it seems this is now the default in Next v13, just updated! (We were actually using v12.0 still)


On your cloneAndSetProperty, consider if you can use a prototype instead of a clone:

  function chainedPrototypePlusProperty(obj, prop, value) {
    const newObject = Object.create(obj);
    newObject[prop] = value;
    return newObject;
    // Alternatively: return Object.create(obj, { [prop]: { configurable: true, enumerable: true, value } });
  }
It depends entirely on what you’re doing with it (not mutating obj while you keep the returned object, not depending on properties being own), but in some cases this will be suitable, and depending on the data and what you’re doing, could be drastically faster (it’s O(1) rather than O(n) on the number of properties, and access after could be much of a muchness).


I've tried this and found it to be much slower


My own experience is that JS can be really, really fast if you manage to avoid confusing the JIT. But it's also very easy to make stuff slow and it can be hard to guess what kind of stuff will be an order of magnitude slower because it avoids a fast path.


Author here.

True, optimize code for JS engines is certainly an art form of its own. What usually works for me is to write boring imperative code when performance is a requirement. That seems to be easily optimizable by current engines.

Also, thanks for sharing my article here on HN. First time for me one of them is on HN.


I enjoyed reading your blogpost. Thanks for sharing!


And that's why profiling is the step 1 for any optimization.


Which is why this article is so good IMO.


> But it's also very easy to make stuff slow and it can be hard to guess what kind of stuff will be an order of magnitude slower

It's not that hard to guess. Just look at what most NPM/Electron programmers and guides recommend you do, and then do the opposite. I.e., think about what your program is doing, and write the most boring, 90s-style code you can. Works really well internally for the Firefox codebase, and has been for 20+ years (for longer than it has even been called "Firefox").

Relevant talk: <https://www.youtube.com/watch?v=p-iiEDtpy6I>

Relevant paper: <https://users.dcc.uchile.cl/~rrobbes/p/EMSE-features.pdf>

Relevant thread: <https://news.ycombinator.com/item?id=33790568>


For those who don't want to go through all that, there's basically 3 simple rules to make your code optimizable:

* Objects should have a fixed set of keys (no adding more and especially no `delete`) and the type of the value should never change (note: in JS, a type must NOT be a union of multiple types if you want it to optimize)

* Arrays must be of one type and a fixed length

* Functions must take a fixed number of arguments and the type of those arguments must never change. This is by far the most important if you want to get your function's code optimized.

And as a corollary from this article, it's not JS specific, but converting string -> float and float -> string isn't a cheap operation. I'd go so far as to say that it is the most expensive operation in a huge amount of the functions where it happens. Int to string is especially egregious because it requires a lot of division which is basically the slowest (common) thing you can do on a CPU.


> Int to string is especially egregious because it requires a lot of division which is basically the slowest (common) thing you can do on a CPU.

Yeah. There's some great benchmarking and tips: https://www.zverovich.net/2013/09/07/integer-to-string-conve...

Godbolt example: https://godbolt.org/z/M4b353PKv of which the meat is

        imul    rcx, rcx, 1374389535
        shr     rcx, 37
        imul    edi, ecx, 100
        mov     r8d, edx
        sub     r8d, edi
        movzx   edi, WORD PTR .LC2[r8+r8]
        mov     WORD PTR [rsi], di
So if our example input is 12345, the first two instructions use the field-inverse property to compute "123" with multiply rather than divide (1 clock, latency 4), then multiply up again to get "12300", then subtract that to get "45". That can then be looked up at position 45+45 in the string "00010203040506070809101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899" to give you the two digits.


> Objects should have a fixed set of keys.

Declared in a fixed order if you want functions to remain monomorphic! One case where using classes is a bit safer than object literals.


It's important, but I purposely stepped around this to not complicate things.

In truth, the only likely situation from these rules is hand-copying configuration then manually transposing. This isn't likely to happen in really hot code paths.

This is the one place where JS helps us because lazy typers are encouraged to return object literals from factory functions and the return will almost always be one object returned at the bottom of the function.

It also (intentionaly) bypasses manually adding elements to an object directly after the object is instantiated. In truth, doing this consistently and directly after it is created can usually be optimized too, but not necessarily and describing this also complicates things (and varies from one JIT to the next).


This makes me wish "use strong" had taken off, at least as a strict mode for development. Helping you stay within the JIT's happy path.


> Arrays must be of one type and a fixed length

If performance is critical then use a TypedArray where possible.

Your other points are spot on for high performance JS.


I was messing about with some profiling and seemed to find that TypedArrays are not always faster than regular arrays carefully handled. And what's worse it seemed like a regression from node 16 (fast typed arrays) to 18 (slow typed arrays). I will have to dig up my profiles again and see if I can verify it. Posting this comment in the hope that someone who knows more than me has something to add!


It would be great if there was a tool that shows you where deoptimization is taking place. IIRC the Chrome Javascript profiler once flagged functions where optimization failed, but this feature is not available in current versions of the devtools performance tool. Although deoptimization is logged, I am not aware of a way to trace the entries back to the source code.


> in JS, a type must NOT be a union of multiple types if you want it to optimize)

Does that mean that tagged unions (functional programming-style data types) are inherently slow in JavaScript? Maybe there is a way for the TypeScript compiler to pass pragmas (via comments in generated code) to the interpreter.


They can have pitfalls is a better way to think about it. Inheritance has a similar penalty.

The potential problem is that JS engines only see the ‘shape’ of an object. That is the members and member order. Which is slightly more strict that TS’s structural types where order doesn’t matter. So union variants and subtypes are each different shapes if they have different members or member order.

Shapes are important for function calls. Objects property access is optimised based on the number of ‘shapes’ that have been seen. Into the following:

Monomorphic - It’s seen only 1 shape and can access the properties by offset directly.

Polymorphic - A small inline cache of several shapes. Access is a lookup into this. Slightly slower.

Megamorphic - A lookup into a global table. This is a performance cliff.

So if you have lots of functions that take a tagged union or base class *and at runtime* they see lots of different types passed in *and this is in a hot path* they can be a problem.


Thanks for the explanation!


> and write the most boring, 90s-style code you can

I think this advice goes way back to before the 90s. I've also heard it called FIAL (Fortran In Any Language).

But, yeah. It's basically always the fastest thing possible in anything even approaching the mainstream.


In other words: JavaScript is fast, JavaScript developers are slow.


More like JS doesn't do anything to discourage writing slow code. Typescript should have been that tool IMO, but instead, it is so flexible that it doesn't do a thing to help developers understand that they are writing slow code.


I appreciate typescripts stance to be a non-optimizing compiler. Having two optimization layers would make things weird


I don't think that's what the other commenter suggested. The point is that TypeScript should disallow (or at least make harder) to write code which is hard to optimize by the VM - e.g. allowing function parameters to be of multiple types.

But TS has a design constraint to be essentially a superset of JavaScript so it would be difficult to realize this goal.


I had the impression, my code gets less polymorphic when I use TypeScript.


If only.

This is polymorphic because union types are polymorphic.

    let stringProcessor = (foo: string | number): string => {
      //do stuff with foo
      return "string" + foo
    }
It's super easy to make non-optimized arrays and objects this way.

    type MyFoo = string | number //somewhere in the codebase

    //you will get zero hints that this is a polymorphic function
    let formatFoos = (foos: MyFoo[]) => foos.map((foo) => `${foo} deoptimized`))
    
Any use of optional object properties also immediately leads to polymorphism too. I doubt you can find ANY significant TS codebase where this isn't used pervasively.

    interface Foo {
      foo: MyFoo  //polymorphic
      bar: string
      baz?: number //this is polymorphic too
    }

    let doStuff = (obj: Foo): string {
      //do stuff with obj
      return "string"
    }
    doStuff({bar: "abc"})
    doStuff({bar: "abc", baz: 123}) //we're now polymorphic
Even if we remove the `?`, it can still be polymorphic because the TS compiler doesn't reorder object properties when compiling.

    interface Foo {
      bar: string //all basic types, so we're good?
      baz: number //no more optional, so we're fine right?
    }

    let doStuff = (obj: Foo): string {
      //do stuff with obj
      return "string"
    }
    doStuff({baz: 123, bar: "abc"})
    doStuff({bar: "abc", baz: 123}) //we changed order, so now we're now polymorphic


I'm very surprised the property ordering would matter. Seems like the JS engine itself could easily fix that, no? Silently re-order the properties to fit its own optimization

Do you have a source for the above info? (both because I'm a little skeptical of some parts of it, but also because I'd love to learn more about this topic and I've had trouble finding concrete info in the past)


JS specifies that the order keys are added makes a difference. For this reason, entering them in two different ways is a fundamentally different, non-interchangeable object type.


You could preserve that information and still have the same native struct, since it only becomes relevant when iterating the keys


This seems like a moving target though, and I still don’t think the typescript compiler should be responsible for it. I think a linter could make sense though for obviously bad cases.


A linter can't tell if it's polymorphic -- only a compiler which is actually analyzing the types.


No, it's possible to write dodgy code in other languages too - 2 of the examples in this article are in go rather than JS


Just because it can be faster doesn't mean it's fast.


When you get down to basics, JS has arrays, structs, ints, floats, and strings+regex (both backed by very fast C++ implementations. If you stick to these things as if they were C analogs, the performance potential is very high.

Viewed in those terms, it shouldn't be surprising that JS can be incredibly fast (faster than C in some cases).


"really, really fast"

Compared to what? Polar bears?

Here: is it faster than warm JVM?


V8 is almost never faster than a warm JVM, especially once you get to heap sizes bigger than a couple GB.

You can't even store more than 14m or so keys in a map in Node w/o reaching for an off heap solution due to hard limits in the runtime...


If needed you can make your own map that doesn't let you down: https://www.npmjs.com/package/infinity-map


I've used megahash with great success:

https://github.com/jhuckaby/megahash


I would freeze before the JVM would warm up.


Jokes aside, then maybe your code cache size is too small ;p


    // Does the same as `Number.prototype.toFixed` but without casting
    // the return value to a string.
    function toFixed(num, precision) {
        const pow = 10 ** precision;
        return Math.round(num * pow) / pow;
    }
I know doubles are big, but doesn't this combination of multiply-and-divide introduce the possibility of precision issues that I'm assuming is not applicable to `toFixed`?

Unless I'm wrong, I would be very hesitant to use this version of the method when I don't control the input.


There certainly are issues with precision:

    toFixed(1000.005, 2); // returns 1000.01
    toFixed(1.005, 2)     // returns 1
Rounding to a fixed number of fraction decimals is extremely tricky, much trickier than it appears on first view.

This is beaten to death at Stackoverflow, but I can tell you that almost all (or all?) of the solutions provided don't work for all cases:

https://stackoverflow.com/questions/11832914/how-to-round-to...


This is an argument for having these fundamentals implemented in the language's standard library IMO.


This is processing numbers which appear in SVG files which are basically coordinates, they aren't that big.


This need not be true: you can put objects and viewboxes wherever you like in SVG. <svg viewBox="1234567890 1234567890 100 100"> is perfectly valid.

Many years ago (Adobe SVG Viewer days) my Dad generated SVG files from geo/map data and at first used the original coordinates or something, but float error made it untenable—some viewers would end up quantising some things (e.g. the shape, the viewbox, individual nodes, even once you applied a transform or SMIL animation to them, the hover hitboxes—I don’t remember the particular details, but it was a few things) so they’d end up off by as much as hundreds of pixels in the most extreme projections. Therefore he ended up rezeroing to work around this, transforming into a more friendly space before producing the file.

In the context of SVGO: really, the whole decimal places thing is stupidly bad, because decimal places can be of wildly differing importance in different contexts, and it wouldn’t be that hard to figure most of them out. In some situations a third decimal place will be more valuable than another number’s hundreds place. And SVGO’s approach is particularly bad because it treats each number individually, not attempting to compensate for the error it introduces, and so the presence of fine detail can easily completely destroy things due to untreated cumulative error.


I wouldn't expect SVGs that require double precision to render anywhere properly. Most renderers these days use 32-bit floats simply because it's easier to get things to render on the GPU that way, so at very large or very small scales things tend to break in fun ways.


This toFixed is if the precision is below 16, JS just doesn't work with more, you literally can't have more than 16 digits after the decimal, JS will round it (very poorly btw)

BTW if one needs to round to more than 16 you can use toLocaleString and pass minimum and maximum fraction digits which has a limit of 20. Ain't much but you could not expect more from JS :D


It's crazy that their patch in Semver (something that doesn't really affect the final build) saved them 4.7s in their build! I never would have thought a package like that would have such an affect on build time.


Author here.

Yeah, that was surprising to me too. You always discover something you would have never thought of on your own. It's what makes profiling such an enjoyable task for me!


I'm curious to see the project that you tested this against. To me it seems like you either tested this against a rather large existing code-base you work with or you code-generated something pretty huge as I've never seen such slow build times. If possible, could you put it up on github or gitlab?


The old trap of tech debt. Basically... back when these tooling projects started, stuff like casting stuff around all the time, using un-optimized tree algorithms, inefficient value objects or whatever didn't matter - because the projects that were built using them were so small the inefficiencies didn't matter. And once the projects got bigger, inefficiencies weren't really looked at for three reasons: the obvious "if the source gets larger, the build time goes up as well", because people simply got more powerful hardware - and finally:

Because most JS developers simply aren't your old guard neckbeard coders with decades of experience under their belt. They're fresh coding academy graduates that don't teach stuff like flame graphs or performance tracing in general, and simply are used to stuff taking a lot of time. And fwiw, knowledge of optimization techniques isn't really widespread even in senior developers. Modern computers are simply "fast enough" for many people (and especially corporate beancounters) to not care.


There's another issue at play here.

Nobody makes money from making SVGO faster.

Your customers don't run SVGO and only care about the performance of the output. Devs care, but not enough to do anything about it. Companies care about dev productivity, but the cost of the hours to run through the profiler isn't something they are willing to pay for.

And finally, there aren't enough maintainers/contributors to these projects. The core team is some volunteers. When they have spare time for the project, it's almost always going to be used to keep up with web standards, add missing features, or chase down bugs that impact the final output. They know performance matters, but thers's always something more important to do instead.


Very interesting, and great work contributing to the open source community!

Can you elaborate on how you did the profiling to identify the (potentially) slow parts?


Author here.

I mainly used node's `--cpu-prof` flag to generate a profile. Tried loading the trace into Chrome DevTools at first, but wasn't able to load the file supposedly because they are too big. So I used https://www.speedscope.app/ instead and which loaded them without any issues and is very snappy overall. What's more is that it allows you to display a left-heavy flamegraph which makes it much easier to spot functions taking up a lot of time. Speedscope is an amazing tool!

For webpack based projects the traces were missing quite a bit of data and I had more luck with their own profiling plugin: https://webpack.js.org/plugins/profiling-plugin/


Awesome, thanks! I too have found Chrome's profiler lacking, so will definitely have to check out https://www.speedscope.app/.


Speedscope is the best.


Yes this would be helpful.

JavaScript programmers, in my experience, don't really use profilers like "traditional" programmers do.

Maybe it's as much the tooling as it is the culture?


The profiler in chrome is pretty good, and can be used fairly easily for both frontend and backend profiling.

If said programmer doesn't use it, it's because they lack engineering maturity and competency.


> it's because they lack engineering maturity and competency

There's no need to escalate into personal attacks.

Chrome's JS profiler is great, and is perfect for the kind of thing the OP is doing. But it's very rare that CPU cycles are a material problem in customer-facing JavaScript code, and it would be easy to get quite a few years into a career without ever getting exposure to it. I've had exactly one (very unusual) job where I used it frequently. The rest of the time, it pretty much collects dust.


Profilers also lie. Learning the ways in which they are wrong is its own skillset, and the current generation of profilers is missing bits of critical information. Which means that not only do users misunderstand the value of profiling data, so do the profiler writers.

If the writers can't get it right, good fucking luck to someone with 3 years of programming experience.


I've done a lot of profiling over the years, and you're right.

In my experience though, it is correct _enough_ in the vast majority of cases to be able to gain significant speedups. You're not going to be eeking cache misses out of your hot paths without understanding what you're looking at, but you will find things like "We spend more time converting to strings and back than we do actually compressing" or "we're spending 100ms of every request loading a config file from S3 that could be cached". If we eliminated the low hanging fruit in some of our most used tools, the difference would be significant.


This may be controversial but I now believe that flame charts are hurting more than helping. These charts are meant to display problems with sequential code, but they end up obfuscating problems with asynchronous code. The evolution toward async-await semantics in Javascript and other languages is 'breaking' current generation profilers.

And this is controversial, but also true: 'low-hanging fruit' is death by a thousand cuts. It's a hill climbing algorithm and as we have all known, for generations, that greedy algorithms get stuck in local maxima. And if you know anything about farming, only amateurs pick the low-hanging fruit. Real growers harvest an entire tree at a time, otherwise you waste a ton of fruit.

Going module by module instead of chopping off tall tent poles lets you achieve much better results. One of the complaints of the Premature Optimization crowd is the potential for regressions in making changes to code that already 'works'. Refactoring reduces that possibility quite a bit. Module- or Concern-Oriented optimization mops up a lot of the rest. You can get better QA fidelity when making 10 changes in one area of functionality than you can by limiting yourself to 4 but spread across the code base.

Why I think more people don't use it, 1) I seem to be the sole proponent, 2) it cuts both ways regarding Instant Gratification. You will know about large problems that you aren't fixing until later. Later answers are often better answers. And on the other edge, it also gets to a bunch of short tent poles that will never make it out of the backlog later in the project. Nobody is going to give you permission to go around making 0.5% performance improvements, and it's a lot of wear and tear to do such work off the books. This is the death by 1000 cuts failure mode. I think the last time I saw someone bragging about a < 1% improvement was the compressed pointer discussion in the V8 blog. I can't even remember the previous example. You can deliver a 16% improvement in one concern instead of the 13% you get from the biggest wins, at very little additional cost or risk. You can keep doing that quarter after quarter, for years, and at the end you've gotten 25% farther than you would have by going the 'easy' route.

25% doesn't matter until it does. Inflection points tear up your project roadmap and disrupt plans. They force (risky) architectural changes farther up in the backlog, and without the benefit of these other little changes you've done along the way, because the best performance improvements also improve code quality, making other changes easier, not harder.


> Profilers also lie.

I've had the Chrome profiler tell me to spend significant time in a function that was never called at all. Not that the idea of that function being called was unusual, it certainly was a possibility that would have indicated a bug in our code. But the code was correct and the function never ran. A breakpoint or log statement in it never executed. Yet it was the top culprit in the profile. To this day I still don't quite know what happened there.

Such things also undermine the trust one has in certain tools.


I wrote an article once with a cheeky title sometime after I became aware of “Everything I Need to Know I Learned in Kindergarden” about counting things.

Basically one of the questions you should always ask is if the results match your expectations. You may find that some methods are calling the slow function that you didn’t expect, and refactoring the code to pass the answer to your business logic may save a lot more computation than making low level changes inside the function. Plus you just learn more about the architecture of the system by following this mental exercise. If ten requests are made I expect this routine to be called 51 times, so why is it being called 213?

Also when I have a smoking gun I like to pull the whole assembly out for the benchmarking phase. If you can’t reproduce the slowdown outside of the live environment it may mean the problem is split with some other place in the code. Cache poisoning, high GC overhead, lock contention, etc.

Each time you successfully pull a piece out, you get the opportunity to fix other problems in the vicinity as well, amortizing the cost of that effort. It’s also a canary for when your coworkers who are overfond of tight coupling get their mitts on the code.


> Profilers also lie

Yep, it becomes obvious when the Chrome profiler tells you that the CPU has spent a sizable amount of time on a javascript comment, which it sometimes actually does.


It's a reflection of their experience and tenure as a programmer, not a personal attack. The fact that it applies to most, if not all, JavaScript developers is a coincidence.


And I wouldn't fault a senior engineer's competency for not being familiar with JS profiling. Like I said, you could easily have a decade's worth of front-end jobs without it ever becoming relevant.


Sucrase is faster than SWC and esbuild in single-threaded performance and something like 20x faster than babel.

https://github.com/alangpierce/sucrase

The real issue isn't JS, but is the lack of profiling (as shown here) combined with wanting to be everything to everyone leading to layers of bloated abstraction.


One slightly annoying thing about esbuild/swc is that they're faster by literally not being equivalent to Babel. esbuild intentionally does not emit ASTs, and SWC has their own flavor that is not compatible with Babel. Furthermore, both are written in non-JS languages, meaning any sort of plugin system either has to be written in Go/Rust respectively, or suffer massive performance hits from crossing the JS runtime boundary. SWC abandoned plugins in JS for this reason.

The dirty secret is, as Sucrase (https://github.com/alangpierce/sucrase) reveals, just that Babel is slow and the codebase really pays homage to the project name. Simply computing the config for transforms takes dozens of milliseconds on my machine as it does a bunch of filesystem reads.

So esbuild/swc etc. aren't gaining some magical compiled language speedup, they're just... not doing the dumb things that Babel is. Not ragging on Babel here, it's a 7 year old project originally written by a guy fresh out of high school. I bet a ground up rewrite of Babel could accomplish the same.


> The real issue isn't JS

Yet it should be acknowledged that JS encourages various inefficient patterns.

Comparing with Rust, with which I am very familiar, I think that you’d be less likely to have most of the problems (some a little less, some a lot less):

• isBlockIgnored: it might still be tempting to do it the /…/.test(rule.toString()) way, but you’d be less likely to do it that way for two reasons: working with the actual AST is a good deal more pleasant than it is in JavaScript (due to a better type system, pattern matching and iteration tools), and you have to go a little further out of your way to use regular expressions since they’re not just built into the language.

• strongRound: somewhat less likely, because its inefficiency is more obvious, and because the developer is less likely to even find format!("{:.1$}", data[i], precision).parse().unwrap() than +data[i].toFixed(precision).

• stringifyNumber: similar remarks on regular expressions, you’re more likely to manipulate the strings rather than bothering with regular expressions (because they are a bother, even if minor, unlike in JavaScript).

• monkeys: well, in this case the issue is JS; the whole thing would be a non-issue in Rust.

• _loop: again JS problems and a misapplication of old-JS/new-JS–mixing techniques.

• semver: in good design this would be somewhat less likely in Rust because it’s likely to shift the parsing invocations to the caller in a way that makes the inefficiency a little more likely to be noticed, but in practice this would probably be about as likely to happen in Rust as in JavaScript.

And these are fairly tame cases, all things considered. Simplifying a bit, JavaScript makes it easy to do very inefficient things by accident, but Rust helps in many (though certainly far from all) cases, making it so that some of those can be optimised so that they aren’t inefficient after all, some simply don’t compile or can’t be expressed in the first place, and some must be expressed in slightly different ways that make the inefficiency more likely to be noticed and avoided.

—⁂—

You also should really specify that Sucrase achieves its speed by doing something completely different. SWC and esbuild are explicitly doing basically what Babel does, and so comparisons between them and Babel are meaningful, but comparisons with Sucrase cannot be used to vindicate JavaScript performance. Ports of what the likes of Sucrase or Bublé do to Rust or even Go would blow the corresponding JavaScript library out of the water.


Any idea how well does Sucrase compare when multiple threads are allowed?


You can run multiple processes of Sucrase if you want at which point it will still be faster than threaded SWC/esbuild. This is currently up to the user, but hopefully, the authors follow through on this issue and integrate an easier way for users to do this (though I don't know if they want that as core functionality).

https://github.com/alangpierce/sucrase/issues/730

JS JIT creators don't get enough credit for their string implementations. Not much comes close when you consider both performance and robustness they packed into the C++ implementations that JS is actually using.


It's it just me or why isn't the total build time and percentage presented? Without that the time savings seems relatively irrelevant?


Relative percentages would be nice for more context, but saving ~12.5 seconds waiting on my build is significant enough that even if it were a small percentage of total time, it would still be well-worth doing.

I think the real point here is that more profiling could give some great results without tons of work.


Thanks for the great writeup and the great work!

The fact that half the comments here are dismissive or condescending is such a sad indictment of HN culture :(


Author here.

Thanks for the kind words!


If you're serving SVG over gzip or similar, you'd be hard-pressed to sell me on the value of SVG optimization.

I'm also not keen on javascript minimization. Most of the delay on the client in not in the transport, but in the parsing. For this reason, I do think tree-shaking has good value.


Whats that profiler being used?


Looks like the one built into chrome. You can open your browser and attach a node instance so you can debug and profile it.

Here's a basic How-to:

* Run node with `node --inspect <file>`

* Open Chrome and go to `chrome://inspect`

* After it finds your target, click the inspect button to launch a debugger.


It looks to me like the profile viewer is actually speedscope ( https://www.speedscope.app/ ). I find it nicer for exploring profiles compared with Chrome's built-in viewer.

To use with Node.js profiling, do the `node --inspect` and `chrome://inspect` steps, then save the profile as a .cpuprofile file and drag that file into speedscope.

Another thing I've found useful is programmatically starting/stopping the profiler using `console.profile()` and `console.profileEnd()`.


That is a great tool. Also, I learned about perfetto the first time today, after doing Javascript profiling for a few years.


Author here.

Answered this elsewhere here: It's the `--cpu-prof` argument with node and the resulting profile is loaded into speedscope.


good stuff but something was mis-transcribed. This line can't possibly work

     for (const i = data.length; i-- > 0; ) {


Author here.

Whoops, good catch! That's definitely an error on my part. The original code uses `var` there. Thanks for pointing it out.


The original used var for both i and rounded. let would work here.


Really fascinating read. Thanks for sharing.


reminds me of a bug bounty type post :) would be nice to get awarded similarly


Speed on roblox


Most javascript packages come with a memory leak as well as ease of use. and as the number of packets increases, unfortunately the memory leak increases exponentially.


Don't speed it up,just remove it. I just logged into AdWords after a long time, and OMG the page is 178MB of javascript and a clunky hell that surely turns a lot of computers to space heaters. This madness has to stop


With that logic just stop using AdWords. It bloats the web.

Javascript isn't going away tomorrow, even if a replacement was released tomorrow it would take years for all the web to move.


I will, i set an end date for the campaign and don't plan to use it ever again until i am forced to (and maybe alcohol will help)

I'not arguing to remove javascript altogether, just to stop this bloat


>> Most popular libraries can be sped up by avoiding unnecessary type conversions or by avoiding creating functions inside functions.

I stopped reading there. This performance overhead is insignificant. The way to speed up libraries is to keep time complexity low; e.g. don't search through every item O(n) when you could just have done a constant time lookup O(1)... Or don't have nested for loops with O(n^2) complexity when the problem could have been solved with two non-nested loops in O(n) time (e.g. using a Set or Object for lookup).


The article describes significant time savings from these changes.


All the issues mentioned are related to the usage of shitty frameworks and build tools. The problem is not caused by a single regex; the real issue is likely that a regex is being created too many times in a loop instead of being created once and re-used. Related to neglected time complexity.

> Config parsing as a whole takes 4.3 seconds

WTF! If parsing a config file takes more than 20 milliseconds, you've got some serious issues with your project... There is no excuse. Even if your project is highly complex and feature-rich, you can just break it up into smaller sub-projects.

This article doesn't give any useful info because the project it's analyzing is just so bad.

It's like watching a turtle complete a 100-meter race and then taking away the lesson that crawling on one's stomach isn't a great strategy for a human to win the 100-meter race at the Olympics. Duh!

The whole project is a giant flaming pile of garbage. There is no interesting lesson to learn from it. Functions named 'monkeys' or 'perItem' or 'strongRound'?... Class named 'ConditionalRuleset' (WTF is a conditional ruleset, why would any project need it??? This abstraction makes no sense). What about 'isBlockIgnored(ruleOrDeclaration)'? Sounds like unnecessary complexity bloat when devs keep inventing new abstractions to give themselves more billable hours of work.

I would jump dump all these bloated libraries and frameworks and start the project from scratch.


your comment is sort of selfish because you don't seem to understand a good portion of the JS ecosystem is based on top of the libraries the author has touched. welcome to the real world


I know much of the JS ecosystem sucks. Don't blame me for industry leaders brainwashing and coercing us into using these horrible frameworks and libraries... And then making it into a mono-culture which shuns all alternatives.

You don't have to use them; you can find a different job at a smaller company. There are still companies doing it right and there are still good, lightweight tools out there; they're just hard to find.


20ms for config parsing is eternity, really.


Over 3 seconds for booting up is fine, heck, sometimes a few minutes is OK, but just to parse the config file? No way. If it takes that long to parse, it's not even a config file; it's a monster. I bet it has many other problems beyond performance.


That's true to an extent. Many important optimizations will be performed by the JIT compiler, but only as long as the programmer doesn't make it hard to do so! The JIT compiler has to meet a strict compromise to meet between itr execution time and the performance of the code it generates. Type inference (to optimize object accesses), escape analysis (to move allocations from heap to the stack and to optimize away closures), and other expensive interprocedural analyses are often not executed since the user might navigate to another website before they have paid off.


any tools to help analyze this type of thing (e.g. what is and is not optimal for the jit)? have heard the term deopt but not sure how to get visibility into that


The JVM can be instructed to emit diagnostics regarding JIT compilation. Dunno if similar things are possible with JavaScript engines. Apart from that, profiling. Coding style is being discussed to death in other comment threads here. In short, "boring code" (few closures and functional idioms, no reflection, careful use of dictionaries, no ultra-flexible function parameter lists, avoid regex unless absolutely required) avoids surprising (read: hard to optimize) behavior.

But maybe all of this is besides the point. It's unrealistic to do heavy duty processing in a dynamic language and expect that the JIT always saves us. In comparison, Python programmers suffer no illusions that their code is usually just "fast enough", and call out to external libraries when heavy duty processing is required. Modern browsers provide some of these optimized capabilities for JavaScript, and they should be used whenever possible.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: