Game theory at work? Someone needs to maintain legacy code for free that hosts thousands of sites and gets nothing but trouble (pride?) in return. Meanwhile the forces of the world present riches and power in return to turn to the dark side (or maybe just letting your domain lapse and doing something else).
If security means every maintainer of every OSS package you use has to be scrupulous, tireless, and not screw up for life, not sure what to say when this kind of thing happens other than "isn't that the only possible outcome given the system and incentives on a long enough timeline?"
Kind of like the "why is my favorite company monetizing now and using dark patterns?" Well, on an infinite timeline did you think service would remain high quality, free, well supported, and run by tireless, unselfish, unambitious benevolent dictators for the rest of your life? Or was it a foregone question that was only a matter of "when" not "if"?
It seems when proprietary resources get infected it's because hackers are the problem, but when open source resources get infected its a problem with open source.
But there isn't any particular reason why a paid/proprietary host couldn't just as easily end up being taken over / sold to a party intending to inject malware. It happens all the time really.
Yes, the economic problem of reward absence is exclusive to open source and private software does not have it. They may have others, like excess of rewards to hackers in form of crypto ransom to the point that the defense department had to step in and ban payouts.
As long as business is not going as well as owners want, the same economic problem exists in private software too - in fact, the private companies get acquired all the time too, and they get shut down, causing DOS for many of their clients.
One difference is that closed-source source usually much less efficient; I cannot imagine "100K+" customers from a commercial org with just a single developer. And when there are dozens or hundreds of people involved, it's unlikely that new owners would turn to outright criminal activity like malware; they are much more likely to just shut down.
agreed, but if a company is making millions for the security of software, the incentive is to keep it secure so customers stick with it. Remember the lastpass debacle, big leak and lost many customers...
Directly security-focused products like lastpass are the only things that have any market pressure whatsoever on this, and that's because they're niche products for which the security is the only value-add, marketed to explicitly security-conscious people and not insulated by a whole constellation of lock-in services. The relevant security threats for the overwhelming majority of people and organizations are breaches caused by the practices of organizations that face no such market pressure, including constant breaches of nonconsensually-harvested data, which aren't even subject to market pressures from their victims in the first place
Even for security-related products the incentives are murky. If they're not actually selling you security but a box on the compliance bingo then it's more likely that they actually increase your attack surface because they want to get their fingers into everything so they can show nice charts about all the things they're monitoring.
Aye. My internal mythological idiolect's trickster deity mostly serves to personify the game-theoretic arms race of deception and is in a near-constant state of cackling derisively at the efficient market hypothesis
I 100% agree. I feel a huge part of my responsibility as a software "engineer" is to manage complexity. But i feel i'm fighting a losing battle, most everyone seems to pull in the opposite direction.
Complexity increases your surface area for bugs to hide in.
I've come to the conclusion it's tragedy-of-the-commons incentives: People get promotions for complex and clever work, so they do it, at the cost of a more-complex-thus-buggy solution.
And yes, it's not just software, it's everywhere. My modern BMW fell apart, in many many ways, at the 7 year mark, for one data point.
But we have exceeded our ability to communicate the ideas and concepts, let alone the instructions of how to build and manage things.
Example: a junior Jiffy Lube high school dropout in 1960 could work hard and eventually own that store. Everything he would ever need to know about ICE engines was simple enough to understand over time… but now? There are 400 oil types, there are closed source computers on top computers, there are specialty tools for every vehicle brand, and you can’t do anything at all without knowing 10 different do-work-just-to-do-more-work systems. The high school dropout in 2024, will never own the store. Same kid. He hasn’t gotten dumber. The world just left him by in complexity.
Likewise… I suspect that Boeing hasn’t forgotten how to build planes, but the complexity has exceeded their ability. No human being on earth could be put in a room and make a 747 even over infinite time. It’s a product of far too many abstract concepts in a million different places that have come together to make a thing.
We make super complex things with zero effort put into communicating how or why they work a way they do.
We increase the complexity just to do it. And I feel we are hitting our limits.
The problem w/ Boeing is not the inability of people to manage complexity but of management's refusal to manage complexity in a responsible way.
For instance, MCAS on the 737 is a half-baked implementation of the flight envelope protection facility on modern fly-by-wire airliners (all of them, except for the 737). The A320 had some growing pains with this, particularly it had at least two accidents where pilots tried to fly the plane into the ground, thought it would fail because of the flight envelope protection system, but they succeeded and crashed anyway. Barring that bit of perversity right out of the Normal Accidents book, people understand perfectly well how to build a safe fly-by-wire system. Boeing chose not to do that, and they refused to properly document what they did.
Boeing chose to not develop a 737 replacement, so all of us are suffering: in terms of noise, for instance, pilots are going deaf, passengers have their head spinning after a few hours in the plane, and people on the ground have no idea that the 737 is much louder than competitors.
Okay but your entire comment is riddled with mentions of complex systems (flight envelope system?) which proves the point of the parent comment. "Management" here is a group of humans who need to deal with all the complexity of corporate structures, government regulations, etc.. while also dealing with the complexities of the products themselves. We're all fallible beings.
Boeing management is in the business of selling contracts. They are not in the business of making airplanes. That is the problem. They relocated headquarters from Seattle to Chicago and now DC so that they can focus on their priority, contracts. They dumped Boeing's original management style and grafted on the management style of a company that was forced to merge with Boeing. They diversified supply chain as a form of kickbacks to local governments/companies that bought their 'contracts'.
They enshiftified every area of the company, all with the priority/goal of selling their core product, 'contracts', and filling their 'book'.
We are plenty capable of designing Engineering systems, PLMs to manage EBOMs, MRP/ERP systems to manage MBOMs, etc to handle the complexities of building aircraft. What we can't help is the human desire to prioritize enshitfication if it means a bigger paycheck. Companies no longer exist to create a product, and the product is becoming secondary and tertiary in management's priorities, with management expecting someone else to take care of the 'small details' of why the company exists in the first place.
Boeing is a kickbacks company in a really strange way. They get contracts based on including agreements to source partly from the contracties local area. Adding complexity for contracts and management bonus sake, not efficiency, not redundancy, not expertise. Add onto that a non-existent safety culture and a non-manufacturing/non-aerospace focused management philosophy grafting on from a company that failed and had to be merged into Boeing replacing the previous Boeing management philosophy. Enshitifaction in every area of the company. Heck they moved headquarters from Seattle to Chicago, and now from Chicago to DC. Prioritizing being where the grift is over, you know, being where the functions of the company are so that management has a daily understanding of what the company does. Because to management what the company does is win contracts, not build aerospace products. 'Someone else' takes care of that detail, according to Boeing management. Building those products in now secondary/tertiary to management.
I did ERP/MPR/EBOM/MBOM/BOM systems for aerospace. We have that stuff down. We have systems for this kind of communication down really well. We can build within a small window an airplane with thousands of parts with lead times from 1 day to 3 months to over a year for certain custom config options, with each parts design/FAA approval/manufacturing/installation tracked and audited. Boeing's issue is culture, not humanity's ability to make complex systems.
But I do agree that there is a complexity issue in society in general, and a lot of systems are coasting on the efforts of those that originally put them in place/designed them. A lot of government seems to be this way too. There's also a lot of overhead for overheads sake, but little process auditing/iterative improvement style management.
Ironically I think you got that almost exactly wrong.
Avoiding "cowboyism" has instead lead to the rise of heuristics for avoiding trouble that are more religion than science. The person who is most competent is also likely to be the person who has learned lessons the hard way the most times, not the person who has been very careful to avoid taking risks.
And let me just say that there are VERY few articles so poorly written that I literally can't get past the first paragraph, and an article that cherry-picks disasters to claim generalized incompetence scores my very top marks for statistically incompetent disingenuous bullshit. There will always be a long tail of "bad stuff that happens" and cherry-picking all the most sensational disasters is not a way of proving anything.
I'm predisposed to agree with the diagnosis that incompetence is ruining a lot of things, but the article boils down to "diversity hiring is destroying society" and seems to attribute a lot of the decline to the Civil Rights Act of 1964. Just in case anybody's wondering what they would get from this article.
> By the 1960s, the systematic selection for competence came into direct conflict with the political imperatives of the civil rights movement. During the period from 1961 to 1972, a series of Supreme Court rulings, executive orders, and laws—most critically, the Civil Rights Act of 1964—put meritocracy and the new political imperative of protected-group diversity on a collision course. Administrative law judges have accepted statistically observable disparities in outcomes between groups as prima facie evidence of illegal discrimination. The result has been clear: any time meritocracy and diversity come into direct conflict, diversity must take priority.
TL;DR "the California PG&E wildfires and today's JavaScript vulnerability are all the fault of Woke Politics." Saved you a click.
A more fundamental reason is that society is no longer interested in pushing forward at all cost. It's the arrival at an economical and technological equilibrium where people are comfortable enough, along with the end of the belief in progress as an ideology, or way to salvation somewhere during the 20th century. If you look closely, a certain kind of relaxation has replaced a quest for efficiency everywhere. Is that disappointing? Is that actually bad? Do you think there might be a rude awakening?
Consider: It was this scifi-fueled dream of an amazing high-tech, high-competency future that also implied machines doing the labour, and an enlightened future relieving people of all kinds of unpleasantries like boring work, therefore prevented them from attaining high competency. The fictional starship captain, navigating the galaxy and studying alien artifacts was always saving planets full of humans in desolate mental state...
My own interpretation of the business cycle is that growth cause externalities that stop growth. Sometimes you get time periods like the 1970s where efforts to control externalities themselves would cause more problems than they solved, at least some of the time. (e.g. see the trash 1974 model year of automobiles where they hadn’t figured out how to make emission controls work.)
I’d credit the success of Reagan in the 1980s at managing inflation to a quiet policy of degrowth the Republicans could get away with because everybody thinks they are “pro business”. As hostile as Reagan’s rhetoric was towards environmentalism note we got new clean air and clean water acts in the 1980s but that all got put in pause under Clinton where irresponsible monetary expansion restarted.
> along with the end of the belief in progress as an ideology, or way to salvation somewhere during the 20th century.
That 20th century belief in technological progress as a "way to salvation" killed itself with smog and rivers so polluted they'd catch on fire, among other things.
Thank you for summarizing (I actually read the whole article before seeing your reply and might have posted similar thoughts). I get the appeal of romanticizing our past as a country, looking back at the post-war era, especially the space race with a nostalgia that makes us imagine it was a world where the most competent were at the helm. But it just wasn't so, and still isn't.
Many don't understand that the Civil Rights Act describes the systematic LACK of a meritocracy. It defines the ways in which merit has been ignored (gender, race, class, etc) and demands that merit be the criteria for success -- and absent the ability for an institution to decide on the merits it provides a (surely imperfect) framework to force them to do so. The necessity of the CRA then and now, is the evidence of absence of a system driven on merit.
I want my country to keep striving for a system of merit but we've got nearly as much distance to close on it now as we did then.
>Many don't understand that the Civil Rights Act describes the systematic LACK of a meritocracy. It defines the ways in which merit has been ignored (gender, race, class, etc) and demands that merit be the criteria for success
The word "meritocracy" was invented for a book about how it's a bad idea that can't work, so I'd recommend not trying to have one. "Merit" doesn't work because of Goodhart's law.
I also feel like you'd never hire junior engineers or interns if you were optimizing for it, and then you're either Netflix or you don't have any senior engineers.
FWiW Michael Young, Baron Young of Dartington, the author of the 1958 book The Rise of the Meritocracy popularised the term which rapidly lost the negative connotations he put upon it.
He didn't invent the term though, he lifted it from an earlier essay by another British sociologist Alan Fox who apparently coined it two years earlier in a 1956 essay.
Everything has become organized around measurable things and short-term optimization. "Disparate impact" is just one example of this principle. It's easy to measure demographic representation, and it's easy to tear down the apparent barriers standing in the way of proportionality in one narrow area. Whereas, it's very hard to address every systemic and localized cause leading up to a number of different disparities.
Environmentalism played out a similar way. It's easy to measure a factory's direct pollution. It's easy to require the factory to install scrubbers, or drive it out of business by forcing it to account for externalities. It's hard to address all of the economic, social, and other factors that led to polluting factories in the first place, and that will keep its former employees jobless afterward. Moreover, it's hard to ensure that the restrictions apply globally instead of just within one or some countries' borders, which can undermine the entire purpose of the measures, even though the zoomed-in metrics still look good.
So too do we see with publically traded corporations and other investment-heavy enterprises: everything is about the stock price or other simple valuation, because that makes the investors happy. Running once venerable companies into the ground, turning merges and acquisitions into the core business, spreading systemic risk at alarming levels, and even collapsing the entire economy don't show up on balance sheets or stock reports as such and can't easily get addressed by shareholders.
And yet now and again "data-driven" becomes the organizing principle of yet another sector of society. It's very difficult to attack the idea directly, because it seems to be very "scientific" and "empirical". But anecdote and observation are still empirically useful, and they often tell us early on that optimizing for certain metrics isn't the right thing to do. But once the incentives are aligned that way, even competent people give up and join the bandwagon.
This may sound like I'm against data or even against empiricism, but that's not what I'm trying to say. A lot of high-level decisions are made by cargo-culting empiricism. If I need to choose a material that's corrosion resistant, obviously having a measure of corrosion resistance and finding the material that minimizes it makes sense. But if the part made out of that material undergoes significant shear stress, then I need to consider that as well, which probably won't be optimized by the same material. When you zoom out to the finished product, the intersection of all the concerns involved may even arrive at a point where making the part
easily replaceable is more practical than making it as corrosion-resistant as possible. No piece of data by itself can make that judgment call.
Well, on the web side, it'd be a lot less complex if we weren't trying to write applications using a tool designed to create documents. If people compiled Qt to WASM (for instance), or for a little lighter weight, my in-development UI library [1] compiled to WASM, I think they'd find creating applications a lot more straightforward.
Most apps don’t need to be on the web. And the ones that need to be can be done with the document model instead of the app model. We added bundles of complexity to an already complex platform (the browser).
I don't think there's any. Too many luminaries are going to defend the fact that we can have things like "poo emojis" in domain names.
They don't care about the myriad of homograph/homoglyph attacks made possible by such an idiotic decision. But they've got their shiny poo, so at least they're happy idiots.
> Too many luminaries are going to defend the fact that we can have things like "poo emojis" in domain names. They don't care about the myriad of homograph/homoglyph attacks made possible by such an idiotic decision.
There is nothing idiotic about the decision to allow billions of people with non-latin scripts to have domain names in their actual language.
What's idiotic is to consider visual inspection of domain names a neccessary security feature.
DNS could be hosted on a blockchain, each person use his own rules for validating names, and reject, accept or rename any ambiguous or dangerous part of the name, in a totally secure and immutable way.
Blockchain has the potential to be the fastest and cheapest network on the planet, because it is the only "perfect competition" system on the internet.
"Perfect competition" comes from game theory, and "perfect" means that no one is excluded from competing. "Competition" means that the best performing nodes of the network put the less efficient nodes out of business.
For the moment unfortunately, there is no blockchain which is the fastest network on the planet, but that's gonna change. Game theory suggests that there will be a number of steps before that happens, and it takes time. In other words, the game will have to be played for a while, for some objectives to be achieved.
UTF8 and glyphs are not related to supply chains, and that's a little bit off topic, but i wanted to mention that there is a solution.
in a strange way, this almost makes the behavior of hopping onto every new framework rational. The older and less relevant the framework, the more the owner's starry-eyed enthusiasm wears off. The hope that bigcorp will pay $X million for the work starts to fade. The tedium of bug fixes and maintenance wears on, the game theory takes it's toll. The only rational choice for library users is to jump ship once the number of commits and hype starts to fall -- that's when the owner is most vulnerable to the vicissitudes of Moloch.
> in a strange way, this almost makes the behavior of hopping onto every new framework rational.
Or maybe not doing that and just using native browser APIs? Many of these frameworks are overkill and having so many "new" ones just makes the situation worse.
Many of them predate those native browser APIs. Pollyfills, the topic at hand, were literally created to add modern APIs to all browsers equally (most notably old Safaris, Internet Explorers, etc.).
Good point. What's often (and sometimes fairly) derided as "chasing the new shiny" has a lot of other benefits too: increased exposure to new (and at least sometimes demonstrably better) ways of doing things; ~inevitable refactoring along the way (otherwise much more likely neglected); use of generally faster, leaner, less dependency-bloated packages; and an increased real-world userbase for innovators. FWIW, my perspective is based on building and maintaining web-related software since 1998.
to be fair there is a whole spectrum between "chasing every new shiny that gets a blog post" vs. "I haven't changed my stack since 1998."
there are certainly ways to get burned by adopting shiny new paradigms too quickly; one big example in web is the masonry layout that Pinterest made popular, which in practice is extremely complicated to the point where no browser has a full implementation of the CSS standard.
To be fair, when it comes to React, I don't think there is a realistic "new shiny" yet. NextJS is (was?) looking good, although I have heard it being mentioned a lot less lately.
Perhaps. I view it as the squalor of an entirely unsophisticated market. Large organizations build and deploy sites on technologies with ramifications they hardly understand or care about because there is no financial benefit for them to do so, because the end user lacks the same sophistication, and is in no position to change the economic outcomes.
So an entire industry of bad middleware created from glued together mostly open source code and abandoned is allowed to even credibly exist in the first place. That these people are hijacking your browser sessions rather than selling your data is a small distinction against the scope of the larger problem.
Tea is not the “replacement for homebrew” apart from the fact that the guy that started homebrew also started tea. There’s a bunch of good reasons not to use tea, not least the fact that it’s heavily associated with cryptocurrency bullshit.
Alternatively, if you rely on some code then download a specific version and check it before using it. Report any problems found. This makes usage robust and supports open source support and development.
I'm afraid this is hitting on the other end of inviolable game theory laws. Dev who is paid for features and business value wants to read line-by-line random package that is upgrading from version 0.3.12 to 0.3.13 in a cryptography or date lib that they likely don't understand? And this should be done for every change of every library for all software, by all devs who will always be responsible, not lazy, and very attentive and careful.
On the flip side there is "doing as little as possible and getting paid" for the remainder of a 40 year career where you are likely to be shuffled off when the company has a bad quarter anyway.
In my opinion, if that was incentivized by our system, we'd already be seeing more of it, we have the system we have due to the incentives we have.
Correct. I don't think I have ever seen sound engimeering decisions being rewarded at any business I have worked for. The only reason any sound decisions are made is that some programmers take the initiative, but said initiative rarely comes with a payoff and always means fighting with other programmers who have a fetish for complexity.
If only programmers had to take an ethics oath so they have an excuse not to just go along with idiotic practices.
Then there are the programmers who read on proggit that “OO drools, functional programming rules” or the C++ programmers who think having a 40 minute build proves how smart and tough they are, etc.
> Report any problems found. This makes usage robust and supports open source support and development.
Project maintainers/developers are not free labor. If you need a proper solution to any problem, make a contract and pay them. This idea that someone will magically solve your problem for free needs to die.
Vendoring should be the norm, not the special case.
Something like this ought to be an essential part of all package managers, and I'm thinking here that the first ones should be the thousands of devs cluelessly using NPM around the world:
We've seen a lot more attacks succeed because somebody has vendored an old vulnerable library than supply chain attacks. Doing vendoring badly is worse than relying on upstream. Vendoring is part of the solution, but it isn't the solution by itself.
Not alone, no. That's how CI bots help a lot, such as Dependabot.
Althought it's also worrying how we seemingly need more technologies on top of technologies just to keep a project alive. It used to be just including the system's patched header & libs, now we need extra bots surveying everything...
Maybe a linux-distro-style of community dependency management would make sense. Keep a small group of maintainers busy with security patches for basically everything, and as a downstream developer just install the versions they produce.
In the old ways, you mostly rely on a few libraries that each solve a complete problem and is backed by a proper community. The odd dependency is usually small and vendored properly. Security was mostly the environment concern (the OS) as the data is either client side or some properly managed enterprise infrastructure). Now we have npm with its microscopic and numerous packages, everyone wants to be on the web, and they all want your data.
That isn't the plan. For this to work new versions have to be aggressively adopted. This is about accepting that using an open source project means adopting that code. If you had an internal library with bug fixes available then the right thing is to review those fixes and merge them into the development stream. It is the same with open source code you are using. If you care to continue using it then you need to get the latest and review code changes. This is not using old code, this is taking the steps needed to continue using code.
> did you think service would remain high quality, free, well supported, and run by tireless, unselfish, unambitious benevolent dictators for the rest of your life
I would run some things I run forever free, if once in a while 1 user would be grateful. In reality that doesn’t happen so I usually end up monetising and then selling it off. People whine about everything and get upset if I don’t answer tickets within a working day etc. Mind you; these are free things with no ads. The thing is; they expect me to fuck them over in the end as everyone does, so it becomes a self fulfilling prophecy. Just a single email or chat saying thank you for doing this once in a while would go a long way, but alas; it’s just whining and bug reports and criticism.
It's amazing to me that anyone who tried to go to a website, then was redirected to an online sports betting site instead of the site they wanted to go to, would be like "hmm, better do some sports gambling instead, and hey this looks like just the website for me". This sort of thing must work on some percentage of people, but it's disappointing how much of a rube you'd have to be to fall for it.
I can't find the reference now, but I think I read somewhere it only redirects when the user got there by clicking on an ad. In that case it would make a bit more sense - the script essentially swaps the intended ad target to that sport gambling website. Could work if the original target was a gaming or sport link.
This assumes that advertisers know how the traffic came to their site. The malware operators could be scamming the advertisers into paying for traffic with very low conversion rates.
It could be targeting people that already have such a website's tab open somewhere in the browser.
Assuming the user opened the website and didn't notice the redirect (this is more common in mobile), then forgot about it and when they opened their browser again a few days later, their favorite gambling website was waiting for them, and proceeded to gamble as they usually do.
Hi, I'm the original author of the polyfill service. I did not own the domain name nor the GitHub account. In recent years I have not been actively involved in the project, and I'm dismayed by what's happened here. Sites using the original polyfill.io should remove it right away and use one of the alternatives or just drop it entirely - in the modern era it really isn't needed anymore.
People should have never started including javascript from third-party domains in the first place. It was always playing with fire and there were plenty of people pointing out the risks.
Does this person telling us not to use polyfill.io, and the guy who sold polyfill.io to the chinese company both work at Fastly? If so, that's kind of awkward...
It appears both currently do work for Fastly. I am pleased the Fastly developer advocate warned us, and announced a fork and alternative hosting service:
Neither of them had ownership of the project, so neither of them were responsible for the sale or benefited from it.
They both simply dedicated a lot of time, care and skill to the project. It's really a shame to see what they spent so much time building and maintaining now being used as a platform to exploit people. I'm sure its extremely disappointing to both of them.
- In mid-February 2024, the polyfillpolyfill account was created on Github, and took ownership over the repo.
So I think sometime between October 2023 and February 2024, JakeChampion decided to sell the site to Funnull. I think the evidence is consistent with him having made a decision to sell the site to _somebody_ in December 2023, and the deal with Funnull closing sometime early February 2024.
You can reduce issues like this using subresource intergrity (SRI) but there are still tradeoffs (around privacy & reliability - see article above) and there is a better solution: self-host your dependencies behind a CDN service you control (just bunny/cloudflare/akamai/whatever is fine and cheap).
In a tiny prototyping project, a public CDN is convenient to get started fast, sure, but if you're deploying major websites I would really strong recommend not using public CDNs, never ever ever ever (the World Economic Forum website is affected here, for example! Absolutely ridiculous).
I always prefer to self-host my dependencies, but as a developer who prefer to avoid an npm-based webpack/whatever build pipeline it's often WAY harder to do that than I'd like.
If you are the developer of an open source JavaScript library, please take the time to offer a downloadable version of it that works without needing to run an "npm install" and then fish the right pieces out of the node_modules folder.
jQuery still offer a single minified file that I can download and use. I wish other interesting libraries would do the same!
(I actually want to use ES Modules these days which makes things harder due to the way they load dependencies. I'm still trying to figure out the best way to use import maps to solve this.)
The assumption of many npm packages is that you have a bundler and I think rightly so because that leaves all options open regarding polyfilling, minification and actual bundling.
I would agree with you if minification delivered marginal gains, but it will generally roughly halve the size of a large bundle or major JS library (compared to just gzip'ing it alone), and this is leaving aside further benefits you can get from advanced minification with dead code removal and tree-shaking. That means less network transfer time and less parse time. At least for my use-cases, this will always justify the extra build step.
I really miss the days of minimal/no use of JS in websites (not that I want java-applets and Flash LOL). Kind of depressing that so much of the current webdesign is walled behind javascript.
Cool, I can download 20 MB of JavaScript instead of 40. Everyone uses minification, and "web apps" still spin up my laptop fans. Maybe we've lost the plot.
There might be a negative incentive in play: you may be compressing packages, but having your dependencies available at the tip of *pm install bloats overall size and complexity beyond what lack of bundling would give you.
The assumption shouldn't be that you have a bundler, but that your tools and runtimes support standard semantics, so you can bundle if you want to, or not bundle if you don't want to.
Ime this has always been standard practice for production code at all the companies I've worked at and with as a SWE or PM - store dependencies within your own internal Artifactory, have it checked by a vuln scanner, and then called and deployed.
That said, I came out of the Enterprise SaaS and Infra space so maybe workflows are different in B2C, but I didn't a difference in the customer calls I've been on.
I guess my question is why your employer or any other org would not follow the model above?
> I guess my question is why your employer or any other org would not follow the model above?
Frankly, it's because many real-world products are pieced together by some ragtag group of bright people who have been made responsible for things they don't really know all that much about.
The same thing that makes software engineering inviting to autodidacts and outsiders (no guild or license, pragmatic 'can you deliver' hiring) means that quite a lot of it isn't "engineered" at all. There are embarrassing gaps in practice everywhere you might look.
Yep. The philosophy most software seems to be written with is “poke it until it works locally, then ship it!”. Bugs are things you react to when your users complain. Not things you engineer out of your software, or proactively solve.
This works surprisingly well. It certainly makes it easier to get started in software. Well, so long as you don’t mind that most modern software performs terribly compared to what the computer is capable of. And suffers from reliability and security issues.
Counterpoint: It's not not about being an autodidact or an outsider.
I was unlikely to meet any bad coders at work, due to how likely it is they were filtered by the hiring process, and thus I never met anyone writing truly cringe-worthy code in a professional setting.
That was until I decided to go to university for a bit[1]. This is where, for the first time, I met people writing bad code professionally: professors[2]. "Bad" as in best-practices, the code usually worked. I've also seen research projects that managed to turn less than 1k LOC of python into a barely-maintainable mess[3].
I'll put my faith in an autodidact who had to prove themselves with skills and accomplishments alone over someone who got through the door with a university degree.
An autodidact who doesn't care about their craft is not going to make the cut, or shouldn't. If your hiring process doesn't filter those people, why are you wasting your time at a company that probably doesn't know your value?
[1] Free in my country, so not a big deal to attend some lectures besides work. Well, actually I'm paying for it with my taxes, so I might as well use it.
[2] To be fair, the professors teaching in actual CS subjects were alright. Most fields include a few lectures on basic coding though, which were usually beyond disappointing. The non-CS subject that had the most competent coders was mathematics. Worst was economics. Yes, I meandered through a few subjects.
[3] If you do well on some test you'd usually get job offers from professors, asking you to join their research projects. I showed up to interviews out of interest in the subject matter and professors are usually happy to tell you all about it, but wages for students are fixed at the legal minimum wage, so it couldn't ever be a serious consideration for someone already working on the free market.
I was hoping ongoing coverage would answer that; it sounds like a perfect example. I heard that the tampered code redirects traffic to a sports betting site.
There are cheaper or free alternatives to Artifactory. Yes they may not have all of the features but we are talking about a company that is fine with using a random CDN instead.
Or, in the case of javascript, you could just vendor your dependencies or do a nice "git add node_modules".
I just gave Artifactory as an example. What about GHE, self-hosted GitLab, or your own in-house Git?
Edit: was thinking - would be a pain in the butt to manage. That tracks, but every org ik has some corporate versioning system that also has an upsell for source scanning.
I've been a part of a team which had to manage a set of geodistributed Artifactory clusters and it was a pain in the butt to manage, too - but these were self-hosted. At a certain scale you have to pick the least worst solution though, Artifactory seems to be that.
This is kinda sad. For introducing new dependencies, a vuln scanner makes sense (don't download viruses just because they came from a source checkout!), but we could have kept CDNs if we'd used signatures.
EDIT: Never mind, been out of the game for a bit! I see there is SRI now...
This supply chain attack had nothing to do with npm afaict.
The dependency in question seems to be (or claim to be) a lazy loader that determines browser support for various capabilities and selectively pulls in just the necessary polyfills; in theory this should make the frontend assets leaner.
But the CDN used for the polyfills was injecting malicious code.
I would expect latency (network round trip time) to make this entire exercise worthless. Most polyfills are 1kb or less. Splitting polyfill code amongst a bunch of small subresources that are loaded from a 3rd party domain sounds like it would be a net loss to performance. Especially since your page won’t be interactive until those resources have downloaded.
Your page will almost certainly load faster if you just put those polyfills in your main js bundle. It’ll be simpler and more reliable too.
In practice when this wasn't a Chinese adware service, it proved to be faster to use the CDN.
You are not loading a "bunch" of polyfill script files, you selected what you needed in the URL via a query parameter, and the service took that plus user agent of the request to determine which were needed and returned a minified file of just the necessary polyfills.
As this request was to a separate domain it did not run into the head of line / max connections per domain issue of Http 1.1 which was still the more common protocol at the time this service came out.
Js dependencies should be pretty small compared to images or other resources. Http pipelining should make it fast to load them from your server with the rest
The only advantage to using one of those cdn-hosted versions is that it might help with browser caching
> Http pipelining should make it fast to load them from your server with the rest
That's true, but it should be emphasized that it's only fast if you bundle your dependencies, too.
Browsers and web developers haven't been able to find a way to eliminate a ~1ms/request penalty for each JS file, even if the files are coming out of the local cache.
If you're making five requests, that's fine, but if you're making even 100 requests for 10 dependencies and their dependencies, there's a 100ms incentive to do at least a bundle that concatenates your JS.
And once you've added a bundle step, you're a few minutes away from adding a bundler that minifies, which often saves 30% or more, which is usually way more than you probably saved from just concatenating.
> The only advantage to using one of those cdn-hosted versions is that it might help with browser caching
And that is not true. Browsers have separate caches for separate sites for privacy reasons. (Before that, sites could track you from site to site by seeing how long it took to load certain files from your cache, even if you'd disabled cookies and other tracking.)
There is still a caching effect of the CDN for your servers, even if there isn't for the end user: if the CDN serves the file then your server does not have to.
Large CDNs with endpoints in multiple locations internationally also give the advantage of reducing latency: if your static content comes from the PoP closest to me (likely London, <20ms away where I'm currently sat, ~13 on FTTC at home⁰, ~10 at work) that could be quite a saving if your server is otherwise hundreds of ms away (~300ms for Tokyo, 150 for LA, 80 for New York). Unless you have caching set to be very aggressive dynamic content still needs to come from your server, but even then a high-tech CDN can² reduce the latency of the TCP connection handshake and¹ TLS handshake by reusing an already open connection between the CDN and the backing server(s) to pipeline new requests.
This may not be at all important for many well-designed sites, or sites where latency otherwise matters little enough that a few hundred ms a couple of times here or there isn't really going to particularly bother the user, but could be a significant benefit to many bad setups and even a few well-designed ones.
--------
[0] York. The real one. The best one. The one with history and culture. None of that “New” York rebranded New Amsterdam nonsense!
[1] if using HTTPS and you trust the CDN to re-encrypt, or HTTP and have the CDN add HTTPS, neither of which I wouldn't recommend as it is exactly an MitM situation, but both are often done
[2] assuming the CDN also manages your DNS for the whole site, or just a subdomain for the static resources, so the end user sees the benefit of the CDNs anycast DNS arrangement.
I don't want a build pipeline. I want to write some HTML with a script type=module tag in it with some JavaScript, and I want that JavaScript to load the ES modules it depends on using import statements (or dynamic import function calls for lazy loading).
Do you not use CSS preprocessors or remote map files or anything like that... or do you just deal with all of that stuff manually instead of automating it?
I suspect this is more relevant for people who aren't normally JavaScript developers. (Let's say you use Go or Python normally.) It's a way of getting the benefits of multi-language development while still being mostly in your favorite language's ecosystem.
On the Node.js side, it's not uncommon to have npm modules that are really written in another language. For example, the esbuild npm downloads executables written in Go. (And then there's WebAssembly.)
In this way, popular single-language ecosystems evolve towards becoming more like multi-language ecosystems. Another example was Python getting 'wheels' straightened out.
So the equivalent for bringing JavaScript into the Python ecosystem might be having Python modules that adapt particular npm packages. Such a module would automatically generate JavaScript based on a particular npm, handling the toolchain issue for you.
A place to start might be a Python API for the npm command itself, which takes care of downloading the appropriate executable and running it. (Or maybe the equivalent for Bun or Deno?)
This is adding still more dependencies to your supply chain, although unlike a CDN, at least it's not a live dependency.
Sooner or later, we'll all depend on left-pad. :-)
This is why I don't think it's very workable to avoid npm. It's the package manager of the ecosystem, and performs the job of downloading dependencies well.
I personally never want to go back to the pre-package-manager days for any language.
One argument is that Javascript-in-the-browser has advanced a lot and there's less need for a build system. (ex. ESM module in the browser)
I have some side projects that are mainly HTMX-based with some usage of libraries like D3.js and a small amount of hand-written Javascript. I don't feel that bad about using unpkg because I include signatures for my dependencies.
Before ESM I wasn't nearly as sold on skipping the build step, but now it feels like there's a much nicer browser native way of handling dependencies, if only I can get the files in the right shape!
npm is a package manager though, not a build system. If you use a library that has a dependency on another library, npm downloads the right version for you.
Yep. And so does unpkg. If you’re using JavaScript code through unpkg, you’re still using npm and your code is still bundled. You’re just getting someone else to do it, at a cost of introducing a 3rd party dependency.
I guess if your problem with npm and bundlers is you don’t want to run those programs, fine? I just don’t really understand what you gain from avoiding running bundlers on your local computer.
Oh lol yeah, I recently gave up and just made npm build part of my build for a hobby project I was really trying to keep super simple, because of this. It was too much of a hassle to link in stuff otherwise, even very minor small things
You shouldn't need to fish stuff out of node_moduoes though, just actually get it linked and bundled into one is so that it automatically grabs exactly what you need and it's deps.
If this process sketches you out as it does me, one way to address that, as I do, is have the bundle emitted with minification disabled so its easy to review
That was my thought too but polyfill.io does do a bit more than a traditional library CDN, their server dispatches a different file depending on the requesting user agent, so only the polyfills needed by that browser are delivered and newer ones don't need to download and parse a bunch of useless code. If you check the source code they deliver from a sufficiently modern browser then it doesn't contain any code at all (well, unless they decide to serve you the backdoored version...)
OTOH doing it that way means you can't use subresource integrity, so you really have to trust whoever is running the CDN even more than usual. As mentioned in the OP, Cloudflare and Fastly both host their own mirrors of this service if you still need to care about old browsers.
The shared CDN model might have made sense back when browsers used a shared cache, but they dont even do that anymore.
Static files are cheap to serve. Unless your site is getting hundreds of millions of page views, just plop the js file on your webserver. With HTTP/2 it will probably be almost the same speed if not faster than a cdn in practise.
If you have hundreds of millions of pageviews, go with a trusted party - someone you actually pay money to - like Cloudflare, Akamai, or any major hosting / cloud party. But not to increase cache hit rate (what CDNs were originally intended for), but to reduce latency and move resources to the edge.
Does it even reduce latency that much (unless you have already squeezed latency out of everything else that you can)?
Presumably your backend at this point is not ultra optimized. If you send a link header and using http/2 the browser will download the js file while your backend is doing its thing. I'm doubtful that moving js to the edge would help that much in such a situation unless the client is on the literal other side of the world.
There of course comes a point where it does matter, i just think the cross over point is way later than people expect.
Stockholm <-> Tokyo is at least 400ms here, anytime you have multi-national sites having a CDN is important. For your local city, not so much (and of course you won't even see it locally).
I understand that ping times are different when geolocated. My point was that in fairly typical scenarios (worst cases are going to be worse) it would be hidden by backend latency since the fetch could be concurrent with link headers or http 103. Devil in details of course.
I'm so glad to find some sane voices here! I mean, sure, if you're really serving a lot of traffic to Mombasa, akamai will reduce latency. You could also try to avoid multi megabyte downloads for a simple page.
While there are lots of bad examples out there - keep in mind its not quite that straight forward as it can make a big difference whether those resources are on the critical path that blocks first paint or not.
It’s not an either or thing. Do both. Good sites are small and download locally. The CDN will work better (and be cheaper to use!) if you slim down your assets as well.
Even when it "made sense" from a page load performance perspective, plenty of us knew it was a security and privacy vulnerability just waiting to be exploited.
There was never really a compelling reason to use shared CDNs for most of the people I worked with, even among those obsessed with page load speeds.
In my experience, it was more about beating metrics in PageSpeed Insights and Pingdom, rather than actually thinking about the cost/risk ratio for end users. Often the people that were pushing for CDN usage were SEO/marketing people believing their website would rank higher for taking steps like these (rather than working with devs and having an open conversation about trade-offs, but maybe that's just my perspective from working in digital marketing agencies, rather than companies that took time to investigate all options).
I don’t think it ever even improved page load speeds, because it introduces another dns request, another tls handshake, and several network round trips just to what? Save a few kb on your js bundle size? That’s not a good deal! Just bundle small polyfills directly. At these sizes, network latency dominates download time for almost all users.
> I don’t think it ever even improved page load speeds, because it introduces another dns request, another tls handshake, and several network round trips just to what?
I think the original use case, was when every site on the internet was using jquery, and on a js based site this blocked display (this was also pre fancy things like HTTP/2 and TLS 0-RTT). Before cache partitioning you could reuse jquery js requested from a totally different site currently in cache as long as the js file had same url, which almost all clients already had since jquery was so popular.
So it made sense at one point but that was long ago and the world is different now.
I believe you could download from multiple domains at the same time, before HTTP/2 became more common, so even with the latency you'd still be ahead while your other resources were downloading. Then it became more difficult when you had things like plugins that depended on order of download.
You can download from multiple domains at once. But think about the order here:
1. The initial page load happens, which requires a DNS request, TLS handshake and finally HTML is downloaded. The TCP connection is kept alive for subsequent requests.
2. The HTML references javascript files - some of these are local URLs (locally hosted / bundled JS) and some are from 3rd party domains, like polyfill.
3a. Local JS is requested by having the browser send subsequent HTTP requests over the existing HTTP connection
3b. Content loaded from 3rd party domains (like this polyfill code) needs a new TCP connection handshake, a TLS handshake, and then finally the polyfills can be loaded. This requires several new round-trips to a different IP address.
4. The page is finally interactive - but only after all JS has been downloaded.
Your browser can do steps 3a and 3b in parallel. But I think it'll almost always be faster to just bundle the polyfill code in your existing JS bundle. Internet connections have very high bandwidth these days, but latency hasn't gotten better. The additional time to download (lets say) 10kb of JS is trivial. The extra time to do a DNS lookup, a TCP then TLS handshake and then send an HTTP request and get the response can be significant.
And you won't even notice when developing locally, because so much of this stuff will be cached on your local machine while you're working. You have to look at the performance profile to understand where the page load time is spent. Most web devs seem much more interested in chasing some new, shiny tech than learning how performance profiling works and how to make good websites with "old" (well loved, battle tested) techniques.
Aren't we also moving toward not even letting cross-origin scripts having very little access to information about the page? I read some stuff a couple years ago that gave me a very strong impression that running 3rd party scripts was quickly becoming an evolutionary dead end.
Definitely for browser extensions. It's become more difficult with needing to set up CORS, but like with most things that are difficult, you end up with developers that "open the floodgates" and allow as much as possible to get the job done without understanding the implications.
The same concept should be applied to container based build pipelines too. Instead of pulling dependencies from a CDN or a pull through cache, build them into a container and use that until you're ready to upgrade dependencies.
It's harder, but creates a clear boundary for updating dependencies. It also makes builds faster and makes old builds more reproducible since building an old version of your code becomes as simple as using the builder image from that point in time.
> The same concept should be applied to container based build pipelines too. Instead of pulling dependencies from a CDN or a pull through cache, build them into a container and use that until you're ready to upgrade dependencies.
Everything around your container wants to automatically update itself as well, and some of the changelogs are half emoji.
I can kind of understand why people went away from this, but this is how we did it for years/decades and it just worked. Yes, doing this does require more work for you, but that's just part of the job.
For performance reasons alone, you definitely want to host as much as possible on the same domain.
In my experience from inside companies, we went from self-hosting with largely ssh access to complex deployment automation and CI/CD that made it hard to include any new resource in the build process. I get the temptation: resources linked from external domains / cdns gave the frontend teams quick access to the libraries, fonts, tools, etc. they needed.
Thankfully things have changed for the better and it's much easier to include these things directly inside your project.
There was a brief period when the frontend dev world believed the most performant way to have everyone load, say, jquery, would be for every site to load it from the same CDN URL. From a trustworthy provider like Google, of course.
It turned out the browser domain sandboxing wasn’t as good as we thought, so this opened up side channel attacks, which led to browsers getting rid of cross-domain cache sharing; and of course it turns out that there’s really no such thing as a ‘trustworthy provider’ so the web dev community memory-holed that little side adventure and pivoted to npm.
Which is going GREAT by the way.
The advice is still out there, of course. W3schools says:
> One big advantage of using the hosted jQuery from Google:
> Many users already have downloaded jQuery from Google when visiting another site. As a result, it will be loaded from cache when they visit your site
Be good at a time when Google manually ranks domains, then pivot to crap when Google stops updating the ranking. Same as the site formerly known as Wikia.
> For performance reasons alone, you definitely want to host as much as possible on the same domain.
It used to be the opposite. Browsers limit the amount of concurrent requests to a domain. A way to circumvent that was to load your resources from a.example.com, b.example.com, c.example.com etc. Paying some time for extra dns resolves I guess, but could then load many more resources at the same time.
Not as relevant anymore, with http2 that allows sharing connections, and more common to bundle files.
Years ago I had terrible DNS service from my ISP, enough to make my DSL sometimes underperform dialup. About 1 in 20 DNS lookups would hang for many seconds so it was inevitable that any web site that pulled content from multiple domains would hang up for a long time when loading. Minimizing DNS lookups was necessary to get decent performance for me back then.
Using external tools can make it quite a lot harder to do differential analysis to triage the source of a bug.
The psychology of debugging is more important than most allow. Known unknowns introduce the possibility that an Other is responsible for our current predicament instead of one of the three people who touched the code since the problem happened (though I've also seen this when the number of people is exactly 1)
The judge and jury in your head will refuse to look at painful truths as long as there is reasonable doubt, and so being able to scapegoat a third party is a depressingly common gambit. People will attempt to put off paying the piper even if doing so means pissing off the piper in the process. That bill can come due multiple times.
Maybe people have been serving those megabytes of JS frameworks from some single-threaded python webserver (in dev/debug mode to boot) and wondered why they could only hit 30req/s or something like that.
I don't think SRI would have ever worked in this case because not only do they dynamically generate the polyfill based on URL parameters and user agent, but they were updating the polyfill implementations over time.
>self-host your dependencies behind a CDN service you control (just bunny/cloudflare/akamai/whatever is fine and cheap).
This is not always possible, and some dependencies will even disallow it (think: third-party suppliers). Anyways, then that CDN service's BGP routes are hijacked. Then what? See "BGP Routes" on https://joshua.hu/how-I-backdoored-your-supply-chain
But in general, I agree: websites pointing to random js files on the internet with questionable domain independence and security is a minefield that is already exploding in some places.
I strongly believe that Browser Dev Tools should have an extra column in the network tab that highlights JS from third party domains that don't have SRI. Likewise in the Security tab and against the JS in the Application Tab.
I've seen people reference CDNs for internal sites. I hate that because it is not only a security risk but it also means we depend on the CDN being reachable for the internal site to work.
It's especially annoying because the projects I've seen it on were using NPM anyway so they could have easily pulled the dependency in through there. Hell, even without NPM it's not hard to serve these JS libraries internally since they tend to get packed into one file (+ maybe a CSS file).
Also the folks who spec'ed ES6 modules didn't think it was a required feature to ship SRI from that start so it's still not broadly and easily supported across browsers. I requested the `with` style import attributes 8 years ago and it's still not available. :/
Another downside of SRI is that it defeats streaming. The browser can't verify the checksum until the whole resource is downloaded so you don't get progressive decoding of images or streaming parsing of JS or HTML.
CF links to the same discussion on GitHub that the OP does. Seems less like they predicted it, and more like they just thought that other folks concerns were valid and amplified the message.
I think JS (well, ES6) has a ton of positive qualities, and I think it's a great fit for many of its current applications. However, this is a pretty good example of what bothers me about the way many people use it. I see a lot of folks, in the name of pragmatism, adopt a ton of existing libraries and services so they don't have to think about more complex parts of the problem they're solving. Great! No need to reinvent the wheel. But, people also seem to falsely equate popularity with stability-- if there's a ton of people using it, SURELY someone has vetted it, and is keeping a close eye on it, no? Well, maybe no? It just seems that the bar for what people consider 'infrastructure' is simply too low. While I don't think that, alone, is SO different from other popular interpreted languages, the light weight of JS environments means you need to incorporate a LOT of that stuff to get functionality that might otherwise be part of a standard library or ubiquitous stalwart framework, which dramatically increases the exposure to events like this.
I think a lot of people conflate criticism of JS with criticism of the way it's been used for the past number of years, putting a nuanced topic into a black-and-white "for or against" sort of discussion. I've done a number of projects heavily using JS-- vanilla, with modern frameworks, server-side and in the browser-- and aside from some fundamental annoyances with it's approach to a few things, it's been a great tool. There's nothing fundamental about JS itself that makes applications vulnerable to this like a buffer overflow would, but the way it is used right now seems to make it a lot easier for inexperienced, under-resourced, or even just distracted developers to open up big holes using generally accepted techniques.
But I've never been a full-time front-end-web or node dev, so maybe I'm mistaken? Compared to the server-side stuff I've worked with, there's definitely a concerning wild west vibe moving into modern JS environments. I think there was a casual "well, it's all in client-side userspace" attitude about the way it was used before which effectively shifted most of the security concerns to the browser. We should probably push a little harder to shake that. Think about how much JS your bank uses in their website? I'll bet they didn't write all of their own interaction libraries.
This is the end-stage of having third-party packaging systems without maintainers. This happens vanishingly infrequently for things like apt repositories, because the people maintaining packages are not necessarily the people writing them, and there's a bit of a barrier to entry. Who knew yanking random code from random domains and executing it was a bad idea? Oh yeah, everyone.
I have been full stack for the last decade. Vanilla JS and server-side rendering is about all I can tolerate these days. I will reach for Ajax or websockets if needed, but 90%+ of all interactions I've ever had to deal with are aptly handled with a multipart form post.
I do vendor my JS, but only for things like PDF processing, 3D graphics, and barcode scanning.
I've been through the framework gauntlet. Angular, RiotJS, React, Blazor, AspNetCore MVC, you name it. There was a time where I really needed some kind of structure to get conceptually bootstrapped. After a while, these things begin to get in the way. Why can't I have the framework exactly my way? Just give me the goddamn HttpContext and get off my lawn. I don't need a babysitter to explain to me how to interpolate my business into an html document string anymore.
I also now understand why a lot of shops insist on separation between frontend and backend. It seems to me that you have to be willing to dedicate much of your conscious existence to honing your skills if you want to be a competent full stack developer. It can't just be your 9-5 job unless you are already highly experienced and have a set of proven patterns to work with. Getting someone off the street to that level can be incredibly expensive and risky. Once you know how to do the whole thing, you could just quit and build for yourself.
Build what? Sell to whom? What, write my own subscription/billing module, on top of full stack development which, as you said, takes a lot of time just to be competent at. on top of building it, do sales and marketing and accounting and all that other business stuff? I mean, I guess
Maybe! Add in package management that isn't a complete clusterfustrum and that could be pretty attractive. Cloudron basically does something similar with FOSS server apps.
I remember Google suggesting that everyone use common libraries hosted by a shared CDN and then suggesting de-ranking slow websites and I think that’s what led to widespread adoption of this pattern.
The only reason I stopped using third-party hosted libraries was because it wasn’t worth the trouble. Using subresource integrity makes it safe but it was part of the trouble.
Sure... Though while I hate to say it, I don't blame people for trusting Google's hosted copy of something. For better or worse, they are more trustworthy than some "as seen on a million janky tutorials" whatever.io. A very privacy-focused employer precluded that possibility during peak adoption, but with what many sites load up, that's the least of your worries.
I'm surprised there is no mention of subresource integrity in the article. It's a low effort, high quality mitigation for almost any JS packages hosted by a CDN.
EDIT: Oh, it's because they are selling something. I don't know anything about their offerings, but SRI is made for this and is extremely effective.
Wouldn't work in this case because the whole selling point of polyfill.io was that as new features came out and as the browser grew support for new features the polyfill that was loaded would dynamically grow or shrink.
Something like `polyfill.io.example.org/v1?features=Set,Map,Other.Stuff` would _shrink_ over time, while something like `pollyfill.io.example.org/v1?features=ES-Next` would grow and shrink as new features came and went.
SRI generally won't work here because the served polyfill JS (and therefore the SRI hash) depends on the user agent/headers sent by the user's browser. If the browser says it's ancient, the resulting polyfill will fill in a bunch of missing JS modules and be a lot of JS. If the browser identifies as modern, it should return nothing at all.
Edit: In summary, SRI won't work with a dynamic polyfill which is part of the point of polyfill.io. You could serve a static polyfill but that defeats some of the advantages of this service. With that said, this whole thread is about what can happen with untrusted third parties so...
It absolutely would work if the browser validates the SRI hash. The whole point is to know in advance what you expect to receive from the remote site and verify the actual bytes against the known hash.
It wouldn’t work for some ancient browser that doesn’t do SRI checks. But it’s no worse for that user than without it.
The CDN in this case is performing an additional function which is incompatible with SRI: it is dynamically rendering a custom JS script based on the requesting User Agent, so the website authors aren't able to compute and store a hash ahead of time.
I edited to make my comment more clear but polyfill.io sends dynamic polyfills based on what features the identified browser needs. Since it changes, the SRI hash would need to change so that part won't work.
Ah! I didn’t realize that. My new hot take is that sounds like a terrible idea and is effectively giving full control of the user’s browser to the polyfill site.
And this hot take happens to be completely correct (and is why many people didn't use it, in spite of others yelling that they were needlessly re-inventing the wheel).
Yeah... I've generated composite fills with the pieces I would need on the oldest browser I had to support, unfortunately all downstream browsers would get it.
Fortunately around 2019 or so, I no longer had to support any legacy (IE) browsers and pretty much everything supported at least ES2016. Was a lovely day and cut a lot of my dependencies.
They are saying that because the content of the script file is dynamic based on useragent and what that useragent currently supports in-browser, the integrity hash would need to also be dynamic which isn't possible to know ahead of time.
Their point is that the result changes depending on the request. It isn't a concern about the SRI hash not getting checked, it is that you can't realistically know the what you expect in advance.
I would still (and do) do both, in the case that your site (for whatever reason) is still under/or simply accessible to HTTP, then a man in the middle attack could still happen and replace your script with another.
For self hosted dynamic scripts, I just add a task in my build process to calc the sha and add it to the <src integrity="sha..." >
Otherwise just calc it and hardcode it once for 3rd party, legacy scripts...
I had this conversation countless times with developers: are you really ok if someone hijacks the CDN for the code you're including? They almost always seem to be fine with it, simply because everyone else is doing it like this. At the same time they put up with countless 2FAs in the most mundane places.
The follow up of "you know that the random packages you're including could have malware" is even more hopeless.
In general, SRI (Subresource Integrity) should protect against this. It sounds like it wasn't possible in the Polyfill case as the returned JS was dynamic based on the browser requesting it.
Why not? The `integrity` attribute accepts more than one value[0].
This would technically be feasible, if my understanding of the service is correct. Hashes could be recorded for each combination of feature -- you could then give those list of hashes to the user to insert into the attribute.
Of course, the main difficulty here would be the management of individual hashes. Hmm, definitely interesting stuff.
yeah that's nuts, I would never use a random site for that, but in general people's opinion on CDN use is dated. Tons of people still think that cached resources are shared between domains for example.
Why? If you host everything on the same domain there's no possibility of a typo. And, developers could maliciously make a typo that can get past code review, don't you take that into account?
In a lot of situations the system can be designed that a mistake has to be obvious at review for it to even pass the build step. Why not strive for that level of robustness?
I don’t understand why JS devs treat dependencies the way they do.
If you are writing scripts that run on other people’s computers I feel like the dependencies used should be vetted for security constantly, and should be used minimally.
Meanwhile most libraries seem to have 80 trillion dependencies written by random github accounts called “xihonghua” or something with no other projects on their account.
There’s a wide, wide range of devs that fit in the label of “JS dev”. At the more junior or casual end, npm and cdns are shoved in your face as the way to go. It shouldn’t be surprising that it’s the natural state of things.
I’ve worked with many JS devs who also have broader experience and are more than aware of issues like these, so it just depends I guess.
The bigger issue may just be the lack of a culture that vendors their code locally and always relies on the 3rd party infrastructure (npm or cdn).
It’s somewhat similar but any Rust project I’m building, I wind up vendoring the crates I pull in locally and reviewing them. I thought it would be more annoying but it’s really not that bad in the grand scheme of things - and there should be some automated things you could set up to catch obvious issues, though I defer to someone with more knowledge to chime in here.
This may be an extra level of headache with the JS ecosystem due to the sheer layers involved.
and Meanwhile Meanwhile, years after it was well known that the JS dependency model was an utter security disaster the rust ecosystem went on to copy it.
The number of websites is decreasing, which is good:
clickhouse-cloud :) SELECT date, count() FROM minicrawl_processed WHERE arrayExists(x -> x LIKE '%polyfill.io%', external_scripts_domains) AND date >= now() - INTERVAL 5 DAY GROUP BY date ORDER BY date
┌───────date─┬─count()─┐
1. │ 2024-06-22 │ 6401 │
2. │ 2024-06-23 │ 6398 │
3. │ 2024-06-24 │ 6381 │
4. │ 2024-06-25 │ 6325 │
5. │ 2024-06-26 │ 5426 │
└────────────┴─────────┘
5 rows in set. Elapsed: 0.204 sec. Processed 15.70 million rows, 584.74 MB (76.87 million rows/s., 2.86 GB/s.)
Peak memory usage: 70.38 MiB.
> this is a wild microcosm example of how the web breaks in ways we don't expect.
I think the performance characteristics of the web are subject to change over time, especially to allow increased security and privacy.
https is another example of increased security+privacy at the cost of being slightly slower than non-https connections because of an extra round trip or two to create the connection.
The lesson I take from it is: don't use complicated optimization techniques that might get out of date over time. Keep it simple instead of chasing every last bit of theoretical performance.
For example, there used to be a good practice of using "Domain Sharding" to allow browsers to download more files in parallel, but was made obsolete with HTTP/2, and domain sharding now has a net negative effect, especially with https.
Now they're realizing that HTTP/2's multiplexing of a single TCP connection can have negative effects on wireless connections, so they're working on HTTP/3 to solve that.
Also don't use polyfills. If your supported browsers don't support the feature then don't use the feature, or implement the fallback yourself. Use the features that are actually available to you.
In addition to Cache Partitioning, it was never really likely that a user had visited another site that used the same specific versions from the same cdn as your site uses.
Making sure all of your pages were synchronized with the same versions and bundling into appropriate bits for sharing makes sense, and then you may as well serve it from your own domain. I think serving from your www server is fine now, but back in the day there were benefits to having a different hostname for static resources and maybe it still applies (I'm not as deep into web stuff anymore, thank goodness).
Because of modern cache partitioning, HTTP/2+ multiplexing, and sites themselves being served off CDNs, external CDNs are now also worse for performance.
If you use them, though, use subresource integrity.
Funnily enough I can't set up CDN on Azure at work because it's not approved but I could link whatever random ass CDN I want for external dependencies if I was so inclined.
Software supply chains feel like one of the Internet's last remaining high-trust spaces, and I don't think that's going to last long. A tidal wave of this is coming. I'm kind of surprised it's taken this long given how unbelievably soft this underbelly is.
Paying for dependencies sounds like a good idea. Reduces the incentive to sell out; allows contributors to quit day jobs to focus on fixing bugs and security holes; less likely to result in abandonware.
This only works at blocking full domains starting from uBO v0.9.3.0. Latest version is v1.58.1, so it's safe to assume most people are up to date. But noting it just in case.
I use both FF and Chrome, and I use multiple profiles on Chrome, so I have to go in an add the filter for each profile and browser. At least for my personal laptop where I can do this. Not sure about my work one.
Yes that is unfortunate. Safari had the option to toggle JavaScript via Shortcut until recently, but it was removed. The only browser I know which can easily toggle JavaScript now is Brave.
But uBlock Origin has that functionality, too, and I guess most people who would care about JavaScript have that already enabled anyways.
The web is so much nicer without JavaScript but easily activating it (via cmd-J) once it seems necessary without reloading.
>Yes that is unfortunate. Safari had the option to toggle JavaScript via Shortcut until recently, but it was removed. The only browser I know which can easily toggle JavaScript now is Brave.
Unless you design your own silicon, build your own pc and peripherals, and write all your own software, there's always going to be a level of trust involved. But at least NoScript is FOSS so you can in theory examine the source code yourself.
I think there’s a toggle to just disable JavaScript entirely somewhere in the menus, but it is sort of inconvenient, because you can’t selectively enable sites that are too poorly coded to run without JavaScript.
Mozilla has marked NoScript as a recommended extension, which is supposed to mean they reviewed the code. Did they do it perfectly? I don’t know. But the same logic could be applied to the patches they receive for their browser itself, right? It’s all just code that we trust them to audit correctly.
>Google’s Ads Security team uses Safe Browsing to make sure that Google ads do not promote dangerous pages.
This is already wrong in my experience. I had a coworker panicking two weeks ago because he googled youtube and clicked the first link. Which turned out to be a fake ransomware page ad designed to get you to call a scam call center.
There is no such thing as a safe ad anymore because no one is policing them appropriately. Especially if something like this can happen when searching a service google themselves owns.
I really wish we didn't have to trust the websites we go to. This exploit is a good example of the problem. I go to a website and something in the content of the website forces you to go to an entirely different place. I didn't click on a link, it can just do this based on any criteria it wants to apply.
I’ll go ahead and make an assumption that the Chinese government was involved. Countries badly need to figure out a way to punish bad actors in cybersecurity realms. It seems that this type of attack, along with many others, are quickly ramping up. If there isn’t competent policy in this area, it could become very dangerous.
Why would the Chinese government use this to load a gambling website? I'm sure there are many better uses that would be more subtle that they could come up with this opportunity.
The problem is that they have a CDN that can serve up custom JS depending on the headers and IP of the visitor, which enables remarkably precise targeting. No reason why they couldn’t, for example, send a targeted payload to people in a specific geographic area (by IP) and/or a specific language (by Accept-Language header). The sports betting stuff could be diversion in that case.
Of course, I don’t personally believe this to be the case; Occam’s Razor says this is a straightforward case of someone deciding they want to start monetizing their acquisition.
> No reason why they couldn’t, for example, send a targeted payload to people in a specific geographic area (by IP) and/or a specific language (by Accept-Language header). The sports betting stuff could be diversion in that case.
What I don't understand is why blow it sending people to a gambling site? They could have kept it going and sent payloads to specific targets making use of zero day browser bugs. Now they can still do that but to far fewer sites.
EDIT: To clarify, patch out just the polyfill (the first line in the snippet). You can of course keep using MathJax, and the second line alone should be enough. (Though still better host a copy by yourself, just in case).
Some of us already have learned that lesson. Imagine how it feels to have to watch the same people touch the same hot stove over and over again. And people wonder why programmers quit and go into farming...
One has to admit the game of cat and mouse that the web browser brought about has been quite valuable for advancing security as a field, unfortunately this sort of thing seems like we have to get burned to learn the pan is hot and mother warning us is not enough.
Security is everyone’s job. You can’t outsource responsibility. The security tram should be compensated fairly but they should also be trusted within the organization. If you want to build secure software then realign incentives. There’s more to that than pay.
Agree. There is something missing from the internet, and that is "Programmer Citizenship". As soon as someone pushes code to a repo, he has to prove his citizenship first, the good old fashioned way by handing his identity to the owner of the repo. His digital identity of course.
As long as the identity is real, and is associated with a clean reputation, then code can be accepted with very little risk. When the reputation might not be so great, then new code has to be double checked before any merge into main.
How hard is it to download your includes and host them on your own domain?
I know sometimes it's not possible, but whenever I can I always do this. If only because it means that if the remote version changes to a new version it doesn't break my code.
... and all of _their_ dependencies. And read through them all to make sure that the local copy that you shipped didn't actually include a deliberately obfuscated exfiltration routine.
Fortunately, people have already designed an operating system for you. You can either use it, or embed six more half-assed versions into a web browser and then ship that instead. Up to you.
I'd say that good [emphasis on "good"] coders can write very secure code. There's fundamental stuff, like encryption algos, that should be sourced from common (well-known and trusted) sources, but when we load in 100K of JS, so we can animate a disclosure triangle, I think it might not be a bad time to consider learning to do that, ourselves.
So if you can't trust your developers to manage dependencies, and you can't trust them to write dependencies... why have you hired them at all? Seriously, what do they actually do? Hire people who can program, it isn't hard.
The phrase "supply chain attack" makes it sound like it's some big, hard to avoid problem. But almost always, it's just developer negligence:
1. Developer allows some organization to inject arbitrary code in the developer's system
2. Organization injects malicious code
3. Developer acts all surprised and calls it an "attack"
Maybe don't trust 3rd parties so much? There's technical means to avoid it.
Calling this situation a supply chain attack is like saying you were victim of a "ethanol consumption attack" when you get drunk from drinking too many beers.
It's called a supply chain attack to displace the blame on the profitable organization that negligently uses this code onto the unpaid developers who lost control of it.
As if expecting lone OSS developers that you don't donate any money towards somehow being able to stand up against the attacks of nation states is a rational position to take.
In this case, the developer sold the user account & repository for money (no ownership change to monitor).. so if you were not privy to that transaction, you really couldn't "easily" avoid this without e.g. forking every repo you depend on and bringing it in house or some other likely painful defense mechanism to implement
That’s why businesses pay Redhat, Qt, Unity,… Clear contracts that reduces the risk of compromised dependencies. Or you vet your dependencies (it helps when you don’t have a lot)
What good does this comment do beside allow you to gloat and put others down? Like, Christ. Are you telling me that you’d ever speak this way to someone in person?
I have no doubt that every single person in this thread understands what a supply chain attack is.
You are arguing over semantics in an incredibly naive way. Trust relationships exist both in business and in society generally. It’s worth calling out attacks against trust relationships as what they are: attacks.
The first time a user, who uses a phone, open a website through an ads ( google ads or facebook ) with this link, it will redirect user to a malicious website.
Request for the first time from a unique IP, with a unique User-Agent.
User-Agent match that of a phone, we used an iphone's user agent ( Mozilla/5.0 (iPhone14,2; U; CPU iPhone OS 14_0 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/15E148 Safari/602.1 ).
Referer from a reputable website that installed polyfill.
Accept /
Accept-Encoding gzip, deflate, br, zstd
Delete all cookies
The request will return the original polyfill code, appended with a piece of malicious code. This code will make a run javascript from https://www.googie-anaiytics.com/ga.js , if the device is not a laptop. You can reproduce this multiple time on the same machine by changing User agent gently, (ex: change Mozilla/5.0 to Mozilla/6.0). Sometimes the server will just timeout or return code without the injection, but it should work most of the time.
The javascript on https://www.googie-anaiytics.com/ga.js will redirect users to a malicious website based on some condition check for a number of conditions before running ( useragent, screen width, ...) to ensure it is a phone, the entry point is at the end:
The code has some protection built-in, so if it is run on a non-suitable environment, it will attempt to relocate a lot of memory to freeze the current devices. It also re-translate all attribute name access with _0x42bcd7 .
This was obviously written by a JS beginner (ex: basic mistake on the "missing" function, the author thinks that `this` refers to the function?, same about the `undefined` implicit global assignment..)
When you are selling an open source project, what outcome do you expect? People who are interested in the project non-financially will demonstrate their value in other ways (PRs, reviews, docs, etc) leading to the more common succession of maintainers without exchanging money. I don't think it's reasonable for software authors, who take the route of giving their projects to buyers rather than top contributors, to act surprised when the consequences roll in.
Is this save once more because it moved back to Cloudflare?
;; QUESTION SECTION:
;cdn.polyfill.io. IN A
;; ANSWER SECTION:
cdn.polyfill.io. 553 IN CNAME cdn.polyfill.io.cdn.cloudflare.net.
cdn.polyfill.io.cdn.cloudflare.net. 253 IN A 172.67.209.56
cdn.polyfill.io.cdn.cloudflare.net. 253 IN A 104.21.23.55
Or is Cloudflare warning about this and hosting the attack site?
It is still not safe. The new owner is using Cloudflare as a CDN to appear more legitimate, but the responses are still fully controlled by the malicious backend.
This has been the case since the end of February.
The justified in "justified exceptions" is important. Whenever I review CSP additions I ask the following questions
- do we have a trust relationship with the vendor
- is it strictly required
- what are the alternatives
- blast radius
Adding script-src has a pretty high blast-radius. There is no relationship with an unpaid CDN. Alternatives can be vendoring a static polyfill script, or just fixing a few functions manually, depending on desired level of browser support.
So it would not have passed.
Adding an exception for 3rd-party images would have to clear a much lower bar for example but even there GDPR or information leakage could be a concern.
CSP changes are just a great point to stop and think about how the frontend interacts with the rest of the world.
If you just rubber-stamp everything then of course it wouldn't have any effect.
> Sansec security forensics team, which on Tuesday claimed Funnull, a Chinese CDN operator that bought the polyfill.io domain and its associated GitHub account in February, has since been using the service in a supply chain attack.
I tend to avoid [other people's] dependencies like the plague. Not just for security, but also for performance and Quality. I think I have a grand total of two (2), in all my repos, and they are ones that I can reengineer, if I absolutely need to (and I have looked them over, and basically am OK with them, in their current contexts).
But I use a lot of dependencies; it's just that I've written most of them.
What has been annoying AF, is the inevitable sneer, when I mention that I like to avoid dependencies.
They usually mumble something like "Yeah, but DRY...", or "That's a SOLVED problem!" etc. I don't usually hang around to hear it.
In most professional development contexts your puritan approach is simply unjustified. You’re obviously feeling very smug now, but that feeling is not justified. I note that you say “in all my repos”. What is the context in which these repositories exist? Are they your hobby projects and not of any real importance? Do you have literally anyone that can call you out for the wasted effort of reimplementing a web server in assembly because you don’t trust dependencies? I hope that you’re making your own artisanal silicon from sand you dug yourself from your family farm. Haven’t you heard about all those Intel / AMD backdoors? Sheesh.
EDIT: you’re an iOS developer. Apples and oranges. Please don’t stand on top of the mountain of iOS’s fat standard library and act like it’s a design choice that you made.
> In most professional development contexts your puritan approach is simply unjustified. You’re obviously feeling very smug now, but that feeling is not justified. I note that you say “in all my repos”. What is the context in which these repositories exist? Are they your hobby projects and not of any real importance? Do you have literally anyone that can call you out for the wasted effort of reimplementing a web server in assembly because you don’t trust dependencies? I hope that you’re making your own artisanal silicon from sand you dug yourself from your family farm. Haven’t you heard about all those Intel / AMD backdoors? Sheesh.
EDIT: you’re an iOS developer. Apples and oranges. Please don’t stand on top of the mountain of iOS’s fat standard library and act like it’s a design choice that you made.
--
ahem, yeah...
[EDIT] Actually, no. Unlike most Internet trolls, I don't get off on the misfortune of others. I -literally-, was not posting it to be smug. I was simply sharing my approach, which is hard work, but also one I do for a reason.
In fact, most of the grief I get from folks, is smugness, and derision. A lot of that "Old man is caveman" stuff; just like what you wrote. I've been in a "professional development context" since 1986 or so, so there's a vanishingly small chance that I may actually be aware of the ins and outs of shipping software.
I was simply mentioning my own personal approach -and I have done a lot of Web stuff, over the years-, along with a personal pet peeve, about how people tend to be quite smug to me, because of my approach.
You have delivered an insult, where one was not needed. It was unkind, unsought, undeserved, and unnecessary.
Always glad to be of service.
BTW. It would take anyone, literally, 1 minute to find all my repos.
I don't think most people understand that although '100k+ sites are infected', the actual malware code runs on the visitors (our) machines, the servers themselves are fine !
I read the news yesterday and was 'yeah, that's too bad', but only later did it dawn on me that I'm the one that's potentially going to have to deal with the consequences.
This needs to be mitigated client side, not rely on the good will of the administrators.
uBlock Origin had filters available within only a few hours of the discovery.
And most browsers already have a builtin domain blacklist via's google safe browsing list, but it's mostly for phishing pages rather than covering every single possibility.
We need a much more decentralized alternative, that lets static files be served based on content hashes. For now browser extensions are the only way. It’s sad but the Web doesn’t protect clients from servers. Only servers from clients.
Before getting on your soapbox about the decentralised web, please look at what Polyfill actually did. I’m not sure what you’re actually suggesting, but the closest remotely viable thing (subresource integrity) already exists. It simply wouldn’t work in Polyfill’s case because Polyfill dynamically selected the ‘right’ code to send based on user agent.
As usual this problem has nothing to do with centralisation v decentralisation. Are you suggesting that people vet the third parties used by the sites visit? How does that sound practical for anyone other than ideological nerds?
You seem to have a lot of hate for decentralized solutions. Even ones as simple as browsers providing an alternative to DNS and serving static files.
You don’t seem to care about protecting the user.
In a secure environment for the user why the f should the only option be to trust a server to choose which file to send based on the user-agent-reported ID?
The user-agent knows its own ID and can request the static file of its choice from the network. Vetted static Javascript running in the static website can do that.
You are so whipped by the Web’s inversion of power, that you can’t even seriously consider that alternative when writing your post.
You say it’s too hard for every person to vet static JS bundles and verify that they have no instructions to phone home or otherwise mess with you. Well, that’s why there are third party auditing companies, they can just sign and publicly post approvals of specific bundle hashes, that your browser can then check and make sure at least 2 reputable audits approved that version. Just like it does for chains of certificates when loading a site. In fact, you are currently sleepwalkinf into learned helplessness by AI models hosted by others, the same way.
At least the HN crowd drew the line at Web Environment Integrity by Google. It’s like you are an enthusiastic defender of mild slavery but oppose beating and killing the slaves.
Sigstore is doing a lot of interesting work in the code supply chain space. I have my fingers crossed that they find a way to replace the current application code signing racket along the way.
Shipping an app that runs on Windows without scary warnings required ~400USD/year code signing certificate, unless you release through the Microsoft Store.
Is there any evidence that whoever is currently behind polyfill.io is a "Chinese company," as most reports claim?
The company known as "Funnull" appears to be based in the Philippines. The phone number associated with their WhatsApp and WeChat accounts has country code +63. Also they appear to be based at 30th Street, 14th Floor, Net Cube Center, E-Square Zone, Metro Manila, Taguig, Philippines at least if the information on this page [1], purportedly from the founder of the company that was acquired and renamed Funnull, is to be trusted (Click "View Source," then run it through Google Translate)
Claude translation:
===
> Announcement
> I am the former owner of the original Philippine company Anjie CDN. After my incident, the company was managed by my family. Due to their isolation and lack of support, they were persuaded by unscrupulous individuals to sell the company. Subsequently, the company was acquired and renamed as Fangneng CDN, which later developed into the ACB Group.
> This is precisely what I need to clarify: Fangneng CDN company and the ACB Group have no connection with me or my family. Recently, many companies have contacted my family and threatened them, believing that Fangneng CDN company has stolen important information such as member data and financial transactions through client domain names using infiltration and mirroring techniques, and has stolen customer programs through server rental services. This matter is unrelated to me and my family. Please contact Fangneng CDN company to resolve these issues.
> I reiterate: As my family has long since closed Anjie CDN company, any events that occurred afterwards are unrelated to me and my family!
> Note: Due to my personal issues, this statement is being released on my behalf by my family.
> Fangneng CDN's actual office location: 30th Street 14TH Floor, Net Cube Center, E-Square Zone, Metro Manila, Taguig, Philippines.
> Due to the release of this statement, the Anjie domain name has been reactivated by Fangneng CDN company. Currently, Anjie and Fangneng are one company, so I once again declare that any conflicts and disputes arising from Anjie CDN and Fangneng CDN companies are not related to me or my family in any way!
> First publication date of the announcement: May 18, 2022
Yeah, maybe, but for the last 20 years there really weren't enough incidents to make it an activity that was too risky to do outside of some really security sensitive applications.
If security means every maintainer of every OSS package you use has to be scrupulous, tireless, and not screw up for life, not sure what to say when this kind of thing happens other than "isn't that the only possible outcome given the system and incentives on a long enough timeline?"
Kind of like the "why is my favorite company monetizing now and using dark patterns?" Well, on an infinite timeline did you think service would remain high quality, free, well supported, and run by tireless, unselfish, unambitious benevolent dictators for the rest of your life? Or was it a foregone question that was only a matter of "when" not "if"?