Hacker News new | past | comments | ask | show | jobs | submit login
Polyfill supply chain attack hits 100K+ sites (sansec.io)
887 points by gnabgib 5 months ago | hide | past | favorite | 370 comments



Game theory at work? Someone needs to maintain legacy code for free that hosts thousands of sites and gets nothing but trouble (pride?) in return. Meanwhile the forces of the world present riches and power in return to turn to the dark side (or maybe just letting your domain lapse and doing something else).

If security means every maintainer of every OSS package you use has to be scrupulous, tireless, and not screw up for life, not sure what to say when this kind of thing happens other than "isn't that the only possible outcome given the system and incentives on a long enough timeline?"

Kind of like the "why is my favorite company monetizing now and using dark patterns?" Well, on an infinite timeline did you think service would remain high quality, free, well supported, and run by tireless, unselfish, unambitious benevolent dictators for the rest of your life? Or was it a foregone question that was only a matter of "when" not "if"?


It seems when proprietary resources get infected it's because hackers are the problem, but when open source resources get infected its a problem with open source.

But there isn't any particular reason why a paid/proprietary host couldn't just as easily end up being taken over / sold to a party intending to inject malware. It happens all the time really.


Yes, the economic problem of reward absence is exclusive to open source and private software does not have it. They may have others, like excess of rewards to hackers in form of crypto ransom to the point that the defense department had to step in and ban payouts.


Private software always wants more rewards, leading to identical symptoms.


Private software already has rewards that may be threatened by certain types of behaviour, leading to reduced symptoms.


Hasn't stopped Larry Ellison from laughing all the way to the bank.


private != profitable

As long as business is not going as well as owners want, the same economic problem exists in private software too - in fact, the private companies get acquired all the time too, and they get shut down, causing DOS for many of their clients.

(See for example: https://www.tumblr.com/ourincrediblejourney )

One difference is that closed-source source usually much less efficient; I cannot imagine "100K+" customers from a commercial org with just a single developer. And when there are dozens or hundreds of people involved, it's unlikely that new owners would turn to outright criminal activity like malware; they are much more likely to just shut down.


agreed, but if a company is making millions for the security of software, the incentive is to keep it secure so customers stick with it. Remember the lastpass debacle, big leak and lost many customers...


Directly security-focused products like lastpass are the only things that have any market pressure whatsoever on this, and that's because they're niche products for which the security is the only value-add, marketed to explicitly security-conscious people and not insulated by a whole constellation of lock-in services. The relevant security threats for the overwhelming majority of people and organizations are breaches caused by the practices of organizations that face no such market pressure, including constant breaches of nonconsensually-harvested data, which aren't even subject to market pressures from their victims in the first place


Even for security-related products the incentives are murky. If they're not actually selling you security but a box on the compliance bingo then it's more likely that they actually increase your attack surface because they want to get their fingers into everything so they can show nice charts about all the things they're monitoring.


Aye. My internal mythological idiolect's trickster deity mostly serves to personify the game-theoretic arms race of deception and is in a near-constant state of cackling derisively at the efficient market hypothesis


I wouldn’t point to LastPass as an exemplar…

https://www.theverge.com/2024/5/1/24146205/lastpass-independ...


I didn't, and my point was exactly that it's not a great one, so I think we largely agree here


Only if they have competition. Which long term is not the default state in the market.


> Which long term is not the default state in the market

Why not?


Why did Facebook buy Instagram and Whatsapp? Why did Google buy Waze? Why did Volkswagen buy Audi and Skoda and Seat and Bentley and so on?

Might as well ask why companies like money.


Sure, but the car industry is pretty old and there's plenty of competition still.


I’m not sure about plenty when you actually look at it, but in any case it’s because of government intervention.


Think about it for a second mate: competition leads to winners and winners don't like competition, they like monopolies.

And no, the car industry has almost no competition. It's an oligopoly with very few players and a notoriously hard industry to get in.


Oh yeah, corporations are so accountable. We have tons of examples of this.


I think for some reason some people still buy cisco products, so this reasoning doesn't seem to be applicable to the real world.


Real solution?

We’re in a complexity crisis and almost no one sees it.

It’s not just software dependencies of course. It’s everything almost everywhere.

No joke, the Amish have a point. They were just a few hundred years too early.


I 100% agree. I feel a huge part of my responsibility as a software "engineer" is to manage complexity. But i feel i'm fighting a losing battle, most everyone seems to pull in the opposite direction.

Complexity increases your surface area for bugs to hide in.

I've come to the conclusion it's tragedy-of-the-commons incentives: People get promotions for complex and clever work, so they do it, at the cost of a more-complex-thus-buggy solution.

And yes, it's not just software, it's everywhere. My modern BMW fell apart, in many many ways, at the 7 year mark, for one data point.


Right, no one is incentivizing simple solutions. Or making sure that the new smart system just cannot be messed up.

We need thinner faster lighter leaner on everything… because IDK why, MBAs have decided that reliability will just not sell.


It's a competence crisis not a complexity one.

https://www.palladiummag.com/2023/06/01/complex-systems-wont...


We haven’t gotten smarter or dumber.

But we have exceeded our ability to communicate the ideas and concepts, let alone the instructions of how to build and manage things.

Example: a junior Jiffy Lube high school dropout in 1960 could work hard and eventually own that store. Everything he would ever need to know about ICE engines was simple enough to understand over time… but now? There are 400 oil types, there are closed source computers on top computers, there are specialty tools for every vehicle brand, and you can’t do anything at all without knowing 10 different do-work-just-to-do-more-work systems. The high school dropout in 2024, will never own the store. Same kid. He hasn’t gotten dumber. The world just left him by in complexity.

Likewise… I suspect that Boeing hasn’t forgotten how to build planes, but the complexity has exceeded their ability. No human being on earth could be put in a room and make a 747 even over infinite time. It’s a product of far too many abstract concepts in a million different places that have come together to make a thing.

We make super complex things with zero effort put into communicating how or why they work a way they do.

We increase the complexity just to do it. And I feel we are hitting our limits.


The problem w/ Boeing is not the inability of people to manage complexity but of management's refusal to manage complexity in a responsible way.

For instance, MCAS on the 737 is a half-baked implementation of the flight envelope protection facility on modern fly-by-wire airliners (all of them, except for the 737). The A320 had some growing pains with this, particularly it had at least two accidents where pilots tried to fly the plane into the ground, thought it would fail because of the flight envelope protection system, but they succeeded and crashed anyway. Barring that bit of perversity right out of the Normal Accidents book, people understand perfectly well how to build a safe fly-by-wire system. Boeing chose not to do that, and they refused to properly document what they did.

Boeing chose to not develop a 737 replacement, so all of us are suffering: in terms of noise, for instance, pilots are going deaf, passengers have their head spinning after a few hours in the plane, and people on the ground have no idea that the 737 is much louder than competitors.


Okay but your entire comment is riddled with mentions of complex systems (flight envelope system?) which proves the point of the parent comment. "Management" here is a group of humans who need to deal with all the complexity of corporate structures, government regulations, etc.. while also dealing with the complexities of the products themselves. We're all fallible beings.


Boeing management is in the business of selling contracts. They are not in the business of making airplanes. That is the problem. They relocated headquarters from Seattle to Chicago and now DC so that they can focus on their priority, contracts. They dumped Boeing's original management style and grafted on the management style of a company that was forced to merge with Boeing. They diversified supply chain as a form of kickbacks to local governments/companies that bought their 'contracts'.

They enshiftified every area of the company, all with the priority/goal of selling their core product, 'contracts', and filling their 'book'.

We are plenty capable of designing Engineering systems, PLMs to manage EBOMs, MRP/ERP systems to manage MBOMs, etc to handle the complexities of building aircraft. What we can't help is the human desire to prioritize enshitfication if it means a bigger paycheck. Companies no longer exist to create a product, and the product is becoming secondary and tertiary in management's priorities, with management expecting someone else to take care of the 'small details' of why the company exists in the first place.


>Example: a junior Jiffy Lube high school dropout in 1960 c

Nowadays the company wouldn't hire a junior to train. They'll only poach already experienced people from their competitors.

Paying for training isn't considered worthwhile to the company because people wont stay.

People won't stay because the company doesn't invest in employees , it only poaches.


Boeing is a kickbacks company in a really strange way. They get contracts based on including agreements to source partly from the contracties local area. Adding complexity for contracts and management bonus sake, not efficiency, not redundancy, not expertise. Add onto that a non-existent safety culture and a non-manufacturing/non-aerospace focused management philosophy grafting on from a company that failed and had to be merged into Boeing replacing the previous Boeing management philosophy. Enshitifaction in every area of the company. Heck they moved headquarters from Seattle to Chicago, and now from Chicago to DC. Prioritizing being where the grift is over, you know, being where the functions of the company are so that management has a daily understanding of what the company does. Because to management what the company does is win contracts, not build aerospace products. 'Someone else' takes care of that detail, according to Boeing management. Building those products in now secondary/tertiary to management.

I did ERP/MPR/EBOM/MBOM/BOM systems for aerospace. We have that stuff down. We have systems for this kind of communication down really well. We can build within a small window an airplane with thousands of parts with lead times from 1 day to 3 months to over a year for certain custom config options, with each parts design/FAA approval/manufacturing/installation tracked and audited. Boeing's issue is culture, not humanity's ability to make complex systems.

But I do agree that there is a complexity issue in society in general, and a lot of systems are coasting on the efforts of those that originally put them in place/designed them. A lot of government seems to be this way too. There's also a lot of overhead for overheads sake, but little process auditing/iterative improvement style management.


I see both, incentivized by the cowboy developer attitude.


Ironically I think you got that almost exactly wrong.

Avoiding "cowboyism" has instead lead to the rise of heuristics for avoiding trouble that are more religion than science. The person who is most competent is also likely to be the person who has learned lessons the hard way the most times, not the person who has been very careful to avoid taking risks.

And let me just say that there are VERY few articles so poorly written that I literally can't get past the first paragraph, and an article that cherry-picks disasters to claim generalized incompetence scores my very top marks for statistically incompetent disingenuous bullshit. There will always be a long tail of "bad stuff that happens" and cherry-picking all the most sensational disasters is not a way of proving anything.


I'm predisposed to agree with the diagnosis that incompetence is ruining a lot of things, but the article boils down to "diversity hiring is destroying society" and seems to attribute a lot of the decline to the Civil Rights Act of 1964. Just in case anybody's wondering what they would get from this article.

> By the 1960s, the systematic selection for competence came into direct conflict with the political imperatives of the civil rights movement. During the period from 1961 to 1972, a series of Supreme Court rulings, executive orders, and laws—most critically, the Civil Rights Act of 1964—put meritocracy and the new political imperative of protected-group diversity on a collision course. Administrative law judges have accepted statistically observable disparities in outcomes between groups as prima facie evidence of illegal discrimination. The result has been clear: any time meritocracy and diversity come into direct conflict, diversity must take priority.

TL;DR "the California PG&E wildfires and today's JavaScript vulnerability are all the fault of Woke Politics." Saved you a click.


A more fundamental reason is that society is no longer interested in pushing forward at all cost. It's the arrival at an economical and technological equilibrium where people are comfortable enough, along with the end of the belief in progress as an ideology, or way to salvation somewhere during the 20th century. If you look closely, a certain kind of relaxation has replaced a quest for efficiency everywhere. Is that disappointing? Is that actually bad? Do you think there might be a rude awakening?

Consider: It was this scifi-fueled dream of an amazing high-tech, high-competency future that also implied machines doing the labour, and an enlightened future relieving people of all kinds of unpleasantries like boring work, therefore prevented them from attaining high competency. The fictional starship captain, navigating the galaxy and studying alien artifacts was always saving planets full of humans in desolate mental state...


My own interpretation of the business cycle is that growth cause externalities that stop growth. Sometimes you get time periods like the 1970s where efforts to control externalities themselves would cause more problems than they solved, at least some of the time. (e.g. see the trash 1974 model year of automobiles where they hadn’t figured out how to make emission controls work.)

I’d credit the success of Reagan in the 1980s at managing inflation to a quiet policy of degrowth the Republicans could get away with because everybody thinks they are “pro business”. As hostile as Reagan’s rhetoric was towards environmentalism note we got new clean air and clean water acts in the 1980s but that all got put in pause under Clinton where irresponsible monetary expansion restarted.


> My own interpretation of the business cycle is that growth cause externalities that stop growth.

The evidence seems to be against this.

https://eml.berkeley.edu/~enakamura/papers/plucking.pdf


> along with the end of the belief in progress as an ideology, or way to salvation somewhere during the 20th century.

That 20th century belief in technological progress as a "way to salvation" killed itself with smog and rivers so polluted they'd catch on fire, among other things.


Thank you for summarizing (I actually read the whole article before seeing your reply and might have posted similar thoughts). I get the appeal of romanticizing our past as a country, looking back at the post-war era, especially the space race with a nostalgia that makes us imagine it was a world where the most competent were at the helm. But it just wasn't so, and still isn't.

Many don't understand that the Civil Rights Act describes the systematic LACK of a meritocracy. It defines the ways in which merit has been ignored (gender, race, class, etc) and demands that merit be the criteria for success -- and absent the ability for an institution to decide on the merits it provides a (surely imperfect) framework to force them to do so. The necessity of the CRA then and now, is the evidence of absence of a system driven on merit.

I want my country to keep striving for a system of merit but we've got nearly as much distance to close on it now as we did then.


>Many don't understand that the Civil Rights Act describes the systematic LACK of a meritocracy. It defines the ways in which merit has been ignored (gender, race, class, etc) and demands that merit be the criteria for success

Stealing that. Very good.


The word "meritocracy" was invented for a book about how it's a bad idea that can't work, so I'd recommend not trying to have one. "Merit" doesn't work because of Goodhart's law.

I also feel like you'd never hire junior engineers or interns if you were optimizing for it, and then you're either Netflix or you don't have any senior engineers.


FWiW Michael Young, Baron Young of Dartington, the author of the 1958 book The Rise of the Meritocracy popularised the term which rapidly lost the negative connotations he put upon it.

He didn't invent the term though, he lifted it from an earlier essay by another British sociologist Alan Fox who apparently coined it two years earlier in a 1956 essay.

https://en.wikipedia.org/wiki/The_Rise_of_the_Meritocracy


I think this is the wrong takeaway.

Everything has become organized around measurable things and short-term optimization. "Disparate impact" is just one example of this principle. It's easy to measure demographic representation, and it's easy to tear down the apparent barriers standing in the way of proportionality in one narrow area. Whereas, it's very hard to address every systemic and localized cause leading up to a number of different disparities.

Environmentalism played out a similar way. It's easy to measure a factory's direct pollution. It's easy to require the factory to install scrubbers, or drive it out of business by forcing it to account for externalities. It's hard to address all of the economic, social, and other factors that led to polluting factories in the first place, and that will keep its former employees jobless afterward. Moreover, it's hard to ensure that the restrictions apply globally instead of just within one or some countries' borders, which can undermine the entire purpose of the measures, even though the zoomed-in metrics still look good.

So too do we see with publically traded corporations and other investment-heavy enterprises: everything is about the stock price or other simple valuation, because that makes the investors happy. Running once venerable companies into the ground, turning merges and acquisitions into the core business, spreading systemic risk at alarming levels, and even collapsing the entire economy don't show up on balance sheets or stock reports as such and can't easily get addressed by shareholders.

And yet now and again "data-driven" becomes the organizing principle of yet another sector of society. It's very difficult to attack the idea directly, because it seems to be very "scientific" and "empirical". But anecdote and observation are still empirically useful, and they often tell us early on that optimizing for certain metrics isn't the right thing to do. But once the incentives are aligned that way, even competent people give up and join the bandwagon.

This may sound like I'm against data or even against empiricism, but that's not what I'm trying to say. A lot of high-level decisions are made by cargo-culting empiricism. If I need to choose a material that's corrosion resistant, obviously having a measure of corrosion resistance and finding the material that minimizes it makes sense. But if the part made out of that material undergoes significant shear stress, then I need to consider that as well, which probably won't be optimized by the same material. When you zoom out to the finished product, the intersection of all the concerns involved may even arrive at a point where making the part easily replaceable is more practical than making it as corrosion-resistant as possible. No piece of data by itself can make that judgment call.


Probably both, IMHO.


Well, on the web side, it'd be a lot less complex if we weren't trying to write applications using a tool designed to create documents. If people compiled Qt to WASM (for instance), or for a little lighter weight, my in-development UI library [1] compiled to WASM, I think they'd find creating applications a lot more straightforward.

[1] https://github.com/eightbrains/uitk


Most apps don’t need to be on the web. And the ones that need to be can be done with the document model instead of the app model. We added bundles of complexity to an already complex platform (the browser).


> Real solution?

I don't think there's any. Too many luminaries are going to defend the fact that we can have things like "poo emojis" in domain names.

They don't care about the myriad of homograph/homoglyph attacks made possible by such an idiotic decision. But they've got their shiny poo, so at least they're happy idiots.

It's a lost cause.


> Too many luminaries are going to defend the fact that we can have things like "poo emojis" in domain names. They don't care about the myriad of homograph/homoglyph attacks made possible by such an idiotic decision.

There is nothing idiotic about the decision to allow billions of people with non-latin scripts to have domain names in their actual language.

What's idiotic is to consider visual inspection of domain names a neccessary security feature.


DNS could be hosted on a blockchain, each person use his own rules for validating names, and reject, accept or rename any ambiguous or dangerous part of the name, in a totally secure and immutable way.

Blockchain has the potential to be the fastest and cheapest network on the planet, because it is the only "perfect competition" system on the internet.

"Perfect competition" comes from game theory, and "perfect" means that no one is excluded from competing. "Competition" means that the best performing nodes of the network put the less efficient nodes out of business.

For the moment unfortunately, there is no blockchain which is the fastest network on the planet, but that's gonna change. Game theory suggests that there will be a number of steps before that happens, and it takes time. In other words, the game will have to be played for a while, for some objectives to be achieved.

UTF8 and glyphs are not related to supply chains, and that's a little bit off topic, but i wanted to mention that there is a solution.


in a strange way, this almost makes the behavior of hopping onto every new framework rational. The older and less relevant the framework, the more the owner's starry-eyed enthusiasm wears off. The hope that bigcorp will pay $X million for the work starts to fade. The tedium of bug fixes and maintenance wears on, the game theory takes it's toll. The only rational choice for library users is to jump ship once the number of commits and hype starts to fall -- that's when the owner is most vulnerable to the vicissitudes of Moloch.


> in a strange way, this almost makes the behavior of hopping onto every new framework rational.

Or maybe not doing that and just using native browser APIs? Many of these frameworks are overkill and having so many "new" ones just makes the situation worse.


Many of them predate those native browser APIs. Pollyfills, the topic at hand, were literally created to add modern APIs to all browsers equally (most notably old Safaris, Internet Explorers, etc.).


Good point. What's often (and sometimes fairly) derided as "chasing the new shiny" has a lot of other benefits too: increased exposure to new (and at least sometimes demonstrably better) ways of doing things; ~inevitable refactoring along the way (otherwise much more likely neglected); use of generally faster, leaner, less dependency-bloated packages; and an increased real-world userbase for innovators. FWIW, my perspective is based on building and maintaining web-related software since 1998.


to be fair there is a whole spectrum between "chasing every new shiny that gets a blog post" vs. "I haven't changed my stack since 1998."

there are certainly ways to get burned by adopting shiny new paradigms too quickly; one big example in web is the masonry layout that Pinterest made popular, which in practice is extremely complicated to the point where no browser has a full implementation of the CSS standard.


CSS Masonry is not even standardized yet. There is a draft spec: https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_grid_la... and ongoing discussion whether it should be part of CSS grid or a new `display` property.


Would you consider yourself as to have "chased the new shiny"? If you don't mind, how many changes (overhauls?) would you say you've made?


To be fair, when it comes to React, I don't think there is a realistic "new shiny" yet. NextJS is (was?) looking good, although I have heard it being mentioned a lot less lately.


Perhaps. I view it as the squalor of an entirely unsophisticated market. Large organizations build and deploy sites on technologies with ramifications they hardly understand or care about because there is no financial benefit for them to do so, because the end user lacks the same sophistication, and is in no position to change the economic outcomes.

So an entire industry of bad middleware created from glued together mostly open source code and abandoned is allowed to even credibly exist in the first place. That these people are hijacking your browser sessions rather than selling your data is a small distinction against the scope of the larger problem.


I believe tea, the replacement for homebrew, is positioning itself as a solution to this but I rarely see it mentioned here https://tea.xyz/


Tea is not the “replacement for homebrew” apart from the fact that the guy that started homebrew also started tea. There’s a bunch of good reasons not to use tea, not least the fact that it’s heavily associated with cryptocurrency bullshit.


Alternatively, if you rely on some code then download a specific version and check it before using it. Report any problems found. This makes usage robust and supports open source support and development.


I'm afraid this is hitting on the other end of inviolable game theory laws. Dev who is paid for features and business value wants to read line-by-line random package that is upgrading from version 0.3.12 to 0.3.13 in a cryptography or date lib that they likely don't understand? And this should be done for every change of every library for all software, by all devs who will always be responsible, not lazy, and very attentive and careful.

On the flip side there is "doing as little as possible and getting paid" for the remainder of a 40 year career where you are likely to be shuffled off when the company has a bad quarter anyway.

In my opinion, if that was incentivized by our system, we'd already be seeing more of it, we have the system we have due to the incentives we have.


Correct. I don't think I have ever seen sound engimeering decisions being rewarded at any business I have worked for. The only reason any sound decisions are made is that some programmers take the initiative, but said initiative rarely comes with a payoff and always means fighting with other programmers who have a fetish for complexity.

If only programmers had to take an ethics oath so they have an excuse not to just go along with idiotic practices.


Then there are the programmers who read on proggit that “OO drools, functional programming rules” or the C++ programmers who think having a 40 minute build proves how smart and tough they are, etc.


lmao you jogged a memory in my brain... in my senior final semester in college, our professor had us agree to the IEEE Code of Ethics:

https://www.computer.org/education/code-of-ethics

In retrospect I can say I've held up for the most part, but in some cases have had to quit certain jobs due to overwhelming and accelerating nonsense.

usually its best for your mental well to just shut up and get paid ;)


"Fetish for complexity" haha. This sounds so much better than "problem engineer".


> Report any problems found. This makes usage robust and supports open source support and development.

Project maintainers/developers are not free labor. If you need a proper solution to any problem, make a contract and pay them. This idea that someone will magically solve your problem for free needs to die.

https://www.softwaremaxims.com/blog/not-a-supplier


Vendoring should be the norm, not the special case.

Something like this ought to be an essential part of all package managers, and I'm thinking here that the first ones should be the thousands of devs cluelessly using NPM around the world:

https://go.dev/ref/mod#vendoring


We've seen a lot more attacks succeed because somebody has vendored an old vulnerable library than supply chain attacks. Doing vendoring badly is worse than relying on upstream. Vendoring is part of the solution, but it isn't the solution by itself.


Not alone, no. That's how CI bots help a lot, such as Dependabot.

Althought it's also worrying how we seemingly need more technologies on top of technologies just to keep a project alive. It used to be just including the system's patched header & libs, now we need extra bots surveying everything...

Maybe a linux-distro-style of community dependency management would make sense. Keep a small group of maintainers busy with security patches for basically everything, and as a downstream developer just install the versions they produce.

I can visualize the artwork..."Debian but for JS"


In the old ways, you mostly rely on a few libraries that each solve a complete problem and is backed by a proper community. The odd dependency is usually small and vendored properly. Security was mostly the environment concern (the OS) as the data is either client side or some properly managed enterprise infrastructure). Now we have npm with its microscopic and numerous packages, everyone wants to be on the web, and they all want your data.


I guess that would work if new exploits weren’t created or discovered.

Otherwise all your plan to “run old software” is questionable.


That isn't the plan. For this to work new versions have to be aggressively adopted. This is about accepting that using an open source project means adopting that code. If you had an internal library with bug fixes available then the right thing is to review those fixes and merge them into the development stream. It is the same with open source code you are using. If you care to continue using it then you need to get the latest and review code changes. This is not using old code, this is taking the steps needed to continue using code.


> did you think service would remain high quality, free, well supported, and run by tireless, unselfish, unambitious benevolent dictators for the rest of your life

I would run some things I run forever free, if once in a while 1 user would be grateful. In reality that doesn’t happen so I usually end up monetising and then selling it off. People whine about everything and get upset if I don’t answer tickets within a working day etc. Mind you; these are free things with no ads. The thing is; they expect me to fuck them over in the end as everyone does, so it becomes a self fulfilling prophecy. Just a single email or chat saying thank you for doing this once in a while would go a long way, but alas; it’s just whining and bug reports and criticism.


Yes it’s inevitable given how many developers are out of a job now. Everyone eventually has a price


This is an insightful comment, sadly


It's amazing to me that anyone who tried to go to a website, then was redirected to an online sports betting site instead of the site they wanted to go to, would be like "hmm, better do some sports gambling instead, and hey this looks like just the website for me". This sort of thing must work on some percentage of people, but it's disappointing how much of a rube you'd have to be to fall for it.


I'm genuinely puzzled that a group with the ability to hijack 100k+ websites can think of nothing more lucrative to do than this.


Even with low rates, my first thought would probably be crypto mining via wasm. I'd never do it, but would have been less noticeable.


Or DDoS, residential proxy, product reviews... one would really think there'd be something more lucrative to sell.


maybe they did all of that too, and still had a lot of traffic/endpoints to spare/sell?


You’d think they’d contract with some hacker group or government and use it as a vector to inject something more nefarious if money was their goal


Yeah, I guess they just genuinely love sports betting


Yes, but if you reduce your overall risk, your chances of actually receiving a payout increase immensely.


That's the parasitic equilibrium: "try to take too much - oops, now you killed yourself by becoming too much of a nuisance."


"ability" here meaning they bought the domain and the "GH repo"


I can't find the reference now, but I think I read somewhere it only redirects when the user got there by clicking on an ad. In that case it would make a bit more sense - the script essentially swaps the intended ad target to that sport gambling website. Could work if the original target was a gaming or sport link.


This assumes that advertisers know how the traffic came to their site. The malware operators could be scamming the advertisers into paying for traffic with very low conversion rates.


It plants a seed. Could be a significant trigger for gambling addicts.


It could be targeting people that already have such a website's tab open somewhere in the browser.

Assuming the user opened the website and didn't notice the redirect (this is more common in mobile), then forgot about it and when they opened their browser again a few days later, their favorite gambling website was waiting for them, and proceeded to gamble as they usually do.


Important context given by the author of polyfill:

> If your website uses http://polyfill.io, remove it IMMEDIATELY.

I created the polyfill service project but I have never owned the domain name and I have had no influence over its sale. (1)

Although I wonder how the GitHub account ownership was transferred.

(1) https://x.com/triblondon/status/1761852117579427975


Hi, I'm the original author of the polyfill service. I did not own the domain name nor the GitHub account. In recent years I have not been actively involved in the project, and I'm dismayed by what's happened here. Sites using the original polyfill.io should remove it right away and use one of the alternatives or just drop it entirely - in the modern era it really isn't needed anymore.


People should have never started including javascript from third-party domains in the first place. It was always playing with fire and there were plenty of people pointing out the risks.


Does this person telling us not to use polyfill.io, and the guy who sold polyfill.io to the chinese company both work at Fastly? If so, that's kind of awkward...


It appears both currently do work for Fastly. I am pleased the Fastly developer advocate warned us, and announced a fork and alternative hosting service:

[1] https://community.fastly.com/t/new-options-for-polyfill-io-u...

But it leaves me with an uneasy feeling about Fastly.


literally went from cloudflare to fastly and now this stuff happpens....

can't catch a break


Neither of them had ownership of the project, so neither of them were responsible for the sale or benefited from it.

They both simply dedicated a lot of time, care and skill to the project. It's really a shame to see what they spent so much time building and maintaining now being used as a platform to exploit people. I'm sure its extremely disappointing to both of them.


https://web.archive.org/web/20240229113710/https://github.co...

Is JakeChampion not the one who sold the project? His bio says he currently works at Fastly


Funnull claims that Jake Champion owned the project and transferred it to them as part of an acquisition agreement[1].

1. https://x.com/JFSIII/status/1761385341951361182


The Internet Archive shows the progression of events pretty clearly.

- In ~May 2023, the FT transferred it to Jake Champion: https://web.archive.org/web/20230505112634/https://polyfill....

- In mid-Oct, the site stated it was "Proudly sponsored by Fastly": https://web.archive.org/web/20231011015804/https://polyfill....

- In November 2023, this was dropped: https://web.archive.org/web/20231101040617/https://polyfill....

- In December 2023, JakeChampion made a sequence of edits removing his name from the repo and pointing to "Polyfill.io maintainers": https://github.com/polyfillpolyfill/polyfill-service/commit/... https://github.com/polyfillpolyfill/polyfill-service/commit/...

- In mid-February 2024, the polyfillpolyfill account was created on Github, and took ownership over the repo.

So I think sometime between October 2023 and February 2024, JakeChampion decided to sell the site to Funnull. I think the evidence is consistent with him having made a decision to sell the site to _somebody_ in December 2023, and the deal with Funnull closing sometime early February 2024.


Seems like it. I hope the money was worth it.


> this domain was caught injecting malware on mobile devices via any site that embeds cdn.polyfill.io

I've said it before, and I'll say it again: https://httptoolkit.com/blog/public-cdn-risks/

You can reduce issues like this using subresource intergrity (SRI) but there are still tradeoffs (around privacy & reliability - see article above) and there is a better solution: self-host your dependencies behind a CDN service you control (just bunny/cloudflare/akamai/whatever is fine and cheap).

In a tiny prototyping project, a public CDN is convenient to get started fast, sure, but if you're deploying major websites I would really strong recommend not using public CDNs, never ever ever ever (the World Economic Forum website is affected here, for example! Absolutely ridiculous).


I always prefer to self-host my dependencies, but as a developer who prefer to avoid an npm-based webpack/whatever build pipeline it's often WAY harder to do that than I'd like.

If you are the developer of an open source JavaScript library, please take the time to offer a downloadable version of it that works without needing to run an "npm install" and then fish the right pieces out of the node_modules folder.

jQuery still offer a single minified file that I can download and use. I wish other interesting libraries would do the same!

(I actually want to use ES Modules these days which makes things harder due to the way they load dependencies. I'm still trying to figure out the best way to use import maps to solve this.)


The assumption of many npm packages is that you have a bundler and I think rightly so because that leaves all options open regarding polyfilling, minification and actual bundling.


polyfilling and minification both belong on the ash heap of js development technologies.


I would agree with you if minification delivered marginal gains, but it will generally roughly halve the size of a large bundle or major JS library (compared to just gzip'ing it alone), and this is leaving aside further benefits you can get from advanced minification with dead code removal and tree-shaking. That means less network transfer time and less parse time. At least for my use-cases, this will always justify the extra build step.


I really miss the days of minimal/no use of JS in websites (not that I want java-applets and Flash LOL). Kind of depressing that so much of the current webdesign is walled behind javascript.


I don’t. Always Craigslist and hacker news to give that 2004 UX.


Cool, I can download 20 MB of JavaScript instead of 40. Everyone uses minification, and "web apps" still spin up my laptop fans. Maybe we've lost the plot.


I wish. When our bundles are being deployed globally and regularly opened on out of date phones and desktops, it can't be avoided yet.


There might be a negative incentive in play: you may be compressing packages, but having your dependencies available at the tip of *pm install bloats overall size and complexity beyond what lack of bundling would give you.


The assumption shouldn't be that you have a bundler, but that your tools and runtimes support standard semantics, so you can bundle if you want to, or not bundle if you don't want to.


> I always prefer to self-host my dependencies

Ime this has always been standard practice for production code at all the companies I've worked at and with as a SWE or PM - store dependencies within your own internal Artifactory, have it checked by a vuln scanner, and then called and deployed.

That said, I came out of the Enterprise SaaS and Infra space so maybe workflows are different in B2C, but I didn't a difference in the customer calls I've been on.

I guess my question is why your employer or any other org would not follow the model above?


> I guess my question is why your employer or any other org would not follow the model above?

Frankly, it's because many real-world products are pieced together by some ragtag group of bright people who have been made responsible for things they don't really know all that much about.

The same thing that makes software engineering inviting to autodidacts and outsiders (no guild or license, pragmatic 'can you deliver' hiring) means that quite a lot of it isn't "engineered" at all. There are embarrassing gaps in practice everywhere you might look.


Yep. The philosophy most software seems to be written with is “poke it until it works locally, then ship it!”. Bugs are things you react to when your users complain. Not things you engineer out of your software, or proactively solve.

This works surprisingly well. It certainly makes it easier to get started in software. Well, so long as you don’t mind that most modern software performs terribly compared to what the computer is capable of. And suffers from reliability and security issues.


Counterpoint: It's not not about being an autodidact or an outsider.

I was unlikely to meet any bad coders at work, due to how likely it is they were filtered by the hiring process, and thus I never met anyone writing truly cringe-worthy code in a professional setting.

That was until I decided to go to university for a bit[1]. This is where, for the first time, I met people writing bad code professionally: professors[2]. "Bad" as in best-practices, the code usually worked. I've also seen research projects that managed to turn less than 1k LOC of python into a barely-maintainable mess[3].

I'll put my faith in an autodidact who had to prove themselves with skills and accomplishments alone over someone who got through the door with a university degree.

An autodidact who doesn't care about their craft is not going to make the cut, or shouldn't. If your hiring process doesn't filter those people, why are you wasting your time at a company that probably doesn't know your value?

[1] Free in my country, so not a big deal to attend some lectures besides work. Well, actually I'm paying for it with my taxes, so I might as well use it.

[2] To be fair, the professors teaching in actual CS subjects were alright. Most fields include a few lectures on basic coding though, which were usually beyond disappointing. The non-CS subject that had the most competent coders was mathematics. Worst was economics. Yes, I meandered through a few subjects.

[3] If you do well on some test you'd usually get job offers from professors, asking you to join their research projects. I showed up to interviews out of interest in the subject matter and professors are usually happy to tell you all about it, but wages for students are fixed at the legal minimum wage, so it couldn't ever be a serious consideration for someone already working on the free market.


Would an unwisely-configured site template or generator explain the scale here?

Or, a malicious site template or generator purposefully sprinkling potential backdoors for later?


But wouldn't some sort of SCA/SAST/DAST catch that?

Like if I'm importing a site template, ideally I'd be verifying either it's source or it's source code as well.

(Not being facetious btw - genuinely curious)


I was hoping ongoing coverage would answer that; it sounds like a perfect example. I heard that the tampered code redirects traffic to a sports betting site.


> I guess my question is why your employer or any other org would not follow the model above?

When you look at Artifactory pricing you ask yourself 'why should I pay them a metric truckload of money again?'

And then dockerhub goes down. Or npm. Or pypi. Or github... or, worst case, this thread happens.


There are cheaper or free alternatives to Artifactory. Yes they may not have all of the features but we are talking about a company that is fine with using a random CDN instead.

Or, in the case of javascript, you could just vendor your dependencies or do a nice "git add node_modules".


I just gave Artifactory as an example. What about GHE, self-hosted GitLab, or your own in-house Git?

Edit: was thinking - would be a pain in the butt to manage. That tracks, but every org ik has some corporate versioning system that also has an upsell for source scanning.

(Not being facetious btw - genuinely curious)


I've been a part of a team which had to manage a set of geodistributed Artifactory clusters and it was a pain in the butt to manage, too - but these were self-hosted. At a certain scale you have to pick the least worst solution though, Artifactory seems to be that.


> have it checked by a vuln scanner

This is kinda sad. For introducing new dependencies, a vuln scanner makes sense (don't download viruses just because they came from a source checkout!), but we could have kept CDNs if we'd used signatures.

EDIT: Never mind, been out of the game for a bit! I see there is SRI now...

https://developer.mozilla.org/en-US/docs/Web/Security/Subres...


This supply chain attack had nothing to do with npm afaict.

The dependency in question seems to be (or claim to be) a lazy loader that determines browser support for various capabilities and selectively pulls in just the necessary polyfills; in theory this should make the frontend assets leaner.

But the CDN used for the polyfills was injecting malicious code.


Sounds like a bad idea to me.

I would expect latency (network round trip time) to make this entire exercise worthless. Most polyfills are 1kb or less. Splitting polyfill code amongst a bunch of small subresources that are loaded from a 3rd party domain sounds like it would be a net loss to performance. Especially since your page won’t be interactive until those resources have downloaded.

Your page will almost certainly load faster if you just put those polyfills in your main js bundle. It’ll be simpler and more reliable too.


In practice when this wasn't a Chinese adware service, it proved to be faster to use the CDN.

You are not loading a "bunch" of polyfill script files, you selected what you needed in the URL via a query parameter, and the service took that plus user agent of the request to determine which were needed and returned a minified file of just the necessary polyfills.

As this request was to a separate domain it did not run into the head of line / max connections per domain issue of Http 1.1 which was still the more common protocol at the time this service came out.


yes, but the NPM packaging ecosystem leads to a reliance on externally-hosted dependencies for those who don't want to bundle


> I always prefer to self-host my dependencies

Js dependencies should be pretty small compared to images or other resources. Http pipelining should make it fast to load them from your server with the rest

The only advantage to using one of those cdn-hosted versions is that it might help with browser caching


> Http pipelining should make it fast to load them from your server with the rest

That's true, but it should be emphasized that it's only fast if you bundle your dependencies, too.

Browsers and web developers haven't been able to find a way to eliminate a ~1ms/request penalty for each JS file, even if the files are coming out of the local cache.

If you're making five requests, that's fine, but if you're making even 100 requests for 10 dependencies and their dependencies, there's a 100ms incentive to do at least a bundle that concatenates your JS.

And once you've added a bundle step, you're a few minutes away from adding a bundler that minifies, which often saves 30% or more, which is usually way more than you probably saved from just concatenating.

> The only advantage to using one of those cdn-hosted versions is that it might help with browser caching

And that is not true. Browsers have separate caches for separate sites for privacy reasons. (Before that, sites could track you from site to site by seeing how long it took to load certain files from your cache, even if you'd disabled cookies and other tracking.)


nope, browsers silo cache to prevent tracking via cached resources


There is still a caching effect of the CDN for your servers, even if there isn't for the end user: if the CDN serves the file then your server does not have to.

Large CDNs with endpoints in multiple locations internationally also give the advantage of reducing latency: if your static content comes from the PoP closest to me (likely London, <20ms away where I'm currently sat, ~13 on FTTC at home⁰, ~10 at work) that could be quite a saving if your server is otherwise hundreds of ms away (~300ms for Tokyo, 150 for LA, 80 for New York). Unless you have caching set to be very aggressive dynamic content still needs to come from your server, but even then a high-tech CDN can² reduce the latency of the TCP connection handshake and¹ TLS handshake by reusing an already open connection between the CDN and the backing server(s) to pipeline new requests.

This may not be at all important for many well-designed sites, or sites where latency otherwise matters little enough that a few hundred ms a couple of times here or there isn't really going to particularly bother the user, but could be a significant benefit to many bad setups and even a few well-designed ones.

--------

[0] York. The real one. The best one. The one with history and culture. None of that “New” York rebranded New Amsterdam nonsense!

[1] if using HTTPS and you trust the CDN to re-encrypt, or HTTP and have the CDN add HTTPS, neither of which I wouldn't recommend as it is exactly an MitM situation, but both are often done

[2] assuming the CDN also manages your DNS for the whole site, or just a subdomain for the static resources, so the end user sees the benefit of the CDNs anycast DNS arrangement.


> prefer to avoid an npm-based webpack/whatever build pipeline

What kind of build pipeline do you prefer, or are you saying that you don't want any build pipeline at all?


I don't want a build pipeline. I want to write some HTML with a script type=module tag in it with some JavaScript, and I want that JavaScript to load the ES modules it depends on using import statements (or dynamic import function calls for lazy loading).


Do you not use CSS preprocessors or remote map files or anything like that... or do you just deal with all of that stuff manually instead of automating it?


That's still allowed! :)


I suspect this is more relevant for people who aren't normally JavaScript developers. (Let's say you use Go or Python normally.) It's a way of getting the benefits of multi-language development while still being mostly in your favorite language's ecosystem.

On the Node.js side, it's not uncommon to have npm modules that are really written in another language. For example, the esbuild npm downloads executables written in Go. (And then there's WebAssembly.)

In this way, popular single-language ecosystems evolve towards becoming more like multi-language ecosystems. Another example was Python getting 'wheels' straightened out.

So the equivalent for bringing JavaScript into the Python ecosystem might be having Python modules that adapt particular npm packages. Such a module would automatically generate JavaScript based on a particular npm, handling the toolchain issue for you.

A place to start might be a Python API for the npm command itself, which takes care of downloading the appropriate executable and running it. (Or maybe the equivalent for Bun or Deno?)

This is adding still more dependencies to your supply chain, although unlike a CDN, at least it's not a live dependency.

Sooner or later, we'll all depend on left-pad. :-)


I always prefer to self-host my dependencies,

Wouldn't this just be called hosting?


As you might know, Lit offers a single bundled file for the core library.


Yes! Love that about Lit. The problem is when I want to add other things that have their own dependency graph.


This is why I don't think it's very workable to avoid npm. It's the package manager of the ecosystem, and performs the job of downloading dependencies well.

I personally never want to go back to the pre-package-manager days for any language.


One argument is that Javascript-in-the-browser has advanced a lot and there's less need for a build system. (ex. ESM module in the browser)

I have some side projects that are mainly HTMX-based with some usage of libraries like D3.js and a small amount of hand-written Javascript. I don't feel that bad about using unpkg because I include signatures for my dependencies.


Before ESM I wasn't nearly as sold on skipping the build step, but now it feels like there's a much nicer browser native way of handling dependencies, if only I can get the files in the right shape!

The Rails community are leaning into this heavily now: https://github.com/rails/importmap-rails


npm is a package manager though, not a build system. If you use a library that has a dependency on another library, npm downloads the right version for you.


Yep. And so does unpkg. If you’re using JavaScript code through unpkg, you’re still using npm and your code is still bundled. You’re just getting someone else to do it, at a cost of introducing a 3rd party dependency.

I guess if your problem with npm and bundlers is you don’t want to run those programs, fine? I just don’t really understand what you gain from avoiding running bundlers on your local computer.


Oh lol yeah, I recently gave up and just made npm build part of my build for a hobby project I was really trying to keep super simple, because of this. It was too much of a hassle to link in stuff otherwise, even very minor small things

You shouldn't need to fish stuff out of node_moduoes though, just actually get it linked and bundled into one is so that it automatically grabs exactly what you need and it's deps.

If this process sketches you out as it does me, one way to address that, as I do, is have the bundle emitted with minification disabled so its easy to review


That was my thought too but polyfill.io does do a bit more than a traditional library CDN, their server dispatches a different file depending on the requesting user agent, so only the polyfills needed by that browser are delivered and newer ones don't need to download and parse a bunch of useless code. If you check the source code they deliver from a sufficiently modern browser then it doesn't contain any code at all (well, unless they decide to serve you the backdoored version...)

https://polyfill.io/v3/polyfill.min.js

OTOH doing it that way means you can't use subresource integrity, so you really have to trust whoever is running the CDN even more than usual. As mentioned in the OP, Cloudflare and Fastly both host their own mirrors of this service if you still need to care about old browsers.


The shared CDN model might have made sense back when browsers used a shared cache, but they dont even do that anymore.

Static files are cheap to serve. Unless your site is getting hundreds of millions of page views, just plop the js file on your webserver. With HTTP/2 it will probably be almost the same speed if not faster than a cdn in practise.


If you have hundreds of millions of pageviews, go with a trusted party - someone you actually pay money to - like Cloudflare, Akamai, or any major hosting / cloud party. But not to increase cache hit rate (what CDNs were originally intended for), but to reduce latency and move resources to the edge.


Does it even reduce latency that much (unless you have already squeezed latency out of everything else that you can)?

Presumably your backend at this point is not ultra optimized. If you send a link header and using http/2 the browser will download the js file while your backend is doing its thing. I'm doubtful that moving js to the edge would help that much in such a situation unless the client is on the literal other side of the world.

There of course comes a point where it does matter, i just think the cross over point is way later than people expect.


> Does it even reduce latency that much

Absolutely:

https://wondernetwork.com/pings/

Stockholm <-> Tokyo is at least 400ms here, anytime you have multi-national sites having a CDN is important. For your local city, not so much (and of course you won't even see it locally).


I understand that ping times are different when geolocated. My point was that in fairly typical scenarios (worst cases are going to be worse) it would be hidden by backend latency since the fetch could be concurrent with link headers or http 103. Devil in details of course.


I'm so glad to find some sane voices here! I mean, sure, if you're really serving a lot of traffic to Mombasa, akamai will reduce latency. You could also try to avoid multi megabyte downloads for a simple page.


Content: 50KB

Images: 1MB

Javascript: 35MB

Fonts: 200KB

Someone who is good at the internet please help me budget this. My bounce rate is dying.


While there are lots of bad examples out there - keep in mind its not quite that straight forward as it can make a big difference whether those resources are on the critical path that blocks first paint or not.


What's all that JavaScript for?


Cookie banner


It’s not an either or thing. Do both. Good sites are small and download locally. The CDN will work better (and be cheaper to use!) if you slim down your assets as well.


> But not to increase cache hit rate (what CDNs were originally intended for)

Was it really cache hit rate of the client or cache hit rate against the backend?


Both.


Even when it "made sense" from a page load performance perspective, plenty of us knew it was a security and privacy vulnerability just waiting to be exploited.

There was never really a compelling reason to use shared CDNs for most of the people I worked with, even among those obsessed with page load speeds.


In my experience, it was more about beating metrics in PageSpeed Insights and Pingdom, rather than actually thinking about the cost/risk ratio for end users. Often the people that were pushing for CDN usage were SEO/marketing people believing their website would rank higher for taking steps like these (rather than working with devs and having an open conversation about trade-offs, but maybe that's just my perspective from working in digital marketing agencies, rather than companies that took time to investigate all options).


I don’t think it ever even improved page load speeds, because it introduces another dns request, another tls handshake, and several network round trips just to what? Save a few kb on your js bundle size? That’s not a good deal! Just bundle small polyfills directly. At these sizes, network latency dominates download time for almost all users.


> I don’t think it ever even improved page load speeds, because it introduces another dns request, another tls handshake, and several network round trips just to what?

I think the original use case, was when every site on the internet was using jquery, and on a js based site this blocked display (this was also pre fancy things like HTTP/2 and TLS 0-RTT). Before cache partitioning you could reuse jquery js requested from a totally different site currently in cache as long as the js file had same url, which almost all clients already had since jquery was so popular.

So it made sense at one point but that was long ago and the world is different now.


I believe you could download from multiple domains at the same time, before HTTP/2 became more common, so even with the latency you'd still be ahead while your other resources were downloading. Then it became more difficult when you had things like plugins that depended on order of download.


You can download from multiple domains at once. But think about the order here:

1. The initial page load happens, which requires a DNS request, TLS handshake and finally HTML is downloaded. The TCP connection is kept alive for subsequent requests.

2. The HTML references javascript files - some of these are local URLs (locally hosted / bundled JS) and some are from 3rd party domains, like polyfill.

3a. Local JS is requested by having the browser send subsequent HTTP requests over the existing HTTP connection

3b. Content loaded from 3rd party domains (like this polyfill code) needs a new TCP connection handshake, a TLS handshake, and then finally the polyfills can be loaded. This requires several new round-trips to a different IP address.

4. The page is finally interactive - but only after all JS has been downloaded.

Your browser can do steps 3a and 3b in parallel. But I think it'll almost always be faster to just bundle the polyfill code in your existing JS bundle. Internet connections have very high bandwidth these days, but latency hasn't gotten better. The additional time to download (lets say) 10kb of JS is trivial. The extra time to do a DNS lookup, a TCP then TLS handshake and then send an HTTP request and get the response can be significant.

And you won't even notice when developing locally, because so much of this stuff will be cached on your local machine while you're working. You have to look at the performance profile to understand where the page load time is spent. Most web devs seem much more interested in chasing some new, shiny tech than learning how performance profiling works and how to make good websites with "old" (well loved, battle tested) techniques.


Aren't we also moving toward not even letting cross-origin scripts having very little access to information about the page? I read some stuff a couple years ago that gave me a very strong impression that running 3rd party scripts was quickly becoming an evolutionary dead end.


Definitely for browser extensions. It's become more difficult with needing to set up CORS, but like with most things that are difficult, you end up with developers that "open the floodgates" and allow as much as possible to get the job done without understanding the implications.


CORS is not required to run third party scripts. Cors is about reading data from third parties not executing scripts from third parties.

(Unless you set a Cross-Origin Resource Policy header, but that is fairly obscure)


The same concept should be applied to container based build pipelines too. Instead of pulling dependencies from a CDN or a pull through cache, build them into a container and use that until you're ready to upgrade dependencies.

It's harder, but creates a clear boundary for updating dependencies. It also makes builds faster and makes old builds more reproducible since building an old version of your code becomes as simple as using the builder image from that point in time.

Here's a nice example [1] using Java.

1. https://phauer.com/2019/no-fat-jar-in-docker-image/


> The same concept should be applied to container based build pipelines too. Instead of pulling dependencies from a CDN or a pull through cache, build them into a container and use that until you're ready to upgrade dependencies.

Everything around your container wants to automatically update itself as well, and some of the changelogs are half emoji.


I get the impression this is a goal of Nix, but I haven't quite digested how their stuff works yet.


> self-host your dependencies

I can kind of understand why people went away from this, but this is how we did it for years/decades and it just worked. Yes, doing this does require more work for you, but that's just part of the job.


For performance reasons alone, you definitely want to host as much as possible on the same domain.

In my experience from inside companies, we went from self-hosting with largely ssh access to complex deployment automation and CI/CD that made it hard to include any new resource in the build process. I get the temptation: resources linked from external domains / cdns gave the frontend teams quick access to the libraries, fonts, tools, etc. they needed.

Thankfully things have changed for the better and it's much easier to include these things directly inside your project.


There was a brief period when the frontend dev world believed the most performant way to have everyone load, say, jquery, would be for every site to load it from the same CDN URL. From a trustworthy provider like Google, of course.

It turned out the browser domain sandboxing wasn’t as good as we thought, so this opened up side channel attacks, which led to browsers getting rid of cross-domain cache sharing; and of course it turns out that there’s really no such thing as a ‘trustworthy provider’ so the web dev community memory-holed that little side adventure and pivoted to npm.

Which is going GREAT by the way.

The advice is still out there, of course. W3schools says:

> One big advantage of using the hosted jQuery from Google:

> Many users already have downloaded jQuery from Google when visiting another site. As a result, it will be loaded from cache when they visit your site

https://www.w3schools.com/jquery/jquery_get_started.asp

Which hasn’t been true for years, but hey.


The only thing I’d trust w3schools to teach me is SEO. How do they stay on top of Google search results with such bad, out of date articles?


Be good at a time when Google manually ranks domains, then pivot to crap when Google stops updating the ranking. Same as the site formerly known as Wikia.


> For performance reasons alone, you definitely want to host as much as possible on the same domain.

It used to be the opposite. Browsers limit the amount of concurrent requests to a domain. A way to circumvent that was to load your resources from a.example.com, b.example.com, c.example.com etc. Paying some time for extra dns resolves I guess, but could then load many more resources at the same time.

Not as relevant anymore, with http2 that allows sharing connections, and more common to bundle files.


Years ago I had terrible DNS service from my ISP, enough to make my DSL sometimes underperform dialup. About 1 in 20 DNS lookups would hang for many seconds so it was inevitable that any web site that pulled content from multiple domains would hang up for a long time when loading. Minimizing DNS lookups was necessary to get decent performance for me back then.


Using external tools can make it quite a lot harder to do differential analysis to triage the source of a bug.

The psychology of debugging is more important than most allow. Known unknowns introduce the possibility that an Other is responsible for our current predicament instead of one of the three people who touched the code since the problem happened (though I've also seen this when the number of people is exactly 1)

The judge and jury in your head will refuse to look at painful truths as long as there is reasonable doubt, and so being able to scapegoat a third party is a depressingly common gambit. People will attempt to put off paying the piper even if doing so means pissing off the piper in the process. That bill can come due multiple times.


Maybe people have been serving those megabytes of JS frameworks from some single-threaded python webserver (in dev/debug mode to boot) and wondered why they could only hit 30req/s or something like that.


Own your process – at best that CDN is spying on your users.


> and it just worked

Just to add... that is unlike the CDN thing, that will send developers into Stack Overflow looking how to set-up CORS.


I don't think SRI would have ever worked in this case because not only do they dynamically generate the polyfill based on URL parameters and user agent, but they were updating the polyfill implementations over time.


>self-host your dependencies behind a CDN service you control (just bunny/cloudflare/akamai/whatever is fine and cheap).

This is not always possible, and some dependencies will even disallow it (think: third-party suppliers). Anyways, then that CDN service's BGP routes are hijacked. Then what? See "BGP Routes" on https://joshua.hu/how-I-backdoored-your-supply-chain

But in general, I agree: websites pointing to random js files on the internet with questionable domain independence and security is a minefield that is already exploding in some places.


I strongly believe that Browser Dev Tools should have an extra column in the network tab that highlights JS from third party domains that don't have SRI. Likewise in the Security tab and against the JS in the Application Tab.


I've seen people reference CDNs for internal sites. I hate that because it is not only a security risk but it also means we depend on the CDN being reachable for the internal site to work.

It's especially annoying because the projects I've seen it on were using NPM anyway so they could have easily pulled the dependency in through there. Hell, even without NPM it's not hard to serve these JS libraries internally since they tend to get packed into one file (+ maybe a CSS file).


Also the folks who spec'ed ES6 modules didn't think it was a required feature to ship SRI from that start so it's still not broadly and easily supported across browsers. I requested the `with` style import attributes 8 years ago and it's still not available. :/


Another downside of SRI is that it defeats streaming. The browser can't verify the checksum until the whole resource is downloaded so you don't get progressive decoding of images or streaming parsing of JS or HTML.


I can see the CDNs like CF / Akamai becoming soon like the internet 1.2 - with the legitimate stuff in, and all else considered gray/dark/1.0 web.


I agree with this take, but it sounds like Funnull acquired the entirety of the project, so they could have published the malware through NPM as well.


> the World Economic Forum website is affected here, for example! Absolutely ridiculous

Dammit Jim, we’re economists, not dream weavers!


World Economic Forum website is affected here, for example!

What did they say about ownership? How ironic.


Meanwhile ...

"High security against CDN, WAF, CC, DDoS protection and SSL protects website owners and their visitors from all types of online threats"

... says the involved CDN's page (FUNNULL CDN).-

(Sure. Except the one's they themselves generate. Or the CCP.)


Another alternative is not to use dependencies that you or your company are not paying for.


It seems like Cloudflare predicted this back in Feb.

https://blog.cloudflare.com/polyfill-io-now-available-on-cdn...


CF links to the same discussion on GitHub that the OP does. Seems less like they predicted it, and more like they just thought that other folks concerns were valid and amplified the message.


Washington Post home page external content:

    app.launchdarkly.com
    cdn.brandmetrics.com
    chrt.fm
    clientstream.launchdarkly.com
    events.launchdarkly.com
    fastlane.rubiconproject.com
    fonts.gstatic.com
    g.3gl.net
    grid.bidswitch.net
    hbopenbid.pubmatic.com
    htlb.casalemedia.com
    ib.adnxs.com
    metrics.zeustechnology.com
    pixel.adsafeprotected.com
    podcast.washpostpodcasts.com
    podtrac.com
    redirect.washpostpodcasts.com
    rtb.openx.net
    scripts.webcontentassessor.com
    s.go-mpulse.net
    tlx.3lift.com
    wapo.zeustechnology.com
    www.google.com
    www.gstatic.com
Fox News home page external content:

    3p-geo.yahoo.com
    acdn.adnxs.com
    ads.pubmatic.com
    amp.akamaized.net
    api.foxweather.com
    bat.bing.com
    bidder.criteo.com
    c2shb.pubgw.yahoo.com
    cdn.segment.com
    cdn.taboola.com
    configs.knotch.com
    contributor.google.com
    dpm.demdex.net
    eb2.3lift.com
    eus.rubiconproject.com
    fastlane.rubiconproject.com
    foxnewsplayer-a.akamaihd.net
    frontdoor.knotch.it
    fundingchoicesmessages.google.com
    global.ketchcdn.com
    grid.bidswitch.net
    hbopenbid.pubmatic.com
    htlb.casalemedia.com
    ib.adnxs.com
    js.appboycdn.com
    js-sec.indexww.com
    link.h-cdn.com
    pagead2.googlesyndication.com
    perr.h-cdn.com
    pix.pub
    player.h-cdn.com
    prod.fennec.atp.fox
    prod.idgraph.dt.fox
    prod.pyxis.atp.fox
    rtb.openx.net
    secure-us.imrworldwide.com
    static.chartbeat.com
    static.criteo.net
    s.yimg.com
    sync.springserve.com
    tlx.3lift.com
    u.openx.net
    webcontentassessor.global.ssl.fastly.net
    www.foxbusiness.com
    www.googletagmanager.com
    www.knotch-cdn.com
    zagent20.h-cdn.com
So there's your target list for attacking voters.



"No sir, we have absolutely no idea why anyone would ever use an ad blocker."


No just for mental sanity but online safety.


I think JS (well, ES6) has a ton of positive qualities, and I think it's a great fit for many of its current applications. However, this is a pretty good example of what bothers me about the way many people use it. I see a lot of folks, in the name of pragmatism, adopt a ton of existing libraries and services so they don't have to think about more complex parts of the problem they're solving. Great! No need to reinvent the wheel. But, people also seem to falsely equate popularity with stability-- if there's a ton of people using it, SURELY someone has vetted it, and is keeping a close eye on it, no? Well, maybe no? It just seems that the bar for what people consider 'infrastructure' is simply too low. While I don't think that, alone, is SO different from other popular interpreted languages, the light weight of JS environments means you need to incorporate a LOT of that stuff to get functionality that might otherwise be part of a standard library or ubiquitous stalwart framework, which dramatically increases the exposure to events like this.

I think a lot of people conflate criticism of JS with criticism of the way it's been used for the past number of years, putting a nuanced topic into a black-and-white "for or against" sort of discussion. I've done a number of projects heavily using JS-- vanilla, with modern frameworks, server-side and in the browser-- and aside from some fundamental annoyances with it's approach to a few things, it's been a great tool. There's nothing fundamental about JS itself that makes applications vulnerable to this like a buffer overflow would, but the way it is used right now seems to make it a lot easier for inexperienced, under-resourced, or even just distracted developers to open up big holes using generally accepted techniques.

But I've never been a full-time front-end-web or node dev, so maybe I'm mistaken? Compared to the server-side stuff I've worked with, there's definitely a concerning wild west vibe moving into modern JS environments. I think there was a casual "well, it's all in client-side userspace" attitude about the way it was used before which effectively shifted most of the security concerns to the browser. We should probably push a little harder to shake that. Think about how much JS your bank uses in their website? I'll bet they didn't write all of their own interaction libraries.


This is the end-stage of having third-party packaging systems without maintainers. This happens vanishingly infrequently for things like apt repositories, because the people maintaining packages are not necessarily the people writing them, and there's a bit of a barrier to entry. Who knew yanking random code from random domains and executing it was a bad idea? Oh yeah, everyone.


I have been full stack for the last decade. Vanilla JS and server-side rendering is about all I can tolerate these days. I will reach for Ajax or websockets if needed, but 90%+ of all interactions I've ever had to deal with are aptly handled with a multipart form post.

I do vendor my JS, but only for things like PDF processing, 3D graphics, and barcode scanning.

I've been through the framework gauntlet. Angular, RiotJS, React, Blazor, AspNetCore MVC, you name it. There was a time where I really needed some kind of structure to get conceptually bootstrapped. After a while, these things begin to get in the way. Why can't I have the framework exactly my way? Just give me the goddamn HttpContext and get off my lawn. I don't need a babysitter to explain to me how to interpolate my business into an html document string anymore.

I also now understand why a lot of shops insist on separation between frontend and backend. It seems to me that you have to be willing to dedicate much of your conscious existence to honing your skills if you want to be a competent full stack developer. It can't just be your 9-5 job unless you are already highly experienced and have a set of proven patterns to work with. Getting someone off the street to that level can be incredibly expensive and risky. Once you know how to do the whole thing, you could just quit and build for yourself.


Build what? Sell to whom? What, write my own subscription/billing module, on top of full stack development which, as you said, takes a lot of time just to be competent at. on top of building it, do sales and marketing and accounting and all that other business stuff? I mean, I guess


I wonder if anyone would be willing to pay for a service that vets and hosts (a very small subset) of popular JS libraries.


Maybe! Add in package management that isn't a complete clusterfustrum and that could be pretty attractive. Cloudron basically does something similar with FOSS server apps.


I remember Google suggesting that everyone use common libraries hosted by a shared CDN and then suggesting de-ranking slow websites and I think that’s what led to widespread adoption of this pattern.

The only reason I stopped using third-party hosted libraries was because it wasn’t worth the trouble. Using subresource integrity makes it safe but it was part of the trouble.


Sure... Though while I hate to say it, I don't blame people for trusting Google's hosted copy of something. For better or worse, they are more trustworthy than some "as seen on a million janky tutorials" whatever.io. A very privacy-focused employer precluded that possibility during peak adoption, but with what many sites load up, that's the least of your worries.


No problem with Google hosting it (although I’d still use sub-resource integrity) but anyone else?


Amusing: https://news.ycombinator.com/item?id=10143620

The original author probably should have done everyone a favor and just killed the site altogether.


I'm surprised there is no mention of subresource integrity in the article. It's a low effort, high quality mitigation for almost any JS packages hosted by a CDN.

EDIT: Oh, it's because they are selling something. I don't know anything about their offerings, but SRI is made for this and is extremely effective.


Wouldn't work in this case because the whole selling point of polyfill.io was that as new features came out and as the browser grew support for new features the polyfill that was loaded would dynamically grow or shrink.

Something like `polyfill.io.example.org/v1?features=Set,Map,Other.Stuff` would _shrink_ over time, while something like `pollyfill.io.example.org/v1?features=ES-Next` would grow and shrink as new features came and went.


SRI generally won't work here because the served polyfill JS (and therefore the SRI hash) depends on the user agent/headers sent by the user's browser. If the browser says it's ancient, the resulting polyfill will fill in a bunch of missing JS modules and be a lot of JS. If the browser identifies as modern, it should return nothing at all.

Edit: In summary, SRI won't work with a dynamic polyfill which is part of the point of polyfill.io. You could serve a static polyfill but that defeats some of the advantages of this service. With that said, this whole thread is about what can happen with untrusted third parties so...


Oooft. I didn't realize it's one that dynamically changes it's content.


So maybe it’s less that the article is selling something and more that you just don’t understand the attack surface?


It absolutely would work if the browser validates the SRI hash. The whole point is to know in advance what you expect to receive from the remote site and verify the actual bytes against the known hash.

It wouldn’t work for some ancient browser that doesn’t do SRI checks. But it’s no worse for that user than without it.


The CDN in this case is performing an additional function which is incompatible with SRI: it is dynamically rendering a custom JS script based on the requesting User Agent, so the website authors aren't able to compute and store a hash ahead of time.


I edited to make my comment more clear but polyfill.io sends dynamic polyfills based on what features the identified browser needs. Since it changes, the SRI hash would need to change so that part won't work.


Ah! I didn’t realize that. My new hot take is that sounds like a terrible idea and is effectively giving full control of the user’s browser to the polyfill site.


And this hot take happens to be completely correct (and is why many people didn't use it, in spite of others yelling that they were needlessly re-inventing the wheel).


Yeah... I've generated composite fills with the pieces I would need on the oldest browser I had to support, unfortunately all downstream browsers would get it.

Fortunately around 2019 or so, I no longer had to support any legacy (IE) browsers and pretty much everything supported at least ES2016. Was a lovely day and cut a lot of my dependencies.


They are saying that because the content of the script file is dynamic based on useragent and what that useragent currently supports in-browser, the integrity hash would need to also be dynamic which isn't possible to know ahead of time.


Their point is that the result changes depending on the request. It isn't a concern about the SRI hash not getting checked, it is that you can't realistically know the what you expect in advance.


In all cases where you can use SRI, there's a better mitigation: Just host a copy of the file yourself.


I would still (and do) do both, in the case that your site (for whatever reason) is still under/or simply accessible to HTTP, then a man in the middle attack could still happen and replace your script with another.

For self hosted dynamic scripts, I just add a task in my build process to calc the sha and add it to the <src integrity="sha..." >

Otherwise just calc it and hardcode it once for 3rd party, legacy scripts...


I had this conversation countless times with developers: are you really ok if someone hijacks the CDN for the code you're including? They almost always seem to be fine with it, simply because everyone else is doing it like this. At the same time they put up with countless 2FAs in the most mundane places.

The follow up of "you know that the random packages you're including could have malware" is even more hopeless.


In general, SRI (Subresource Integrity) should protect against this. It sounds like it wasn't possible in the Polyfill case as the returned JS was dynamic based on the browser requesting it.


now you are a bummer and a roadblock so your career suffers


yes you just put integrity="sha384-whatever" and you're good to go


Can't do that with this one because it generates the polyfill based on the user agent.


Why not? The `integrity` attribute accepts more than one value[0].

This would technically be feasible, if my understanding of the service is correct. Hashes could be recorded for each combination of feature -- you could then give those list of hashes to the user to insert into the attribute.

Of course, the main difficulty here would be the management of individual hashes. Hmm, definitely interesting stuff.

[0]: https://developer.mozilla.org/en-US/docs/Web/Security/Subres...


We are talking potentially hundreds of hashes because of how the polyfills service worked.


That depends on how many polyfills are served via each script.


yeah that's nuts, I would never use a random site for that, but in general people's opinion on CDN use is dated. Tons of people still think that cached resources are shared between domains for example.


Sure, but why risk a developer making a typo, saying integrty="sha384-whatever", and that attribute simply being ignored in the html?


“A developer could typo something” is kind of weak because you could use this argument for basically anything.


Why? If you host everything on the same domain there's no possibility of a typo. And, developers could maliciously make a typo that can get past code review, don't you take that into account?

In a lot of situations the system can be designed that a mistake has to be obvious at review for it to even pass the build step. Why not strive for that level of robustness?


I don’t understand why JS devs treat dependencies the way they do. If you are writing scripts that run on other people’s computers I feel like the dependencies used should be vetted for security constantly, and should be used minimally.

Meanwhile most libraries seem to have 80 trillion dependencies written by random github accounts called “xihonghua” or something with no other projects on their account.


There’s a wide, wide range of devs that fit in the label of “JS dev”. At the more junior or casual end, npm and cdns are shoved in your face as the way to go. It shouldn’t be surprising that it’s the natural state of things.

I’ve worked with many JS devs who also have broader experience and are more than aware of issues like these, so it just depends I guess.

The bigger issue may just be the lack of a culture that vendors their code locally and always relies on the 3rd party infrastructure (npm or cdn).

It’s somewhat similar but any Rust project I’m building, I wind up vendoring the crates I pull in locally and reviewing them. I thought it would be more annoying but it’s really not that bad in the grand scheme of things - and there should be some automated things you could set up to catch obvious issues, though I defer to someone with more knowledge to chime in here.

This may be an extra level of headache with the JS ecosystem due to the sheer layers involved.


Combine that with electron and auto update everything

I've seen dev include scripts from templates when the app was for banks internal users and intranet only. They are clueless


Just want to point out that there's nothing wrong with having a chinese username alone


Because that is how the tutorials on "web development" teach it. Just use this thing and don't ask what it is.


and Meanwhile Meanwhile, years after it was well known that the JS dependency model was an utter security disaster the rust ecosystem went on to copy it.


I checked, and here are the top domains that are still using Polyfill.io as of today: https://pastila.nl/?00008b47/8a0d821be418cdd5003a2d620d76589...

However, theguardian.com is using it from its own domain, which is safe. But most of the other 5000 websites don't.


The number of websites is decreasing, which is good:

    clickhouse-cloud :) SELECT date, count() FROM minicrawl_processed WHERE arrayExists(x -> x LIKE '%polyfill.io%', external_scripts_domains) AND date >= now() - INTERVAL 5 DAY GROUP BY date ORDER BY date

       ┌───────date─┬─count()─┐
    1. │ 2024-06-22 │    6401 │
    2. │ 2024-06-23 │    6398 │
    3. │ 2024-06-24 │    6381 │
    4. │ 2024-06-25 │    6325 │
    5. │ 2024-06-26 │    5426 │
       └────────────┴─────────┘

    5 rows in set. Elapsed: 0.204 sec. Processed 15.70 million rows, 584.74 MB (76.87 million rows/s., 2.86 GB/s.)
    Peak memory usage: 70.38 MiB.
PS. If you want to know about this dataset, check https://github.com/ClickHouse/ClickHouse/issues/18842


Many .gov websites seen on this list.


Always host your dependencies yourself, it's easy to do & even in the absence of a supply chain attack it helps to protect your users' privacy.


But if the dependency from a CDN is already cached, it will skip an extra resource and site will load faster.

I agree with the points though.


That’s not been true since Site Isolation IIRC

e: Not sure it’s Site Isolation specifically, but it’s definitely still not true anymore: https://news.ycombinator.com/item?id=24745748

e2: listen to the commenter below, its Cache Partitioning: https://developer.chrome.com/blog/http-cache-partitioning


Right - and site isolation is about five years old at this point. The idea that CDNs can share caches across different sites is quite out of date.


Didn't know that. Then why are we (I mean, many web devs) still using it?

Just an old convention, simplicity, or saving bandwidth?


If that's true, this is a wild microcosm example of how the web breaks in ways we don't expect.


> this is a wild microcosm example of how the web breaks in ways we don't expect.

I think the performance characteristics of the web are subject to change over time, especially to allow increased security and privacy.

https is another example of increased security+privacy at the cost of being slightly slower than non-https connections because of an extra round trip or two to create the connection.

The lesson I take from it is: don't use complicated optimization techniques that might get out of date over time. Keep it simple instead of chasing every last bit of theoretical performance.

For example, there used to be a good practice of using "Domain Sharding" to allow browsers to download more files in parallel, but was made obsolete with HTTP/2, and domain sharding now has a net negative effect, especially with https.

https://developer.mozilla.org/en-US/docs/Glossary/Domain_sha...

Now they're realizing that HTTP/2's multiplexing of a single TCP connection can have negative effects on wireless connections, so they're working on HTTP/3 to solve that.

Also don't use polyfills. If your supported browsers don't support the feature then don't use the feature, or implement the fallback yourself. Use the features that are actually available to you.


In addition to Cache Partitioning, it was never really likely that a user had visited another site that used the same specific versions from the same cdn as your site uses.

Making sure all of your pages were synchronized with the same versions and bundling into appropriate bits for sharing makes sense, and then you may as well serve it from your own domain. I think serving from your www server is fine now, but back in the day there were benefits to having a different hostname for static resources and maybe it still applies (I'm not as deep into web stuff anymore, thank goodness).


Because of modern cache partitioning, HTTP/2+ multiplexing, and sites themselves being served off CDNs, external CDNs are now also worse for performance.

If you use them, though, use subresource integrity.


> and sites themselves being served off CDNs

Funnily enough I can't set up CDN on Azure at work because it's not approved but I could link whatever random ass CDN I want for external dependencies if I was so inclined.


This theory gets parroted endlessly, but nearly never actually happens in reality.


Gotta love the code excerpt verifying the device type that checks for "Mac68K" and "MacPPC" strings. Your retro Macs are safe!


Software supply chains feel like one of the Internet's last remaining high-trust spaces, and I don't think that's going to last long. A tidal wave of this is coming. I'm kind of surprised it's taken this long given how unbelievably soft this underbelly is.


going back to rolling it yourself, or relying on a few high quality stdlib providers that you likely have to pay for.


Paying for dependencies sounds like a good idea. Reduces the incentive to sell out; allows contributors to quit day jobs to focus on fixing bugs and security holes; less likely to result in abandonware.


Don't forget that the author of core-js, a library not a service like polyfill but with almost the same goal, is tired of open source as well https://github.com/zloirock/core-js/blob/master/docs/2023-02...


I added a ublock rule on mobile to exclude this domain

||polyfill.io^

Any other practical steps that mobile users can take?


Thanks for this, I added it to everywhere I use uBlock Origin.

In case anyone was wondering, here's more info: https://github.com/gorhill/uBlock/wiki/Strict-blocking

This only works at blocking full domains starting from uBO v0.9.3.0. Latest version is v1.58.1, so it's safe to assume most people are up to date. But noting it just in case.

https://github.com/gorhill/uBlock/releases


Its already been blocked by Ublock filters - badware risks: https://github.com/uBlockOrigin/uAssets/blob/master/filters/...


edit:DNS servers for Cloudflare, Quad9, Google and Level3 all cname cdn.polyfill.io to Cloudflare domains.

I assume those are the alt endpoints that Cloudflare setup in Feb. Lots of folks seem to be protected now.

Cloudflare notice: https://blog.cloudflare.com/polyfill-io-now-available-on-cdn...

Feb discussion: https://github.com/formatjs/formatjs/issues/4363

(edit:withdrawn) For my use cases, I made the local DNS servers authoritative for polyfill.io. Every subdomain gets a Server Failed error.

Might work for pihole too.


Might as well block googie-anaiytics.com while you're at it


I just set polyfill.io to Untrusted in NoScript.


Just added to my pihole.


But Microsoft Azure for GitHub ScanningPoint 2024 is SOC2 compliant. How could this happen?


Probably those auditors following playbook from 2008. Using ..VPN, AD, umm pass.


Clientside mitigation: install noscript.

https://addons.mozilla.org/en-US/firefox/addon/noscript/

You can’t expect to remain secure on the modern web while running arbitrary javascript from anyone and everyone.


If you're running uBlock Origin, you can also add it to your filters.

discussed here: https://news.ycombinator.com/item?id=40792322

I wonder if it would just better to edit your /etc/hosts file and add something like to this to it:

    127.0.0.1 polyfill.io 
    127.0.0.1 www.polyfill.io
    127.0.0.1 cdn.polyfill.io
     
I use both FF and Chrome, and I use multiple profiles on Chrome, so I have to go in an add the filter for each profile and browser. At least for my personal laptop where I can do this. Not sure about my work one.

edit: looks like uBlock is already blocking it: https://news.ycombinator.com/item?id=40796938


Nice: to be secure on the web, you just need to install an add-on which needs to:

Access browser tabs Store unlimited amount of client-side data Access browser activity during navigation Access your data for all websites


Yes that is unfortunate. Safari had the option to toggle JavaScript via Shortcut until recently, but it was removed. The only browser I know which can easily toggle JavaScript now is Brave.

But uBlock Origin has that functionality, too, and I guess most people who would care about JavaScript have that already enabled anyways.

The web is so much nicer without JavaScript but easily activating it (via cmd-J) once it seems necessary without reloading.


>Yes that is unfortunate. Safari had the option to toggle JavaScript via Shortcut until recently, but it was removed. The only browser I know which can easily toggle JavaScript now is Brave.

There's a Firefox Addon[0] for that.

[0] https://addons.mozilla.org/en-US/firefox/addon/javascript-to...


Unless you design your own silicon, build your own pc and peripherals, and write all your own software, there's always going to be a level of trust involved. But at least NoScript is FOSS so you can in theory examine the source code yourself.

https://github.com/hackademix



I think there’s a toggle to just disable JavaScript entirely somewhere in the menus, but it is sort of inconvenient, because you can’t selectively enable sites that are too poorly coded to run without JavaScript.

Mozilla has marked NoScript as a recommended extension, which is supposed to mean they reviewed the code. Did they do it perfectly? I don’t know. But the same logic could be applied to the patches they receive for their browser itself, right? It’s all just code that we trust them to audit correctly.


You don't need an addon for https://safebrowsing.google.com


>Google’s Ads Security team uses Safe Browsing to make sure that Google ads do not promote dangerous pages.

This is already wrong in my experience. I had a coworker panicking two weeks ago because he googled youtube and clicked the first link. Which turned out to be a fake ransomware page ad designed to get you to call a scam call center.

There is no such thing as a safe ad anymore because no one is policing them appropriately. Especially if something like this can happen when searching a service google themselves owns.


> "no such thing as a safe ad anymore because no one is policing them appropriately"

That's one of the most Kafkaesque sentences i have read in a while.


You can't expect any modern page to work without JavaScript either. And auditing every page's JavaScript yourself isn't exactly feasible.


I really wish we didn't have to trust the websites we go to. This exploit is a good example of the problem. I go to a website and something in the content of the website forces you to go to an entirely different place. I didn't click on a link, it can just do this based on any criteria it wants to apply.


I can’t believe the Financial Times didn’t secure the domain for the project. They backed it for a long time then dropped support of it.

I wonder if the polyfills themselves are compromised, because you can build your poly fill bundles via npm packages that are published by JakeChampion


I am always shocked at the amount of sites that still use remotely hosted dependencies.

Sure, in a true supply chain attack, you wouldn’t be able to trust npm or github or whatever, but atleast you wouldn’t be compromised immediately.

And even outside of security concerns, why would you ever allow someone else to deploy code to your site without testing it first?


I’ll go ahead and make an assumption that the Chinese government was involved. Countries badly need to figure out a way to punish bad actors in cybersecurity realms. It seems that this type of attack, along with many others, are quickly ramping up. If there isn’t competent policy in this area, it could become very dangerous.


Why would the Chinese government use this to load a gambling website? I'm sure there are many better uses that would be more subtle that they could come up with this opportunity.


The problem is that they have a CDN that can serve up custom JS depending on the headers and IP of the visitor, which enables remarkably precise targeting. No reason why they couldn’t, for example, send a targeted payload to people in a specific geographic area (by IP) and/or a specific language (by Accept-Language header). The sports betting stuff could be diversion in that case.

Of course, I don’t personally believe this to be the case; Occam’s Razor says this is a straightforward case of someone deciding they want to start monetizing their acquisition.


> No reason why they couldn’t, for example, send a targeted payload to people in a specific geographic area (by IP) and/or a specific language (by Accept-Language header). The sports betting stuff could be diversion in that case.

What I don't understand is why blow it sending people to a gambling site? They could have kept it going and sent payloads to specific targets making use of zero day browser bugs. Now they can still do that but to far fewer sites.


MathJax [1] still recommends this way of using it:

  <script src="https://polyfill.io/v3/polyfill.min.js?features=es6"></script>
  <script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
Therefore, if you ever used MathJax, by possibly copying the above and forgetting, make sure to patch it out.

[1] https://www.mathjax.org/#gettingstarted

EDIT: To clarify, patch out just the polyfill (the first line in the snippet). You can of course keep using MathJax, and the second line alone should be enough. (Though still better host a copy by yourself, just in case).


I just filed a PR to remove that: https://github.com/mathjax/MathJax-website/pull/102


One of these days we're going to learn our lesson and just write our own damned code.


Some of us already have learned that lesson. Imagine how it feels to have to watch the same people touch the same hot stove over and over again. And people wonder why programmers quit and go into farming...


We won't, as can be seen by everyone recommending having an LLM write your code.


One has to admit the game of cat and mouse that the web browser brought about has been quite valuable for advancing security as a field, unfortunately this sort of thing seems like we have to get burned to learn the pan is hot and mother warning us is not enough.

Pay your security teams more, people.


Security is everyone’s job. You can’t outsource responsibility. The security tram should be compensated fairly but they should also be trusted within the organization. If you want to build secure software then realign incentives. There’s more to that than pay.


> Security is everyone’s job.

Agree. There is something missing from the internet, and that is "Programmer Citizenship". As soon as someone pushes code to a repo, he has to prove his citizenship first, the good old fashioned way by handing his identity to the owner of the repo. His digital identity of course.

As long as the identity is real, and is associated with a clean reputation, then code can be accepted with very little risk. When the reputation might not be so great, then new code has to be double checked before any merge into main.


> One has to admit the game of cat and mouse that the web browser brought about

Supply chain attacks can affect any toolchain, command line tools, games, everything under the sun.


How hard is it to download your includes and host them on your own domain?

I know sometimes it's not possible, but whenever I can I always do this. If only because it means that if the remote version changes to a new version it doesn't break my code.


Or just ship local copies of your dependencies. It's not that hard.


... and all of _their_ dependencies. And read through them all to make sure that the local copy that you shipped didn't actually include a deliberately obfuscated exfiltration routine.


Bet we won’t.


Or just use subresource integrity


Subresource integrity wouldn't work with Polyfill.io scripts, since they dynamically changed based on user agent.


yeah let me just write an entire operating system to create a to-do list app


Fortunately, people have already designed an operating system for you. You can either use it, or embed six more half-assed versions into a web browser and then ship that instead. Up to you.


Then you get to write 10x the vulnerabilities yourself and not have nearly the same chance of any of them getting disclosed to you!


That argument doesn't seem to be aging well.

I'd say that good [emphasis on "good"] coders can write very secure code. There's fundamental stuff, like encryption algos, that should be sourced from common (well-known and trusted) sources, but when we load in 100K of JS, so we can animate a disclosure triangle, I think it might not be a bad time to consider learning to do that, ourselves.


It's impossible to tell how that argument is aging since we don't have the counterfactual.


So if you can't trust your developers to manage dependencies, and you can't trust them to write dependencies... why have you hired them at all? Seriously, what do they actually do? Hire people who can program, it isn't hard.


In times of widespread attacks, it would be better to list out actions that affected parties can take. Here is what I found:

- remove it fully (as per original author). It is no more needed - use alternate cdns (from fastly or cloudflare)

Also as a good practice, use SRI (though it wouldn’t have helped in this attack)

I posted a note here: https://cpn.jjude.com/@jjude/statuses/01J195H28FZWJTN7EKT9JW...

Please add any actions that devs and non-devs can take to mitigate this attack.


The phrase "supply chain attack" makes it sound like it's some big, hard to avoid problem. But almost always, it's just developer negligence:

1. Developer allows some organization to inject arbitrary code in the developer's system

2. Organization injects malicious code

3. Developer acts all surprised and calls it an "attack"

Maybe don't trust 3rd parties so much? There's technical means to avoid it.

Calling this situation a supply chain attack is like saying you were victim of a "ethanol consumption attack" when you get drunk from drinking too many beers.


It's called a supply chain attack to displace the blame on the profitable organization that negligently uses this code onto the unpaid developers who lost control of it.

As if expecting lone OSS developers that you don't donate any money towards somehow being able to stand up against the attacks of nation states is a rational position to take.


In this case, the developer sold the user account & repository for money (no ownership change to monitor).. so if you were not privy to that transaction, you really couldn't "easily" avoid this without e.g. forking every repo you depend on and bringing it in house or some other likely painful defense mechanism to implement


That’s why businesses pay Redhat, Qt, Unity,… Clear contracts that reduces the risk of compromised dependencies. Or you vet your dependencies (it helps when you don’t have a lot)


What good does this comment do beside allow you to gloat and put others down? Like, Christ. Are you telling me that you’d ever speak this way to someone in person?

I have no doubt that every single person in this thread understands what a supply chain attack is.

You are arguing over semantics in an incredibly naive way. Trust relationships exist both in business and in society generally. It’s worth calling out attacks against trust relationships as what they are: attacks.


All other things being equal, a computer system that doesn't depend on trusting some external entity is better than one that does.

Sometimes, trusting is inevitable (e.g. SSL certificate authorities), but in this case, it was very much a choice on part of the developers.


I love the ethanol consumption attack thing :-)


This has been expected since 4 months ago. I wish the past posts (https://news.ycombinator.com/item?id=39517757, https://news.ycombinator.com/item?id=39523523) got more traction and people implemented mitigations on a larger scale before exploitation started.


So what did the malware actually do?


The first time a user, who uses a phone, open a website through an ads ( google ads or facebook ) with this link, it will redirect user to a malicious website.

The request send to https://cdn.polyfill.io/v2/polyfill.min.js needs to match the following format:

        Request for the first time from a unique IP, with a unique User-Agent.
        User-Agent match that of a phone, we used an iphone's user agent ( Mozilla/5.0 (iPhone14,2; U; CPU iPhone OS 14_0 like Mac OS X) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Mobile/15E148 Safari/602.1 ).
        Referer from a reputable website that installed polyfill.
        Accept /
        Accept-Encoding gzip, deflate, br, zstd
        Delete all cookies
The request will return the original polyfill code, appended with a piece of malicious code. This code will make a run javascript from https://www.googie-anaiytics.com/ga.js , if the device is not a laptop. You can reproduce this multiple time on the same machine by changing User agent gently, (ex: change Mozilla/5.0 to Mozilla/6.0). Sometimes the server will just timeout or return code without the injection, but it should work most of the time.

The javascript on https://www.googie-anaiytics.com/ga.js will redirect users to a malicious website based on some condition check for a number of conditions before running ( useragent, screen width, ...) to ensure it is a phone, the entry point is at the end:

bdtjfg||cnzfg||wolafg||mattoo||aanaly||ggmana||aplausix||statcct?setTimeout(check_tiaozhuan,-0x4*0x922+0x1ebd+0xd9b):check_tiaozhuan();

The code has some protection built-in, so if it is run on a non-suitable environment, it will attempt to relocate a lot of memory to freeze the current devices. It also re-translate all attribute name access with _0x42bcd7 .

https://github.com/polyfillpolyfill/polyfill-service/issues/...


If you don't use version pinning, you are rolling the dice every single build.


So who was the previous owner that sold us all out?


This was obviously written by a JS beginner (ex: basic mistake on the "missing" function, the author thinks that `this` refers to the function?, same about the `undefined` implicit global assignment..)


Since this is effecting end users of websites directly. So instead of waiting for developer to fix it, this could help you stay protected until then

https://medium.com/@wrongsahil/protecting-yourself-from-poly...


> "If you own a website, loading a script implies an incredible relationship of trust with that third party," he Xeeted at the time.

Are people actually calling Tweets "Xeets" now?


I propose "X-crete" as the new verb and "X-cretion" for the final product.


I was thinking about "Xitting out", but yours is better.


Xeet in the hip hop industry means something completely different.


How would you even pronounce it?


Maybe like "Zeet" or probably just don't bother because it's stupid


When you are selling an open source project, what outcome do you expect? People who are interested in the project non-financially will demonstrate their value in other ways (PRs, reviews, docs, etc) leading to the more common succession of maintainers without exchanging money. I don't think it's reasonable for software authors, who take the route of giving their projects to buyers rather than top contributors, to act surprised when the consequences roll in.


Is this save once more because it moved back to Cloudflare?

    ;; QUESTION SECTION:
    ;cdn.polyfill.io.  IN A
    
    ;; ANSWER SECTION:
    cdn.polyfill.io. 553 IN CNAME cdn.polyfill.io.cdn.cloudflare.net.
    cdn.polyfill.io.cdn.cloudflare.net. 253 IN A 172.67.209.56
    cdn.polyfill.io.cdn.cloudflare.net. 253 IN A 104.21.23.55
Or is Cloudflare warning about this and hosting the attack site?


It is still not safe. The new owner is using Cloudflare as a CDN to appear more legitimate, but the responses are still fully controlled by the malicious backend. This has been the case since the end of February.


Is anyone taking the steps to inform affected site owners?


Isn't there some hash in the script tag for these kinds of stuff? Maybe that should be mandatory or something? This broke half the internet anyway.


> However, in February this year, a Chinese company bought the domain and the Github account.

Github accounts of open source software are now for sale?


I don't understand that either. The original repo was owned by the Finantial-Times account [0]. For sure that account has not been sold.

I would love to hear how the deal was made to include the repository transfer. It is really surprising.

[0] https://web.archive.org/web/20230524161733/https://github.co...


sadly everyone has a price, its pretty much tech real estate

many js devs have used that on their resumes/portfolio - "I own a 10-line library downloaded over 1 billion times!"

^^ pretty easy target to poach github accounts for mass malware spreading.


start projects with

    Content-Security-Policy: default-src 'self';
then add narrow, individually justified exceptions.


That wouldn't have helped here.

Anyone adding CSPs would have had polyfill.io as permitted... which allowed this attack.


The justified in "justified exceptions" is important. Whenever I review CSP additions I ask the following questions

- do we have a trust relationship with the vendor - is it strictly required - what are the alternatives - blast radius

Adding script-src has a pretty high blast-radius. There is no relationship with an unpaid CDN. Alternatives can be vendoring a static polyfill script, or just fixing a few functions manually, depending on desired level of browser support.

So it would not have passed.

Adding an exception for 3rd-party images would have to clear a much lower bar for example but even there GDPR or information leakage could be a concern.

CSP changes are just a great point to stop and think about how the frontend interacts with the rest of the world. If you just rubber-stamp everything then of course it wouldn't have any effect.


> Sansec security forensics team, which on Tuesday claimed Funnull, a Chinese CDN operator that bought the polyfill.io domain and its associated GitHub account in February, has since been using the service in a supply chain attack.

Quite the CDN, eh? :/


They are denying everything on their Twitter @Polyfill_Global lol

https://x.com/Polyfill_Global



I tend to avoid [other people's] dependencies like the plague. Not just for security, but also for performance and Quality. I think I have a grand total of two (2), in all my repos, and they are ones that I can reengineer, if I absolutely need to (and I have looked them over, and basically am OK with them, in their current contexts).

But I use a lot of dependencies; it's just that I've written most of them.

What has been annoying AF, is the inevitable sneer, when I mention that I like to avoid dependencies.

They usually mumble something like "Yeah, but DRY...", or "That's a SOLVED problem!" etc. I don't usually hang around to hear it.


In most professional development contexts your puritan approach is simply unjustified. You’re obviously feeling very smug now, but that feeling is not justified. I note that you say “in all my repos”. What is the context in which these repositories exist? Are they your hobby projects and not of any real importance? Do you have literally anyone that can call you out for the wasted effort of reimplementing a web server in assembly because you don’t trust dependencies? I hope that you’re making your own artisanal silicon from sand you dug yourself from your family farm. Haven’t you heard about all those Intel / AMD backdoors? Sheesh.

EDIT: you’re an iOS developer. Apples and oranges. Please don’t stand on top of the mountain of iOS’s fat standard library and act like it’s a design choice that you made.


> In most professional development contexts your puritan approach is simply unjustified. You’re obviously feeling very smug now, but that feeling is not justified. I note that you say “in all my repos”. What is the context in which these repositories exist? Are they your hobby projects and not of any real importance? Do you have literally anyone that can call you out for the wasted effort of reimplementing a web server in assembly because you don’t trust dependencies? I hope that you’re making your own artisanal silicon from sand you dug yourself from your family farm. Haven’t you heard about all those Intel / AMD backdoors? Sheesh.

EDIT: you’re an iOS developer. Apples and oranges. Please don’t stand on top of the mountain of iOS’s fat standard library and act like it’s a design choice that you made.

--

ahem, yeah...

[EDIT] Actually, no. Unlike most Internet trolls, I don't get off on the misfortune of others. I -literally-, was not posting it to be smug. I was simply sharing my approach, which is hard work, but also one I do for a reason.

In fact, most of the grief I get from folks, is smugness, and derision. A lot of that "Old man is caveman" stuff; just like what you wrote. I've been in a "professional development context" since 1986 or so, so there's a vanishingly small chance that I may actually be aware of the ins and outs of shipping software.

I was simply mentioning my own personal approach -and I have done a lot of Web stuff, over the years-, along with a personal pet peeve, about how people tend to be quite smug to me, because of my approach.

You have delivered an insult, where one was not needed. It was unkind, unsought, undeserved, and unnecessary.

Always glad to be of service.

BTW. It would take anyone, literally, 1 minute to find all my repos.


Maybe this will lead to everyone building their own stuffs. A pretty good outcome for SWEs, isn't it?


I rarely use a CDN for that reason; I prefer to have the libraries hosted with the site.


Is there an easy way to go about scanning my domains to see if I'm impacted?


I don't think most people understand that although '100k+ sites are infected', the actual malware code runs on the visitors (our) machines, the servers themselves are fine ! I read the news yesterday and was 'yeah, that's too bad', but only later did it dawn on me that I'm the one that's potentially going to have to deal with the consequences.

This needs to be mitigated client side, not rely on the good will of the administrators.


uBlock Origin had filters available within only a few hours of the discovery.

And most browsers already have a builtin domain blacklist via's google safe browsing list, but it's mostly for phishing pages rather than covering every single possibility.


So glad I removed polyfill as soon as all that nonsense went down a few months ago.


Sadly this is how the Web works.

We need a much more decentralized alternative, that lets static files be served based on content hashes. For now browser extensions are the only way. It’s sad but the Web doesn’t protect clients from servers. Only servers from clients.


Before getting on your soapbox about the decentralised web, please look at what Polyfill actually did. I’m not sure what you’re actually suggesting, but the closest remotely viable thing (subresource integrity) already exists. It simply wouldn’t work in Polyfill’s case because Polyfill dynamically selected the ‘right’ code to send based on user agent.

As usual this problem has nothing to do with centralisation v decentralisation. Are you suggesting that people vet the third parties used by the sites visit? How does that sound practical for anyone other than ideological nerds?


You seem to have a lot of hate for decentralized solutions. Even ones as simple as browsers providing an alternative to DNS and serving static files.

You don’t seem to care about protecting the user.

In a secure environment for the user why the f should the only option be to trust a server to choose which file to send based on the user-agent-reported ID?

The user-agent knows its own ID and can request the static file of its choice from the network. Vetted static Javascript running in the static website can do that.

You are so whipped by the Web’s inversion of power, that you can’t even seriously consider that alternative when writing your post.

You say it’s too hard for every person to vet static JS bundles and verify that they have no instructions to phone home or otherwise mess with you. Well, that’s why there are third party auditing companies, they can just sign and publicly post approvals of specific bundle hashes, that your browser can then check and make sure at least 2 reputable audits approved that version. Just like it does for chains of certificates when loading a site. In fact, you are currently sleepwalkinf into learned helplessness by AI models hosted by others, the same way.

At least the HN crowd drew the line at Web Environment Integrity by Google. It’s like you are an enthusiastic defender of mild slavery but oppose beating and killing the slaves.


Sigstore is doing a lot of interesting work in the code supply chain space. I have my fingers crossed that they find a way to replace the current application code signing racket along the way.


Sorry - I must have missed this - what’s the application code signing racket?


Shipping an app that runs on Windows without scary warnings required ~400USD/year code signing certificate, unless you release through the Microsoft Store.


The fact two Fastly employees are involved does immense damage for its brand.

Had it not been for the greed of Fastly's employee, this whole thing could've been avoided.


> Had it not been for the greed of Fastly's employee

I don't think he had anything to do with the sale.


Name and shame the sports betting site


Name and shame people who put unwarranted trust in third parties to save 2ms on their requests.


Is there any evidence that whoever is currently behind polyfill.io is a "Chinese company," as most reports claim?

The company known as "Funnull" appears to be based in the Philippines. The phone number associated with their WhatsApp and WeChat accounts has country code +63. Also they appear to be based at 30th Street, 14th Floor, Net Cube Center, E-Square Zone, Metro Manila, Taguig, Philippines at least if the information on this page [1], purportedly from the founder of the company that was acquired and renamed Funnull, is to be trusted (Click "View Source," then run it through Google Translate)

Claude translation:

===

> Announcement

> I am the former owner of the original Philippine company Anjie CDN. After my incident, the company was managed by my family. Due to their isolation and lack of support, they were persuaded by unscrupulous individuals to sell the company. Subsequently, the company was acquired and renamed as Fangneng CDN, which later developed into the ACB Group.

> This is precisely what I need to clarify: Fangneng CDN company and the ACB Group have no connection with me or my family. Recently, many companies have contacted my family and threatened them, believing that Fangneng CDN company has stolen important information such as member data and financial transactions through client domain names using infiltration and mirroring techniques, and has stolen customer programs through server rental services. This matter is unrelated to me and my family. Please contact Fangneng CDN company to resolve these issues.

> I reiterate: As my family has long since closed Anjie CDN company, any events that occurred afterwards are unrelated to me and my family!

> Note: Due to my personal issues, this statement is being released on my behalf by my family.

> Fangneng CDN's actual office location: 30th Street 14TH Floor, Net Cube Center, E-Square Zone, Metro Manila, Taguig, Philippines.

> Due to the release of this statement, the Anjie domain name has been reactivated by Fangneng CDN company. Currently, Anjie and Fangneng are one company, so I once again declare that any conflicts and disputes arising from Anjie CDN and Fangneng CDN companies are not related to me or my family in any way!

> First publication date of the announcement: May 18, 2022

> Announcement update date: May 18, 2024

[1] Original URL: https://cc.bingj.com/cache.aspx?q=%E5%AE%89%E6%8D%B7%E8%BF%9...

Archive.today version: https://archive.is/uDNoV


Update: The Register has updated their article to be less committal. See the diff at https://www.diffchecker.com/W3wmc71w/


[flagged]


> Relying on third-party, un-audited code was acceptable when the majority of contributors were from the West.

No, it wasn’t. It was always a really bad idea, and it’s been exploited repeatedly since the start.


Yeah, maybe, but for the last 20 years there really weren't enough incidents to make it an activity that was too risky to do outside of some really security sensitive applications.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: