The average size of Web pages is now the average size of a Doom install

jessriedel · on April 22, 2016

I'm skeptical that developers talking to each other about how bad web bloat is will change anything. They will still face the same incentives in terms of ad revenue, costs of optimization, etc.

Here's a random idea that might have more potential: create an adblocker browser plugin that also colors URLs based on how slow they are expected to load, e.g., smoothly from blue to red. The scores could be centrally calculated for the top N URLs on the web (or perhaps, an estimate based on the top M domain names and other signals) and downloaded to the client (so no privacy issues). People will very quickly learn to associate red URLs with the feeling "ugh, this page is taking forever". So long as the metric was reasonably robust to gaming, websites would face a greater pressure to cut the bloat. And yet, it's still ultimately feedback determined by a user's revealed preferences, based on what they think is worth waiting how long for, rather than a developer's guess about what's reasonable.

tyingq · on April 22, 2016

I think a plugin that enables the chrome "regular 2G" throttling (developer tools / network tab), permanently, for anyone that works in marketing or web development might help :)

Edit: Joking aside, that throttling feature is a nice easy way to let a dev team, or business counterpart, see what their site is like for say, a customer with a low-end DSL connection: https://developers.google.com/web/tools/chrome-devtools/prof...

logotype · on April 22, 2016

No need for a plugin if you are running Linux or OSX:

# Bandwidth trottling, enabling 150kB/s on port 80

sudo ipfw pipe 1 config bw 15KByte/s

sudo ipfw add 1 pipe 1 src-port 80

# Bandwidth trottling, disable

sudo ipfw delete 1

More useful stuff here as well, please star the repo :) https://github.com/logotype/useful-unix-stuff/blob/master/us...

jloughry · on April 22, 2016

It used to work; unfortunately, Apple removed ipfw from recent versions of OS X.

The new method uses dummynet and pf but isn't reliable and I've never got it to work consistently, despite trying for hours and hours.

The only method that works reliably on recent versions of OS X is the free Network Link Conditioner. It is absolutely bulletproof.

Edited to add: Network Link Conditioner seems to use pf and dummynet under the hood; you can see the rules appear. But there's an interaction with the nlcd daemon that I don't understand yet. I want to do protocol-specific bandwidth throttling and I've not got that to work with nlcd interfering. But if you can live with throttling all traffic on the box, NLC works a treat.

kedean · on April 22, 2016

It's not a plugin, it's built into chrome. His joke was about a plugin that enables it permanently for certain people.

parksy · on April 22, 2016

Not to mention my account managers and BDM's don't even know what a linux is.

mort96 · on April 23, 2016

Linux? But he was talking about OS X...

autokad · on April 22, 2016

sounds like that could have a lot of unintended consequences. especially since so much data is now transmitted over port 80

Joeri · on April 22, 2016

I have to think that the devs want to make the page lighter, but their managers keep pushing more tracking and ad networks at them that they're forced to integrate.

tracker1 · on April 22, 2016

Yeah, but I've seen projects using bootstrap, themes, extensions, and extras... in addition to jQueryUI & Mobile all loaded... as well as 2-3 versions of jQuery in different script tags.

That doesn't reflect responsible, light-minded development. Hell, load a couple of Airline website home pages... I don't think any of them are loading uncer 400kb of JS. that's for the homepage alone. Let alone the number of individual assets being requested (less of an issue once http/2 takes hold.. but still.

threesixandnine · on April 22, 2016

This. I have my own site slim and lean even with javascript (analytics) in there and it loads in 300-500ms. I developed a site for a client that was slim and lean and then it started. Sliders, full page background images, photos of random people, 3x analytics, this and that. It became a mammoth of 2-3mb and loading times of 1.5-3s.

When we tracked conversion the best converting page on the site was one made specifically for that and it is 100kb and loaded instantly. Two images and lots of text. They still insist on slow, beautiful pages elsewhere instead of making them convert as well.

bobwaycott · on April 22, 2016

I'm sure that's true of a certain class of developer. However, given what I've seen in the frontend & JS community in past years, I don't think there is a pervasive desire to make pages lighter. If it is, very few of them are showing that in the way they actually build things.

tracker1 · on April 22, 2016

Especially, since many frameworks are really easy to use from source (Bootstrap comes to mind), but almost nobody does... they have the framework, the patched css, and then in the project, maybe there's scss/less etc. Instead of having a project that starts off composed with the appropriate pieces of bootstrap.

That's just bootstrap, not even the shear number of jQuery UI bits floating around, and heaven forbid you see both in a project.. and all the "bugs" that the input in one page doesn't match others. sigh

tyingq · on April 22, 2016

That's true in some cases, but there are lots of big company sites with no ad networks that are crazy bloated and slow. Some have trackers, but would be bloated (2,3,4MB+ over the wire) without them as well.

davidivadavid · on April 22, 2016

Doesn't Segment.io address such issues?

Pxtl · on April 22, 2016

To me the bigger problem is the javascript bloat making it crawl on a mobile device with 1Gb of ram and a 1.3 ghz processor.

bobwaycott · on April 22, 2016

Sounds like the perfect addition to corporate provisioning policies/profiles for all their machines!

Razengan · on April 22, 2016

I think a more radical departure is needed. It's about time to acknowledge that the web is increasingly being used to access full-blown applications more often than "webpages."

Web browsers have more or less become mini operating systems, running elaborate virtual machines. There's way too much complexity for everyone involved — from web devs and browser devs to the users and the people who maintain the standards, then there's the devs who have to make native clients for the web apps — just to deliver products that don't have half the power of OS-native software. Everyone has to keep reinventing the wheel, like with WebAssembly, to fix problems that don't have to be there in the first place, not anymore:

Thanks to smartphones, people are already familiar with the modern concept of the standalone app; why not just make downloading an OS-native binary as easy as typing in a web address, on every OS?

Say if I press Cmd+Space on a Mac and type "Facebook" in Spotlight, it immediately begins downloading the native OS X Facebook app. The UI would be described in a format that can be incrementally downloaded, so the experience remains identical to browsing the web, except with full access to the OS's features, like an icon in the Dock and notifications and everything.

TL;DR: Instead of investing resources in browser development, Android/iOS/OSX/Windows should just work on a better standard mechanism to deliver native apps instead.

dasil003 · on April 22, 2016

I disagree.

This is a backward-looking argument which ignores the unique benefits of the web which made it inevitable that it would evolve into an application platform, regardless of how tortured the results may feel.

The web is the first truly cross-platform development environment. It is not controlled by a single vendor, and anyone implementing a new computing device must support the web (just stop for a second and consider from a historical perspective what a monumental accomplishment that is). Furthermore, it allows casual access of content and applications without any installation requirement. It comes with a reasonable security model baked-in, which, while imperfect, gets far more attention than most OS vendor sandboxing schemes. Last but not least, the web's primitive is a simple page, which is far more useful than an app as a primitive—for every app someone installs they probably visit 100 web pages for information that they would never consider installing an app for.

I agree that the web is sort of abused as an application platform, the problem is there is no central planning method which will achieve its benefits in more app-oriented fashion. No company has the power to create a standard for binary app deliverables that will have anywhere near the reach of the web. And even if one could consolidate the power and mastermind such a thing, I feel like it would run squarely into Gall's Law and have twice as many warts as the web.

timr · on April 22, 2016

"The web is the first truly cross-platform development environment."

No it isn't. Not even close. It's maybe the first cross-platform development "environment" of which millenials are widely aware. But it's only an "environment" in the most ecological sense -- it's a collection of ugly hacks, each building upon the other, with the sort of complexity and incomprehensibility and interdependency of organisms you'd expect to find in a dung heap.

"Last but not least, the web's primitive is a simple page, which is far more useful than an app as a primitive"

For whom, exactly? You're just begging the question.

I'll grant you that "installing" an app is more burdensome for users than browsing to a web page, but the amount of developer time spent (badly) shoe-horning UI development problems (that we solved in the 90s) into the "page" metaphor is mind-boggling. In retrospect, the Java applet approach seems like a missed opportunity.

The proper reaction to something like React, for example, should be shame, not pride. We've finally managed to kludge together something vaguely resembling the UI development platform we had in windows 3, but with less consistency, greater resource consumption, and at the expense of everything that made the web good in the first place. And for what reason? It's not as if these "pages" work as webpages anymore.

A proper "application development environment" for the web would be something that discards the page model entirely, and replaces it with a set of open components that resembles what we've had for decades in the world of desktop application development.

elviejo · on April 22, 2016

You are so right. The web as an application delivery platform sucks. And we are no even capable of producing UIs with the same level of polishment I did in visual basic 2.0 in 1996 or so.

Alan Kay has expressed the same filling.

PS to downvoters if you have not used a proper interface designer such a QT or Delphi, then you don't know what we mean. Please watch some videos to decide if the state of the art (angular and react) is what we should be using in 2016.

dasil003 · on April 22, 2016

The downvotes are because he is not engaging with my point. Never did I say the web is a proper application development environment. My point was that you can't create a proper application development environment that is both an open and defacto standard the way the web is.

timr · on April 22, 2016

How am I not engaging with your point? In response to a comment pointing out how we need to re-think web application development, you said:

"The web is the first truly cross-platform development environment."

...and then talked a bit about how it's open (yeah, ok, sure), and then you said it's not really a good application development environment (obviously).

I'm saying, your entire premise is wrong: it isn't an application development environment, any more than a box of legos is a "housing development environment". People have built houses out of legos, but that doesn't make "lego" a building material. It's a big, messy, nasty hack.

The fact that it's "open" is a non-sequitur response to "it's the wrong tool for the job", which is what the OP (and I, and elviejo) are arguing. It's also not a legitimate response to argue that any re-thinking of the model has to come from a company, or otherwise not be "open".

The reason that web apps happened is because web apps started as a hack. That doesn't mean we can't change the paradigm, but to do that, we have to stop defending the current model.

(Realistically, the reason I'm getting downvoted probably has more to do with my willingness to call out React as a pile of garbage than with the substance of the greater argument. C'est la vie...it's actually pretty amusing to watch the comment fluctuate between -3 and +3...)

dasil003 · on April 23, 2016

> That doesn't mean we can't change the paradigm, but to do that, we have to stop defending the current model.

Here's the crux of our disagreement. You believe that the web is such a broken application platform that it is possible to convince enough vendors and people to get behind a better solution. However, I (despite your presumptuous implication that I'm a millenial), have been around long enough to know that will never happen. Web standards will continue iterating, and companies will continue building apps on the web, even the most powerful app platforms today such as iOS and Android for all their market power can not stop this force. The reason is because it's a platform that works. The man-millenia behind the web can not be reproduced and focused into a single organized effort. You might as well argue that we replace Linux with Plan 9, it doesn't matter how much passion you have and how sound your technical argument is, Linux, like the web, is entrenched. It's gone beyond the agency of individual humans and organizations to become an emergent effect.

That's not to say that the web might not some be supplanted by something better, but it won't come because of angry engineers wringing their hands about how terrible the web is. It will come from something unexpected that solves a different problem, but in a much simpler and more elegant way, and over time it will be the thin edge of the wedge where it evolves and develops into a web killer.

Maybe I'm just cynical and lack vision, perhaps you can go start a movement to prove me wrong. I'll happily eat my hat and rejoice at your accomplishments when that time comes.

timr · on April 23, 2016

"You believe that the web is such a broken application platform that it is possible to convince enough vendors and people to get behind a better solution. However, I...have been around long enough to know that will never happen."

"That's not to say that the web might not some be supplanted by something better..."

Whomever wrote the first paragraph of your comment should get in touch with the person who wrote the second paragraph.

OK, seriously, though, let's summarize:

1) Person says "web development sucks, here's why: $REASONS"

2) You reply: "it's the only truly cross-platform development environment ever"

3) I (and others) reply: "no, it really isn't. it isn't even a development environment, by any reasonable measure."

Now you're putting words in my mouth about convincing vendors and starting movements. I'm not trying to start a revolution here, just trying to counter the notion that we can't do any better than the pile of junk we've adopted. You don't have to love your captors!

I have no idea if someone will come up with a revolutionary, grand unified solution tomorrow, but I know that this process starts with the acknowledgement that what we have sucks, and that we have lots of examples of better solutions to work from. Hell...just having a well-defined set of 1995-era UI components defined as a standard would be a quantum leap forward in terms of application development.

dasil003 · on April 23, 2016

The irony is I understand your qualitative opinion of the web, and I generally agree with it. What I believe makes you unable to see my argument is an inability to separate technical excellence from the market dynamics that govern adoption.

Declaring the web "not even a development environment" is just absolutist rhetoric that can in no way further the conversation. If you define "development environment" as a traditional GUI toolkit then your're just creating a tautology to satisfy your own outrage.

okc · on April 23, 2016

This is a great discussion. What is it about the English language that makes it so much easier to oppose someone than express nuances in general opinion? I would like to see more discussions like this based at implementation level, surely something valuable and innovative is being grasped at by both sides.

Khaine · on April 26, 2016

Lets be honest, the web is a developer environment in the same way that a paper aeroplane is a passenger plane. I mean I'm sure its possible to create a 747 from paper, but do you really want to?

Why is it that a new javascript framework pops up each week? Its because the web as a developer environment is deficient. Despite it being standardised so much stuff doesn't work without kludges in each browser

Razengan · on April 23, 2016

> It will come from something unexpected that solves a different problem, but in a much simpler and more elegant way, and over time it will be the thin edge of the wedge where it evolves and develops into a web killer.

So.. app stores?

It has already begun. The most popular webapps (Facebook, Twitter etc.) already have native clients in Android and iOS. I believe the majority of people already prefer and use the native FB/Twitter apps more often than accessing the FB/Twitter websites. So it's already obvious that native apps must be more convenient.

Right now however, app stores are a little clumsier to navigate compared to browsers.

For webapps:

• you have to open the browser,

• type in the address OR

• use a web search if you don't know the exact address.

But for apps:

• you have to open the app store,

• search for the app,

• potentially filter through unofficial third-party software,

• download the app, possibly after entering your credentials,

• navigate to the app icon,

• authorize any security permissions on startup (in the case of Android or badly-designed iOS apps.)

We just need the Big Three (Apple/Google/Microsoft) to actively acknowledge that app stores can supplant the-web-as-application-platform, and remove some of those hurdles.

Ideally an app store would be akin to searching for a website on Google.com (or duckduckgo.com) with a maximum of one extra click or tap between you and the app.

Apps should also be incrementally downloadable so they're immediately available for use just like a website, and Apple already has begun taking steps toward that with App Thinning.

Ultimately there's no reason why the OS and native apps shouldn't behave just like a web browser, because if web browsers keep advancing and evolving they WILL eventually become the OS, and the end result will be the same to what I'm suggesting anyway.

Currently though, both the native OS side and the web side exist in a state of neither-here-nor-there, considering how most people actually use their devices.

digi_owl · on April 23, 2016

Not so much works, as they have found a way to monetize as a rent.

TeMPOraL · on April 23, 2016

I'm in the middle of reading through The Unix-Haters Handbook and I must say that the arguments against web-as-application-platform (and the attempts to defend it) and are eerily similar to what this book reports about arguments against Unix that were circulated in the 80s and early 90s. Sadly, the fact Unix managed to a) win, and b) fuck up the computing world so badly that people don't even realize how much we've lost doesn't make me hopeful about the future of the web.

Bad stuff seems to win because it's more evolutionarily adapted than well thought out stuff. This happens to hold for programming languages too.

Dr_tldr · on April 23, 2016

Your takeaway of computers going from niche industry to the single largest driver of global economic activity is that the bad stuff won? What an incredibly myopic conclusion.

The recent cross-communication between JavaScript, Elm, and Clojure has been incredibly fruitful but hasn't been noticed by the bitter die-hards. And really, almost all of it could've happened literally 15 years ago with Lisp if the Lisp community hadn't been dismissive, arrogant douchebags that considered JavaScript a worthless toy language.

What's truly sad is that some people would rather be abstractly right while producing nothing of value than work with the dominant paradigm and introduce useful concepts to it.

TeMPOraL · on April 23, 2016

> Your takeaway of computers going from niche industry to the single largest driver of global economic activity is that the bad stuff won? What an incredibly myopic conclusion.

This did not happen thanks to Unix; if anything, you'd probably have to be grateful to Microsoft and Apple for introducing OSes that were end-user-usable. There's a reason the "year of Linux on Desktop" never happened and is always one year from now.

The point of The Unix-Haters Handbook, which also applies very much to modern web is that the so-called "advancement" didn't really bring anything new. It reinvented old things - things we knew how to do right - but in a broken way, full of half-assed hacks that got fossilized because everything else depends on it.

> And really, almost all of it could've happened literally 15 years ago with Lisp if the Lisp community hadn't been dismissive, arrogant douchebags that considered JavaScript a worthless toy language.

I don't know where you're getting that from, but it's probably a good opportunity to remind you that JavaScript was supposed to be Scheme twice, both time it didn't happen because Netscape wanted a Java-looking solution right fucking now to compete first with Java, and then with Microsoft, and somehow no-one thought to pause for the moment and maybe do it right.

(Also don't blame Lisp community for the fact that companies reinvented half of Lisp in XML. Rather ask yourself why most programmers think the history of programming is a linear progression of power from Assembler and C, and why they remain ignorant of anything that happened before ~1985.)

JavaScript got a bad rep because a) it was terribly broken (less so now), and b) because of all the stupid stuff people were writing in it those 15 years ago. But the current problems of the Web are not really the fault of JavaScript, but of the community moving forward at the speed of typing, without stopping for a second and thinking if those layers on layers on layers of complexity are actually needed or useful. Simple landing pages are now built on frameworks that are more complex than what we used to call "Enterprise Edition" 10 years ago.

Dr_tldr · on April 23, 2016

Unix isn't for end users, it's for developers to build on top of to give things to end users. Linux on desktop already happened years and years ago if you work for a tech company, and that's probably about as far as it needs to go.

http://steve-yegge.blogspot.com/2006/04/lisp-is-not-acceptab...

This is Steve Yegge's understanding of the Lisp community, and I should clarify that I don't think the XML monstrosities we all work with are "all their fault", but that, on the whole, the Lisp community and enterprise coders were mutually antagonistic.

Javascript after ES3 really wasn't broken at all, just most people coding in it didn't know how to take advantage of it. No language can prevent someone dedicated to bad code architecture from writing bad code. A Lisp programmer would've found a lot of comfortable features and powerful patterns and been able to share them, but most of their efforts were wasted denouncing everyone outside of their tiny sect. The end result was that most people learning JavaScript were taught how to code like it was broken Java because most of the resources were written by Enterprise Java devs who didn't understand what a fundamentally impoverished language Java is.

TeMPOraL · on April 23, 2016

Thanks for the link to that post. I'd also advise to read through its comment though - some people there, especially Pascal Costanza, point out quite a lot of problems that basically reduce it to ranting of a person who doesn't understand the language and the culture he's writing about ;).

Also, the influx of enterprise patterns into JavaScript is quite a recent phenomenon - personally, I blame Google (who, for a reason I can't understand to this day, embraces enterprise-level Java as their primary platform for everything...), but regardless, the problem with JavaScript culture is mostly that of very fast growth coupled with lack of experience and (probably unwilling) ignorance of the past. Since this community basically controls the Internet, it's hard for voices expressing some restraint and thoughtfulness to get trough the noise.

And I really do recommend The Unix-Haters Handbook. Funny thing is - over a decade ago, when I was acquainting myself with the Linux world (after many years of DOS and Windows experience), I've been noticing and complaining about various things that felt wrong or even asinine. Gradually I got convinced by people I considered smarter than me that those things are not bugs but features, they're how a Good Operating System works, etc. Only now I realize that my intuition back then was right, but I got Stockolm-syndromed to accept the insanity. Like most of the world. The sad thing is, there were better solutions in the past, which once again shows how IT is probably the only industry that's totally ignorant of its own history and constantly running in circles.

sethrin · on April 25, 2016

What part of UHH do you think is not obsolete? I dislike many aspects of Unix, but it seems like 90% of the book is either wrong, meaningless invective, or obsolete. It's hard to view it as of more than historical value -- which is a shame, because Unix is far from perfect.

sethrin · on April 25, 2016

To clarify, I'm about half of the way through the book, and I have found a couple arguably-correct points. Complaints about Usenet and Unix vendors have mostly gone the way of the dodo. The author consistently ignores any distinction between the filesystem and the rest of the OS, which may have been accurate at the time of writing, but hasn't been true for decades. Similarly, we don't distinguish between a given shell and Unix as a whole, even though the book makes explicit mention that other shells exist. And why there is a chapter on Usenet passes understanding. Suggesting that shell expansion be handled by an external library is equally bizarre. So far the valid points are: * command input syntax is inconsistent. I don't know that this has a feasible solution, but it is true. * tar specifically sucks * sendmail configuration sucks

The real crime of UHH is that it merely hates, it does not instruct. When we do find valid criticisms, there is no suggestion for how to fix things, or how other OSes are better at the same role. I've resigned myself to read the entirety, but for all the authors' complaints about not learning anything from history, one can only feel like they have themselves to blame.

Dr_tldr · on April 23, 2016

If you know C++/Perl/PHP but don't know ES6, websockets, Node, etc, would you not say that your opinion about the web being a bad environment for rich applications might be in some way colored by your personal economic interests?

A common defensive mechanism among people with outdated skills is to try to delegitimize new frameworks and technologies in the hopes of convincing the broader community not to use things they don't know.

I'm inferring this from your arguments being driven by analogies and insinuations rather than concrete critique. It's not my intention to attack you personally, but an aggressively dismissive attitude towards unfamiliar concepts should be properly contextualized.

As for React, isn't it more likely that you don't know React very well, have never looked at its internals, and in general don't feel like you have the time or ability to learn much about modern web development?

If you build a few projects with React and still dislike it, good! Your critiques will be a lot more valid and useful at that point, whereas right now...yeah.

sievebrain · on April 23, 2016

I know C++, used to know Perl, and know JavaScript pretty well. The fact that you actually name WebSockets as a technology sort of sums up the issue we are discussing here. WebSockets is not something that can be compared to C++ or even Node. It's a dumb hack which justifies its existence primarily by allowing web apps to circumvent corporate firewall policies.

The web is a joke of an app platform. Those of us who have wider experience of different kinds of programmings see some web devs struggling with this concept and conclude, I think quite reasonably, that the only plausible explanation is lack of experience. This is not due to "outdated skills" - I daresay everyone criticising the web as a platform in this thread has, in fact, written web apps. It's the opposite problem. It's to do with developers who haven't got the experience of older technologies having nothing to compare it too, so "web 2.2" gets compared to compared to "web 2.0" and it looks like progress.

And in case you're tempted to dismiss me too, I recently tried to write an app using Polymer. It sucked. The entire experience was godawful from beginning to end. Luckily the app in question didn't have to be a web app, so I started over with a real widget toolkit and got results that were significantly better in half the time.

Dr_tldr · on April 23, 2016

I want to disagree with you more here, but Polymer really does suck.

I would be interested in a detailed explanation of why websockets are a "dumb hack." Duplex streams much more closely map to what web apps actually need to do than dealing with an HTTP request-response cycle. In what way is streaming a hack and making requests that were originally designed to serve up new pages not a hack?

jt2190 · on April 22, 2016

  > My point was that you can't create a proper application 
  > development environment that is both an open and defacto 
  > standard the way the web is.

Why? Is there a technical limitation? Or are you saying that it's technically possible, but that you have no faith that vendors will cooperate and support such a thing?

dasil003 · on April 23, 2016

The latter.

clevernickname · on April 23, 2016

>My point was that you can't create a proper application development environment that is both an open and defacto standard the way the web is.

In addition to the countless examples posted in this thread, I would argue that if it's nearly impossible to create your own implementation of a platform or standard from scratch, then it's not really open in a practical sense. Who cares if the specs are available if it takes dozens or hundreds of man years to deliver a passable implementation?

shwouchk · on April 25, 2016

"Who cares if you have the formula of a drug to cure cancer if it takes many manyears to produce it from said formula?"

rayiner · on April 23, 2016

The web makes Java look clean and elegant. Which is saying something. But it's an open question whether anything as popular as the web could have been any less crappy. CPU cycles and memory are way cheaper than social cooperation.

dasil003 · on April 22, 2016

> No it isn't. Not even close. It's maybe the first cross-platform development "environment" of which millenials are widely aware.

Name one then instead of this patronizing BS.

timr · on April 23, 2016

Just off the top of my head: Delphi, Smalltalk, Java, Flash, Mono, and many different cross-library thunks like wxWindows, Qt, GTK+, and so on.

Two seconds of googling will find you dozens.

dasil003 · on April 23, 2016

Those aren't cross-platform the way the web is cross-platform, they all depend primarily on one vendor's implementation. None of those ever had any hope of crossing the chasm to ubiquity, they only gained as much traction as their vendor had market clout on a limited set of platforms for a limited time period.

timr · on April 23, 2016

"Those aren't cross-platform the way the web is cross-platform, they all depend primarily on one vendor's implementation"

And web browsers are different how, exactly?

(Other than the fact that "the web" is a mish-mash of hundreds of different "standards" with varying levels of mutual compatibility and standardization, of course.)

sethrin · on April 23, 2016

Web browsers have standards they're supposed to meet. Where's the standard for qt? Where's the competing implementation? You can denigrate the web standards, even with some degree of truth, but you yourself have pointed out the difference. Perhaps instead of being dismissive you can elaborate how this difference is ultimately valueless despite its apparent success.

Skinney · on April 23, 2016

Both the JVM and the CLR have multiple implementations. Both JVM and CLR have a standard, as do Java and C#. The primarily implementation for both is also open source. So no, they don't primarily depend on one vendor's implementation.

cbd1984 · on April 22, 2016

> But it's only an "environment" in the most ecological sense -- it's a collection of ugly hacks, each building upon the other, with the sort of complexity and incomprehensibility and interdependency of organisms you'd expect to find in a dung heap.

That's what every practical environment is. Only the environments which never get used remain "pristine" in the architectural sense, because you cannot fundamentally re-architect an ecology once you have actual users beyond the development team and a few toy test loads.

TeMPOraL · on April 23, 2016

Given how fast JS world jumps between frameworks I'm thinking that yes, a strong enough actor could in fact re-architect the Web, or at least clear out some of that dung heap...

_w3e8 · on April 25, 2016

I had this discussion the other day with a junior engineer in regards to web assembly and the need for more stable compile targets than javascript.

I think a big part of the problem is that web developers have forgotten (or never learned) about a lot of the ui innovation that has already been done for native platform development.

I blame the page / html / dom model for this. It has forced generations of web developers to figure out clever (or not) workarounds to the point that they actually think they are innovating when they arrive at the point qt was at years ago.

niutech · on April 24, 2016

This post is about web bloat, not about app platforms.

abiox · on April 22, 2016

> Android/iOS/OSX/Windows should just work on a better standard mechanism to deliver native apps instead

one might imagine that after these competing and incompatible native apps become a headache for crossplatform pursuits, a new platform will emerge that provides a uniform toolset for developing (mostly) native-platform independent applications.

perhaps this toolset will utilize a declarative system for specifying the user interface, and a scripting system that is JIT'd on each platform.

arcticfox · on April 22, 2016

Magic! We could even have standardized links between the apps, and companies that specialize in finding content no matter what app it's in.

I think you're on to something...it could be huge... Heh.

TeMPOraL · on April 23, 2016

> I think you're on to something...it could be huge... Heh.

Could be. Sad that it isn't. Think how awesome it would be if app developers actually cared about interoperability instead of trying to grab the whole pie for themselves while giving you a hefty dose of ads in return. This is mostly the fault of developers, but the platform itself could help a lot if it was more end-user programmable. You'd have at least a chance to force different apps to talk to each other.

bendbro · on April 23, 2016

He's being sarcastic. Links are literal web links and the Company he's referring to is Google.

TeMPOraL · on April 23, 2016

Yes, I know. And I'm trying to subtly point out that it doesn't even work on the web, because it got fucked up by cowboy companies who ignore standards and do whatever they like to get easier $$$.

sievebrain · on April 23, 2016

Android has declarative UI, JIT compiled app logic and a way to link apps together (via intents). It is definitely not the web though.

I think people confuse ideas with implementations. The web is a pretty reasonable implementation of the idea "let's build a hypertext platform". It is not at all a reasonable implementation of the idea "let's build an app platform" which is why in the markets where reasonable distribution platforms exist (mobile) HTML5 got its ass kicked by professionally designed platforms like iOS and Android.

Razengan · on April 22, 2016

Well, why do native apps for all the popular websites exist? Surely nobody would need or download the FB/Twitter/Reddit/etc. apps on mobile or desktop if the website itself was the optimum experience...

My point is that even with the web the developers are still going to make the native clients, so either the web has to become good enough for the need for native apps to disappear eventually, and the browser becomes the OS, or native apps become convenient enough to completely replace webapps.

Of course if the browser becomes the OS then the end result would be the same as the suggestion in my original post.

niutech · on April 25, 2016

There is one such platform called Haxe (http://haxe.org)

tnorgaard · on April 22, 2016

Interestingly, this was what Alan Kay was advocating for on the web: "A bytecode VM og something like X for graphics" [1].

When Lars Bak and Kasper Lund launched Dart [2], I found it sad that they weren't more bold - left CSS and the DOM alone, and created an alternative Content-Type. So you can choose to Accept 'application/magic-bytecode' [3] before text/html, if your client supports so. Sadly, we ended up with Web Assembly, which by the few talks I've seen, appears to only cater to that of graphic/game developers, with no support for dynamic or OO languages.

[1] https://www.youtube.com/watch?v=FvmTSpJU-Xc [2] https://www.dartlang.org [3] Or in Dart lingo, a VM snapshot.

elviejo · on April 22, 2016

Yes I think of dart as a missed opportunity, it isn't smalltalk for the web, neither is it strongly typed... I think this falt of character makes it that nobody hate it, but also no body loves it.

Go doesn't have generics, some hate it, some love it. But it took a strong stand on that point.

MustardTiger · on April 22, 2016

>Web browsers have more or less become mini operating systems

I wish. No, web browsers have become massively bloated operating systems. And since they didn't intend to, they are terrible at it. You have little to no control over anything.

makomk · on April 22, 2016

Have you looked at the size of native apps these days? They make the web look almost unbloated by comparison. (It feels like half of them are shipping an entire HTML rendering engine too.)

niutech · on April 25, 2016

Exactly. E.g. have a look at the Facebook app.

mirimir · on April 22, 2016

Maybe because I don't want a bloody app for every site that I browse? Apps of that sort are just dumb. And huge security risks.

Razengan · on April 22, 2016

Of course webpages would still be around and needed, and desired for more basic content, but by the time you want to offer something as complex or regularly used as Facebook or Twitter or Reddit or Soundcloud and so on, it'd be better as a native app, as the current native clients for each of those websites already prove.

I mean the UI is undeniably smoother, and they can seamlessly hook into the OS notifications systems and better multitasking (for example I see separate entries for each native app in the OS's task switcher, but have to go through the extra step of switching into a browser then its tabs for webapps) and energy saving and everything else.

mirimir · on April 22, 2016

I don't want seemless and hooked. I want isolated, and sandboxed.

sdkmvx · on April 23, 2016

> regularly used as Facebook or Twitter or Reddit or Soundcloud

And there's the problem. I don't use any of those four sites regularly, but I have visited all of them. Hyperlinks provide for that, and they (a) don't exist or at best would be awkward in a native app (not that webapps handle them well to begin with, though all of the above do allow for them at least) and (b) work between apps, platforms, and what-have-you. If I got a link to some image, <160 character sentence, comment thread, or song and was prompted to download an app, I would probably not view that content instead.

khedoros · on April 22, 2016

That sounds ambitious, interesting, and like a lot more work than keeping a webpage running. My first thought is "cross-platform nightmare". My second thought is that they'd have to rethink their expectations of continuous deployment/release and a/b testing.

We're either looking at making individual pages maintain binaries for the platforms they support (implying support of only those platforms that make sense to the site) or some kind of compilation framework running on the local machine.

PretzelFisch · on April 23, 2016

This native delivery is pretty much a solved problem, just look at how firefox and chrome auto update.

khedoros · on April 23, 2016

Firefox and Chrome don't do "move fast and break things"-style continuous deployment, can't push updates to a user seamlessly (i.e. without restarting the program), and can be downloaded and compiled on platforms that Mozilla and Google don't want to officially support.

Native delivery of a monolithic browser based on an open-source codebase is a fixed problem. Trying to do the same with a website, using current techniques, would cause issues to both their current workflow and my expectations as a user that websites don't currently have.

I'm not saying that it's impossible to do, I'm just saying that it's not a good fit for current trends in web development, and I'm not convinced that it would be great for the users either.

tmd83 · on April 23, 2016

I also agree that the web is full of bloat and a mishmash of compromised features. I think you can design a web thats more powerful yet simpler if you actually solve the problems in a more forward thinking manner.

For example there has been so many things for powerful layouts whereas everyone knew 10 years ago we need powerful layout solutions (flexbox or whatver) and now we have grid frameworks and years of craft on older css enhancement that has to be supported. They keep adding features here and there to sort of address lots of problems where individually those features might be cheaper but the overall cost of implementing both by browsers and the us regular developers is much higher.

Here are couple of the things I want from the web and quite a few of them are there already if not in super ideal forms. Powerful layout thats simple enough to use, concept of webpage, a bundle (http2? all your resource together), making the whole partial rendering (ajaxified page) a natural concept. Even making the UI/markup delivery being made separate from content (you can do that with all sorts of library but I think it should be at the core). Security concepts that are easier to implement (CSRF, url tampering etc.).

One of the idea I had is that browsers make a new engine that does the right things from the start and hopefully thats a much lighter engine and if you serve new pages they are really fast and if you serve old pages there is an optional transpoiler kind of thing that translates to the new version of the fly. Now it won't be terribly good to start with so its optional but essentially the old version is frozen and the more people start to only use the new engine (with transpoiler).

tammer · on April 22, 2016

Not sure about the conclusion but you're absolutely right about the problem. This is why the social networks are eating such a great percentage of screen time: pure content, outside of web applications, is better suited in a single format rather than the superfluous loud magazines that currently pepper the web.

Perhaps rather than native apps what we need is the return of gopher. I think that's what Apple's trying to do with Apple News.

TeMPOraL · on April 23, 2016

That's a good point. Too much control over the form is left with website owners and too little with web users. Most of the web looks much better when you strip out the styling it comes with.

In a way, this is why I like doing more and more things from inside Emacs. I get a consistent interface that I control, and that is much more powerful than what each app or website I'd otherwise use can offer. Hell, it's a better interface than any popular OS has.

Aleman360 · on April 22, 2016

> Say if I press Cmd+Space on a Mac and type "Facebook" in Spotlight, it immediately begins downloading the native OS X Facebook app

That's exactly what Windows 10 does.

http://imgur.com/EqvATwL

nsgi · on April 22, 2016

Does it auto-download? You still have to install it anyway.

tracker1 · on April 22, 2016

I'm fine with that.. however, I've seen MANY instances of websites that load multiple versions of jQuery... that's just one library. Let alone how poorly constructed most web applications are.

When it comes down to it, for so long many front-end guys only cared about how it looked, and backend guys didn't care at all, because it's just the ui, not the real code.

We're finally at a point where real front end development is starting to matter. I honestly haven't seen this much before about 3-5 years ago... which coincides with node and npm taking over a lot of mindshare. There's still a lot of bad, but as the graph show, there's room to make it better.

PavlovsCat · on April 22, 2016

> It's about time to acknowledge that the web is increasingly being used to access full-blown applications more often than "webpages."

I think that is orthogonal to bloat. Sure, a complex app will always have more to load and compute than a static page with one blog post on it, but that doesn't mean an app can't be bloated on top of that, just like pages with just a single blog post on them can be bloated.

Razengan · on April 22, 2016

Speaking of bloat,

https://news.ycombinator.com/item?id=11548816 - The average size of Web pages is now the average size of a Doom install

PavlovsCat · on April 23, 2016

That's not really bloat, because anybody who sees your comment will already have that link in the browser cache ^_^

(To make this comment not entirely frivolous, does anyone remember the "bloatware hall of shame", or however it was called? I couldn't find it or anything decent like it, sadly. How about something like it for websites?)

Maakuth · on April 23, 2016

> Thanks to smartphones, people are already familiar with the modern concept of the standalone app; why not just make downloading an OS-native binary as easy as typing in a web address, on every OS?

Congratulations, you've invented ActiveX.

anthk · on April 23, 2016

>Say if I press Cmd+Space on a Mac and type "Facebook" in Spotlight, it immediately begins downloading the native OS X Facebook app

Epic malware vector.

SonicSoul · on April 22, 2016

What constitutes "way too much complexity" ? What if browsers are evolving with whatever resources are available to them at just the right pace. Why would you want the browser to be hindered by an arbitrary speed/resource limit. Let it soar to the sound of fans going full speed!

stepanhruda · on April 22, 2016

Isn't WebAssembly a big step towards this?

niutech · on April 24, 2016

WebAssembly is even not supported natively in any browser, is it?

xj9 · on April 22, 2016

In my opinion, if there is something that the web can't do as well as native it is a bug.

Web technologies can already do most of what you are proposing, including notifications. There are some performance issues, but they are well on their way to being fixed.

TeMPOraL · on April 23, 2016

Thing is, those performance issues matter. In user interfaces, they mean a lot. And they're also important when you're trying to do actual work. I'm yet to see a web app that wouldn't choke on the amount of data a power user may need. Yes, that includes Google's office suite, which is so horribly laggy when you put more than a screenfull of content into it that the very experience justifies switching to Windows and buying Microsoft Office.

What we would need, if the browser is to become a platform for actual productivity tools and not shiny toys, is a decent persistent storage interface - one that would be controlled by users, not by applications, and that could be browsed, monitored. And most importantly, one that would be reliable. And then, on top of that, a stronger split between what's on-line and what's off-line. Because some data and some tasks should really not be done through a network.

sievebrain · on April 23, 2016

The problem with the web is not one specific missing feature, or even performance. Man-centuries of effort by browser makers have been able to make performance not-quite-competitive instead of just hopelessly uncompetitive.

The problem with the web is that the developer experience is nightmarish. The fact that native apps don't suffer XSS should be a hint about where to start looking, but it's really just a house of horrors in there.

d33 · on April 22, 2016

I would say that a possible solution is also to better rank websites that mention the checksums of their external resources and make web browsers keep them in cache much longer... if pretty much every website uses jQuery, perhaps we should ship jQuery with the web browser?

kevincox · on April 22, 2016

I hear this argument a lot, and I very much disagree. Now you have browser vendors having to device which libraries are "popular" and shipping them in the initial download of the browser.

It turns out that this technology already exists in a much better form. It's called cache. The problem is that almost everyone hosts their own version of jQuery. If everyone simply linked the "canonical" version of jQuery (the CDN link is right on their site) then requiring jQuery will be effectively free because it will be in everyone's cache.

Also the cache is supported by all browsers with an elegant fallback. Instead of having to manually having to check if your user's browser has the resource you want preloaded you just like the URL and the best option will automatically be used.

TL;DR Rather then turning this into a political issue stop bundling resources, modern protocols and intelligent parallel loading allow using the cache to stove this problem.

nostrademons · on April 22, 2016

> If everyone simply linked the "canonical" version of jQuery (the CDN link is right on their site) then requiring jQuery will be effectively free because it will be in everyone's cache.

It's not, though. I ran this experiment when I tried to get Google Search to adopt JQuery (back in 2010). About 13% of visits (then) hit Google with a clean cache. This is Google Search, which at the time was the most visited website in the world, and it was using the Google CDN version of JQuery, which at the time was what the JQuery homepage recommended.

The situation is likely worse now, with the rise of mobile. When I did some testing on mobile browsing performance in early 2014, there were some instances where two pageviews was enough to make a page fall out of cache.

I'd encourage you to go to chrome://view-http-cache/ and take a look at what's actually in your cache. Mine has about 18 hours worth of pages. The vast majority is filled up with ad-tracking garbage and Facebook videos. It also doesn't help that every Wordpress blog has its own copy of JQuery (WordPress is a significant fraction of the web), or for that matter that DoubleClick has a cache-busting parameter on all their JS so they can include the referer. There's sort of a cache-poisoning effect where every site that chooses not to use a CDN for JQuery etc. makes the CDN less effective for sites that do choose to.

[On a side note, when I look at my cache entries I just wanna say "Doubleclick: Breaking the web since 2000". It was DoubleClick that finally got/forced me to switch from Netscape to Internet Explorer, because they served broken Javascript in an ad that hung Netscape for about 40% of the web. Grrrr....]

PavlovsCat · on April 22, 2016

> The vast majority is filled up with ad-tracking garbage and Facebook videos.

There is the problem then, and the solution? I for one don't make bloated sites nilly-willy, I suck at what I do but at least I do love to fiddle and tweak for the sake of it, not because anyone else might even notice; and I like that in websites and prefer to visit those, too. Clean, no-BS, no hype "actual websites". So I'd be rather annoyed if my browser brought some more stuff I don't need along just because the web is now a marketing machine and people need to deploy their oh so important landing pages with stock photos and stock text and stock product in 3 seconds. It was fine before that, and I think a web with hardly any money to be made in it would still work fine, it would still develop. The main difference is that it would be mostly developed by people who you'd have to pay to stay away, instead of the other way around. I genuinely feel we're cheating ourselves out of the information age we could have, that is, one with informed humans.

jakub_g · on April 22, 2016

Interesting data, thanks for sharing.

On top of that, while everyone uses jquery, everyone uses different version of it (say, 1.5.1, 1.5.2, ... hundreds of different versions in total probably).

d33 · on April 22, 2016

The problem with caching is that you're sharing the referer with the canonical URL. Another problem is that you're using someone else's bandwidth. And if you combine the two, you can be sure that info about your visitors will be sold, which is why quite a lot of people would prefer to host their own versions of jQuery...

kevincox · on April 22, 2016

For the referrer problem you can apply a referrer policy to prevent this but unfortunately the referrer policy isn't very granular.

Also for my sites I have a fallback to a local copy of the script. This allows me to do completely local development and remain up if the public CDN goes down (or gets compromised). With small (usually) performance impact.

lolive · on April 22, 2016

Couldn't the HTTP cache of your ISP be a good in-betweener, in that case? Would he send the referer to the canonical URL?

mdavidn · on April 22, 2016

Not for sites using TLS. The only option for secure sites would be a CDN. That is, an HTTP cache in a relationship with the content publisher rather than the subscriber.

dkopi · on April 22, 2016

The problem with hosting JS libraries on CDNs is that the cache has a network effect.

You only gain performance if the browser already has a cached version of this specific version on this specific CDN. If you don't - you end up losing performance, because now an additional DNS lookup needs to be performed, and an additional TCP connection needs to be opened.

Here are a few reasons people choose to avoid CDNized versions of JS libraries. http://www.sitepoint.com/7-reasons-not-to-use-a-cdn/

This is a 6 year old post, but it raises some valid concerns: https://zoompf.com/blog/2010/01/should-you-use-javascript-li...

bryanrasmussen · on April 22, 2016

the reason why I prefer to use a cdn is because it is a games theory example come to life here - if everyone used the cdn version then any user coming to your site would most likely have the cdn version in their cache and thus performance would go up, but if you use the cdn version and your competitors don't their performance is slightly better than yours and so on and so forth. Games theory indicates that in most games of this sort cooperation is better than non-cooperation.

And really if you are using one of the major libraries and a major CDN (Google, JQuery, etc.) over time your users will end up having the stuff in the cache, either from you or from others having used the same library version and cdn.

I suppose someone has done a study on CDN spreading of libraries and CDNS among users, so that you could figure out what the chance is that a user coming to your site will have a specific library cached - there's this http://www.stevesouders.com/blog/2013/03/18/http-archive-jqu... but it is 3 years ago, really this information would need to be maintained at least annually to tell you what your top cdn would be for a library.

tracker1 · on April 22, 2016

but there isn't 1 CDN, or 1 version... if you need two libraries, but the canonical CDN for jquery is on one, and your required extension is on another... that's two DNS lookups, connections, request cycles, etc.

So you use the one that has both, but one is not canonical, which means more cache misses. That doesn't even count the fact that there are different versions of each library, each with it's own uses, and distribution, and the common CDN approach becomes far less valuable.

In the end, you're better off compositing micro-frameworks and building yourself. Though this takes effort... React + Redux with max compression in a simple webpack project seems to take about 65K for me, before actually adding much to the project. Which isn't bad at all... if I can keep the rest of the project under 250K, that's less than the CSS + webfonts. It's still half a mb though... just the same, it's way better than a lot of sites manage, even with CDNs

bryanrasmussen · on April 23, 2016

that's 2 dns lookups on any user that hasn't already done that somewhere in the past and had it cached.

The question then is how likely are they to have done that in regards to your particular cdn and version of the library.

I agree that a lot of possible cdns, versions and so forth decreases the value of the common CDN approach, but there are at least some libraries that have a canonical CDN (JQuery for example) and not using that is essentially being the selfish player in a games theory style game.

Since I don't know of any long running tracking of CDN usage that allows you to predict how many people who visit your site are likely to have a popular library in their cache it's really difficult to talk about it meaningfully (I know there are one-off evaluations done in one point in time but that's not really helpful).

Anyway it's my belief that widespread refusal to use CDN versions of popular libraries is of course beneficial in the short run for the individual site but detrimental in the long run for a large number of sites.

extrapickles · on April 22, 2016

Latency of a new request as mentioned in one of those articles is the main reason why I self host everything.

Since HTTPS needs an extra round trip to startup, it's now even more important to not CDN your libraries. The average bandwidth of a user is only going to go up, and their connection latency will remain the same.

If you are making a SaaS product that business want, using CDNs also make it hard to offer a enterprise on-site version as they want the software to have no external dependencies.

cortesoft · on April 22, 2016

This might make sense, if all of your users are located near your web servers and you can comfortably handle the load of all the requests hitting your web servers.

If the user making the request is in Australia, for example, and your web server is in the US, the user is going to be able to complete many round trip requests to the local CDN pop in Australia in the time it takes to make a single request to your server in the US.

Latency is one of the main reasons TO use a CDN. A CDN's entire business model depends on making sure they have reliable and low latency connections to end users. They peer with multiple providers in multiple regions, to make sure links aren't congested and requests are routed efficiently.

Unless you are going to run datacenters all around the world, you aren't going to beat a CDN in latency.

extrapickles · on April 22, 2016

If the only thing you have on the CDN is libraries, it's faster to have your site host them even if it's on the other side of the world. When HTTP2 file push is widely supported, it becomes even more in favor of hosting locally, as you can start sending your libraries right after you are done sending the initial page without waiting for the browser to request them.

If you are using a CDN for images/video, then yes, you would have savings from using a CDN since your users will have to nail up a connection to your CDN anyways.

Then again a fair number of the users for the site I'm currently working on have high latency connections (800ms+), so it might be distorting my view somewhat.

detaro · on April 22, 2016

Or you use a CDN in front of your site, caching the content under your domain. But certainly something to be aware of.

kevincox · on April 22, 2016

This is why I recommended using the CDN recommended by the project, most do recommend a CDN, for example jQuery has it's own CDN.

As for adoption, that is very much a chicken and egg problem.

tracker1 · on April 22, 2016

Even then, different versions will have their own misses... not to mention 3rd party libraries on another CDN means another DNS hit.

DNS resolution time is a pretty significant impact for a lot of sites.

brianjackson · on April 23, 2016

Ya I would have to agree with you tracker. Ever 3rd party dependency is introducing another DNS lookup. The whole point behind using a CDN effectively, besides lowering latency, is to reduce your DNS lookups to a bare minimum. For example, I use https://www.keycdn. They support HTTP/2 and HPACK compression along with Huffman encoding which reduce the size of your headers.

The benefits of hosting say Google Fonts, Font Awesome, jQuery, etc. all with KeyCDN is that I can take better advantage of parallelism if I have one single HTTP/2 connection. Not to mention I have full control over my assets to implement caching (cache-control), expire headers, etags, easier purging, and the ability to host my own scripts.

newjersey · on April 22, 2016

What if the checksum was the same and you accepted the cache hit if the checksum agrees and get your own copy if it doesn't? Maybe the application should get to declare a canonical URL for the js file instead of the browser? So something like

Would this cause more problems than it would solve? I'm assuming disk access is faster than network access.

I'm concerned about people like me who use noscript selectively. How easy is it to create a malicious file that matches the checksum of a known file?

eertami · on April 22, 2016

>How easy is it to create a malicious file that matches the checksum of a known file?

I'd say not easy at all, practically impossible.

https://en.wikipedia.org/wiki/Preimage_attack

AnkhMorporkian · on April 22, 2016

> I'm concerned about people like me who use noscript selectively. How easy is it to create a malicious file that matches the checksum of a known file?

SHA-256? Very, very, very, very hard. I don't believe there are any known attacks for collisions for SHA-256.

jval43 · on April 22, 2016

I think even a collision (any collision) has yet to found.

dinkumthinkum · on April 23, 2016

People make too big a deal of this collision stuff, a lot of times these are very theoretical would require tremendous computation. Anyway, for this use case, even md5, how likely really to make a useful malicious that file collides with a particular known and widely used one? I dunno seems pretty unlikely.

dexterdog · on April 22, 2016

And if you worry about that you can always use 384. Plus a side benefit is that 384 is faster on a 64-bit processor.

kevincox · on April 22, 2016

It would be interesting if browsers start implementing a content-addressable cache. So as well as caching resource by URI also cache by hash. Then SRI requests could be served even if the URL was different.

Of course this would need a proposal or something but it would be interesting to consider.

SixSigma · on April 23, 2016

Plan9's Venti file storage system is content addressable.

http://plan9.bell-labs.com/sys/doc/venti/venti.html

Also available on *nix

jMyles · on April 22, 2016

> How easy is it to create a malicious file that matches the checksum of a known file?

As others have pointed out, it's quite difficult. But here's another way to think about it: if hash collisions become easy in popular libraries, the whole internet will be broken and nobody will be thinking about this particular exploit.

Servers won't be able to reliably update. Keys won't be able to be checked against fingerprints. Trivial hash collisions will be chaos. Fortunately, we seem to have hit a stride of fairly sound hash methods in terms of collision freedom.

llis · on April 22, 2016

I think this vaguely reminds me of the Content Centric Networking developed by PARC. There's 1.0 implementation of a protocol on github (https://github.com/PARC/CCNx_Distillery). A CCNx enabled browser could potentially get the script from a CCN by referring to it's signature alone (it being a sha-256 checksum or otherwise).

wallacoloo · on April 22, 2016

This seems a little redundant - why not just

    <script src="jQuery-1.12.2.min.js" sha-256="31be012d5df7152ae6495decff603040b3cfb949f1d5cf0bf5498e9fc117d546"></script>

? If you wanted to explicitly fetch from google if the client doesn't have a cached copy, then instead do

    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.12.2/jquery.min.js" sha-256="31be012d5df7152ae6495decff603040b3cfb949f1d5cf0bf5498e9fc117d546"></script>

The first would seem preferable though, as loading from an external source would expose the user to cross-site tracking.

newjersey · on April 22, 2016

> The first would seem preferable though, as loading from an external source would expose the user to cross-site tracking.

You're right in that the first one you had with just the sha-256 would be pretty much equivalent as what I had especially given that hn readers have resoundingly given support to the idea that it is non-trivial to create a malicious file with the same hash as our script file. I was simply trying to be cautious and retain some control for the web application (even if the extra sense of security is misplaced).

This is the use case I'm trying to protect by adding a new "canonical" reference that the web application decides. As others in this thread have said, it is very unlikely that someone will be able to craft a malicious script with the same hash as what I already have. The reason I still stand by including both is firstly compatibility (I hope browsers can simply ignore the sha-256 hash and the authorized cache links if they don't know what to do with it).

As a noscript user, I do not want to trust y0l0swagg3r cdn (just giving an example, please forgive me if this is your company name). NoScript blocks everything other than a select whitelist. If the CDN happens to be blocked, my website should still continue to function loading the script from my server.

My motivation here was to allow perhaps even smaller companies to sort of pool their common files into their own cdn? <script src="jimaca.js" authoritative-cache-provider="https://cdn.jimacajs.example.com/v1/12/34/jimaca.js""></scri... I also want to avoid a situation where Microsoft can come to me and tell me that I can't name my js files microsoft.js or something. The chances of an accidental collision are apparently very close to zero so I agree with you that there is room for improvement. (:

This is definitely not an RFC or anything formal. I am just a student and in no position to actually effect any change or even make a formal proposal.

15155 · on April 23, 2016

If accompanied by the exact same sha-256 hash idea, loading from any external source cannot expose the user to any additional risk.

SHA + CDN url list (for whitelisting/reliability purposes - public/trusted, and then private for reliability) would be ideal.

JoshTriplett · on April 22, 2016

> The problem is that almost everyone hosts their own version of jQuery.

Any site that expects users to trust it with sensitive data should not be pulling in any off-site JavaScript.

As for checksumming, browser vendors don't need to pre-load popular JavaScript libraries (though they might choose to do so, especially if they already ship those libraries for use by their extensions and UI). But checksum-based caching means the cost still only gets paid by the first site, after which the browser has the file cached no matter who references it.

sametmax · on April 22, 2016

Yes but no.

    - jquery.com becomes a central point of failure and attack;
    - jquery gets to be the biggest tracker of all time;
    - cache does not stay forever. Actually, with pages taking 3Mo everytime they load, after 100 click (not much), I invalidated 300 Mo of cache. If firefox allow 2go of cache (it's lot for only one app), then an the end of the day, all my cache has been busted.

Lawtonfogle · on April 22, 2016

> If everyone simply linked the "canonical" version of jQuery (the CDN link is right on their site) then requiring jQuery will be effectively free because it will be in everyone's cache.

So create one massive target that needs to be breached to access massive numbers of websites around the world?

Imagine if every Windows PC ran code from a single web page on startup every time they started up. Now imagine if anything could be put in that code and it would be ran. How big of a target would that be?

While there are cases where the performance is worth using a CDN, there are plenty of reasons to not want to run foreign code.

(Now maybe we could add some security, like generating a hash of the code on the CDN and matching it with a value provided by the website and only running the code if the hashes matched. But there are still business risks even with that.)

eertami · on April 22, 2016

The solution to this is including a checksum with the link to the file, and if the checksum doesn't match, don't load the file.

See https://developer.mozilla.org/en-US/docs/Web/Security/Subres... though it isn't universally supported yet.

jonlucc · on April 22, 2016

Just so I understand, I pull the file and make a checksum, then hardcode it into the link to the resource in my own code? Then, when the client pulls my code, follows the link, checks the checksum against the one I included in the link.

dexterdog · on April 22, 2016

Yes. It's very simple. I don't know why more library providers don't have it in their copyable <script> snippets.

hyperpape · on April 22, 2016

I agree with you that putting JQuery in the browser is a bad idea, but the second part of the argument, that the browser will have the libraries in cache is not really that reliable. Here's some notes I collected on the subject: http://justinblank.com/notebooks/BrowserCacheEffectiveness.h....

kevincox · on April 22, 2016

I was definitely speaking whimsically because it is a huge chicken and egg problem. But theoretically if every site that used jQuery referenced a version at https://code.jquery.com/ there would be a very good hit ratio, even considering different versions. However we are a very, very far way away from that.

throwaway_1011 · on April 22, 2016

That may seem like a good solution for some sites, but the name of the page and the requestor's IP address and other information is also 'leaked' to jquery.com. This is not always welcomed. For example, a company has an acquisition tracking site (or other legal-related site) and the name of the targets are part of the page name (goobler, foxulus, etc.) which get sent as the referrer page and IP address to jquery.com or other third party sites/CDNs. While not a security threat, you may unwittingly be recommending an unwanted information leak.

dexterdog · on April 22, 2016

But it's only leaked when you don't have it cached. Otherwise the client doesn't even make the request.

j3097736 · on April 22, 2016

There's the firefox addon Decentraleyes https://github.com/Synzvato/decentraleyes which is trying to solve those problems, currently only the most common versions of each library are bundled, but there are plans to create more flexible bundles.

There's no reason to hit the webserver with an If-modified when the libraries already include their version in the path.

Sanddancer · on April 23, 2016

If everyone linked the "canonical" version of jquery, then that location would be worth a considerable amount of money to any sort of malicious actor. Just getting your version there for a few minutes could rope in many thousands of machines. Look at China's attack on github earlier in the year for examples of the damage that that sort of decision could do.

rmc · on April 22, 2016

> if pretty much every website uses jQuery, perhaps we should ship jQuery with the web browser?

There's a firefox plugin that does just that! DecentralEyes - https://addons.mozilla.org/en-US/firefox/addon/decentraleyes...

Although mostly for privacy reasons, so google doesn't know all the sites you visit.

douche · on April 22, 2016

Hmmmm, or javascript could be standardized on something that isn't a barely-working kludge with no batteries included? If something like jQuery or Lodash is basically table-stakes for doing anything really worthwhile, then just maybe that de facto standard should be made de jure.

JustSomeNobody · on April 22, 2016

We absolutely, most certainly, should NOT ship libraries with browsers. This is a horrible idea. Browser vendors would then have too much influence on development libraries.

delan · on April 22, 2016

“We absolutely, most certainly, should NOT ship libraries with operating systems.”

ArkyBeagle · on April 22, 2016

There used to be a distinction between browsers and operating systems.

JustSomeNobody · on April 22, 2016

Pretty sure there still is.

mindcrime · on April 22, 2016

It's getting pretty hard to tell.

JustSomeNobody · on April 22, 2016

All the major browsers ship with JavaScript.

If JS, absolutely needed JQuery in order to, say, select an element the way C needs a library to output to the console, then sure, you may have an argument here.

But, no.

limelight · on April 23, 2016

And why do you think web development has become the dominant form of development?

pikzen · on April 23, 2016

Because literally any child can start developing, and then grows up and pulls every library under the sun because the language they use is a piece of crap, without a single thought about the consequences?

Thankfully node.js is bringing this clusterfuck to the desktop too.

dinkumthinkum · on April 23, 2016

It's more about deployment and dependency problems.

The dependencies we are talking about are transparent to the user, save only for some download performance issues which are pretty minimal, if not overstated in many cases, compared to the former issues faced by native applications.

notahacker · on April 22, 2016

I think the direction of the influence would mostly works the other way round. If there's an attribute on the HTML link tag that some browsers recognise as calling the browser's client-side version of that library (with local/CDN fallback for browsers that don't) then users are going to notice that browser X seems to load a lot of websites more quickly than browser Y, which puts pressure on the vendor of browser Y to make sure it also supports these frequently-used js libraries (especially if they're large frequently-used libraries with multiple minor version numbers and no widely-accepted-as-canonical CDN source)

This is still a shame if you're maintaining a heavyweight alternative to jQuery or React which is far too obscure to be a consideration for browser vendors, but it's a big boon for users, especially users on slow or metered connections that download several different version numbers of jQuery from several different CDNs every day.

dinkumthinkum · on April 23, 2016

I'm not sure that is true. No one is saying external libraries cannot be loaded. It is actually a good idea to be able to rely on certain popular libraries already on the client; this makes a lot of sense. You can still have the existing mechanism so I don't think browser vendor are going to hold everyone hostage.

If you like, there could be mutually agreed upon standard repository that browser vendors routinely update from.

Sure, your less popular experimental library won't be in the list, that's what "<script src=..." is for.

It probably won't happen but it is hard to defend the position that it is a bad idea, I think.

mod · on April 22, 2016

It could be a 'use the default or link your own' situation, which then wouldn't influence a whole lot.

But I agree, overall.

LeoNatan25 · on April 22, 2016

In this case, very very little would change. People would still link their own, as they do now.

JustSomeNobody · on April 22, 2016

Or worse, people would use the buggy, out of date default.

Xophmeister · on April 22, 2016

Common libraries and frameworks are usually loaded from a CDN. These should stay in the browser's cache for some time.

d33 · on April 22, 2016

Yeah, but per domain. Unless you use something like cjdns, if you end up with static.yourlovelydomain.com, the resource's going to get downloaded from scratch at the first attempt. If we had checksums attached to resources, there'd be provably no point bothering to download them in many cases.

tehbeard · on April 22, 2016

The problem is the quantity of CDNs on which jQuery for example can be found.

For instance, your site might pull it in from cdnjs, while I have the one from ajax.googleapis.com already cached. I still have to go fetch your selected version.

binarymax · on April 22, 2016

Only if everyone is using the same CDNs :)

lostgame · on April 22, 2016

Isn't this what the '.min' versions of these is for?

jquery-2.2.3.min.js is only 86KB for me. For the amount of functionality it adds, sure seems like a sweet deal.

Part of the problem (IMHO) is the growing requirement for larger, alpha-channel-using image formats like PNG, that then need to be expressed in a Retina format - I mean, shit looks terrible on my retina MacBook Pro that isn't properly Retina formatted. (Here's looking at you, Unity3D...even the current beta which supports Retina displays is extremely unstable and half the graphics (even stupid buttons on the UI) are still SD...)

With bandwidth costs declining [1] and basic RAM increasing [2] is there a particular reason a Web application should be much smaller than a typical desktop application? We have caches for a reason.

[1] http://www.networkworld.com/article/2187538/tech-primers/exp...

[2] http://en.yibada.com/articles/118394/20160422/macbook-air-is...

achairapart · on April 22, 2016

Why only jQuery, anyway?

What if browsers shipped with a repository instead of a cache? Download once, stay forever.

illumen · on April 22, 2016

There are checksums available for resources now. As of last few months. Also things like jQuery are available from multiple CDN, and they are used by many of the top websites and html frameworks. So you probably already have it in cache.

yxlx · on April 22, 2016

Penalize any historically slow loading (for some arbitrary cut-off value, say 600 ms) site client side with an additional 10 seconds of waiting before any request is sent. This way users will hit the back button well before any data is sent to the server and the site operators will see traffic plummet.

If any browser vendor implemented my proposal, their users would switch to another browser. If it was an addon, only a handful of people would use it so there would be no impact. If it was done on the network level of a corporate business, users would complain. Still, one can dream :)

dayjah · on April 22, 2016

This whole thread is really interesting but I think it misses a key point: developing countries. The issue with these pages getting larger is that it creates two webs: one that work on the slower networks of less developed countries and another that works for the richer world.

It's often suggested that we'll solve poverty by providing education and given the internet provides a unique opportunity to learn then surely we benefit as a species by ensuring there isn't a rich web and a poor web? This risk is compounded by rich people being in a better position to create better learning opportunities.

Much like there's a drive toward being aware of the accessibility of your website (colour blind-check tools, facebooks image analysis for alt text tags, etc, etc) we should be thinking about delivery into slower networks in poorer countries.

mtgx · on April 22, 2016

I also like how Opera now allows you to benchmark websites with and without ads. I think other browsers should expand on that idea (benchmarking sites, and not just for ads).

Give some kind of reminder to both users and developers about how slow their sites are. Those with the slowest websites probably won't like it too much initially, but it's going to be better for all of us in the long term.

http://www.opera.com/blogs/desktop/2016/03/native-ad-blockin...

wnevets · on April 22, 2016

Google is already starting to base rankings on page speed, thats enough to get companies to spend money on page speed.

mirimir · on April 22, 2016

Yes, this would for sure be valuable.

Also, browsers could default to http://downforeveryoneorjustme.com/ for sites that loaded too slowly. AdBlock Plus and NoScript speed loads greatly. Maybe browsers could do triage on sites that load too slowly. Perhaps switch to reader mode or whatever.

alexpersian · on April 22, 2016

Something that would be more effective is if Google began factoring page load speeds in their rankings just like they did for mobile optimization.

PretzelFisch · on April 23, 2016

Why would that matter? If the website has the content I need it doesn't matter that this other site that doesn't match my query as well but loads faster is now the first result.

IMTDb · on April 22, 2016

They do

hippich · on April 23, 2016

i think they already do it.

vmarsy · on April 22, 2016

the Disconnect Me (https://disconnect.me) add-on is doing something sort of similar. It will try blocking the useless external content (tracking, etc.) and will show you how much your loading time decreased and how many KB you saved by not loading those.

niutech · on April 24, 2016

In order to load much lighter web pages, just use Opera Mini or http://gdriv.es/l/ [append url]

stavros · on April 22, 2016

People have an innate metric for "this thing is slow", it's when lots of time passes between its start and end. Why is color-coding more evident than our own perception of time?

jasode · on April 22, 2016

I think his key phrase is "expected to load," which means the color coding would provide an indicator before the user clicks on the links. Therefore, the user skips the unnecessary pain of perceiving the slowness for himself.

jessriedel · on April 22, 2016

Yes exactly.

fudged71 · on April 22, 2016

The problem is the same as "Downforeveryoneorjustme". Just because a site is loading slow once or twice doesn't necessarily mean that it's the site's fault, it could be any number of issues. But by crowdsourcing this data the truth is revealed.

ma2rten · on April 22, 2016

I thought Google already punishes slow loading websites.

jessriedel · on April 22, 2016

When Google does this they are explicitly making a trade off between relevance and speed on behalf of the user. I'd rather leave that choice to the user, especially when the user can infer that this is the correct destination from the search result text.

Also, it doesn't work other places.

limelight · on April 23, 2016

You don't even need to do a complicated extension mechanism.

Users already respond to page load time. There's extensive evidence to support this.

robotnoises · on April 22, 2016

Before everyone jumps onto the JQuery/Bootstrap/etc sucks bandwagon, just a reminder that the minified jquery from cdnjs is 84.1kb. Bootstrap is 43.1kb.

If you want your page to load fast, the overall "size" of the page shouldn't be at the top of your list of concerns. Try reducing the # of requests, first. Combine and minify your javascript, use image sprites, etc.

r1ch · on April 22, 2016

That's still 84 KB of highly compressed javascript code to parse and execute. Even on a 4.4 GHz CPU core and a cached request, jquery takes upwards of 22ms to parse - that's just parsing - not even executing anything! Now add a bunch of other frameworks and utility scripts and your 100ms budget for feeling "fast" is gone before you even reach the HTML content.

dclowd9901 · on April 22, 2016

How long does it take you to start Word versus loading Google docs?

Has everyone gone insane? The web is absolutely incredible and JavaScript is absolutely killing it as a portable application language. Remember the dark days of flash? Stop complaining.

supergauntlet · on April 22, 2016

This is disingenuous - he's obviously referring to image and text articles, not complicated web apps. Obviously such a website is going to take longer to load, just as complex native applications take a while to load. But to claim that because Word takes a while that it's okay that A WEBSITE WITH LITERALLY JUST TEXT AND IMAGES is 18 megs big and takes 6000 ms to load is fucking stupid.

TeMPOraL · on April 23, 2016

Also, Word takes a while to load, but then works fast and is responsive - as well-written native apps are by default - and doesn't suck up your system resources. Something which can't be said about many websites "with literally just text and images".

Skinney · on April 23, 2016

You're comparing a well-written native app with a poorly written website. I'm currently working on a web-app that requires an about 600kb (minified, yes I know) app.js file.

It takes a while to load (not really, 100ms + download), but after that it is silky smooth due to client side rendering (faster than downloading more html) and caching.

On the most demanding page, heap usage is a little more than 20mb.

Sure, there are a lot of websites which are slow and huge memory hogs. But that goes for many native apps as well.

dclowd9901 · on April 23, 2016

Come on, you know that's the exception and not the rule. No one should use a revenue generating site as a basis for how the web works. Unless they want to compare it to a similar native product.

Sanddancer · on April 23, 2016

Yes, I remember flash, when it was trivial to save things I found with a couple of clicks. The typical webapp is ephemeral -- no way to save it, no way to run it a few years from now after the site it was hosted on goes bye-bye. The typical webapp is terrible at interoperability -- getting it to work with native libraries is pretty much impossible unless you've written your native code in one of the few languages that have transpilers available. The typical webapp is terrible at multithreading -- webworkers are a horrible hack, with no way to send/receive objects between worker threads other than routing everything via the main thread. When you start wanting to do anything interesting, the web's a set of blinders and shackles, keeping you from using resources.

sievebrain · on April 23, 2016

Um, Word has pretty much always started in a second or two even on computers from the year 2000. You're picking on the wrong app there: Microsoft was notorious for caring about startup time of their Office suite and going to tremendous lengths to improve it.

Meanwhile, how fast Google Docs loads depends entirely on the speed of my internet connection at the time. Good luck even opening it at all if your connection is crappy, flaky, if any of the ISPs between you and Google have congestion issues, or if there's a transient latency problem in one of the dozens of server pools that makes up an app like Docs.

swiley · on April 23, 2016

Vim/LaTeX is way faster though. And there's no way to make that in any clean way with the web.

robotnoises · on April 22, 2016

That's a fair point, though my threshold for fast is probably a bit higher than 100ms. That might be a conditioned thing, though.

adrianN · on April 22, 2016

It's not much higher than 100ms

https://www.nngroup.com/articles/response-times-3-important-...

spiderfarmer · on April 22, 2016

I see lots of websites that don't even cache static resources and use > 10 scripts and stylesheets, from a lot of different domains. Terrible.

eterm · on April 22, 2016

And link the non-minified versions.

(but the different domains actually helps speed up loading with http because it helps parallelise transfers.)

spiderfarmer · on April 22, 2016

Domain sharding almost never makes any sense, especially on mobile, and with HTTP2 it's working against you.

http://www.mobify.com/blog/domain-sharding-bad-news-mobile-p...

madawan · on April 22, 2016

or use HTTP2 which should make the number of requests largely irrelevant.

acdha · on April 22, 2016

HTTP/2 helps with that but the total size still matters. This is particularly relevant for resources like CSS which block rendering – even with HTTP/2 making it less important whether that's one big resource or a dozen small ones, the page won't render until it's all been transferred.

https://github.com/filamentgroup/loadCSS#recommended-usage-p... has a rather nice way to load CSS asynchronously in browsers which support rel=preload.

tedmiston · on April 22, 2016

Not to mention that using the latest stable jQuery/Bootstrap from CDN means it's likely to be cached before a user visits your site.

K0nserv · on April 22, 2016

Quite happy with my own web page/blog. Pages hover at around 10kb, 30kb if I include some images. I think the page size can be attributed a lot to there being no JS except for GA.

I have taken a lot of inspiration from http://motherfuckingwebsite.com/ and http://bettermotherfuckingwebsite.com/

Of course the size will differ depending on the site's purpose, but I feel like most web pages could stand to loose a lot of weight.

EDIT: I have a guide to setup a similar blog/site here[0]

0: https://hugotunius.se/2016/01/10/the-one-cent-blog.html

nickpsecurity · on April 22, 2016

That loaded SOOO fast on mobile with no delays or scrolling problems. Great job!

Note: Might try it with NoScript later. Always a good indicator of... something...

K0nserv · on April 22, 2016

Fun story about NoScript. I used to use AMP[0]. But I reverted that for two reasons.

1. The size of their js is ~170kb roughly 17 times larger than most of my pages.

2. It loads from a 3d party domain which means that NoScript/uBlock Origin users will see a blank white page.

The project does impose some constraints that will help reduce page weight and increase page speed, but for my page sizes it's ridiculous to use their JS.

0: https://www.ampproject.org/

nickpsecurity · on April 22, 2016

The site certainly loaded fast. Never heard of them before. Thanks for the link.