Hacker News new | past | comments | ask | show | jobs | submit login
The average size of Web pages is now the average size of a Doom install (mobiforge.com)
857 points by Ovid on April 22, 2016 | hide | past | favorite | 449 comments



I'm skeptical that developers talking to each other about how bad web bloat is will change anything. They will still face the same incentives in terms of ad revenue, costs of optimization, etc.

Here's a random idea that might have more potential: create an adblocker browser plugin that also colors URLs based on how slow they are expected to load, e.g., smoothly from blue to red. The scores could be centrally calculated for the top N URLs on the web (or perhaps, an estimate based on the top M domain names and other signals) and downloaded to the client (so no privacy issues). People will very quickly learn to associate red URLs with the feeling "ugh, this page is taking forever". So long as the metric was reasonably robust to gaming, websites would face a greater pressure to cut the bloat. And yet, it's still ultimately feedback determined by a user's revealed preferences, based on what they think is worth waiting how long for, rather than a developer's guess about what's reasonable.


I think a plugin that enables the chrome "regular 2G" throttling (developer tools / network tab), permanently, for anyone that works in marketing or web development might help :)

Edit: Joking aside, that throttling feature is a nice easy way to let a dev team, or business counterpart, see what their site is like for say, a customer with a low-end DSL connection: https://developers.google.com/web/tools/chrome-devtools/prof...


No need for a plugin if you are running Linux or OSX:

# Bandwidth trottling, enabling 150kB/s on port 80

sudo ipfw pipe 1 config bw 15KByte/s

sudo ipfw add 1 pipe 1 src-port 80

# Bandwidth trottling, disable

sudo ipfw delete 1

More useful stuff here as well, please star the repo :) https://github.com/logotype/useful-unix-stuff/blob/master/us...


It used to work; unfortunately, Apple removed ipfw from recent versions of OS X.

The new method uses dummynet and pf but isn't reliable and I've never got it to work consistently, despite trying for hours and hours.

The only method that works reliably on recent versions of OS X is the free Network Link Conditioner. It is absolutely bulletproof.

Edited to add: Network Link Conditioner seems to use pf and dummynet under the hood; you can see the rules appear. But there's an interaction with the nlcd daemon that I don't understand yet. I want to do protocol-specific bandwidth throttling and I've not got that to work with nlcd interfering. But if you can live with throttling all traffic on the box, NLC works a treat.


It's not a plugin, it's built into chrome. His joke was about a plugin that enables it permanently for certain people.


Not to mention my account managers and BDM's don't even know what a linux is.


Linux? But he was talking about OS X...


sounds like that could have a lot of unintended consequences. especially since so much data is now transmitted over port 80


I have to think that the devs want to make the page lighter, but their managers keep pushing more tracking and ad networks at them that they're forced to integrate.


Yeah, but I've seen projects using bootstrap, themes, extensions, and extras... in addition to jQueryUI & Mobile all loaded... as well as 2-3 versions of jQuery in different script tags.

That doesn't reflect responsible, light-minded development. Hell, load a couple of Airline website home pages... I don't think any of them are loading uncer 400kb of JS. that's for the homepage alone. Let alone the number of individual assets being requested (less of an issue once http/2 takes hold.. but still.


This. I have my own site slim and lean even with javascript (analytics) in there and it loads in 300-500ms. I developed a site for a client that was slim and lean and then it started. Sliders, full page background images, photos of random people, 3x analytics, this and that. It became a mammoth of 2-3mb and loading times of 1.5-3s.

When we tracked conversion the best converting page on the site was one made specifically for that and it is 100kb and loaded instantly. Two images and lots of text. They still insist on slow, beautiful pages elsewhere instead of making them convert as well.


I'm sure that's true of a certain class of developer. However, given what I've seen in the frontend & JS community in past years, I don't think there is a pervasive desire to make pages lighter. If it is, very few of them are showing that in the way they actually build things.


Especially, since many frameworks are really easy to use from source (Bootstrap comes to mind), but almost nobody does... they have the framework, the patched css, and then in the project, maybe there's scss/less etc. Instead of having a project that starts off composed with the appropriate pieces of bootstrap.

That's just bootstrap, not even the shear number of jQuery UI bits floating around, and heaven forbid you see both in a project.. and all the "bugs" that the input in one page doesn't match others. sigh


That's true in some cases, but there are lots of big company sites with no ad networks that are crazy bloated and slow. Some have trackers, but would be bloated (2,3,4MB+ over the wire) without them as well.


Doesn't Segment.io address such issues?


To me the bigger problem is the javascript bloat making it crawl on a mobile device with 1Gb of ram and a 1.3 ghz processor.


Sounds like the perfect addition to corporate provisioning policies/profiles for all their machines!


I think a more radical departure is needed. It's about time to acknowledge that the web is increasingly being used to access full-blown applications more often than "webpages."

Web browsers have more or less become mini operating systems, running elaborate virtual machines. There's way too much complexity for everyone involved — from web devs and browser devs to the users and the people who maintain the standards, then there's the devs who have to make native clients for the web apps — just to deliver products that don't have half the power of OS-native software. Everyone has to keep reinventing the wheel, like with WebAssembly, to fix problems that don't have to be there in the first place, not anymore:

Thanks to smartphones, people are already familiar with the modern concept of the standalone app; why not just make downloading an OS-native binary as easy as typing in a web address, on every OS?

Say if I press Cmd+Space on a Mac and type "Facebook" in Spotlight, it immediately begins downloading the native OS X Facebook app. The UI would be described in a format that can be incrementally downloaded, so the experience remains identical to browsing the web, except with full access to the OS's features, like an icon in the Dock and notifications and everything.

TL;DR: Instead of investing resources in browser development, Android/iOS/OSX/Windows should just work on a better standard mechanism to deliver native apps instead.


I disagree.

This is a backward-looking argument which ignores the unique benefits of the web which made it inevitable that it would evolve into an application platform, regardless of how tortured the results may feel.

The web is the first truly cross-platform development environment. It is not controlled by a single vendor, and anyone implementing a new computing device must support the web (just stop for a second and consider from a historical perspective what a monumental accomplishment that is). Furthermore, it allows casual access of content and applications without any installation requirement. It comes with a reasonable security model baked-in, which, while imperfect, gets far more attention than most OS vendor sandboxing schemes. Last but not least, the web's primitive is a simple page, which is far more useful than an app as a primitive—for every app someone installs they probably visit 100 web pages for information that they would never consider installing an app for.

I agree that the web is sort of abused as an application platform, the problem is there is no central planning method which will achieve its benefits in more app-oriented fashion. No company has the power to create a standard for binary app deliverables that will have anywhere near the reach of the web. And even if one could consolidate the power and mastermind such a thing, I feel like it would run squarely into Gall's Law and have twice as many warts as the web.


"The web is the first truly cross-platform development environment."

No it isn't. Not even close. It's maybe the first cross-platform development "environment" of which millenials are widely aware. But it's only an "environment" in the most ecological sense -- it's a collection of ugly hacks, each building upon the other, with the sort of complexity and incomprehensibility and interdependency of organisms you'd expect to find in a dung heap.

"Last but not least, the web's primitive is a simple page, which is far more useful than an app as a primitive"

For whom, exactly? You're just begging the question.

I'll grant you that "installing" an app is more burdensome for users than browsing to a web page, but the amount of developer time spent (badly) shoe-horning UI development problems (that we solved in the 90s) into the "page" metaphor is mind-boggling. In retrospect, the Java applet approach seems like a missed opportunity.

The proper reaction to something like React, for example, should be shame, not pride. We've finally managed to kludge together something vaguely resembling the UI development platform we had in windows 3, but with less consistency, greater resource consumption, and at the expense of everything that made the web good in the first place. And for what reason? It's not as if these "pages" work as webpages anymore.

A proper "application development environment" for the web would be something that discards the page model entirely, and replaces it with a set of open components that resembles what we've had for decades in the world of desktop application development.


You are so right. The web as an application delivery platform sucks. And we are no even capable of producing UIs with the same level of polishment I did in visual basic 2.0 in 1996 or so.

Alan Kay has expressed the same filling.

PS to downvoters if you have not used a proper interface designer such a QT or Delphi, then you don't know what we mean. Please watch some videos to decide if the state of the art (angular and react) is what we should be using in 2016.


The downvotes are because he is not engaging with my point. Never did I say the web is a proper application development environment. My point was that you can't create a proper application development environment that is both an open and defacto standard the way the web is.


How am I not engaging with your point? In response to a comment pointing out how we need to re-think web application development, you said:

"The web is the first truly cross-platform development environment."

...and then talked a bit about how it's open (yeah, ok, sure), and then you said it's not really a good application development environment (obviously).

I'm saying, your entire premise is wrong: it isn't an application development environment, any more than a box of legos is a "housing development environment". People have built houses out of legos, but that doesn't make "lego" a building material. It's a big, messy, nasty hack.

The fact that it's "open" is a non-sequitur response to "it's the wrong tool for the job", which is what the OP (and I, and elviejo) are arguing. It's also not a legitimate response to argue that any re-thinking of the model has to come from a company, or otherwise not be "open".

The reason that web apps happened is because web apps started as a hack. That doesn't mean we can't change the paradigm, but to do that, we have to stop defending the current model.

(Realistically, the reason I'm getting downvoted probably has more to do with my willingness to call out React as a pile of garbage than with the substance of the greater argument. C'est la vie...it's actually pretty amusing to watch the comment fluctuate between -3 and +3...)


> That doesn't mean we can't change the paradigm, but to do that, we have to stop defending the current model.

Here's the crux of our disagreement. You believe that the web is such a broken application platform that it is possible to convince enough vendors and people to get behind a better solution. However, I (despite your presumptuous implication that I'm a millenial), have been around long enough to know that will never happen. Web standards will continue iterating, and companies will continue building apps on the web, even the most powerful app platforms today such as iOS and Android for all their market power can not stop this force. The reason is because it's a platform that works. The man-millenia behind the web can not be reproduced and focused into a single organized effort. You might as well argue that we replace Linux with Plan 9, it doesn't matter how much passion you have and how sound your technical argument is, Linux, like the web, is entrenched. It's gone beyond the agency of individual humans and organizations to become an emergent effect.

That's not to say that the web might not some be supplanted by something better, but it won't come because of angry engineers wringing their hands about how terrible the web is. It will come from something unexpected that solves a different problem, but in a much simpler and more elegant way, and over time it will be the thin edge of the wedge where it evolves and develops into a web killer.

Maybe I'm just cynical and lack vision, perhaps you can go start a movement to prove me wrong. I'll happily eat my hat and rejoice at your accomplishments when that time comes.


"You believe that the web is such a broken application platform that it is possible to convince enough vendors and people to get behind a better solution. However, I...have been around long enough to know that will never happen."

"That's not to say that the web might not some be supplanted by something better..."

Whomever wrote the first paragraph of your comment should get in touch with the person who wrote the second paragraph.

OK, seriously, though, let's summarize:

1) Person says "web development sucks, here's why: $REASONS"

2) You reply: "it's the only truly cross-platform development environment ever"

3) I (and others) reply: "no, it really isn't. it isn't even a development environment, by any reasonable measure."

Now you're putting words in my mouth about convincing vendors and starting movements. I'm not trying to start a revolution here, just trying to counter the notion that we can't do any better than the pile of junk we've adopted. You don't have to love your captors!

I have no idea if someone will come up with a revolutionary, grand unified solution tomorrow, but I know that this process starts with the acknowledgement that what we have sucks, and that we have lots of examples of better solutions to work from. Hell...just having a well-defined set of 1995-era UI components defined as a standard would be a quantum leap forward in terms of application development.


The irony is I understand your qualitative opinion of the web, and I generally agree with it. What I believe makes you unable to see my argument is an inability to separate technical excellence from the market dynamics that govern adoption.

Declaring the web "not even a development environment" is just absolutist rhetoric that can in no way further the conversation. If you define "development environment" as a traditional GUI toolkit then your're just creating a tautology to satisfy your own outrage.


This is a great discussion. What is it about the English language that makes it so much easier to oppose someone than express nuances in general opinion? I would like to see more discussions like this based at implementation level, surely something valuable and innovative is being grasped at by both sides.


Lets be honest, the web is a developer environment in the same way that a paper aeroplane is a passenger plane. I mean I'm sure its possible to create a 747 from paper, but do you really want to?

Why is it that a new javascript framework pops up each week? Its because the web as a developer environment is deficient. Despite it being standardised so much stuff doesn't work without kludges in each browser


> It will come from something unexpected that solves a different problem, but in a much simpler and more elegant way, and over time it will be the thin edge of the wedge where it evolves and develops into a web killer.

So.. app stores?

It has already begun. The most popular webapps (Facebook, Twitter etc.) already have native clients in Android and iOS. I believe the majority of people already prefer and use the native FB/Twitter apps more often than accessing the FB/Twitter websites. So it's already obvious that native apps must be more convenient.

Right now however, app stores are a little clumsier to navigate compared to browsers.

For webapps:

• you have to open the browser,

• type in the address OR

• use a web search if you don't know the exact address.

But for apps:

• you have to open the app store,

• search for the app,

• potentially filter through unofficial third-party software,

• download the app, possibly after entering your credentials,

• navigate to the app icon,

• authorize any security permissions on startup (in the case of Android or badly-designed iOS apps.)

We just need the Big Three (Apple/Google/Microsoft) to actively acknowledge that app stores can supplant the-web-as-application-platform, and remove some of those hurdles.

Ideally an app store would be akin to searching for a website on Google.com (or duckduckgo.com) with a maximum of one extra click or tap between you and the app.

Apps should also be incrementally downloadable so they're immediately available for use just like a website, and Apple already has begun taking steps toward that with App Thinning.

Ultimately there's no reason why the OS and native apps shouldn't behave just like a web browser, because if web browsers keep advancing and evolving they WILL eventually become the OS, and the end result will be the same to what I'm suggesting anyway.

Currently though, both the native OS side and the web side exist in a state of neither-here-nor-there, considering how most people actually use their devices.


Not so much works, as they have found a way to monetize as a rent.


I'm in the middle of reading through The Unix-Haters Handbook and I must say that the arguments against web-as-application-platform (and the attempts to defend it) and are eerily similar to what this book reports about arguments against Unix that were circulated in the 80s and early 90s. Sadly, the fact Unix managed to a) win, and b) fuck up the computing world so badly that people don't even realize how much we've lost doesn't make me hopeful about the future of the web.

Bad stuff seems to win because it's more evolutionarily adapted than well thought out stuff. This happens to hold for programming languages too.


Your takeaway of computers going from niche industry to the single largest driver of global economic activity is that the bad stuff won? What an incredibly myopic conclusion.

The recent cross-communication between JavaScript, Elm, and Clojure has been incredibly fruitful but hasn't been noticed by the bitter die-hards. And really, almost all of it could've happened literally 15 years ago with Lisp if the Lisp community hadn't been dismissive, arrogant douchebags that considered JavaScript a worthless toy language.

What's truly sad is that some people would rather be abstractly right while producing nothing of value than work with the dominant paradigm and introduce useful concepts to it.


> Your takeaway of computers going from niche industry to the single largest driver of global economic activity is that the bad stuff won? What an incredibly myopic conclusion.

This did not happen thanks to Unix; if anything, you'd probably have to be grateful to Microsoft and Apple for introducing OSes that were end-user-usable. There's a reason the "year of Linux on Desktop" never happened and is always one year from now.

The point of The Unix-Haters Handbook, which also applies very much to modern web is that the so-called "advancement" didn't really bring anything new. It reinvented old things - things we knew how to do right - but in a broken way, full of half-assed hacks that got fossilized because everything else depends on it.

> And really, almost all of it could've happened literally 15 years ago with Lisp if the Lisp community hadn't been dismissive, arrogant douchebags that considered JavaScript a worthless toy language.

I don't know where you're getting that from, but it's probably a good opportunity to remind you that JavaScript was supposed to be Scheme twice, both time it didn't happen because Netscape wanted a Java-looking solution right fucking now to compete first with Java, and then with Microsoft, and somehow no-one thought to pause for the moment and maybe do it right.

(Also don't blame Lisp community for the fact that companies reinvented half of Lisp in XML. Rather ask yourself why most programmers think the history of programming is a linear progression of power from Assembler and C, and why they remain ignorant of anything that happened before ~1985.)

JavaScript got a bad rep because a) it was terribly broken (less so now), and b) because of all the stupid stuff people were writing in it those 15 years ago. But the current problems of the Web are not really the fault of JavaScript, but of the community moving forward at the speed of typing, without stopping for a second and thinking if those layers on layers on layers of complexity are actually needed or useful. Simple landing pages are now built on frameworks that are more complex than what we used to call "Enterprise Edition" 10 years ago.


Unix isn't for end users, it's for developers to build on top of to give things to end users. Linux on desktop already happened years and years ago if you work for a tech company, and that's probably about as far as it needs to go.

http://steve-yegge.blogspot.com/2006/04/lisp-is-not-acceptab...

This is Steve Yegge's understanding of the Lisp community, and I should clarify that I don't think the XML monstrosities we all work with are "all their fault", but that, on the whole, the Lisp community and enterprise coders were mutually antagonistic.

Javascript after ES3 really wasn't broken at all, just most people coding in it didn't know how to take advantage of it. No language can prevent someone dedicated to bad code architecture from writing bad code. A Lisp programmer would've found a lot of comfortable features and powerful patterns and been able to share them, but most of their efforts were wasted denouncing everyone outside of their tiny sect. The end result was that most people learning JavaScript were taught how to code like it was broken Java because most of the resources were written by Enterprise Java devs who didn't understand what a fundamentally impoverished language Java is.


Thanks for the link to that post. I'd also advise to read through its comment though - some people there, especially Pascal Costanza, point out quite a lot of problems that basically reduce it to ranting of a person who doesn't understand the language and the culture he's writing about ;).

Also, the influx of enterprise patterns into JavaScript is quite a recent phenomenon - personally, I blame Google (who, for a reason I can't understand to this day, embraces enterprise-level Java as their primary platform for everything...), but regardless, the problem with JavaScript culture is mostly that of very fast growth coupled with lack of experience and (probably unwilling) ignorance of the past. Since this community basically controls the Internet, it's hard for voices expressing some restraint and thoughtfulness to get trough the noise.

And I really do recommend The Unix-Haters Handbook. Funny thing is - over a decade ago, when I was acquainting myself with the Linux world (after many years of DOS and Windows experience), I've been noticing and complaining about various things that felt wrong or even asinine. Gradually I got convinced by people I considered smarter than me that those things are not bugs but features, they're how a Good Operating System works, etc. Only now I realize that my intuition back then was right, but I got Stockolm-syndromed to accept the insanity. Like most of the world. The sad thing is, there were better solutions in the past, which once again shows how IT is probably the only industry that's totally ignorant of its own history and constantly running in circles.


What part of UHH do you think is not obsolete? I dislike many aspects of Unix, but it seems like 90% of the book is either wrong, meaningless invective, or obsolete. It's hard to view it as of more than historical value -- which is a shame, because Unix is far from perfect.


To clarify, I'm about half of the way through the book, and I have found a couple arguably-correct points. Complaints about Usenet and Unix vendors have mostly gone the way of the dodo. The author consistently ignores any distinction between the filesystem and the rest of the OS, which may have been accurate at the time of writing, but hasn't been true for decades. Similarly, we don't distinguish between a given shell and Unix as a whole, even though the book makes explicit mention that other shells exist. And why there is a chapter on Usenet passes understanding. Suggesting that shell expansion be handled by an external library is equally bizarre. So far the valid points are: * command input syntax is inconsistent. I don't know that this has a feasible solution, but it is true. * tar specifically sucks * sendmail configuration sucks

The real crime of UHH is that it merely hates, it does not instruct. When we do find valid criticisms, there is no suggestion for how to fix things, or how other OSes are better at the same role. I've resigned myself to read the entirety, but for all the authors' complaints about not learning anything from history, one can only feel like they have themselves to blame.


If you know C++/Perl/PHP but don't know ES6, websockets, Node, etc, would you not say that your opinion about the web being a bad environment for rich applications might be in some way colored by your personal economic interests?

A common defensive mechanism among people with outdated skills is to try to delegitimize new frameworks and technologies in the hopes of convincing the broader community not to use things they don't know.

I'm inferring this from your arguments being driven by analogies and insinuations rather than concrete critique. It's not my intention to attack you personally, but an aggressively dismissive attitude towards unfamiliar concepts should be properly contextualized.

As for React, isn't it more likely that you don't know React very well, have never looked at its internals, and in general don't feel like you have the time or ability to learn much about modern web development?

If you build a few projects with React and still dislike it, good! Your critiques will be a lot more valid and useful at that point, whereas right now...yeah.


I know C++, used to know Perl, and know JavaScript pretty well. The fact that you actually name WebSockets as a technology sort of sums up the issue we are discussing here. WebSockets is not something that can be compared to C++ or even Node. It's a dumb hack which justifies its existence primarily by allowing web apps to circumvent corporate firewall policies.

The web is a joke of an app platform. Those of us who have wider experience of different kinds of programmings see some web devs struggling with this concept and conclude, I think quite reasonably, that the only plausible explanation is lack of experience. This is not due to "outdated skills" - I daresay everyone criticising the web as a platform in this thread has, in fact, written web apps. It's the opposite problem. It's to do with developers who haven't got the experience of older technologies having nothing to compare it too, so "web 2.2" gets compared to compared to "web 2.0" and it looks like progress.

And in case you're tempted to dismiss me too, I recently tried to write an app using Polymer. It sucked. The entire experience was godawful from beginning to end. Luckily the app in question didn't have to be a web app, so I started over with a real widget toolkit and got results that were significantly better in half the time.


I want to disagree with you more here, but Polymer really does suck.

I would be interested in a detailed explanation of why websockets are a "dumb hack." Duplex streams much more closely map to what web apps actually need to do than dealing with an HTTP request-response cycle. In what way is streaming a hack and making requests that were originally designed to serve up new pages not a hack?


  > My point was that you can't create a proper application 
  > development environment that is both an open and defacto 
  > standard the way the web is.
Why? Is there a technical limitation? Or are you saying that it's technically possible, but that you have no faith that vendors will cooperate and support such a thing?


The latter.


>My point was that you can't create a proper application development environment that is both an open and defacto standard the way the web is.

In addition to the countless examples posted in this thread, I would argue that if it's nearly impossible to create your own implementation of a platform or standard from scratch, then it's not really open in a practical sense. Who cares if the specs are available if it takes dozens or hundreds of man years to deliver a passable implementation?


"Who cares if you have the formula of a drug to cure cancer if it takes many manyears to produce it from said formula?"


The web makes Java look clean and elegant. Which is saying something. But it's an open question whether anything as popular as the web could have been any less crappy. CPU cycles and memory are way cheaper than social cooperation.


> No it isn't. Not even close. It's maybe the first cross-platform development "environment" of which millenials are widely aware.

Name one then instead of this patronizing BS.


Just off the top of my head: Delphi, Smalltalk, Java, Flash, Mono, and many different cross-library thunks like wxWindows, Qt, GTK+, and so on.

Two seconds of googling will find you dozens.


Those aren't cross-platform the way the web is cross-platform, they all depend primarily on one vendor's implementation. None of those ever had any hope of crossing the chasm to ubiquity, they only gained as much traction as their vendor had market clout on a limited set of platforms for a limited time period.


"Those aren't cross-platform the way the web is cross-platform, they all depend primarily on one vendor's implementation"

And web browsers are different how, exactly?

(Other than the fact that "the web" is a mish-mash of hundreds of different "standards" with varying levels of mutual compatibility and standardization, of course.)


Web browsers have standards they're supposed to meet. Where's the standard for qt? Where's the competing implementation? You can denigrate the web standards, even with some degree of truth, but you yourself have pointed out the difference. Perhaps instead of being dismissive you can elaborate how this difference is ultimately valueless despite its apparent success.


Both the JVM and the CLR have multiple implementations. Both JVM and CLR have a standard, as do Java and C#. The primarily implementation for both is also open source. So no, they don't primarily depend on one vendor's implementation.


> But it's only an "environment" in the most ecological sense -- it's a collection of ugly hacks, each building upon the other, with the sort of complexity and incomprehensibility and interdependency of organisms you'd expect to find in a dung heap.

That's what every practical environment is. Only the environments which never get used remain "pristine" in the architectural sense, because you cannot fundamentally re-architect an ecology once you have actual users beyond the development team and a few toy test loads.


Given how fast JS world jumps between frameworks I'm thinking that yes, a strong enough actor could in fact re-architect the Web, or at least clear out some of that dung heap...


I had this discussion the other day with a junior engineer in regards to web assembly and the need for more stable compile targets than javascript.

I think a big part of the problem is that web developers have forgotten (or never learned) about a lot of the ui innovation that has already been done for native platform development.

I blame the page / html / dom model for this. It has forced generations of web developers to figure out clever (or not) workarounds to the point that they actually think they are innovating when they arrive at the point qt was at years ago.


This post is about web bloat, not about app platforms.


> Android/iOS/OSX/Windows should just work on a better standard mechanism to deliver native apps instead

one might imagine that after these competing and incompatible native apps become a headache for crossplatform pursuits, a new platform will emerge that provides a uniform toolset for developing (mostly) native-platform independent applications.

perhaps this toolset will utilize a declarative system for specifying the user interface, and a scripting system that is JIT'd on each platform.


Magic! We could even have standardized links between the apps, and companies that specialize in finding content no matter what app it's in.

I think you're on to something...it could be huge... Heh.


> I think you're on to something...it could be huge... Heh.

Could be. Sad that it isn't. Think how awesome it would be if app developers actually cared about interoperability instead of trying to grab the whole pie for themselves while giving you a hefty dose of ads in return. This is mostly the fault of developers, but the platform itself could help a lot if it was more end-user programmable. You'd have at least a chance to force different apps to talk to each other.


He's being sarcastic. Links are literal web links and the Company he's referring to is Google.


Yes, I know. And I'm trying to subtly point out that it doesn't even work on the web, because it got fucked up by cowboy companies who ignore standards and do whatever they like to get easier $$$.


Android has declarative UI, JIT compiled app logic and a way to link apps together (via intents). It is definitely not the web though.

I think people confuse ideas with implementations. The web is a pretty reasonable implementation of the idea "let's build a hypertext platform". It is not at all a reasonable implementation of the idea "let's build an app platform" which is why in the markets where reasonable distribution platforms exist (mobile) HTML5 got its ass kicked by professionally designed platforms like iOS and Android.


Well, why do native apps for all the popular websites exist? Surely nobody would need or download the FB/Twitter/Reddit/etc. apps on mobile or desktop if the website itself was the optimum experience...

My point is that even with the web the developers are still going to make the native clients, so either the web has to become good enough for the need for native apps to disappear eventually, and the browser becomes the OS, or native apps become convenient enough to completely replace webapps.

Of course if the browser becomes the OS then the end result would be the same as the suggestion in my original post.


There is one such platform called Haxe (http://haxe.org)


Interestingly, this was what Alan Kay was advocating for on the web: "A bytecode VM og something like X for graphics" [1].

When Lars Bak and Kasper Lund launched Dart [2], I found it sad that they weren't more bold - left CSS and the DOM alone, and created an alternative Content-Type. So you can choose to Accept 'application/magic-bytecode' [3] before text/html, if your client supports so. Sadly, we ended up with Web Assembly, which by the few talks I've seen, appears to only cater to that of graphic/game developers, with no support for dynamic or OO languages.

[1] https://www.youtube.com/watch?v=FvmTSpJU-Xc [2] https://www.dartlang.org [3] Or in Dart lingo, a VM snapshot.


Yes I think of dart as a missed opportunity, it isn't smalltalk for the web, neither is it strongly typed... I think this falt of character makes it that nobody hate it, but also no body loves it.

Go doesn't have generics, some hate it, some love it. But it took a strong stand on that point.


>Web browsers have more or less become mini operating systems

I wish. No, web browsers have become massively bloated operating systems. And since they didn't intend to, they are terrible at it. You have little to no control over anything.


Have you looked at the size of native apps these days? They make the web look almost unbloated by comparison. (It feels like half of them are shipping an entire HTML rendering engine too.)


Exactly. E.g. have a look at the Facebook app.


Maybe because I don't want a bloody app for every site that I browse? Apps of that sort are just dumb. And huge security risks.


Of course webpages would still be around and needed, and desired for more basic content, but by the time you want to offer something as complex or regularly used as Facebook or Twitter or Reddit or Soundcloud and so on, it'd be better as a native app, as the current native clients for each of those websites already prove.

I mean the UI is undeniably smoother, and they can seamlessly hook into the OS notifications systems and better multitasking (for example I see separate entries for each native app in the OS's task switcher, but have to go through the extra step of switching into a browser then its tabs for webapps) and energy saving and everything else.


I don't want seemless and hooked. I want isolated, and sandboxed.


> regularly used as Facebook or Twitter or Reddit or Soundcloud

And there's the problem. I don't use any of those four sites regularly, but I have visited all of them. Hyperlinks provide for that, and they (a) don't exist or at best would be awkward in a native app (not that webapps handle them well to begin with, though all of the above do allow for them at least) and (b) work between apps, platforms, and what-have-you. If I got a link to some image, <160 character sentence, comment thread, or song and was prompted to download an app, I would probably not view that content instead.


That sounds ambitious, interesting, and like a lot more work than keeping a webpage running. My first thought is "cross-platform nightmare". My second thought is that they'd have to rethink their expectations of continuous deployment/release and a/b testing.

We're either looking at making individual pages maintain binaries for the platforms they support (implying support of only those platforms that make sense to the site) or some kind of compilation framework running on the local machine.


This native delivery is pretty much a solved problem, just look at how firefox and chrome auto update.


Firefox and Chrome don't do "move fast and break things"-style continuous deployment, can't push updates to a user seamlessly (i.e. without restarting the program), and can be downloaded and compiled on platforms that Mozilla and Google don't want to officially support.

Native delivery of a monolithic browser based on an open-source codebase is a fixed problem. Trying to do the same with a website, using current techniques, would cause issues to both their current workflow and my expectations as a user that websites don't currently have.

I'm not saying that it's impossible to do, I'm just saying that it's not a good fit for current trends in web development, and I'm not convinced that it would be great for the users either.


I also agree that the web is full of bloat and a mishmash of compromised features. I think you can design a web thats more powerful yet simpler if you actually solve the problems in a more forward thinking manner.

For example there has been so many things for powerful layouts whereas everyone knew 10 years ago we need powerful layout solutions (flexbox or whatver) and now we have grid frameworks and years of craft on older css enhancement that has to be supported. They keep adding features here and there to sort of address lots of problems where individually those features might be cheaper but the overall cost of implementing both by browsers and the us regular developers is much higher.

Here are couple of the things I want from the web and quite a few of them are there already if not in super ideal forms. Powerful layout thats simple enough to use, concept of webpage, a bundle (http2? all your resource together), making the whole partial rendering (ajaxified page) a natural concept. Even making the UI/markup delivery being made separate from content (you can do that with all sorts of library but I think it should be at the core). Security concepts that are easier to implement (CSRF, url tampering etc.).

One of the idea I had is that browsers make a new engine that does the right things from the start and hopefully thats a much lighter engine and if you serve new pages they are really fast and if you serve old pages there is an optional transpoiler kind of thing that translates to the new version of the fly. Now it won't be terribly good to start with so its optional but essentially the old version is frozen and the more people start to only use the new engine (with transpoiler).


Not sure about the conclusion but you're absolutely right about the problem. This is why the social networks are eating such a great percentage of screen time: pure content, outside of web applications, is better suited in a single format rather than the superfluous loud magazines that currently pepper the web.

Perhaps rather than native apps what we need is the return of gopher. I think that's what Apple's trying to do with Apple News.


That's a good point. Too much control over the form is left with website owners and too little with web users. Most of the web looks much better when you strip out the styling it comes with.

In a way, this is why I like doing more and more things from inside Emacs. I get a consistent interface that I control, and that is much more powerful than what each app or website I'd otherwise use can offer. Hell, it's a better interface than any popular OS has.


> Say if I press Cmd+Space on a Mac and type "Facebook" in Spotlight, it immediately begins downloading the native OS X Facebook app

That's exactly what Windows 10 does.

http://imgur.com/EqvATwL


Does it auto-download? You still have to install it anyway.


I'm fine with that.. however, I've seen MANY instances of websites that load multiple versions of jQuery... that's just one library. Let alone how poorly constructed most web applications are.

When it comes down to it, for so long many front-end guys only cared about how it looked, and backend guys didn't care at all, because it's just the ui, not the real code.

We're finally at a point where real front end development is starting to matter. I honestly haven't seen this much before about 3-5 years ago... which coincides with node and npm taking over a lot of mindshare. There's still a lot of bad, but as the graph show, there's room to make it better.


> It's about time to acknowledge that the web is increasingly being used to access full-blown applications more often than "webpages."

I think that is orthogonal to bloat. Sure, a complex app will always have more to load and compute than a static page with one blog post on it, but that doesn't mean an app can't be bloated on top of that, just like pages with just a single blog post on them can be bloated.


Speaking of bloat,

https://news.ycombinator.com/item?id=11548816 - The average size of Web pages is now the average size of a Doom install


That's not really bloat, because anybody who sees your comment will already have that link in the browser cache ^_^

(To make this comment not entirely frivolous, does anyone remember the "bloatware hall of shame", or however it was called? I couldn't find it or anything decent like it, sadly. How about something like it for websites?)


> Thanks to smartphones, people are already familiar with the modern concept of the standalone app; why not just make downloading an OS-native binary as easy as typing in a web address, on every OS?

Congratulations, you've invented ActiveX.


>Say if I press Cmd+Space on a Mac and type "Facebook" in Spotlight, it immediately begins downloading the native OS X Facebook app

Epic malware vector.


What constitutes "way too much complexity" ? What if browsers are evolving with whatever resources are available to them at just the right pace. Why would you want the browser to be hindered by an arbitrary speed/resource limit. Let it soar to the sound of fans going full speed!


Isn't WebAssembly a big step towards this?


WebAssembly is even not supported natively in any browser, is it?


In my opinion, if there is something that the web can't do as well as native it is a bug.

Web technologies can already do most of what you are proposing, including notifications. There are some performance issues, but they are well on their way to being fixed.


Thing is, those performance issues matter. In user interfaces, they mean a lot. And they're also important when you're trying to do actual work. I'm yet to see a web app that wouldn't choke on the amount of data a power user may need. Yes, that includes Google's office suite, which is so horribly laggy when you put more than a screenfull of content into it that the very experience justifies switching to Windows and buying Microsoft Office.

What we would need, if the browser is to become a platform for actual productivity tools and not shiny toys, is a decent persistent storage interface - one that would be controlled by users, not by applications, and that could be browsed, monitored. And most importantly, one that would be reliable. And then, on top of that, a stronger split between what's on-line and what's off-line. Because some data and some tasks should really not be done through a network.


The problem with the web is not one specific missing feature, or even performance. Man-centuries of effort by browser makers have been able to make performance not-quite-competitive instead of just hopelessly uncompetitive.

The problem with the web is that the developer experience is nightmarish. The fact that native apps don't suffer XSS should be a hint about where to start looking, but it's really just a house of horrors in there.


I would say that a possible solution is also to better rank websites that mention the checksums of their external resources and make web browsers keep them in cache much longer... if pretty much every website uses jQuery, perhaps we should ship jQuery with the web browser?


I hear this argument a lot, and I very much disagree. Now you have browser vendors having to device which libraries are "popular" and shipping them in the initial download of the browser.

It turns out that this technology already exists in a much better form. It's called cache. The problem is that almost everyone hosts their own version of jQuery. If everyone simply linked the "canonical" version of jQuery (the CDN link is right on their site) then requiring jQuery will be effectively free because it will be in everyone's cache.

Also the cache is supported by all browsers with an elegant fallback. Instead of having to manually having to check if your user's browser has the resource you want preloaded you just like the URL and the best option will automatically be used.

TL;DR Rather then turning this into a political issue stop bundling resources, modern protocols and intelligent parallel loading allow using the cache to stove this problem.


> If everyone simply linked the "canonical" version of jQuery (the CDN link is right on their site) then requiring jQuery will be effectively free because it will be in everyone's cache.

It's not, though. I ran this experiment when I tried to get Google Search to adopt JQuery (back in 2010). About 13% of visits (then) hit Google with a clean cache. This is Google Search, which at the time was the most visited website in the world, and it was using the Google CDN version of JQuery, which at the time was what the JQuery homepage recommended.

The situation is likely worse now, with the rise of mobile. When I did some testing on mobile browsing performance in early 2014, there were some instances where two pageviews was enough to make a page fall out of cache.

I'd encourage you to go to chrome://view-http-cache/ and take a look at what's actually in your cache. Mine has about 18 hours worth of pages. The vast majority is filled up with ad-tracking garbage and Facebook videos. It also doesn't help that every Wordpress blog has its own copy of JQuery (WordPress is a significant fraction of the web), or for that matter that DoubleClick has a cache-busting parameter on all their JS so they can include the referer. There's sort of a cache-poisoning effect where every site that chooses not to use a CDN for JQuery etc. makes the CDN less effective for sites that do choose to.

[On a side note, when I look at my cache entries I just wanna say "Doubleclick: Breaking the web since 2000". It was DoubleClick that finally got/forced me to switch from Netscape to Internet Explorer, because they served broken Javascript in an ad that hung Netscape for about 40% of the web. Grrrr....]


> The vast majority is filled up with ad-tracking garbage and Facebook videos.

There is the problem then, and the solution? I for one don't make bloated sites nilly-willy, I suck at what I do but at least I do love to fiddle and tweak for the sake of it, not because anyone else might even notice; and I like that in websites and prefer to visit those, too. Clean, no-BS, no hype "actual websites". So I'd be rather annoyed if my browser brought some more stuff I don't need along just because the web is now a marketing machine and people need to deploy their oh so important landing pages with stock photos and stock text and stock product in 3 seconds. It was fine before that, and I think a web with hardly any money to be made in it would still work fine, it would still develop. The main difference is that it would be mostly developed by people who you'd have to pay to stay away, instead of the other way around. I genuinely feel we're cheating ourselves out of the information age we could have, that is, one with informed humans.


Interesting data, thanks for sharing.

On top of that, while everyone uses jquery, everyone uses different version of it (say, 1.5.1, 1.5.2, ... hundreds of different versions in total probably).


The problem with caching is that you're sharing the referer with the canonical URL. Another problem is that you're using someone else's bandwidth. And if you combine the two, you can be sure that info about your visitors will be sold, which is why quite a lot of people would prefer to host their own versions of jQuery...


For the referrer problem you can apply a referrer policy to prevent this but unfortunately the referrer policy isn't very granular.

Also for my sites I have a fallback to a local copy of the script. This allows me to do completely local development and remain up if the public CDN goes down (or gets compromised). With small (usually) performance impact.


Couldn't the HTTP cache of your ISP be a good in-betweener, in that case? Would he send the referer to the canonical URL?


Not for sites using TLS. The only option for secure sites would be a CDN. That is, an HTTP cache in a relationship with the content publisher rather than the subscriber.


The problem with hosting JS libraries on CDNs is that the cache has a network effect.

You only gain performance if the browser already has a cached version of this specific version on this specific CDN. If you don't - you end up losing performance, because now an additional DNS lookup needs to be performed, and an additional TCP connection needs to be opened.

Here are a few reasons people choose to avoid CDNized versions of JS libraries. http://www.sitepoint.com/7-reasons-not-to-use-a-cdn/

This is a 6 year old post, but it raises some valid concerns: https://zoompf.com/blog/2010/01/should-you-use-javascript-li...


the reason why I prefer to use a cdn is because it is a games theory example come to life here - if everyone used the cdn version then any user coming to your site would most likely have the cdn version in their cache and thus performance would go up, but if you use the cdn version and your competitors don't their performance is slightly better than yours and so on and so forth. Games theory indicates that in most games of this sort cooperation is better than non-cooperation.

And really if you are using one of the major libraries and a major CDN (Google, JQuery, etc.) over time your users will end up having the stuff in the cache, either from you or from others having used the same library version and cdn.

I suppose someone has done a study on CDN spreading of libraries and CDNS among users, so that you could figure out what the chance is that a user coming to your site will have a specific library cached - there's this http://www.stevesouders.com/blog/2013/03/18/http-archive-jqu... but it is 3 years ago, really this information would need to be maintained at least annually to tell you what your top cdn would be for a library.


but there isn't 1 CDN, or 1 version... if you need two libraries, but the canonical CDN for jquery is on one, and your required extension is on another... that's two DNS lookups, connections, request cycles, etc.

So you use the one that has both, but one is not canonical, which means more cache misses. That doesn't even count the fact that there are different versions of each library, each with it's own uses, and distribution, and the common CDN approach becomes far less valuable.

In the end, you're better off compositing micro-frameworks and building yourself. Though this takes effort... React + Redux with max compression in a simple webpack project seems to take about 65K for me, before actually adding much to the project. Which isn't bad at all... if I can keep the rest of the project under 250K, that's less than the CSS + webfonts. It's still half a mb though... just the same, it's way better than a lot of sites manage, even with CDNs


that's 2 dns lookups on any user that hasn't already done that somewhere in the past and had it cached.

The question then is how likely are they to have done that in regards to your particular cdn and version of the library.

I agree that a lot of possible cdns, versions and so forth decreases the value of the common CDN approach, but there are at least some libraries that have a canonical CDN (JQuery for example) and not using that is essentially being the selfish player in a games theory style game.

Since I don't know of any long running tracking of CDN usage that allows you to predict how many people who visit your site are likely to have a popular library in their cache it's really difficult to talk about it meaningfully (I know there are one-off evaluations done in one point in time but that's not really helpful).

Anyway it's my belief that widespread refusal to use CDN versions of popular libraries is of course beneficial in the short run for the individual site but detrimental in the long run for a large number of sites.


Latency of a new request as mentioned in one of those articles is the main reason why I self host everything.

Since HTTPS needs an extra round trip to startup, it's now even more important to not CDN your libraries. The average bandwidth of a user is only going to go up, and their connection latency will remain the same.

If you are making a SaaS product that business want, using CDNs also make it hard to offer a enterprise on-site version as they want the software to have no external dependencies.


This might make sense, if all of your users are located near your web servers and you can comfortably handle the load of all the requests hitting your web servers.

If the user making the request is in Australia, for example, and your web server is in the US, the user is going to be able to complete many round trip requests to the local CDN pop in Australia in the time it takes to make a single request to your server in the US.

Latency is one of the main reasons TO use a CDN. A CDN's entire business model depends on making sure they have reliable and low latency connections to end users. They peer with multiple providers in multiple regions, to make sure links aren't congested and requests are routed efficiently.

Unless you are going to run datacenters all around the world, you aren't going to beat a CDN in latency.


If the only thing you have on the CDN is libraries, it's faster to have your site host them even if it's on the other side of the world. When HTTP2 file push is widely supported, it becomes even more in favor of hosting locally, as you can start sending your libraries right after you are done sending the initial page without waiting for the browser to request them.

If you are using a CDN for images/video, then yes, you would have savings from using a CDN since your users will have to nail up a connection to your CDN anyways.

Then again a fair number of the users for the site I'm currently working on have high latency connections (800ms+), so it might be distorting my view somewhat.


Or you use a CDN in front of your site, caching the content under your domain. But certainly something to be aware of.


This is why I recommended using the CDN recommended by the project, most do recommend a CDN, for example jQuery has it's own CDN.

As for adoption, that is very much a chicken and egg problem.


Even then, different versions will have their own misses... not to mention 3rd party libraries on another CDN means another DNS hit.

DNS resolution time is a pretty significant impact for a lot of sites.


Ya I would have to agree with you tracker. Ever 3rd party dependency is introducing another DNS lookup. The whole point behind using a CDN effectively, besides lowering latency, is to reduce your DNS lookups to a bare minimum. For example, I use https://www.keycdn. They support HTTP/2 and HPACK compression along with Huffman encoding which reduce the size of your headers.

The benefits of hosting say Google Fonts, Font Awesome, jQuery, etc. all with KeyCDN is that I can take better advantage of parallelism if I have one single HTTP/2 connection. Not to mention I have full control over my assets to implement caching (cache-control), expire headers, etags, easier purging, and the ability to host my own scripts.


What if the checksum was the same and you accepted the cache hit if the checksum agrees and get your own copy if it doesn't? Maybe the application should get to declare a canonical URL for the js file instead of the browser? So something like

<script src="jQuery-1.12.2.min.js" authoritative-cache-provider="https://ajax.googleapis.com/ajax/libs/jquery/1.12.2/jquery.m... sha-256="31be012d5df7152ae6495decff603040b3cfb949f1d5cf0bf5498e9fc117d546"></script>

Would this cause more problems than it would solve? I'm assuming disk access is faster than network access.

I'm concerned about people like me who use noscript selectively. How easy is it to create a malicious file that matches the checksum of a known file?


>How easy is it to create a malicious file that matches the checksum of a known file?

I'd say not easy at all, practically impossible.

https://en.wikipedia.org/wiki/Preimage_attack


> I'm concerned about people like me who use noscript selectively. How easy is it to create a malicious file that matches the checksum of a known file?

SHA-256? Very, very, very, very hard. I don't believe there are any known attacks for collisions for SHA-256.


I think even a collision (any collision) has yet to found.


People make too big a deal of this collision stuff, a lot of times these are very theoretical would require tremendous computation. Anyway, for this use case, even md5, how likely really to make a useful malicious that file collides with a particular known and widely used one? I dunno seems pretty unlikely.


And if you worry about that you can always use 384. Plus a side benefit is that 384 is faster on a 64-bit processor.


It would be interesting if browsers start implementing a content-addressable cache. So as well as caching resource by URI also cache by hash. Then SRI requests could be served even if the URL was different.

Of course this would need a proposal or something but it would be interesting to consider.


Plan9's Venti file storage system is content addressable.

http://plan9.bell-labs.com/sys/doc/venti/venti.html

Also available on *nix


> How easy is it to create a malicious file that matches the checksum of a known file?

As others have pointed out, it's quite difficult. But here's another way to think about it: if hash collisions become easy in popular libraries, the whole internet will be broken and nobody will be thinking about this particular exploit.

Servers won't be able to reliably update. Keys won't be able to be checked against fingerprints. Trivial hash collisions will be chaos. Fortunately, we seem to have hit a stride of fairly sound hash methods in terms of collision freedom.


I think this vaguely reminds me of the Content Centric Networking developed by PARC. There's 1.0 implementation of a protocol on github (https://github.com/PARC/CCNx_Distillery). A CCNx enabled browser could potentially get the script from a CCN by referring to it's signature alone (it being a sha-256 checksum or otherwise).


This seems a little redundant - why not just

    <script src="jQuery-1.12.2.min.js" sha-256="31be012d5df7152ae6495decff603040b3cfb949f1d5cf0bf5498e9fc117d546"></script>
? If you wanted to explicitly fetch from google if the client doesn't have a cached copy, then instead do

    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.12.2/jquery.min.js" sha-256="31be012d5df7152ae6495decff603040b3cfb949f1d5cf0bf5498e9fc117d546"></script>
The first would seem preferable though, as loading from an external source would expose the user to cross-site tracking.


> The first would seem preferable though, as loading from an external source would expose the user to cross-site tracking.

You're right in that the first one you had with just the sha-256 would be pretty much equivalent as what I had especially given that hn readers have resoundingly given support to the idea that it is non-trivial to create a malicious file with the same hash as our script file. I was simply trying to be cautious and retain some control for the web application (even if the extra sense of security is misplaced).

This is the use case I'm trying to protect by adding a new "canonical" reference that the web application decides. As others in this thread have said, it is very unlikely that someone will be able to craft a malicious script with the same hash as what I already have. The reason I still stand by including both is firstly compatibility (I hope browsers can simply ignore the sha-256 hash and the authorized cache links if they don't know what to do with it).

As a noscript user, I do not want to trust y0l0swagg3r cdn (just giving an example, please forgive me if this is your company name). NoScript blocks everything other than a select whitelist. If the CDN happens to be blocked, my website should still continue to function loading the script from my server.

My motivation here was to allow perhaps even smaller companies to sort of pool their common files into their own cdn? <script src="jimaca.js" authoritative-cache-provider="https://cdn.jimacajs.example.com/v1/12/34/jimaca.js""></scri... I also want to avoid a situation where Microsoft can come to me and tell me that I can't name my js files microsoft.js or something. The chances of an accidental collision are apparently very close to zero so I agree with you that there is room for improvement. (:

This is definitely not an RFC or anything formal. I am just a student and in no position to actually effect any change or even make a formal proposal.


If accompanied by the exact same sha-256 hash idea, loading from any external source cannot expose the user to any additional risk.

SHA + CDN url list (for whitelisting/reliability purposes - public/trusted, and then private for reliability) would be ideal.


> The problem is that almost everyone hosts their own version of jQuery.

Any site that expects users to trust it with sensitive data should not be pulling in any off-site JavaScript.

As for checksumming, browser vendors don't need to pre-load popular JavaScript libraries (though they might choose to do so, especially if they already ship those libraries for use by their extensions and UI). But checksum-based caching means the cost still only gets paid by the first site, after which the browser has the file cached no matter who references it.


Yes but no.

    - jquery.com becomes a central point of failure and attack;
    - jquery gets to be the biggest tracker of all time;
    - cache does not stay forever. Actually, with pages taking 3Mo everytime they load, after 100 click (not much), I invalidated 300 Mo of cache. If firefox allow 2go of cache (it's lot for only one app), then an the end of the day, all my cache has been busted.


> If everyone simply linked the "canonical" version of jQuery (the CDN link is right on their site) then requiring jQuery will be effectively free because it will be in everyone's cache.

So create one massive target that needs to be breached to access massive numbers of websites around the world?

Imagine if every Windows PC ran code from a single web page on startup every time they started up. Now imagine if anything could be put in that code and it would be ran. How big of a target would that be?

While there are cases where the performance is worth using a CDN, there are plenty of reasons to not want to run foreign code.

(Now maybe we could add some security, like generating a hash of the code on the CDN and matching it with a value provided by the website and only running the code if the hashes matched. But there are still business risks even with that.)


The solution to this is including a checksum with the link to the file, and if the checksum doesn't match, don't load the file.

See https://developer.mozilla.org/en-US/docs/Web/Security/Subres... though it isn't universally supported yet.


Just so I understand, I pull the file and make a checksum, then hardcode it into the link to the resource in my own code? Then, when the client pulls my code, follows the link, checks the checksum against the one I included in the link.


Yes. It's very simple. I don't know why more library providers don't have it in their copyable <script> snippets.


I agree with you that putting JQuery in the browser is a bad idea, but the second part of the argument, that the browser will have the libraries in cache is not really that reliable. Here's some notes I collected on the subject: http://justinblank.com/notebooks/BrowserCacheEffectiveness.h....


I was definitely speaking whimsically because it is a huge chicken and egg problem. But theoretically if every site that used jQuery referenced a version at https://code.jquery.com/ there would be a very good hit ratio, even considering different versions. However we are a very, very far way away from that.


That may seem like a good solution for some sites, but the name of the page and the requestor's IP address and other information is also 'leaked' to jquery.com. This is not always welcomed. For example, a company has an acquisition tracking site (or other legal-related site) and the name of the targets are part of the page name (goobler, foxulus, etc.) which get sent as the referrer page and IP address to jquery.com or other third party sites/CDNs. While not a security threat, you may unwittingly be recommending an unwanted information leak.


But it's only leaked when you don't have it cached. Otherwise the client doesn't even make the request.


There's the firefox addon Decentraleyes https://github.com/Synzvato/decentraleyes which is trying to solve those problems, currently only the most common versions of each library are bundled, but there are plans to create more flexible bundles.

There's no reason to hit the webserver with an If-modified when the libraries already include their version in the path.


If everyone linked the "canonical" version of jquery, then that location would be worth a considerable amount of money to any sort of malicious actor. Just getting your version there for a few minutes could rope in many thousands of machines. Look at China's attack on github earlier in the year for examples of the damage that that sort of decision could do.


> if pretty much every website uses jQuery, perhaps we should ship jQuery with the web browser?

There's a firefox plugin that does just that! DecentralEyes - https://addons.mozilla.org/en-US/firefox/addon/decentraleyes...

Although mostly for privacy reasons, so google doesn't know all the sites you visit.


Hmmmm, or javascript could be standardized on something that isn't a barely-working kludge with no batteries included? If something like jQuery or Lodash is basically table-stakes for doing anything really worthwhile, then just maybe that de facto standard should be made de jure.


We absolutely, most certainly, should NOT ship libraries with browsers. This is a horrible idea. Browser vendors would then have too much influence on development libraries.


“We absolutely, most certainly, should NOT ship libraries with operating systems.”


There used to be a distinction between browsers and operating systems.


Pretty sure there still is.


It's getting pretty hard to tell.


All the major browsers ship with JavaScript.

If JS, absolutely needed JQuery in order to, say, select an element the way C needs a library to output to the console, then sure, you may have an argument here.

But, no.


And why do you think web development has become the dominant form of development?


Because literally any child can start developing, and then grows up and pulls every library under the sun because the language they use is a piece of crap, without a single thought about the consequences?

Thankfully node.js is bringing this clusterfuck to the desktop too.


It's more about deployment and dependency problems.

The dependencies we are talking about are transparent to the user, save only for some download performance issues which are pretty minimal, if not overstated in many cases, compared to the former issues faced by native applications.


I think the direction of the influence would mostly works the other way round. If there's an attribute on the HTML link tag that some browsers recognise as calling the browser's client-side version of that library (with local/CDN fallback for browsers that don't) then users are going to notice that browser X seems to load a lot of websites more quickly than browser Y, which puts pressure on the vendor of browser Y to make sure it also supports these frequently-used js libraries (especially if they're large frequently-used libraries with multiple minor version numbers and no widely-accepted-as-canonical CDN source)

This is still a shame if you're maintaining a heavyweight alternative to jQuery or React which is far too obscure to be a consideration for browser vendors, but it's a big boon for users, especially users on slow or metered connections that download several different version numbers of jQuery from several different CDNs every day.


I'm not sure that is true. No one is saying external libraries cannot be loaded. It is actually a good idea to be able to rely on certain popular libraries already on the client; this makes a lot of sense. You can still have the existing mechanism so I don't think browser vendor are going to hold everyone hostage.

If you like, there could be mutually agreed upon standard repository that browser vendors routinely update from.

Sure, your less popular experimental library won't be in the list, that's what "<script src=..." is for.

It probably won't happen but it is hard to defend the position that it is a bad idea, I think.


It could be a 'use the default or link your own' situation, which then wouldn't influence a whole lot.

But I agree, overall.


In this case, very very little would change. People would still link their own, as they do now.


Or worse, people would use the buggy, out of date default.


Common libraries and frameworks are usually loaded from a CDN. These should stay in the browser's cache for some time.


Yeah, but per domain. Unless you use something like cjdns, if you end up with static.yourlovelydomain.com, the resource's going to get downloaded from scratch at the first attempt. If we had checksums attached to resources, there'd be provably no point bothering to download them in many cases.


The problem is the quantity of CDNs on which jQuery for example can be found.

For instance, your site might pull it in from cdnjs, while I have the one from ajax.googleapis.com already cached. I still have to go fetch your selected version.


Only if everyone is using the same CDNs :)


Isn't this what the '.min' versions of these is for?

jquery-2.2.3.min.js is only 86KB for me. For the amount of functionality it adds, sure seems like a sweet deal.

Part of the problem (IMHO) is the growing requirement for larger, alpha-channel-using image formats like PNG, that then need to be expressed in a Retina format - I mean, shit looks terrible on my retina MacBook Pro that isn't properly Retina formatted. (Here's looking at you, Unity3D...even the current beta which supports Retina displays is extremely unstable and half the graphics (even stupid buttons on the UI) are still SD...)

With bandwidth costs declining [1] and basic RAM increasing [2] is there a particular reason a Web application should be much smaller than a typical desktop application? We have caches for a reason.

[1] http://www.networkworld.com/article/2187538/tech-primers/exp...

[2] http://en.yibada.com/articles/118394/20160422/macbook-air-is...


Why only jQuery, anyway?

What if browsers shipped with a repository instead of a cache? Download once, stay forever.


There are checksums available for resources now. As of last few months. Also things like jQuery are available from multiple CDN, and they are used by many of the top websites and html frameworks. So you probably already have it in cache.


Penalize any historically slow loading (for some arbitrary cut-off value, say 600 ms) site client side with an additional 10 seconds of waiting before any request is sent. This way users will hit the back button well before any data is sent to the server and the site operators will see traffic plummet.

If any browser vendor implemented my proposal, their users would switch to another browser. If it was an addon, only a handful of people would use it so there would be no impact. If it was done on the network level of a corporate business, users would complain. Still, one can dream :)


This whole thread is really interesting but I think it misses a key point: developing countries. The issue with these pages getting larger is that it creates two webs: one that work on the slower networks of less developed countries and another that works for the richer world.

It's often suggested that we'll solve poverty by providing education and given the internet provides a unique opportunity to learn then surely we benefit as a species by ensuring there isn't a rich web and a poor web? This risk is compounded by rich people being in a better position to create better learning opportunities.

Much like there's a drive toward being aware of the accessibility of your website (colour blind-check tools, facebooks image analysis for alt text tags, etc, etc) we should be thinking about delivery into slower networks in poorer countries.


I also like how Opera now allows you to benchmark websites with and without ads. I think other browsers should expand on that idea (benchmarking sites, and not just for ads).

Give some kind of reminder to both users and developers about how slow their sites are. Those with the slowest websites probably won't like it too much initially, but it's going to be better for all of us in the long term.

http://www.opera.com/blogs/desktop/2016/03/native-ad-blockin...


Google is already starting to base rankings on page speed, thats enough to get companies to spend money on page speed.


Yes, this would for sure be valuable.

Also, browsers could default to http://downforeveryoneorjustme.com/ for sites that loaded too slowly. AdBlock Plus and NoScript speed loads greatly. Maybe browsers could do triage on sites that load too slowly. Perhaps switch to reader mode or whatever.


Something that would be more effective is if Google began factoring page load speeds in their rankings just like they did for mobile optimization.


Why would that matter? If the website has the content I need it doesn't matter that this other site that doesn't match my query as well but loads faster is now the first result.


They do


i think they already do it.


the Disconnect Me (https://disconnect.me) add-on is doing something sort of similar. It will try blocking the useless external content (tracking, etc.) and will show you how much your loading time decreased and how many KB you saved by not loading those.


In order to load much lighter web pages, just use Opera Mini or http://gdriv.es/l/ [append url]


People have an innate metric for "this thing is slow", it's when lots of time passes between its start and end. Why is color-coding more evident than our own perception of time?


I think his key phrase is "expected to load," which means the color coding would provide an indicator before the user clicks on the links. Therefore, the user skips the unnecessary pain of perceiving the slowness for himself.


Yes exactly.


The problem is the same as "Downforeveryoneorjustme". Just because a site is loading slow once or twice doesn't necessarily mean that it's the site's fault, it could be any number of issues. But by crowdsourcing this data the truth is revealed.


I thought Google already punishes slow loading websites.


When Google does this they are explicitly making a trade off between relevance and speed on behalf of the user. I'd rather leave that choice to the user, especially when the user can infer that this is the correct destination from the search result text.

Also, it doesn't work other places.


You don't even need to do a complicated extension mechanism.

Users already respond to page load time. There's extensive evidence to support this.


Before everyone jumps onto the JQuery/Bootstrap/etc sucks bandwagon, just a reminder that the minified jquery from cdnjs is 84.1kb. Bootstrap is 43.1kb.

If you want your page to load fast, the overall "size" of the page shouldn't be at the top of your list of concerns. Try reducing the # of requests, first. Combine and minify your javascript, use image sprites, etc.


That's still 84 KB of highly compressed javascript code to parse and execute. Even on a 4.4 GHz CPU core and a cached request, jquery takes upwards of 22ms to parse - that's just parsing - not even executing anything! Now add a bunch of other frameworks and utility scripts and your 100ms budget for feeling "fast" is gone before you even reach the HTML content.


How long does it take you to start Word versus loading Google docs?

Has everyone gone insane? The web is absolutely incredible and JavaScript is absolutely killing it as a portable application language. Remember the dark days of flash? Stop complaining.


This is disingenuous - he's obviously referring to image and text articles, not complicated web apps. Obviously such a website is going to take longer to load, just as complex native applications take a while to load. But to claim that because Word takes a while that it's okay that A WEBSITE WITH LITERALLY JUST TEXT AND IMAGES is 18 megs big and takes 6000 ms to load is fucking stupid.


Also, Word takes a while to load, but then works fast and is responsive - as well-written native apps are by default - and doesn't suck up your system resources. Something which can't be said about many websites "with literally just text and images".


You're comparing a well-written native app with a poorly written website. I'm currently working on a web-app that requires an about 600kb (minified, yes I know) app.js file.

It takes a while to load (not really, 100ms + download), but after that it is silky smooth due to client side rendering (faster than downloading more html) and caching.

On the most demanding page, heap usage is a little more than 20mb.

Sure, there are a lot of websites which are slow and huge memory hogs. But that goes for many native apps as well.


Come on, you know that's the exception and not the rule. No one should use a revenue generating site as a basis for how the web works. Unless they want to compare it to a similar native product.


Yes, I remember flash, when it was trivial to save things I found with a couple of clicks. The typical webapp is ephemeral -- no way to save it, no way to run it a few years from now after the site it was hosted on goes bye-bye. The typical webapp is terrible at interoperability -- getting it to work with native libraries is pretty much impossible unless you've written your native code in one of the few languages that have transpilers available. The typical webapp is terrible at multithreading -- webworkers are a horrible hack, with no way to send/receive objects between worker threads other than routing everything via the main thread. When you start wanting to do anything interesting, the web's a set of blinders and shackles, keeping you from using resources.


Um, Word has pretty much always started in a second or two even on computers from the year 2000. You're picking on the wrong app there: Microsoft was notorious for caring about startup time of their Office suite and going to tremendous lengths to improve it.

Meanwhile, how fast Google Docs loads depends entirely on the speed of my internet connection at the time. Good luck even opening it at all if your connection is crappy, flaky, if any of the ISPs between you and Google have congestion issues, or if there's a transient latency problem in one of the dozens of server pools that makes up an app like Docs.


Vim/LaTeX is way faster though. And there's no way to make that in any clean way with the web.


That's a fair point, though my threshold for fast is probably a bit higher than 100ms. That might be a conditioned thing, though.



I see lots of websites that don't even cache static resources and use > 10 scripts and stylesheets, from a lot of different domains. Terrible.


And link the non-minified versions.

(but the different domains actually helps speed up loading with http because it helps parallelise transfers.)


Domain sharding almost never makes any sense, especially on mobile, and with HTTP2 it's working against you.

http://www.mobify.com/blog/domain-sharding-bad-news-mobile-p...


or use HTTP2 which should make the number of requests largely irrelevant.


HTTP/2 helps with that but the total size still matters. This is particularly relevant for resources like CSS which block rendering – even with HTTP/2 making it less important whether that's one big resource or a dozen small ones, the page won't render until it's all been transferred.

https://github.com/filamentgroup/loadCSS#recommended-usage-p... has a rather nice way to load CSS asynchronously in browsers which support rel=preload.


Not to mention that using the latest stable jQuery/Bootstrap from CDN means it's likely to be cached before a user visits your site.


Quite happy with my own web page/blog. Pages hover at around 10kb, 30kb if I include some images. I think the page size can be attributed a lot to there being no JS except for GA.

I have taken a lot of inspiration from http://motherfuckingwebsite.com/ and http://bettermotherfuckingwebsite.com/

Of course the size will differ depending on the site's purpose, but I feel like most web pages could stand to loose a lot of weight.

EDIT: I have a guide to setup a similar blog/site here[0]

0: https://hugotunius.se/2016/01/10/the-one-cent-blog.html


That loaded SOOO fast on mobile with no delays or scrolling problems. Great job!

Note: Might try it with NoScript later. Always a good indicator of... something...


Fun story about NoScript. I used to use AMP[0]. But I reverted that for two reasons.

1. The size of their js is ~170kb roughly 17 times larger than most of my pages.

2. It loads from a 3d party domain which means that NoScript/uBlock Origin users will see a blank white page.

The project does impose some constraints that will help reduce page weight and increase page speed, but for my page sizes it's ridiculous to use their JS.

0: https://www.ampproject.org/


The site certainly loaded fast. Never heard of them before. Thanks for the link.


>1. The size of their js is ~170kb roughly 17 times larger than most of my pages.

Presumably this would be heavily cached or baked into the browser, though -- assuming AMP takes off.


"You're a fucking moron if you use default browser styles." - Eleanor Roosevelt


I find it hard to take advice from anyone who makes such strong claims, in such a provocative manner, whilst failing so utterly in their own demonstration: https://imagebin.ca/v/2eeSxQABPMX9

Apparently that's "legible" and "looks the same in all... browsers".

I certainly agree that BMFW has less contrast than the MFW; so little that it's difficult to read!

For comparison, here's how MFW looks: https://imagebin.ca/v/2eeVa5QksF6e

White text on a black background, exactly as I asked for when I configured Firefox. Content taking up my whole widescreen monitor, exactly as I asked for when I made the Firefox window that size.

The only thing BMFW seems to have correct is using sans-serif font; which I expect is because I unticked Firefox's "allow pages to override these fonts" option. Other than that, it looks like a me-too cargo-cult of MFW which completely misses the point.

Presumably the creators of BMFW are using, and only ever test anything with, a black-on-white style which is, let me guess, the browser's default? What would that make them, in their own (jokingly "quoted") words?


I'm a fan of white-on-black too, but whatever setting/addon you are using to acheive that effect seems to be very broken. It's unfair to blame BMFW for a problem created by, and unique to, your specific configuration.


The setting isn't broken. It's not an add-on. It's BMFW's fault, and it's fair to blame it for this. BMFW sets a foreground color, and does not set a background color.

Here are the settings in stock Firefox. http://imgur.com/h4SmKYs


The author even addresses the fact that they didn't set the background on the site.

"I would've even made this site's background a nice #EEEEEE if I wasn't so focused on keeping declarations to a lean 7 fucking lines."

Also the whole site is satire which you seemed to have missed


> The author even addresses the fact that they didn't set the background on the site.

So? JS template engines allow server-side rendering, to address the fact their pages are slow, and unusable without JS. JS libraries have polyfills, to address browsers implementing older JS versions. Drupal allows Varnish, to address how slow its rendering is. SVG graphics libraries have canvas fallbacks to address IE. Video players have MP4 and Webm versions to address fragmented codec support. And so on.

Does these things make the MFW argument wrong or moot? No, because a) they involve extra work to solve problems that wouldn't exist if people just made a motherfucking website, and b) because many devs don't even bother to implement these fixes even when they exist: case in point, BMFW's authors wrote about setting the background but didn't bother actually doing it, so their site is broken.

> Also the whole site is satire which you seemed to have missed

I get that it's "haha only serious", but it's self-defeating. If the premise were "I built a Web-scale Uber-for-toilets with isomorphic React" and the text were unreadable, that would add to the charm. If the premise were "Stop breaking your sites with junk" (which is the message of MFW) and the text were unreadable, that would be unfortunate and a bit ironic. Yet BMFW's premise is "MFW is right, but there's no excuse to leave out these things", and those things break sites. The site itself is a demonstration of why the only point it makes is wrong. It's funny, but in a "laughing at" rather than a "laughing with" kind of way.


> the whole site is satire which you seemed to have missed

You seem to have missed it too :-)

> I have taken a lot of inspiration from … http://bettermotherfuckingwebsite.com/


I don't get the point of those settings. You seem to prefer white-on-black text, but the "Only with High Contrast themes" override means that it will almost never apply.

I tried setting this in my Firefox and visited a bunch of sites. None of them showed me white text on black background, because almost every site sets both background and foreground color styles, which override the preference setting. In fact the only site that was affected was the BMFW.

So what is the point of that setting? If you actually prefer white on black, set it to "Always" and you'll always get it--even on the BMFW.


> I don't get the point of those settings. You seem to prefer white-on-black text, but the "Only with High Contrast themes" override means that it will almost never apply.

The idea of defaults is to be default. I don't mind sites specifying the use of particular colours, and it's useful e.g. for syntax highlighting, for text above background images, for transparent images assuming a particular background colour (e.g. pre-rendered LaTeX images, like those on Wikipedia), etc.

The problem is setting either the foreground or the background, but not both.

> I tried setting this in my Firefox and visited a bunch of sites. None of them showed me white text on black background, because almost every site sets both background and foreground color styles, which override the preference setting. In fact the only site that was affected was the BMFW.

If only that were true. Many sites set either just the foreground colour, assuming the background will be white; or just the background, assuming the foreground will be black. I've emailed many sites about this over the years, ended up opening countless others in a separate browser with different settings, and just rage-quitted many more.

Last week I emailed ticketmaster.es about the black-on-black text in their registration form (amongst many other issues; their service is terrible, I recommend avoiding them if possible)

This is also the reason I stopped using Yahoo Mail back in 2007 when they switched off access to their old Web UI (with no option to use IMAP or POP without paying) http://2.bp.blogspot.com/_6BhjMzysLTs/Rg1mh0pLjMI/AAAAAAAAAA...

I've even tried fixing pages with custom Javascript http://chriswarbo.net/blog/2015-10-01-web_colours.html with mixed success. I also have a key bound to `xcalib -invert -alter` so I can quickly invert my screen colours.

The MFW concludes with the following:

> What I'm saying is that all the problems we have with websites are ones we create ourselves. Websites aren't broken by default, they are functional, high-performing, and accessible. You break them. You son-of-a-bitch.

In their attempts to be "better", BMFW's authors broke it. Sure, it can be argued that light-on-dark default colours are an edge case; that the authors say they were "going to" add a background colour but didn't; etc. and those are all perfectly reasonable arguments. But those are exactly the arguments MFW and BMFW are disagreeing with, only with "light-on-dark default colours" instead of "disabled Javascript" or "out-of-date Android browser" or "retina display", etc.; or with "background colour" instead of "server-side rendering", "error handler", "CDN", etc.


So basically you reversed the defaults in order to make it easier to find sites with partial style declarations, so you can yell at them.

Seems like a big waste of time to me.


> So basically you reversed the defaults in order to make it easier to find sites with partial style declarations, so you can yell at them.

Nope, it's only the especially egregious that get yelled at. In the case of ticketmaster.es, their purchase form looks like this with the default black-on-white colour settings https://imagebin.ca/v/2ekv8S6pbKls

It looks like they're forcing the foreground colour of native text input fields to be black, and the background of native drop-down lists to be white. I refuse to change my entire GTK+ theme (and hence, every application I use on a machine which I spend the majority of every day staring at) to an eye-straining black-on-white colour scheme just to make up for some Web devs going out of their way to break their own sites.

Plus, in the case of ticketmaster, their form handler mangled my input; their "change details" form didn't work; they provide no contact details other than accounts on social media sites which require signing up to; their entire "help" section is an "ask us a question" form which displays all submissions publically; and so on. Besides, it's not "yelling", it's bug reporting; they're free to ignore it.


It's not for everyone.


"I messed up my browser's settings, and now I'm upset that my browser looks messed up."


> "I messed up my browser's settings" ...

Wrong. I changed my browser's settings to perfectly reasonable values. White text on a black background. Now I'm reasonably upset because my browser looks messed up.


Plus one for you. It used to be the internet was a book, now it's a movie.


I think the case of WWW and Internet is one of the biggest collective misunderstandings and tragic fails in the history of technology (and maybe humanity). If there would be a study of average ratio of information in kilobytes to page size in kilobytes, the result would make us cry. I have a Notes.org file that holds most of my notes, a list of most movies I watched and I'll watch, most books I read and I'll read, most places I saw and I'll see... basically my life and its about 200kb, where most probably more nearly 100% of it is information, whereas most multi-meg webpages have two paragraphs worth of information on them.


http://i.imgur.com/wB5B35g.png

My user experience was great, I have to agree.


Great writeup! I might try this out later and see how it fairs. I haven't used the Jekyll + GitHub combo yet but I have heard great things.


Thanks, glad you liked it :) If you do give it a try and run into problems tweet me at @k0nserv.


I have begun a 1 ko framework based on this website, if anyone is interested, I just pushed the code right now https://github.com/BafS/mu


There has long been http://www.1kbgrid.com



Interesting comparison, if a bit arbitrary. It raises a couple of questions though.

1) How do the numbers come out when you exclude images?

It's valid and good to know the total sizes, including images, but that can hide huge discrepancies in the experienced performance of a site.

For example, a page with 150KB of HTML/CSS/JS and a single 2.1MB hero image can feel very different from a page with 2MB of HTML/CSS/JS and a few 50KB images.

If we're just interested in total bandwidth consumption, then sure, total size is a good metric. If we're interested in how a user experiences the web, there's a lot of variability and nuance buried in that.

2) What device and methodology were used to take the measurements?

In this age of responsive design, CSS media queries, and infinite scrolling/deferred loading, it really matters how you measure and what you use to measure.

For example, if I load a page on my large retina screen and scroll to the bottom, many sites will send far more data than if I load them on my phone and don't scroll past the fold.

I only skimmed the article and didn't dig in to the references. These questions may be answered elsewhere.


The speed index was invented for benchmarking this: https://sites.google.com/a/webpagetest.org/docs/using-webpag...


Ah, very cool, thanks for sharing. I'm familiar with a lot of the tools for evaluating the performance of a single site (e.g., that I'm developing), but I'm pretty ignorant of the standard approaches for these types of larger scale benchmarking projects.


If you're developing you may not be aware of Page Speed Insights, extremely handy for SEO. https://developers.google.com/speed/pagespeed/insights/


I actually wasn't aware that the site could also perform the test, but the Page Speed Insights Chrome extension is one of my go-to performance tools. The output looks the same, though formatted a little better in some places. The extension is nice though, because it lets you check private sites.


Lots of people are focusing on excessive JavaScript and CSS but these combined are easily dwarfed by a single high quality image.

Try visiting Apple's website for example. I can't see how you can have a small page weight if your page includes several images that are meant to look good on high quality screens. You're not going to convince marketing and page designers to go with imageless pages.

Doom's original resolution was 320x200 = 64K pixels in 8-bit colour mode. Even an Apple Watch has 92K pixels and 24-bit colour (three times more space per pixel) now, and a 15" MacBook display shows 5.2M pixels. The space used for high quality images on newer displays is order of magnitudes higher to what Doom hardware had to show.


> Try visiting Apple's website for example.

Indeed, right now on mobile the biggest asset on apple.com is a 1.7 MB picture.

http://images.apple.com/v/home/cm/images/heros/environment_e...

The total size of the webpage being 2.5 MB.


Images can load progressively, so pure HTML + one large image absolutely can appear faster to the user. The JS and CSS to load a SPA won't. Page weight is a good rule of thumb, but it isn't the be all end all of a good experience.


That's why anybody should optimize their images, using e.g. Kraken.io or converting to WebP.


It is hardly surprising, considering that a single picture taken with an average smartphone is probably already surpassing that by quite a bit.

Times change, and 20 years in tech is equivalent to several geological ages.

If anything, it cannot really be underestimated how some developers were able to craft such compelling gaming experiences, with the limited resources available at the time.

My personal favorite as "most impressive game for its size":

https://en.wikipedia.org/wiki/Frontier:_Elite_II


Maciej Cegłowski has a great talk/writeup on this very problem:

The Website Obesity Crisis

http://idlewords.com/talks/website_obesity.htm

Here’s the video of the talk if you prefer to hear him speak: https://vimeo.com/147806338


Oh God.

Every discussion about the web will continue to be a mess until we clarify what we're talking about.

Let's try rephrasing the title a couple times.

Rephrase 1: "The average size of a webapp is now the average size of a Doom install".

Response: Interesting, but not bad! Heck, some webapps are games. "The average size of a web game is now the average size of Doom" isn't a sentence that damns the web, it's a sentence that complements the web! (or would if it was true, and it might be for all I know)

Rephrase 2: "The average size of web document is now the average size of a Doom install".

Response: Well this sucks (or would if it was true -- still we don't know). Simple documents should be a few KB, not the size of a game.

Basically our terminology is shot to crap. Imagine if 19th century engineers used the same word for "hand crank" and "steam engine". "Hand crank prices are skyrocketing! What's causing this massive bloat!" Whelp, that could mean anything.

The best solution: web browsers should enforce a clear distinction between "web documents" and "web apps". These are two different things and should be treated separately. This won't happen though, which leaves us (the rest of the tech community) to explore other options . . .


Web apps/documents both look pretty good compared to modern games, those things are 100s of MB at least.


In the late 00s I remember turning on an old computer with a 650 MHz Athlon CPU and being surprised that web browsing performance in Firefox wasn't bad. Now if I try that with a 1 GHz Pentium 3, performance is absolutely horrible. Is this why?


Basically -- ultimately, it's the rise of javascript for third-party content (ads, trackers, APIs, add-ons, etc etc) that drives both page bloat and poor rendering performance.


This, and Javascript, and browser bloat.


Unless the browser ends up swapping a lot I doubt that browser bloat would cause significant perceptible performance regressions compared to the quite significant amount of optimization that has happened in the meanwhile.


The average data plan here is 10GB :

1,000,000 * 10 / 2250 = 4444 web pages a month

4444 / 31 = 143 web pages a day at most on mobile.

While it is somehow acceptable, I don't see data plans getting cheaper yet the size of the average webpage is raising fast.

It doesn't seem like most websites have heavily invested in using HTML5 offline capabilities or actual mobile first design either, something easy to check with chrome dev tools.

Also let's talk about ads : Polygon.com a site I visit often , first article on the homepage with an Iphone 5 :

- with ads/trackers 1.5mb - without ads 623kb

More than half of the load is ad/tracking related. This isn't normal.


With the majority of users moving towards mobile, I really think this is an issue, and I've been consciously building projects as lean as possible. Removing bloated jquery libraries was a big one. With native calls like document.querySelectorAll document.querySelector I've found I can 90% get by without it. For the rest, using something like vue.js, and I've taken care of all the dom manipulation, data binding, etc.


I lived in $nowheresville, TN for a bit. This drove me insane. Sure, your webapp works great in downtown SF, but try loading it with the spotty connection the rest of the country has, not to mention the rest of the world.

One of the places I worked at had a computer hooked up to a ~800 kbps modem, and would test all their web-pages on that. It was really eye-opening, and I wish more companies would benchmark like that


Chrome network tools has throttling that can simulate really shitty networks, I use that a fair bit.


I really wish more companies cared about their mobile presence. You don't need much, especially since native mobile widgets go so far along towards making web a nice thing. Also, for gods sake allow zooming. I think of that as such a major plus for mobile and yet so many mobile sites disallow it.


How about not using js at all? I think that for the majority of pages it doesn’t add any value.


With the instant gratification generation, not having dynamic content is going to be a tough sell. I will however agree, there is a gross overuse of it.


The funny part about your reply is JS-heavy websites delay instant gratification on slow connections or machines. They can take 10-30 seconds then objects start jumping around as you try to click them.

Whereas a cached static page or templated dynamic loads pretty instantly if it's mostly text content. In a way they understand if it's graphics.


Oh, you just want to add a class to the element? \adds whole jQuery\ That's what's wrong with the web.

Oh, and you need a loop? \adds underscore.js\


Well, not that adding the whole of underscore.js is reasonable, but maybe if Javascript had provided proper enumeration of their data structures earlier like all decent programming languages do, that wouldn't be an issue.

To this day, you still can't enumerate an object literal, yielding each key/value pair. You still can't properly enumerate a NodeList and Object.values is still "experimental technology".

I understand that people just abuse libraries in and out, but come on, let's not forget the reason those humongous libraries exist in the first place (and why they're having success success), it's because JS has long been crippled in terms of abstractions and people needed something to alleviate the pain.

EDIT #1: Regarding enumeration of object literals, it turns out it's actually possible, though Object.entries is still marked as "experimental" and returns an array of [key, value] pairs, as arrays, which doesn't exactly scream great design. I just wish I could do something like someObject.forEach(function(key, value) {}) (and same for map, reduce, filter, some, every, etc).

EDIT #2: The latter in my previous edit is actually possible if you're willing to create a Map from Object.entries, which you can call forEach on. Quite convoluted, but possible, so my bad.


Yeah, there will be Object.keys/Object.values/Object.entries and you can use it in an for of loop.

for (let [key, value] of Object.entries({a: 1})) { console.log(key, value); }


var obj = {...}

// native JavaScript

Object.keys(obj).forEach(function(key){ console.log('obj.', key, ' = ', obj[key]) })

// Lodash

_.forEach(obj, function(value, key) { console.log('obj.', key, ' = ', value) })

Object.keys(obj) creates an array that contains the keys of the object, which makes it pretty trivial to call filter, map, reduce, etc. And getting the value from an object's key is also pretty straightforward...


Ruby: obj.each{|k,v| ...}

Sure you can do it in JS, but calling Object.keys then forEach (in whose closure you'll still have to reference obj[key] by the way) still doesn't feel optimal.


In which case, if you really can't live without it, just attach it to the prototype.

Object.prototype.each = function(k,v){ //... }


I would argue that automatically adding methods to a user-created object isn't optimal either.


Element.classList doesn't work in IE9 or below.

Array.forEach doesn't work in IE8 or below.

Fetch API for ajax isn't in any version of IE or Safari, or any mobile browser apart from Chrome for Android.

And so on.

It's not so much that front end devs are lazy, but more a case that building a set of polyfills for each and every browser support problem is actually hard. "just use jQuery" gets around the problem. I think most developers want to write better code, but most projects don't have the development bandwidth for them to do things better.


It would help if the basic javascript api's were even vaguely useful

I recently tried to force myself not to use any libraries for a simple site. After a while I realised I had so much re-invention of stuff (AJAX in particular is madness without a library) that I ended up adding Zepto

With all the crap they're adding in ES6 you would have hoped they would add an ajax function at least


it's called the "fetch" API. Not sure if it's in ES6


ah awesome.

Now we just have to wait 5+years until it has 95% browser share :)


Neither XMLHttpRequest, nor the Fetch API mentioned in a sibling comment are part of the JavaScript language. They are both web platform specs maintained by the WHATWG. The JavaScript committee (TC39) does not control these platform APIs.


A lack of skills, a low barrier to entry and low budgets are causing this.

I hoped with page performance becoming a ranking factor, this would change, if only for a tiny bit. But I see very slow brand websites still ranked higher then the highly optimized indie website, so that didn't work.


The community is the worst. The adds whole jQuery comes from every single js topic on SO in the past five+ years being answered by "just use jquery". Many times they are not even web related at all.

> "How do you do something in javascript?"

> "With jQuery you do it like this..."

I had the worst time ever when I had to work with jscript. I really wonder if my dislike of the language comes from the language itself or the community around it.


A long time the web was like this:

In IE6, you do it like this.

In IE8, you do it like this.

In Firefox, you do it like this.

Then jQuery came along. And it was like 'Now you do it like this, and jQuery handles it for all browsers perfectly'.

Just because some tasks are now performed easily in native javascript on all browsers, doesn't mean it was always that way.


There are lots of legitimate reasons to not use jQuery. Not being in a browser might be one of them. So if someone asks about javascript, one should answer the question (and perhaps suggest that there are libraries such as jquery that handle it better).


SO feels more like a race to answer fast rather than to answer with quality. I think the best way to learn about the language is through the Mozilla Developer Network. I can't speak for everyone but I feel incredibly productive when programming in javascript.

I'd argue that jQuery was great, but now most (if not all) of it can be replaced by native javascript features (e.g. document.querySelector). Today I wouldn't recommend jQuery to anyone.

I suggest learning a functional language (in my case it was Haskell) in parallel, as it opens up new ways of thinking about javascript and problem solving.


correct! reminds me of: http://i.stack.imgur.com/ssRUr.gif

I think SO has gotten a little better about this, but not much, IME


Imagine how fast the web would be if browser manufacturers could "approve" frameworks, and store them as a "standard library" of sorts. A global cache. Then you have ONE copy of jQuery for all sites on your machine.

Wanting to use a framework shouldn't be discouraged. The issue here isn't developers trying to save time, the issue is the shitty tools.


Yeah, javascript historically lacks a stdlib and now it's so big that nobody could probably agree on one. Chalk another one for "the road to hell is paved with good intentions".


Look how long it took for browsers to kinda agree on how to interpret Javascript and HTML. Do you really want to add another way for browsers to behave differently?


This is basically what [Decentraleyes](https://github.com/Synzvato/decentraleyes) does. It helps especially on mobile, though it's not exactly life-changing IME.


I wish this could be possible.

Eventually you'll have to track every minor version of every library when not every site stays up to date. At that point it's (almost) effectively a CDN.


I totally agree - I love raw JS too, and I hate seeing underscore used instead of stdlib - but the stdlib and DOM API has been getting better very slowly:

- element.classList is relatively recent

- NodeLists still don't have .forEach()

- string.includes() and array.includes() was only added in ES6 and ES7

- As of a Chrome 50 a few weeks ago you can now see the values inside FormData.


jQuery is not that large and very likely to be cached an a users machine. So, it's IMO disingenuous to harp on it.

If your playing with larger and less 'standard' library's then that's a different story.


There are approximately 5 million different CDNs hosting different versions of jQuery. Cache hits are actually quite low.


I have a small website that uses jQuery, but I by default point to the Google-hosted version (with a fallback to my own site if that fails), but I also use an older version that's a little over half the size of the current version.

What I'd like to see, however, is a way of trimming down jQuery to only include the functions I actually use. I only use two functions out of the whole library IIRC, so I really don't need to load the whole thing. Maybe one day when I have some spare time and other projects out of the way I'll try stripping those functions out of jQuery and just hosting them locally.


Three popular ones is not "approximately 5 million": Google, cdnjs, MaxCDN.


even the sites using one of these 3 CDNs probably specify aa specific minor version of jQuery from this list http://code.jquery.com/jquery/

(OK, so I doubt many sites are specifying the beta or uncompressed versions, but there's still a pretty good chance that somebody hasn't downloaded the relevant version from the relevant CDN recently enough for it to be in their cache, especially if they're browsing on their phone)


Source? And even then it's ~83KB vs the average page of ~2250K.


If instead of JavaScript a more fitting domain language was used for scripting web apps, this would be less of an issue.


I am not JS dev, but... At this point there should be tree shaking / dead code removal for JS widely deployed. Why it is not? I know that dynamic nature of JS causes some of it, but most code out there is not that dynamic. How good is Google's Closure compiler?


rollupjs.org as well as webpack2 are doing this. As you said, the dynamic nature will prevent this from being native. Using a build tool isn't terrible though.


These days, for simple interactions, youmightnotneedjquery.com is a good resource for those trying to curb that sort of behavior.

On the framework end of the spectrum, we have efforts like Mithril.js that try to provide a minimalist set of tools for more ambitious web applications.

So all is not lost :)


I don't think jQuery is the problem. jQuery is not 1.2 MB. It's not even 0.1 MB.


But shouldn't the browser have those libraries cached?


Given the number of combination of CDNs and jQuery versions out there I'm guessing the chances of already having the 'right' jQuery in cache the first time you visit a site is quite low. Would be interesting to see some numbers though.


I wanted to see how one of my personal projects compared, so I looked at Groove Basin.

Groove Basin [1] is an open source music player server. It has a sophisticated web-based client with the ability to retag, label, create playlists, stream, browse the library, chat with other users, upload new music, and import music by URL.

I just checked the payload size of a cold load, and it's 68 KB.

I'll just keep doing my thing over here.

[1]: https://github.com/andrewrk/groovebasin


Not too long ago Medium pushed an "invisible" 1MB image to clients

https://binarypassion.net/digital-decadence-6ea59251d64d

and the video it refers to https://vimeo.com/147806338


At my frst job I once had to fix a wordpress site that was 'too slow' which loaded a 12MB 4k by 4k favicon. I kid you not. It also loaded jquery 4 times, had about 20 unminified stylesheets linked in the header. When I told this to my manager he told me 'oh but you only need to load that stuff once and then its cached, so that can't be an issue'. I did not last long in that place. The backend was even more of a wasteland.


A 16 megapixel favicon does sound scary.


If web bloat is a problem, I don't think that looking at whether <insert buzzword framework of CURRENT_YEAR> can be removed is the answer.

I suggest that at the moment, we have basically two camps of website, with rough, fuzzy boundaries.

1. A place where someone sticks up an insight, or posts a wiki page, or whatever, to share some thought to others (if anyone actually cares). The blogs of many users of HN. Hacker News itself. Wikipedia. The Arch Linux Wiki. lwn.net. Etc. The sites are very roughly concerned with 'this is what I care about, if you do, great, this is useful to you'.

2. Commercial web sites that employ sophisticated means to try and enlarge market share and retain users. AB testing. 'Seamless' experiences which are aimed at getting more views, with user experience as an afterthought (a sort of evolutionary pressure, but not the only one).

Complaining that camp #2 exists is strange. It's a bit like lamenting the fact that chocolate bars aren't just chocolate bars, they have flashy wrappers, clever ingredients, optimized sugar ratio, crunchy bit and non crunchy bit, etc.

It works! A snickers bar is a global blockbuster, and 'Tesco chocolate bar' is the functional chocolate bar that just does the job, but will never attain that level of commercial success, it serves a different role.

-----

My personal view:

Fundamentally what I want when we click a link from an aggregator, is an 'article.txt' with perhaps a relevant image or two. Something like http://motherfuckingwebsite.com/ maybe.

But if a site actually does that, a website like The Guardian, I'd fire up wget, strip all the advertising, strip the fact it's even The Guardian, and read it like a book. If everyone does it then no-one makes any money, site dies.

So what we actually have is this constant DRM-style race to try and fight for our brains to get us to look at adverts. It's not about jQuery, it's about advertising, branding, 'self vs other' (the integrity of a company as a coherent thing), etc.

I don't know what the answer is here. I think this is why I find concepts like UBI so appealing - I find it kind of alarming that we seem doomed to infect more and more of the commons with commercialization because we haven't found a solution to keep each other alive otherwise.


With stories like these, you have to know where to find the real problem. At least some percentage can be ignored whenever nerds use the word "bloat" because it's a code word for "the only things that should be allowed are things I approve" in a surprisingly large number of contexts.


I dig that you tie this domain-specific problem with that much larger social issue.


Thanks. I've been struggling a lot with life in general lately (depression, etc), and it's nice to have some glimpse that someone else shares the view, maybe.

I think we have a real issue at the moment with people funneling their efforts into ways of dealing with issues at the micro level, rather than the macro level. Local optima. Something like that.

It makes me happy to see the sort of political revolutions that the Internet can bring about, but sad to see that we're still thinking so small. e.g. "Get x into tech" rather than "Make it so that you don't need to be in tech to live a decent life". That sort of thing.


How about browser bloat? Each chromium tab on linux takes an extra ~50-150mb depending on the site -- and I still have no idea what they need all of that memory for...


Chromium is a memory hog, it is a well known fact. Use Otter Browser or QupZilla instead.


Mostly Javascript and caches.


I once bought a pre-made landing page template with all kinds of whizz bang Javascript libraries built in. The demo page was 4 MB. In the time it took to strip all the trash out of the template I could have designed the page myself. I'll never do that again.

I wonder how much of the problem is due to bloated templates.


Looks like most of the discussion here is on network traffic.

Minifying JS and CSS, compression, CDNs and caching won't keep your browser from having to render all the stuff.

---

The stewardess on a new jet airliner:

- Ladies and gentlemen, welcome aboard of our new airplane. On the second deck you'll find a couple of bars and a restaurant. The golf course is on the third deck. You're also welcome to visit the swimming pool on the fourth deck. Now - ladies and gentlemen - please fasten your seatbelts. With all this sh*t we'll try to take off.


Just to clarify, since I was confused (I remembered that Doom 2 was about 30 megs uncompressed, which websites are still a long ways from), this metric appears to refer to the compressed size of the Doom 1 shareware distribution.

http://www.doomarchive.com/ListFiles.asp?FolderId=216&Conten...


> Recall that Doom is a multi-level first person shooter that ships with an advanced 3D rendering engine and multiple levels, each comprised of maps, sprites and sound effects.

Doom isn't in true 3D, its an advanced raycasting engine. The levels are all 2D, there are no polygons, you can't look up and down. Doom has been ported to a TI Calculator. Lets maintain some perspective here.


>Lets maintain some perspective here.

I see what you did here.


Visited a website a few days ago, which used 2048x1365 jpegs for 190x125 buttons. They had multiple buttons like this on multiple pages. I sent them an e-mail about this, but I don't expect them to fix it.


Send “models” rather than code. Low-level code is relatively unexpressive, contains considerable redundancy, and as a result, is relatively large. By sending high-level models instead, which are then expanded on the client to working code, application download size can be greatly decreased. Models typically provide one to two orders of magnitude of compression over code.

This video shows how we do it: https://www.youtube.com/watch?v=S4LbUv5FsGQ

This document gives some results (like a GMail client that is 100X smaller): https://docs.google.com/a/google.com/document/d/1Kuw6_sMCKE7...


I remember thinking years ago, that my CV in Word took up more memory than my first computer (Acorn Electron 32kb ram). It amazes me that I used to play Elete on that machine.


The Doom install image was 35x the size of the Apollo guidance computer.

Thirty-five times! Apollo software got us to the moon. Doom wasted millions of man-hours on a video game.

My point of course is that these comparisons are not actually that illuminating.

Are web pages much heavier than they need to be? Yes. This presentation very capably talks about that problem:

http://idlewords.com/talks/website_obesity.htm

Does comparing web pages to Doom help understand or improve the situation? No, not any more than comparing Doom to Apollo memory size helps us understand the difference between a video game and a history-altering exploration.


> Are web pages much heavier than they need to be? Yes

What about the question "do web pages work any better than they did in 2007?" when we were using full page reloads and server side logic instead of Javascipt tricks.

I see so much basic brokenness on the web today from the back button not working to horribly overloaded news websites with mystery-meat mobile navigation I find myself wondering what have we really achieved in the last 9 years? This stuff used to work


I think you are looking at the past through incredibly rose-tinted glasses. The web has been a mess for a long time, and we used to have to make sure our computer was set to 800x600 and we were using Internet Explorer 6 in order to even use it.


I'm pretty sure I had > 1024x768 graphics and have used nothing but Netscape/Seamonkey (because inertia) since the mid '90s.


Sadly, a lot of laptops come with only 1366x768 displays, 20 years later. What a lack of progress.


Laptops sit in a different corner ( as in corner solutions ).


>make sure our computer was set to 800x600

Thaaaaaaaat's nonsense. I had relatively high-res CRTs (1600x1200) in the late 90s and early 2000s.

My father and I were able to get by with Netscape Navigator and Firefox for quite awhile as well.


Here's a message from 2003. (scroll up)

https://groups.google.com/forum/#!msg/uk.comp.os.linux/5c40N...

> http://www.argos.co.uk

> "Sorry, the Argos Internet site cannot currently be viewed using Netscape 6 or other browsers with the same rendering engine.

> In the meantime, please use a different web browser or call 0870 600 2020 to order items from the Argos catalogue."

> Sorry, I think I'll shop elsewhere until you get it fixed...

Argos was sniffing the useragent. I think people tried changing the useragent, and it worked fine.

This kind of thing wasn't rare, even in 2003.


>This kind of thing wasn't rare, even in 2003.

And it is not rare now either. Nothing has changed in that regard. Things have just gotten massively slower, use insane amounts of CPU, and are less functional.


Even the first NeXT computer in 1990 was 1120×832...albeit greyscale...still...800x600 died in the mid-90s - especially for professionals.


Firefox was released in November 2004.


The initial release of Phoenix was in 2002. Many long-time Netscape/Mozilla users (myself included) switched very early on.


I remember even relatively computer un-savvy folk in '98 having 1024x768 monitors.


I know that. The date range was not exclusive. The point was that high resolution CRTs date quite a bit back even for consumer use. We had no problems browsing the web.


That was 1997, not 2007.


IE 6 was released with Windows XP in 2001 and IE 7 didn't come out until the end of 2006.


Opera was awesome even in 2000.


2007 I was working for a shop that only officially supported IE6. It was a health insurance company too. Your premiums at work!


This website itself is not far behind, weighing in at 937kb and 57 requests: http://www.webpagetest.org/result/160422_KJ_18KN/1/details/


network latency still fails to catch up to human awareness' time resolution.

client-side logic, done right is much improved over a server-side solution.


Which network and which latency? My local network runs audio way below the Haas limit - I can record over the network without incurring any latency penalty.


Comparing local-network performance with "random" cross-internet traffic IMHO isn't very useful, because there is a wide range for internet latency.

My wired desktop gets DNS responses from 8.8.8.8 nearly as if it were in my network, in way under 10 ms, ping responses in 2 ms or so. Accessing websites hosted in e.g. Korea takes >100 ms.

Add a congested wireless connection somewhere (WLAN or mobile network) and you can add another few hundred ms. And neither cross-continent nor congested wireless latency is going to go away.


Perhaps I should have been more explicit - comparing local audio traffic to local web traffic, there's a heck of a lot of difference. That would be the stacks.


[flagged]


English much?

I said audio. I provided this as a counterexample to the stated thesis of your post. There exist things that can be done over a network such that latency is not an issue. I am obviously not pulling data over a cross-continental link.

FWIW, the protocols I write at work can do a full data pull - a couple thousand elements and growing - in under a half second end to end. I don't know of any HTML/Web based protocols that can even get close to that over localhost.

So yeah - we know the Web is an utter pig. My point is that it probably doesn't have to be.


Reading comprehension much?

The article was specifically about web page payload size. My comment was comparing UX of dynamic client-side logic vs full round trips.

You must replied to the wrong comment, I would hope.


> Apollo software got us to the moon. Doom wasted millions of man-hours on a video game.

Well, to be honest, Episode 1 and Episode 2 of Doom takes place on Phobos and Deimos, so you could say Apollo software got us to the moon but Doom got us to Mars :)


Since 2.1 megs is only the compressed size of the shareware distribution[1], we are not going any further than Phobos. Since this is only for v1.0, it is going to be a buggy Phobos.

[1] http://www.doomarchive.com/ListFiles.asp?FolderId=216&Conten...


That link inspires me to print and bind the code of a webpage into a book, and put it on a shelf next to other literature masterpieces.

One Hundred Years of Solitude

The Count of Monte Cristo

Anna Karenina

Don Quixote

Amazon.com: Online Shopping for Electronics, Apparel, Computers, Books, DVDs & more


I want the audio version, narrated by Malcolm McDowell.


Less than head question mark, doctype html, greater than less than html class equals opening quote a dash no dash js closing quote. Data dash nineteen a x five a nine j f equals opening quote dingo closing quote greater than less than head greater than.


> The Doom install image was 35x the size of the Apollo guidance computer.

Keep in mind that the AGC was a necessary but not sufficient piece of hardware for navigating to the moon, and was extremely special-purpose. NASA had several big (for the time) mainframes that

1) calculated the approximation tables that were stored in AGC ROM (each mission required a new table because the relative positions of the earth, sun and moon was different)

2) reduced soundings from earth-based radars to periodically update the AGC's concept of its position.

3) other things that I've forgotten

In other words, the AGC required the assistance of a ground-based computer with dozens of megabytes of RAM and hundreds of megabytes of storage. That will fit on your phone quite easily, but let's not minimize the requirements for celestial navigation.


Apollo era space navigation is not that complex, mainly a matter of (i) pointing the ship in the right direction, (ii) firing the engines until a certain velocity change had happened, and (iii) assessing the result. (ii) in particular is a one-dimensional problem, and (iii) can be done by the guys on the ground via radar.

What the shuttle did was much more complex because it was an unstable aircraft that required many "frames per second" applied to the control surfaces to keep it stable during reentry and landing.

Back around 2006 or so I wrote a simple software 3-d rendering engine in Javascript that was 8k in size without much effort towards minimizing size other being (a) maybe the only AJAX application that actually used XML (to represent 3d models) and (b) using XML element and attribute names that were just one character long.

Not long after that, libraries such as Prototype and JQuery were becoming popular and these were all many times bigger than my 3d engine before you even started coding the app.


To be fair, Doom wasn't a waste by any metric. Doom has had an history altering impact on 3D engines. The huge advancements that came from the Doom and Quake engines have found their way into software that benefits society as a whole.


Doom was delivering an experience far more complex than the Apollo guidance computer was. The average webpage is not. It's delivering an experience as complex as a pamphlet with a few phone numbers on it.


The experience may not be complex, but the software that builds and displays the experience is way more complex than the Apollo guidance computer.


The software that builds and displays the experience is the browser. Most webpages could be cut by 90% and be visually and functionally indistinguishable. For 90% of the last 10%, the functionality cut out would improve the user experience, especially on mobile.


..and the server, protocols, and language used to build that web page. It's not like a powerpoint presentation.


Which only highlights the point of how overengineered it is. You shouldn't need (and in fact you don't need) several-layer tall stacks of programming languages and support equipment to render something that's less functional than a PowerPoint presentation.


Doom was created for entertainment. Web pages are (typically) created for entertainment. The guidance computer was created for calculating trajectories.

IMO, Doom and Web pages are remarkably close in terms of purpose and required assets, and the comparison is apt. Especially when you can play Doom on a web page...


Remember: It's not a waste if you (and millions of other people) have fun!


That's what NASA wants you to believe...


The Web is heavy because there's no negative feedback for the weight factor. And in people's minds, it nearly seems that the difference between a video game and history-altering exploration diminishes day by day.


Pretty sure he was just being funny about the doom bit.


that's because the moon landing was staged! wake up sheeple!!!!!!


Yes, it was recorded in a sound stage on Mars.


Honestly if a native Android or iOS app can be several Gb in size then a webapp can be a few Mb. That said there are lots of optimizations that we're missing out on. I hope http2 and other advances in server side render + tree shaking help reduce the size of payloads further.


>Apollo software got us to the moon. Doom wasted millions of man-hours on a video game.

How many millions of man-hours Apollo project wasted for a PR stunt?



This argument proves too much. If we wanted better sneaker materials or retractable football stadium roofs surely we could have accomplished those tasks for far less money. The opportunity cost of the Apollo missions were enormous, even if there were positive outcomes that spun out of it.

Or to put it more elegantly: "Stop there! Your theory is confined to that which is seen; it takes no account of that which is not seen."


> The opportunity cost of the Apollo missions were enormous, even if there were positive outcomes that spun out of it.

True, but if you're talking opportunity costs then I see Apollo, and the space race in general, as a great success story. They took the polotical atmosphere of nationalism, paranoia, one-upmanship, costly signalling, etc. and funnelled some of it into exploration, science and engineering at otherwise unthinkable levels.

It it weren't for the space race, it's likely the majority of those resources would be poured into armaments, military-industrial churn, espionage, corruption/lobbying, (proxy) wars, etc.


Reached Moon vs Had planet-destroying nuclear war.

Sounds like a bargain to me.


>If we wanted better sneaker materials or retractable football stadium roofs surely we could have accomplished those tasks for far less money.

The problem with this type of attitude is that discovery doesn't work like this. Incremental improvements can sometimes work this way, but big discoveries do not. If there had been a mandate to "find a way to communicate without wires" I'm going to guess that it would not have gotten very far. Instead, this came about as a side effect of pure science research.


Going to the moon wasn't pure science research. Such research could have been an alternative use of those millions of man-hours of effort.

That said, I do take chriswarbo's point that it could have easily instead been even more baroque weapons or proxy wars, as well as yours and manaskarekar's about the uncertainty inherent in counterfactuals. I just wanted to make the point just finding some positives is not enough, you need to look at opportunity costs. If we both look at them and come to different conclusions, that's life, but at least we agree on the basis of measurement.


I somewhat agree with what you're saying, but eventually it's hard to prove if one way benefits us more than the other, because the benefits of the Apollo program perhaps influenced the space endeavors that improve our life today.

It's definitely debatable and hard to gauge. I just thought I'd throw in the link to show the other side of the argument.


My rule of thumb is that if a goal is thought to be technically possible (although it may have not been done before) and there are people who can execute and have the resources to execute it to any degree, it will probably be done. Especially regardless of anyone else opinion if they are not intimately involved in such processes.


Principal benefit from Apollo: a sense of shared purpose that does not depend on human savagery.


Even if you could argue that Apollo was nothing more than a "PR Stunt", what is the issue with it being that? From a nationalism perspective, there's a tremendous advantage to be gained from plugging an entire generation with patriotism from a moon landing.


And from an individual perspective, there is a lot pleasure to be gained from playing Doom. :-)


> The top ten sites are significantly lighter than the rest (worth noting if you want to be a top website).

Wow. That's nice to see actually.


I suspect that the top ten sites are lighter because they must perform under load rather than being a top ten site because they load quickly, but it is nice to see. Imagine the billions of man hours that would be wasted by humanity waiting for the top ten sites to load.


I just watched https://www.youtube.com/watch?v=Q4dYwEyjZcY this video about the early HTML standardization process, and it seems to explain all the ills of HTML.

So indeed, there is a huge optimization opportunity of having a stricter error model.

Also, I'm really wondering how much battery could be saved when surfing such pages.

Also I'm sure there is a lot of potential going in the pre-parsed document model. But that's a next level kind of engineering I guess.


Yesterday I discovered that Twitter's HTTP headers alone are ~3500 bytes long (25 tweets!) with several long cookies, custom headers and the Content Security Policy[1] containing ~90 records. Is this considered normal nowadays?

[1] https://en.wikipedia.org/wiki/Content_Security_Policy


I'm genuinely excited by ensuring great response times and minimal load on a website.

Locally I see so many companies building good looking but horrendously optimized websites for their clientele who don't know enough to ask for it.

The last company I worked at were building a local search engine and were displaying thumbnails whilst loading full size pictures which were hot linked from businesses websites. With an auto loading feature at the bottom of the page by the php backend, an initial 5-6 Mb page load could turn into 30+ Mb within a few seconds of scrolling. Add to this no gzipping and caching was not properly configured either.

I tried my best to get some changes going but the senior (and only other) dev wouldn't allow any modifications to the current system "for the moment". It was a bit frustrating to see so many easy fixes ignored.


So, were can I play Doom in the browser?


Here: http://playdosgamesonline.com/doom-ii-hell-on-earth.html (yes, it's Doom 2, and yes it requires a fast machine since it emulates a DOS PC in the browser)


I can't help but add that it's a DOS PC + a game that still loads faster than some web pages. At some point, even the apples to oranges comparisons start saying something because it's glaringly obvious.


Ok, awesome. Now I'm wondering if we PCDOS in a broswer...

1) Can we say the claim (I think of mozilla's) that the web was going to be the new OS is true?

2) If DOS today, will it be 5 or 10 years before we get BSD or *nix in the browser?


The only real solution is a search engine that allows the end user to clip the results based on the maximum size of the total page. I've often wondered why Duck Duck Go doesn't do this as well as filter search results based on number of ad networks used, etc...


Enable Ghostery and load cnn.com and you will see why the web pages are so heavy these days.


disclaimer: my own blog - https://blog.thekyel.com/?anchor=Why_I_Block_Scripts_and_Ads

I kept looking for a "minimal" blogging platform, but they all had too much bloat/JS/etc. I guess minimal means different things to different people. I ended up just writing my own. The biggest post I have is 7.41 KB.

I used to be interested in front-end design, but since it's the industry standard to use $latest_framework, instead of tried and proven practices, I've given up on that idea.


One extreme end of the scale ;)

I feel like investing a few hundred extra bytes on some styling might be worth it, a la http://bettermotherfuckingwebsite.com/ vs http://motherfuckingwebsite.com/


I'm of the mindset that our browsers are built backwards. For "content-oriented" sites, we should present text, and give the users tools in their browsers to present them as the user sees fit. Instead, we expect the site to design it. Something like Firefox Reader View.

Some people may like bettermotherfuckingwebsite better, but I personally don.t

"presentation-oriented" sites are a different story, of course


Basic styles don't mess with Reader View or other customization, so I don't see a large downside to include them, they just make it nicer for people not using such tools.

Unless the point is to get people to use different tools?


> Unless the point is to get people to use different tools?

Kind of. I dream of an Internet where people have their own CSS/styles, and they can make it look how they want, rather than how the website wants it. I think of it like this: I'm a linux guy. I want my tools to output plainly to stdout. If I want to format them, I'll pipe them into my own formatting tools. I wish websites worked that way as well.


> The top ten sites are significantly lighter than the rest (worth noting if you want to be a top website)

Isn't that that the top websites have a lot more ressources available to improve asset management, cleanup and refactor?


This set of benchmarks is often helpful:

http://www.dynatrace.com/en/benchmarks/united-states/

Particularly the "last mile" or "Chrome Homepage" tabs.

They cover the top websites, grouped into categories like "Retail", "Travel", "Media", etc.

The disparity across competitors is pretty stunning, with some websites getting close to 1 second to download/render a page, and others taking 6, 7, 8 , even 10 seconds. And these are all big, well known, companies.

And, for the most part, it all correlates very well to total page weight, and total number of artifacts on the page (js/css/images/etc). There are some exceptions, but it's a pretty strong correlation.


From this title; maybe hacker news needs a twitter?

Also: You can't use average page weight when you are just looking at the top ten. That downturn could represent a single website; all others could be increasing in size.


This all comes down to cost. It is much cheaper to have "bloat" than it is to pay devs to fix it. And customers find it much cheaper to deal with "bloat" than to find smaller alternatives. Sure the average webpage is bigger than doom, but the CPU in my phone is approximately 100x (times multicore too?) than the 486 that ran Doom.

Sure, if man hours were free, we could trim it all down to (my rough guess) about 1/10th the size. But at $100 or even $10 an hour its just not worth it. Pay the GBs to your carrier, spend $50 more on a better phone.


It would cost no extra time if devs made websites from scratch with performance in mind. Save optimized images, minify code as a build task, etc.


Wondering if an idea like this would work:

Bundlers like Webpack already import JS in a modular structure. I'm wondering if we could do some profiling into popular npm module combinations (I know many people using React + Lodash + Redux Router, etc), bundle them up, and have Webpack load in those combos from a CDN via <script>?

Now this would probably require some work on webpack's end (the __webpack_require__(n) would have to be some sort of consistent hash), but at least everyone who blindly require('lodash') will see an improvement?


Web pages shouldn't need JS at all, except for the basic eye candy it was originally used for, and most importantly shouldn't break without it. A web page that doesn't work without JS is broken.

Web apps are another deal though.


and the .kkreiger beta only uses 96k! https://en.wikipedia.org/wiki/.kkrieger


It's also about 2 times bigger than a lot of SNES and Mega Drive games. Or about 4 times bigger than Super Mario World (512KB).

As for why it's getting so insane, probably either:

1. Frameworks, since most people don't remove the code they're not using. For Bootstrap or Foundation, that can be a lot of extra code.

2. Content Management Systems, since stuff like WordPress, Drupal, Joomla, any forum or social network script, tend to add a lot of extra code (more so if you've added plugins).

3. The aforementioned tracking codes, ads, etc.



The only conclusion I can legitimately draw from this article is that in twenty years a single web page will be larger than the 65GB Grand Theft Auto V install.


Arguably sites have been increasing in size for one simple reason: It directly results in increased sales.

Everything is sales.

If cleaner, 'purer' sites made more money you bet the average web page would be 10kb.

It's all about what translates to more sales. As such, you won't ever see a return to more traditional websites. Look at Amazon with it's virtual dress models, heavy as hell, but they most certainly land more sales.


Quite opposite. It is proved that longer page loading time is less conversion.


Amazon isn't what I'd call "light" at 4.7 MB, but looking at my companies market all the bigger players are way lighter than us.


What happened to semantic markups? In the name of rendering optimizations, many web sites use css background image instead of <img>


This page clocks in at 935kb in my browser. According to this same page, that is roughly the size of Sim City 2000.


The publishers really have no incentive to address this until a critical mass of users install adblock software.


It's initially cheaper to make larger web pages, you don't have to optimize for size (most of the time it would probably execute faster if it was smaller but probably not always). Some others make it larger on purpose for obfuscation (like Google).


Google developed SPDY, an efficient binary representation of HTTP messages. Maybe they will do the same thing but for HTML. It would be much more efficient if one could design a binary representation of HTML that can only express well-formed HTML.



Why can't all these frameworks just be cached? If a cross site request to cdn.com/react-v1.0.js is cached under cdn.com, at most one download will trigger. That seems to solve the problem, but maybe I'm missing something.


I've seen people mention that caching is broken since it only works when the name and version are the same; however, I don't see how this is any different than two packages of the same functionality being uploaded to an APT repo with different names.


...and my first computer had 128 bytes of RAM. And a 300-baud modem.


That's amazing, what was it? My first computer had 5K of RAM.


Altair 8800, kit from MITS. And I wrote a game on it, using keyboard and screen.


GTA 5 is ~65GB in size. One day, web pages will be bigger than that.


> The average size of Web pages is now the average size of a Doom install

It's not really surprising in a world where a graphical driver is > 100 MB (Nvidia driver for Windows).


They mean the original Doom, which ran with a software renderer anyways.


Oh come on, let's talk about Adobe Reader and Skype. :)


Only a hundred megabytes? You're getting off light there. I went to install the latest Nvidia CUDA driver the other day, and it's a gigabyte download.


And 43 MB chromium mini_installer.exe


Is there a Chrome extenson that shows the size of a web page? There's a good one for page load time that I use, but I want kilobytes with and without cache.


Doom is not a good measure, when web pages become bigger and more bloated than your average printer or scanner driver, then it will be alarming :)


Isn't the answer to just create a reasonable standard library for javascript so people don't need to link in megabytes of frameworks


Great, another kooky unit of measure.

"This new re-design gets us down to 0.4 Doom installs without sacrificing any of the visual elements."


There's a reason that the economics of web development mostly work and the economics of games development mostly do not.


This probably requires more explanation.


Too bad we can not measure it football fields


But they load faster than Doom used to.


In 20 years the average size of web pages will be the size of a Quake 3 install. This is progress.


This is one of the reasons why I love a simple website without too many whistles and bells.


Are we really complaining about webpage size when fully 30% of web traffic is Netflix? This might be an unpopular opinion but websites are no longer just html, css and js. They're full on applications with rich interaction and data visualization. Call me when they're larger than an average modern native app install.


For Internet 3 we should call John Carmack and put all of the internet in a MegaTexture.


Can anyone explain why a simple web page is so much bigger now than a whole game?


Because we can! Games were slim back then because they had to be. Webpages are fat now because our computers are fast.

Now, if a webpage were to approach the size of a contemporary game...


The biggest culprits are the layers of Javascript for and and tracking/analytics and unoptimized, high-resolution images and videos...

http://idlewords.com/talks/website_obesity.htm


More because that "whole game" was incredibly small, not because a web page is particularly big.

A single good quality jpeg is around 1MB. If you have 2 nice pictures on your web page, suddenly it's bigger than doom.


For one, Javascript is used as source code text, whereas games benefit from being compiled to binary form. There's also a monetary cost for file size if your game requires two floppy disks rather than one.


Because modern web sites circumvent the HTML engine and request JSON from the server which they then render via JavaScript.

Other bugs with the modern web authoring style:

1. broken back/forward functionality

2. broken scroll bars

3. broken accessibility

4. broken memory of where the page was at when closed, leading to a fresh load of the page on Re-Open-Closed-Tab

5. infinite scroll being highly unnatural


A simple web page nowadays is bigger than a whole game in 1993. A whole budget game nowadays is several gigabytes in size and that is far from a simple website.


ronan has an account here, commented on this as it was developing a few months ago:

https://news.ycombinator.com/item?id=9981707


and here I remember that the PDF of the Turbo Pascal manual was so many multiples of the compiler's size I needed a calculator to figure it out


what is the average size of a native mobile app?


Doom would have been the superior solution to the World Wide Web.


For the last project I built the initial page load with the absolutely minimal JS that was embedded into the page. Then it loaded the rest whenever it needed it. My coworkers were shocked how quickly the page loaded.

It's actually better to show the user some progress bar, than the standard browser's "Waiting for yoursite.com".

You can get away with a lot without jQuery, while still having clean-ish code.


I would argue two things:

1) This is an irrelevant statistic.

2) Even if this were true it's not that big of a deal.

This is irrelevant because most people don't browse the average web page. They browse the top few sites on the internet and that's it. A more relevant statistic would be what have the sizes of the top 50 sites been over the last 15 years. I imagine they still may have grown on average, but download speeds have also grown over that time. Especially on mobile.

Even if we accept the premise that web sites as a whole, including the most popular ones are all growing and are now an average of 2.2MB each. Who cares? 2.2MB is nothing in 2016. Even on an LTE connection that's probably between 4 and 1.5 seconds to download the full page. And a lot of that size is probably in ads, which nobody minds if they load last or not at all.

Lastly, this is a self fixing problem. If a site is too bloated, users will stop going to it.

But I would propose that a lot of this increase in size is due to users (especially mobile) having higher and higher resolution displays, which necessitates higher resolution content, which of course is bigger.


I care. I know a lot of people (here in western Europe) who are on a 1 GiB/month data plan. 2.2 MiB per page means that you can visit 15 pages per day. Better think twice before you click a link.

Besides, 2.2 MiB for a page is pure bloat. Unless the page is heavy on images, you can usually fit all of the content that matters in a tenth of the size. In 300 KiB you can fit a 500-word article with 160 KiB picture, a few webfonts, headers/footers and stylesheets. Using a factor ten more is just ridiculous.


I would recommend using Opera Mini or GDrives Web Light (http://gdriv.es/l/)


Who cares? Everyone does. Everyone wants faster sites and downloading 2.2MB is not easy or quick.

It's a ridiculous amount of data needed to display sites that don't have any special media or functionality and it burns up battery life and data plans for most mobile users.

Also latency is a far bigger effect than bandwidth when interacting with sites so I would say even with bandwidth going up, the actual experience of loading a site has gone down considerably over the last few years.


The reality, however, is that depending on where they are many users don't have an LTE connection to begin with. My personal experience is that every so often a page begins to load, then shows text invisibly, then loads ads, and only after around 10 secs, I guess, I see the content I was interested in in the first place. Well, usually I closed my browser by then already.


Internet infrastructure is far from being cheap, and CPUs are not free either.

> Who cares? 2.2MB is nothing in 2016.

Multiply that by the amount of internet users, and now try to imagine all the hardware running this.

It's also about having a format that discourage bloat so that the web can be faster on a large scale.


What? The size of a page has nothing to do with the CPU. Most of that size is going to be in images and animation, and most web pages put basically no load on your CPU. If one does it's going to be because of badly written JS, not because of the size.

Shrinking the average page size even by half is not going to "the web run faster on a large scale." Most of the time your page load speed is not affected by congestion, and if it is it's likely due to a local tower or something. The problem there will be the number of users per tower, not the size of the pages they are loading.


Parsing text can be hardly parallelized. Overall it will increase the price of your phone so it can show that webpage fast enough, which will be a cost of both CPU and battery. I have a shitty bi core smartphone, and it really doesn't like webpages. When you look at the source of a page, there is no justification for all of this, unless you like complexity.

I did not want to talk about HTML on smartphones, since it already seems to be non existent since apps replaced HTML, but do you really think we should have apps instead of webpages? Apps are less open. I really think HTML parsing is power intensive, and that you could actually increase autonomy and speed on HTML, instead of just having bigger batteries. Seems to be the same argument than fuel efficiency regulations.

Maybe I say all this because I like minimalism, but at the end of the day, most core formats are designed with acceptable performance in mind. In my opinion, HTML doesn't have acceptable performance.


Are you sure that parsing HTML is the step that is the biggest issue for your phone, and not e.g. rendering or JS execution? (It's certainly possible, parsing HTML with bad parsers or pathological content can take surprisingly long, but it's generally the last thing I'd expect as a reason for a browser to be slow)


I linked a computerphile video explaining how HTML ended up as a loose defined language, meaning browsers will often struggle to render something.

I think it's part of the whole thing, HTML is too permissive, and it also will make rendering slower. If HTML was better defined from the ground up, rendering would also be faster. The sites I visited were not really JS intensive. JS is also a problem for webpages.

Anyway I hate HTML/CSS/JS in general for all those reasons. Parsing text is already painful, why do it each time you visit a webpage?


> Even on an LTE connection that's probably between 4 and 1.5 seconds to download the full page

So on the fastest mobile connection type available it's a magnitude or more to slow to feel fast, and that's a good state of things?


The average Web page now does more than the average Doom install, I don't see the relevance of this.

Although I get really annoyed when I visit a blog post whose page is 100x larger than Dostoevsky's novels in .txt format. On my blog (https://pljns.com/blog/), JQuery and genericons are often my largest file transfers, but I still clock under 500kb.


https://pljns.com/

> 40-pound jQuery file and 83 polyfills give IE7 a boner because it finally has box-shadow

[x] check

> You loaded all 7 fontfaces of a shitty webfont just so you could say "Hi." at 100px height at the beginning of your site?

[x] assuming 404'ed fontawesome as shitty webfont, check

> You thought you needed media queries to be responsive, but no

[x] check

> Your site has three bylines and link to your dribbble account, but you spread it over 7 full screens and make me click some bobbing button to show me how cool the jQuery ScrollTo plugin is

[x] check

Still pretty good site, but it's funny how accurately creator of http://motherfuckingwebsite.com/ has described the situation with the modern web :)


These are all good points that I knew when I hastily pushed the site last week. Still way under 1000kb!

Also, I said my blog, but serves me right I guess ;-)


Displaying text on screen with some crappy ads & animations is not doing more than Doom, not even close.


Once you remove Ad Tech how much does the average we page actually do?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: