Hacker News new | past | comments | ask | show | jobs | submit login
Page Weight Matters (2012) (chriszacharias.com)
137 points by Tomte on July 31, 2022 | hide | past | favorite | 57 comments



I'm a bit pessimistic about this now, and largely given up beating the drum

Even in industries where customers are on extremely bad connections [0], I couldn't convince people to give up their bloated component libraries or cache things in indexedDB. Fact is not many web developers know how to write for these environments and not many businesses will incentivize for it, even when their customers complain.

Though the optimist in me will say this - if anyone actually does web dev where people do actually care about this, I'd love to hear about it.

[0] My experience here https://lewiscampbell.tech/blog/220205.html


In DayJob we produce web based applications, as do friends in other companies. One common refrain when commenting on the heavy payload for the applications is that they are not targetting mobile phones (good job, the apps are not terrible useable on a small display) and the clients won't care (which is actually true, for the most part at least).

Some don't seem to realise that tethering to the mobile network is a thing when something is amiss with the local network's connection to the outside world (more common now than it used to be, with more people working remotely) or trying to get stuff done while travelling - so users might have a decent sized screen & chunky processor but still be talking to us over a low-bandwidth & high-latencey link.

A lot of devs should spend more time with Chrom{ium|e}'s network throttling options turned on. It is far from perfect, but gives at least some insight into the troubles experienced by those with less than ideal connectivity. Perhaps I'll see if, next time I have my infrastructure hat on, there is a way to enforce that as a policy that can't be overridden!


I hear that "we're not really targeting mobile" excuse less and less often these days (phones are, what, half of users?). But I do hear another excuse frequently: performance isn't free, as in spending time on performance optimization means something else has to give: more features, quicker delivery, better testing, readability and documentation, package reinvention, whatever.

People cobble together npm packages because it saves time vs having to reinvent their whole color conversion solution, or page routing system, or unit converter or whatever.

And sometimes management is the one unwilling to budge. I couldn't get us to spend ANY time on performance until I showed them a video like, "Hey, I know this works fine on our latest Macs, but check out what happens when I try to run this on a 2-year-old budget Windows laptop, like our users would have. It barely even runs." Only then did we start testing with not just network throttling but CPU throttling too. There are still some really slow computers out there.


One thing I remember thinking was neat when I visited friends at FB (Meta) HQ in Menlo Park was they had fb.com/shitty (or something like that) which simulated terrible connections. They even had little stations with crappy and yellowed with age compaq CRT monitors and really old hardware, old Nokia phones etc.

I thought that was pretty cool, don't know if that's something they still do though (this would have been back in 2013 or so).


My company let me expense for super low end cell phones from 7-Eleven and put each one on a different MVNO.

Once a year I take them into the actual places where our users will use the web sites To see how the web sites perform. The visitors are 70% mobile users, and I shadow a few of them in the locations where the sites are used, like parking garages and basements and industrial buildings with lots of opportunities for interference.

Very illuminating. And I suspect more accurate than many of the bad connection simulators I've seen.


>they had fb.com/shitty (or something like that)

https://mbasic.facebook.com/ perhaps? It is still available and is designed for the lowest of low-end dumbphones.

It's a decent workaround if you need access to facebook's chat services on your phone but don't want to install that accursed messenger app.


I live from a website I run. Performance is super important for me for a few reasons:

1. It's just text on a page. There are no excuses.

2. People use my website on the train, where the reception is so-so.


I’ve never met anybody who was using angular who could actually justify why they were using it at all. They just use it because everybody else is.


It's batteries included and highly leverages reactive programming (rxjs is first class). The dependency injection is a really nice and clean pattern. I have nothing but good things to say about Angular as both a React and Angular user. They both do the same things in different ways and depending on the team, may provide benefits over the other but it's case-by-case.

I've also never met a React user who could explain why React is supposedly better than Angular, other than because everybody else is.


I don’t understand this chain of “I’ve never met anybody who…” when you can just ask. I had to use Angular 2 and 3, and I find React to have a better, simpler abstraction.” Was this so hard?

This thread paints people who don’t share your preferences as simpletons without preferences at all. That should be a red flag.


I started in Angular v1 and it's exactly what you said: batteries included. We were able to build out an ecommerce website with few (maybe zero? can't remember exactly) external dependencies, just what shipped with Angular, because it was so well-engineered, feature-complete, and well thought out. Yeah, there was a bit of a learning curve, but not any more so than learning React + eight other big packages.

Fast forward half a decade, of course I'm using React now because every employer demands it, but god, what a duct-taped ecosystem! You have to reinvent SO MANY wheels and hack them together, when Angular had it all so tightly integrated. I am really pretty sad that React won out in the end. It's honestly kinda useless on its own (there's not even basic page routing).

Thankfully, Next.js solves a lot of that and makes React useful and easy with a brunch of preconfigured libraries and features they wrote well-integrated together. React by itself is nowhere near Angular, but Next.js is amazing.


        Reactivity   Data       Initial Page View
    
    SS  Delayed      Delayed    Quick
    CS  Instant      Delayed    Super Slow
It depends on what kind of site you are optimizing for. If you run a site you expect to see someone only for a few minutes go with SS. If you expect them to stick around, like they are checking statuses and updates, CS is better for user experience.

It's not impossible to have CS with lighter components too, but generally apps that have had a longer lifetime and revisions will get heavyweight components.


You didn't include DS and ES...


There's some interesting lessons in the article, one being that following metrics blindly will lead you down the wrong path. You have to know why the numbers are the way they are, not just what they are.

The second lesson I guess, is that inside an evolving product, this is a constant battle against entropy. The YouTube homepage is once again 15mb/90 requests, and that's with an ad-blocker.


The one thing that makes me scratch my head is that the page generally loads quickly, but videos load and play slowly. Rewinding a video to a point that I've already played causes the spinner to show up for what feels like a eternity -- at that point I often just decide the video is not worth watching and try to find what I'm looking for elsewhere.

So a website that is supposed to be for playing videos has optimized the page loading pretty well, gotten the video player showing as quickly as possible, then after all that the experience of playing the video is depressing. Without knowing details, it feels like a case where the devs have optimized for the wrong things.


Standard YouTube behavior: Whatever you do, except for one linear playback ... "I need to buffer that again!". One of the worst video player experiences. The default HTML5 player is better than that, and by a landslide. That's one of the reasons to use invidious instances.


Video playback is very fast for me. However, there used to be an issue where my ISP was giving lower priority for YouTube videos and I had to route around it. https://www.google.com/get/videoqualityreport/


Seeking anywhere in a video on YouTube is instantaneous for me. 1 gig symmetrical fiber in Portland, OR. You should look into network tools and see what’s causing the slowdown, because I doubt it’s on YouTube’s side.


> The YouTube homepage is once again 15mb/90 requests

You look to be measuring the wrong thing there: 15MB is the uncompressed size; the actual amount transferred is only 3MB. (The two biggest JavaScript files are about 10.5MB, but 2MB on the wire.) And it’s the transfer size that the article is talking about.

(Now the uncompressed size does matter too, especially for JavaScript where that code has to all be parsed and executed, and 10MB will block for multiple seconds on particularly slow mobile devices, but that’s about CPUs, whereas the article is focusing on networks.)


I wonder how much of that 3MB came from the Google+ integration (the post was 2012 and three years prior would be 2009; G+ was 2011).

I also wonder how much of that 3MB affects time-to-video-play vs being loaded after the video starts.


> vs being loaded after the video starts.

On a slow or bitty connection that could still impact the video playback, perhaps even causing a stutter or few, due to competing with it for a share of the throughput possible, if only for a short time, so I think it is still valid to consider as potentially blocking content.


> The YouTube homepage is once again 15mb/90 requests, and that's with an ad-blocker.

The article is about the Feather frontend, which was never the same as the normal YouTube frontend. So, it’s not really a fair comparison. Also, Feather was introduced in 2009 and they axed it after some years.



I assume such senior engineers who used to care about page weight have left YouTube by now. The trend of every big site (Facebook, Twitter, YouTube) is to go towards a fat multi-megabyte SPA.


Im gonna assume that users of YouTube can probably tolerate a one time multi megabyte download before streaming hundreds to thousands of megabytes of video.


Slightly tangential but if you've ever looked at the data transferred on a youtube video in the network tab they're actually suprisingly tiny (as far as I've seen). Often only like 10 MB for multi minute downloads. Top notch compression (not surprising).


If you are on your phone that's probably true. But on my desktop it defaults to 4k and sometimes 60fps which still comes out to some pretty big file sizes. I have never once been inconvenienced by the initial JS bundle download of youtube. After the first load, the site functions almost like a native app with very snappy page changes.


On a bad enough connection YouTube can switch to 144p, that's on the order of one megabyte per minute.


YouTube logs me out and deletes my preferences every week or so. I doubt the page cache is “one time”.


I wonder if these people in remote places, who had to wait two minute to download a 100KB page, were really able to use the now-loadable page to actually watch the video on the page? I wonder how much data per second is required to view a YouTube video at the lowest quality setting?


I love this story. The moral is not that folk in Siberia could now watch video, but that a whole chunk of the distribution lying between "fast US fiber" and "siberian 2.4kbps wet string" had their experience dramatically shifted to the left, enough so that new outliers impacted the average. Even after noting the lesson about the limitations of reliance on averages, still it skipped over this key element.

I remember the YouTube feather opt-in being spread by word of mouth at the time by folk in the tech community on fast connections. People really care about this stuff and the only reliable way to know is by engaging them directly.


It seems like lots of people who didn't have low bandwidth so previously used YouTube anyway, switched to using Feather because the page was smaller by being focused more narrowly on their needs.

By the sounds of it, though, it was a net loss for Google, which is why the page is still so big. Much of the code might have been irrelevant (even harmful) for the end user, but was necessary for Google's business model.

So a side-learning ignored by the article is that there are more types of stakeholder than just the end users, especially in applications that are free (as in beer) at the point of use. And more stakeholders means more complex solutions. And more complex solutions typically mean more code and bigger pages.


> I wonder how much data per second is required to view a YouTube video at the lowest quality setting?

It depends a lot on the content. The example I’m about to show comes to about 114kbps (14.25KB/s).

You can inspect this sort of thing with tools like yt-dlp (the superior fork of the probably-better-known youtube-dl). Here’s one example from a well-known music video:

  $ yt-dlp -F 'https://www.youtube.com/watch?v=dQw4w9WgXcQ'
  [youtube] dQw4w9WgXcQ: Downloading webpage
  [youtube] dQw4w9WgXcQ: Downloading android player API JSON
  [info] Available formats for dQw4w9WgXcQ:
  ID  EXT   RESOLUTION FPS │  FILESIZE   TBR PROTO │ VCODEC          VBR ACODEC      ABR ASR MORE INFO
  ────────────────────────────────────────────────────────────────────────────────────────────────────────────
  sb2 mhtml 48x27        0 │                 mhtml │ images                                  storyboard
  sb1 mhtml 80x45        1 │                 mhtml │ images                                  storyboard
  sb0 mhtml 160x90       1 │                 mhtml │ images                                  storyboard
  139 m4a   audio only     │   1.23MiB   49k https │ audio only          mp4a.40.5   49k 22k low, m4a_dash
  249 webm  audio only     │   1.18MiB   46k https │ audio only          opus        46k 48k low, webm_dash
  250 webm  audio only     │   1.55MiB   61k https │ audio only          opus        61k 48k low, webm_dash
  140 m4a   audio only     │   3.27MiB  130k https │ audio only          mp4a.40.2  130k 44k medium, m4a_dash
  251 webm  audio only     │   3.28MiB  130k https │ audio only          opus       130k 48k medium, webm_dash
  17  3gp   176x144      6 │   1.97MiB   78k https │ mp4v.20.3       78k mp4a.40.2    0k 22k 144p
  394 mp4   256x144     25 │   1.71MiB   68k https │ av01.0.00M.08   68k video only          144p, mp4_dash
  160 mp4   256x144     25 │   1.82MiB   72k https │ avc1.4d400c     72k video only          144p, mp4_dash
  278 webm  256x144     25 │   2.27MiB   90k https │ vp9             90k video only          144p, webm_dash
  395 mp4   426x240     25 │   3.37MiB  134k https │ av01.0.00M.08  134k video only          240p, mp4_dash
  133 mp4   426x240     25 │   2.96MiB  117k https │ avc1.4d4015    117k video only          240p, mp4_dash
  242 webm  426x240     25 │   4.02MiB  159k https │ vp9            159k video only          240p, webm_dash
  396 mp4   640x360     25 │   6.57MiB  260k https │ av01.0.01M.08  260k video only          360p, mp4_dash
  134 mp4   640x360     25 │   5.55MiB  220k https │ avc1.4d401e    220k video only          360p, mp4_dash
  18  mp4   640x360     25 │  15.04MiB  595k https │ avc1.42001E    595k mp4a.40.2    0k 48k 360p
  243 webm  640x360     25 │   6.93MiB  274k https │ vp9            274k video only          360p, webm_dash
  397 mp4   854x480     25 │  11.45MiB  453k https │ av01.0.04M.08  453k video only          480p, mp4_dash
  135 mp4   854x480     25 │   8.52MiB  337k https │ avc1.4d401e    337k video only          480p, mp4_dash
  244 webm  854x480     25 │  10.04MiB  397k https │ vp9            397k video only          480p, webm_dash
  22  mp4   1280x720    25 │ ~66.22MiB 2559k https │ avc1.64001F   2559k mp4a.40.2    0k 44k 720p
  398 mp4   1280x720    25 │  23.71MiB  938k https │ av01.0.05M.08  938k video only          720p, mp4_dash
  136 mp4   1280x720    25 │  16.56MiB  655k https │ avc1.4d401f    655k video only          720p, mp4_dash
  247 webm  1280x720    25 │  16.97MiB  671k https │ vp9            671k video only          720p, webm_dash
  399 mp4   1920x1080   25 │  45.16MiB 1786k https │ av01.0.08M.08 1786k video only          1080p, mp4_dash
  137 mp4   1920x1080   25 │  89.66MiB 3547k https │ avc1.640028   3547k video only          1080p, mp4_dash
So the audio for that 3:32 video is around 1.2MB at the lower quality (format 249, 46kbps), and the video starts at about 1.7MB (format 394, 68kbps). Cumulatively that’s 114kbps. Well, there’s also the 3gp form where you can appreciate the entire thing at 6fps for under 2MB. Given the size, it’s pretty tolerable, really, and a fun curio if nothing else.

(The source article speaks of 100KB taking 2 minutes, which corresponds to 6.7kbps, which is extraordinarily slow. Well, at that rate it would take you an hour to load the 3MB of the video in this example, though experience suggests that at those speeds other stuff is extremely likely to clog things up further. But then, maybe that clogging is already taken into account.)


Tens of kilobytes per second, at the lowest 144p setting.


Back in the day, my internet connection was so Comcastic that I couldn't watch a video on youtube, either. I'd have to cache it with youtube-dl and watch it the next day, or whenever it finished. It was weird the first time I tried to watch a video on the web page and it actually played.


Comcastic = the opposite of fantastic :-)


I was expecting something about (paper) page grammage. I suppose disambiguating by adding _web_ page is out of the question given the audience.


I was also expecting something about buying heavier weight paper, something I thought I remembered reading about on HN in the past. I remember reading this post too -- it opened my eyes to just how hard a problem optimizing page loads can be.


I wasn't, but then I thought I should. Déformation professionnelle and all.


I think the future of the high performance web involves a websocket and streaming content at the element-level. Preloading all source for a session that will use only a subset of the features is wasteful in my opinion.

Even on high latency and crappy connections this can provide a better UX. TCP is still applicable on that websocket, so it's not like things will arbitrarily derail because of some packet loss.

If your web application is doing anything useful, it probably has to periodically check in with the server anyways. Why not just keep all of the state on the server and stream the UI to the client in a more granular fashion?


Isn't that what the browser does by default if you don't involve any websockets or JS? When you load a page it streams (and incrementally parses and loads) the content for that page. When you follow a link to a different page on the site, it streams the content for that page and only that page.


I hate hearing these sorts of arguments when you have an inhouse business app served to a handful of repeat users. Reducing the payload size will make very little difference to most users as they visit the site every day and have most of it in cache. Also, speeding up the initial load times for internal users often does little to increase profits. Until it does it's hard to justify the work.

That said it's worth aiming to keeps things as small as possible from the start. Just not investing in a rewrite.


Ironically in house web apps would be the easiest to make lightweight. No ones going to complain that your dropdowns aren't sexy enough, so you can largely just use the browser.


> Also, speeding up the initial load times for internal users often does little to increase profits.

I used to be on the Performance team at Box. Our charter was to get the files page to load in under 1s for the median domestic (US) user.

Box is largely sold to CTOs, not the people using it. The folks demoing/testing/evaluating are almost always on fast computers with fast internet. That didn't negate our goal, though, for a few reasons:

1. It turns out that lots of business users fly on airplanes. If you have to store/send a file and the app loads too slowly, you'll just use Dropbox or GDrive, or the OneDrive that Microsoft threw into your Office365 contract for free instead. And when _that_ loads instead, you'll remember it come contract negotiation time.

2. If all of the people who are supposed to use the app hate it, they'll actively avoid using it. If they have no choice, they'll complain and make sure decision makers are well-aware of how bad it sucks. That does eventually trickle up. With Box, folks would switch to email or flash drives, which makes security/compliance orgs salty, which introduces new, stricter limitations on your USB ports and email, which pisses off the execs paying for Box. Bad business software—contrary to popular belief—does eventually get phased out, just not on a short timescale.

3. Geography does matter. One project to bring the files page on Box from ~3s to ~2.3s ended up improving EU load times by multiple seconds in some cases. Companies who would have previously ruled Box after testing it once suddenly became meaningful prospects.

4. Sites that load slowly usually load slowly because of underlying complexity. That complexity makes the code harder to change. The slowness and the ability to make meaningful improvements to the site in other areas are directly linked. In 2014, I realized we'd picked almost all of our low-hanging (performance) fruit and pushed a full rewrite/redesign. It turned out to be one of the biggest software ships Box has had in the last decade (see "All New Box" [0]), and was a huge boon for sales and marketing. Something something life gives you lemons. Nobody plans on making their site slow, "not investing in a rewrite" makes no sense in the real world.

It's also worth calling out that this is woefully inaccurate:

> make very little difference to most users as they visit the site every day and have most of it in cache

If you're a serious company that releases software changes regularly (Box had ~nightly releases when I worked there), no, your users will not have the assets cached. One small JavaScript change (bug fix, feature, whatever) invalidates the cache.

We spent many man months getting our third party libraries into a "vendor" bundle that didn't change with meaningful frequency, but that accounted for less than 30% of total JS payload. With the exception of weekends and Mondays, you could be guaranteed to load a few hundred kilobytes of fresh JavaScript every morning. Organizations with CD would be in a worse situation.

But in the real world, the experience feels worse than it is: the times your cache is the least fresh are the times when you're most likely to notice the slowness. You just paid for Gogo in-flight wifi, you just came back from vacation or sick leave, you just opened your laptop at a customer site.

[0] https://blog.box.com/introducing-all-new-box-where-all-your-...


Your points are all true but box is not at all the kind of app I was talking about. Box must be deployed to tens of thousands of users and likely operates in a competitive environment.

I'm talking about the kind of app that exists in one organization, has 1 - 10 users, and is not a huge part of one persons job.

Something like an EDI tool that sends invoices to that one creditor that requires some strange EDI format and only just makes up enough of your business for it to be worth while. Where the EDI was written 15 years ago and no one knows much about it. Then you get a new dev asked to make a small a change and comes back saying they need a week to improve its performance or shrink it's binary.


> ...Bad business software—contrary to popular belief—does eventually get phased out, just not on a short timescale...

So true! But, man, that not-so-short timescale often is painful! :-)


This is an interesting and thorough response, and I liked reading it, but GP was talking about internal, captive-audience stuff. Box is an app you sell. B2B, yes, not B2C, so your users are “enterprise.” But it’s still a product that competes with others and that seems very unrelated to GP’s point.


In that case, who is concerned with page size for those apps? The apps with bundle sizes that people complain about are almost exclusively from third parties. I've worked on some big internal apps, and almost none of them have payloads big enough to notice. If it ain't broken don't fix it.


> who is concerned with page size for those apps?

That was kind of their point: No one is, so why bother with a lot of effort?

> The apps with bundle sizes that people complain about are almost exclusively from third parties.

Right, like Box. Not the internal apps they were referring to.

> I've worked on some big internal apps, and almost none of them have payloads big enough to notice.

That echoes my own experience.

> If it ain't broken don't fix it.

I think you’re agreeing with the person you responded to. It was just that Box wasn’t the kind of app they were talking about. All I was saying was that your response, while interesting, was not 100% relevant. It was still interesting, though!


What I wonder about is what's the best:

1) lightweight pages, that link out to lots of stuff (javascript libraries, images, ...). AJAX request to load page content.

2) more heavyweight page, but everything into one. Only inline javascript, <img src="... Everything into one.

Because page weight doesn't really matter. Time to display matters.


Satellite users have long latency, each separate resource request waits 500ms before a connection can begin. Argues for "all in one".

However: the user may be throttled, in which case any connection is likely to have an interruption about 3MB of data transfer.

So you get bit coming and going.


Time to interact matters as much as (perhaps even more than?) time to display; I would rather wait a little bit longer for the page to display than have the page display immediately but behave erratically until everything has settled.


Every site is different, but in general I have seen the greatest correlation between bounce rate and FCP. Ensuring sites are quickly responsive also matters, of course. But users will straight up leave after a few seconds of blank.


Resources could be cached with (1). So subsequent visits will be a lot faster.


Until you make any sort of update. Many web apps are constantly seeing updates pushed. So there's no guarantee when any particular user loads the app they will have a valid cache. Browsers also no longer share caches between origin domains, so even if App A and App B load a resource from the same CDN they won't share the cached version. On mobile caches tend to be evicted with higher frequency then the desktop.

This all adds up to relying on cached resources for page load performance is fragile at best and completely broken at worst. Relying on caches ends up with a worse experience on average because whenever a user loads the app they may get wildly different load performance based on the state of the cache.


I would think that an empty cache between visits, but a full cache between different pages of the same visit would be a realistic assumption. i.e. not relying on the cache to speed up initial page load, but accepting it will speed up subsequent pages.


A slow initial page load won't necessarily get you subsequent page loads. Resources used in the same visit will be resident in memory rather than loaded from the cache. The cache will be hit if the load takes place in a new tab or window.

Also if your uncached page is slow loading, then any of the multitude of reasons a returning user has an empty/invalid cache, they get a nice unexpected slow load experience. Even two page loads a day apart might have had the cache invalidated by an update on the server. Use patterns don't neatly align with deployment patterns.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: