Hacker News new | past | comments | ask | show | jobs | submit login
How HTML 5 link prefetching can make your site load faster with one line of code (keyboardy.com)
69 points by ronnoch on June 3, 2010 | hide | past | favorite | 29 comments



Google already does this prefetch on some search results pages with the top result. For instance, in firefox if you search for "hacker news":

http://www.google.com/search?q=hacker+news

there is a link element added to prefetch news.ycombinator.com.

  <link href="http://news.ycombinator.com/" rel="prefetch">
I assume if you aren't careful in your logs, you could be overestimating your traffic.


wow I certainly didnt realise google were going this

I guess some people with top search results can infer some pretty accurate stats from this.


How will this affect web analytics? Let's say that Google chooses to prefetch the #1 result for any given search term. Will the #1 ranking site's analytics package report a unique visitor for every time that query is searched? If so, that would be a serious problem for those who rely on analytics.


It shouldn't affect Google Analytics and other JavaScript based analytics, but from the article,

* If this becomes popular it has the potential to skew logs and stats. Consider what happens when a bunch of prefetch requests are made to one of your pages, but the user never actually visits the page. The server (or stats package) doesn’t know the difference.

To clear this up, Firefox sends along an HTTP header, X-moz: prefetch, but you need some logic on the server side to detect it.


Do you just count the prefetch for what it is... a "maybe" page view. Or when the browser shows the prefetched page can it send a "view" (or load it again in the background - but that seems wasteful).


You'll only get the "maybe" hit, and never the real hit, so you can't accurately detect this on the serverside.

Javascript on the page isn't run until you actually visit the page, so that will be unaffected.

I don't think this is a big issue since JS-analytics are already superior to server-side log-analysis (caching alone).


That's what I thought, but I wasn't sure. Thanks.


Good point, it wouldn't affect javascript based stats, but it's surely going to mess up with webserver log-based analytics such as webalizer, awffull etc..


This guy is deathly truthful: "Link prefetching will probably pop up in Opera, Chrome and Safari soon, and in Internet Explorer sometime around 2020."

Neat little tidbit though, I'm interested in how this work with regard to the cache control/expiry headers (for dynamic pages).


I would expect the browser-makers to allow more than 2-4 connections per host - that would speed things up. This links-prefetching thing feels a bit dodgy for both performance and security reasons.


That, and I thought link rel="next" already existed… and had a more semantic meaning than pre-fetching.

It (and its brother, rel="prev") are meant to inform the browser (which then hopefully informs the user, but that never took off) of what the next or previous page is (in contexts where that makes sense). Maybe it's the next page of an article (evil!), or the next blog post in line. Whatever.

Anyway, this has been around for a while, even mentioned in this post from 2008: http://diveintomark.org/archives/2008/06/21/minimalism

So it's more accurate to say that Firefox likes to prefetch pages, and will use <link rel="next"> tags to guide it. If you're going to be pre-fetching pages anyway, that seems a reasonable enough way to decide what to fetch, but I agree that I'd prefer pre-fetching to be off.


The Firefox extension "Link Widgets" also uses those to generate a menu bar for the site (it also uses the rel="top", rel="up", rel="first" and rel="last").


I, for one, would appreciate a browser setting to not prefetch.


This has been around for years - Load up mail-archive.com in Firefox 2 (for example) and browse threads via Charles proxy or Fiddler so you can see what it's requesting and you'll see it pre-fetching the next message in the thread thanks to rel="next".

Does make it feel super-snappy though


I can think of ways to abuse this already.


Anything that you couldn't really do firing off an ajax request after the page loads?


No same-origin-policy, since you can't access the data in the response, but you could do that already with hidden iframes.


You could stick hundreds of huge images forced to 1x1 px size.

Maybe the "annoy a minority of people with tight bandwidth-caps"-attack isn't all that big of a threat.


Care to share?


Not the op but, on your site, force them to 'preload' tons of 'content' that they'll never really need. With low limit tiered bandwidth, it could be extra evil.


Prefetching could explain the occasional log entries I've seen where the page is retrieved by a modern, graphical browser (i.e. not lynx, links, elinks, w3m), but no images or stylesheets are downloaded.


Definitely block prefetch on your site, it completely screws up stats to save the user 100ms of load time.

in .htaccess

  RewriteCond %{X-moz} ^prefetch
  RewriteRule ^.* - [F]


When did we decide that getting accurate serverside log-analysis is definitely more important than saving 100ms of load time?


For people who want an accurate representation of the number of people visiting their sites and the marketeers who would very much like correct data too?


JS Analytics


All client side analytics can be blocked.

Check out Ghostery for Firefox.

Only server-side is reliable.


Not if someone caches your stuff.


Wouldn't it be better to exclude those requests from your stats rather than punishing your users for your choice of log analysis software?


While your idea is good and Apache has the CustomLog directive which could do it, the process is much more complex because of typically customized logging directories (especially where there's more than one client per host).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: