How will this affect web analytics? Let's say that Google chooses to prefetch the #1 result for any given search term. Will the #1 ranking site's analytics package report a unique visitor for every time that query is searched? If so, that would be a serious problem for those who rely on analytics.
It shouldn't affect Google Analytics and other JavaScript based analytics, but from the article,
* If this becomes popular it has the potential to skew logs and stats. Consider what happens when a bunch of prefetch requests are made to one of your pages, but the user never actually visits the page. The server (or stats package) doesn’t know the difference.
To clear this up, Firefox sends along an HTTP header, X-moz: prefetch, but you need some logic on the server side to detect it.
Do you just count the prefetch for what it is... a "maybe" page view. Or when the browser shows the prefetched page can it send a "view" (or load it again in the background - but that seems wasteful).
Good point, it wouldn't affect javascript based stats, but it's surely going to mess up with webserver log-based analytics such as webalizer, awffull etc..
I would expect the browser-makers to allow more than 2-4 connections per host - that would speed things up. This links-prefetching thing feels a bit dodgy for both performance and security reasons.
That, and I thought link rel="next" already existed… and had a more semantic meaning than pre-fetching.
It (and its brother, rel="prev") are meant to inform the browser (which then hopefully informs the user, but that never took off) of what the next or previous page is (in contexts where that makes sense). Maybe it's the next page of an article (evil!), or the next blog post in line. Whatever.
So it's more accurate to say that Firefox likes to prefetch pages, and will use <link rel="next"> tags to guide it. If you're going to be pre-fetching pages anyway, that seems a reasonable enough way to decide what to fetch, but I agree that I'd prefer pre-fetching to be off.
The Firefox extension "Link Widgets" also uses those to generate a menu bar for the site (it also uses the rel="top", rel="up", rel="first" and rel="last").
This has been around for years - Load up mail-archive.com in Firefox 2 (for example) and browse threads via Charles proxy or Fiddler so you can see what it's requesting and you'll see it pre-fetching the next message in the thread thanks to rel="next".
Not the op but, on your site, force them to 'preload' tons of 'content' that they'll never really need. With low limit tiered bandwidth, it could be extra evil.
Prefetching could explain the occasional log entries I've seen where the page is retrieved by a modern, graphical browser (i.e. not lynx, links, elinks, w3m), but no images or stylesheets are downloaded.
For people who want an accurate representation of the number of people visiting their sites and the marketeers who would very much like correct data too?
While your idea is good and Apache has the CustomLog directive which could do it, the process is much more complex because of typically customized logging directories (especially where there's more than one client per host).
http://www.google.com/search?q=hacker+news
there is a link element added to prefetch news.ycombinator.com.
I assume if you aren't careful in your logs, you could be overestimating your traffic.