Hacker News new | past | comments | ask | show | jobs | submit login
Chrome will Soon Let You Share Links to a Specific Word or Sentence on a Page (chromestory.com)
361 points by kumaranvpl on Feb 15, 2019 | hide | past | favorite | 319 comments



Technical question, apart from the (valid IMO) political arguments: Google's idea to implement this is to append the text fragment to the URL in a "#targetText" pseudo-query-argument in the URL fragment part. This sounds like it would extremely easily break existing web pages who use the fragment part for their own state.

In the blog post, there doesn't seem to be anything addressing this (or even just specifying how Chrome would combine the "targetText" fragment with an already existing fragment part)

Did they spend any thought on this?

Edit: Oh, right, the readme answers this:

> Web pages could potentially be using the fragment to store parameters, e.g. http://example.com/#name=test. If sites are already using targetText in the URL fragment for their own purposes, this feature could break those sites.

> We expect this usage to be exceedingly rare. It's an abuse of the purpose of URL fragments and page arguments are typically encoded using the '?' character (e.g. http://example.com?name=test). Still, we may run an experiment to see if targetText is available enough to use as a reserved token.

We'll just completely change the commonly understand processing model of fragment identifiers, but it'll be fine since we don't think anyone is using them anyway...

No further questions...


Feature author here.

I'd like to first clarify that this is still in the super-early stage of development; none of this is shipped or finalized yet. The feature hasn't even requested approval to ship at which point these kinds of issues would be brought up. We take web-compat very seriously. If this breaks even a small percentage of pages, it won't ship. Part of the shipping process is ensuring we have at least a draft spec (W3C,WHATWG) at least some support from other vendors.

Sorry, the explainer text came off more dismissive than I intended. I wanted to get something implemented so we could start experimenting and see how this would work in the wild. #targetText= is a a first attempt at syntax, any criticisms or data on how this might break would be appreciated.

From my (limited) understanding of how fragments are used like this today, the way this would break a page is if the page itself was using "targetText=..." and parsing the result. This is something we can and will measure and see if we have a naming collision. For pages that use a "#var=name" type fragment, we could append "&targetText=...".

I'm not tied to any particular syntax here so if I'm missing why this is a monumentally bad idea, please file a bug on the GitHub repo: https://github.com/bokand/ScrollToTextFragment/issues


This is not rare at all, I use it almost every time I link to a wikipedia article, for example: https://en.wikipedia.org/wiki/Newton%27s_method#Examples

That said, it seems easy to make this backwards compatible: if the existing #blah syntax is a valid link, that should take precedence.

FWIW, I think there's way too much cynicism in this thread. My first reaction on seeing the title was "cool! I would use that!".

edit: Hmm, I may have misunderstood the compatibility issue.


> That said, it seems easy to make this backwards compatible: if the existing #blah syntax is a valid link, that should take precedence.

That's indeed how it works. The majority of the compatibility concerns appear to be with apps that a using _custom_ parsing of the fragment to perform app-specific logic, which is a valid concern.


Angular 1 does this.


FYI, there's even a Chrome extension that makes this easier:

https://chrome.google.com/webstore/detail/display-anchors/po...


The parent comment's example includes a trailing equals sign on the fragment that could be used to differentiate (#targetText= vs #targetText).


Why is this not being pushed to become part of the web spec first? That seems more reasonable than pushing a feature with the possibility of breaking pages that are spec compliant.


It will. The specification process needs to be informed by implementation and experimentation. When implementing a feature we'll learn all sorts of things and hit bumps that will help guide the design. Once we have a working implementation, it helps to be able to use it to answer things like:

- How does this perform on existing pages - How does this feel from a user perspective - How can I write pages that user this

Specification-up-front is very theoretical in the absence of an implementation. IMHO, where it's (rarely) happened in practice, the specification is often unimplemented. The value of a specification is that it allows other browser vendors to add an interoperable implementation.

Again, I'd like to stress, this is in the very early-stage experimental phase. We aren't dropping a feature that'll break the existing web.

Edit: To clarify, "implementation" does not necessarily mean shipped to users.


> Edit: To clarify, "implementation" does not necessarily mean shipped to users.

Literally just today we got an article about Web Bluetooth on the front page of HN, a user-shipped capability in Chrome that is still in the Draft phase of standardization.[0]

Beyond that, we have the Web USB API, which is also in the draft phase, but is of course shipping to users in Chrome, on by default.[1]

Beyond that, we have HTML imports, which have been rejected from the standard but are still shipping to users in Chrome.[2]

I don't doubt your intentions, but if you think that Chrome is going to wait for standardization on this before it ships, you are not paying enough attention to the teams you're working with. And once Chrome ships a feature to users, the web standards body has basically two options: accept Google's vision of the standard wholesale, or change the standard and break websites that are already using Google's implementation.

I would be more confident and trusting of the process you describe if there was some kind of official commitment from the Chrome team that this feature will stay behind a browser flag until the standardization process is completely finished. But I think some of the reason you're getting immediate pushback to an extremely early draft of the spec is because developers don't trust Google not to ship this on-by-default once it reaches a WD stage.

[0]: https://developer.mozilla.org/en-US/docs/Web/API/Web_Bluetoo...

[1]: https://developer.mozilla.org/en-US/docs/Web/API/USB

[2]: https://developer.mozilla.org/en-US/docs/Web/Web_Components/...


I feel like the actual, underlying issue worth bringing to the standards body is: how do we address arbitrary content on a web page, even when it moves around or changes over time? Aside from these neat Chrome links, this would enable some super interesting features, such as the ability to add persistent and shareable annotations to webpages.


the ability to add persistent and shareable annotations to webpages

Microsoft did this many moons ago. People ended up defacing web sites like the New York Times and spreading proto fake news.


This is my only gripe. Please work with a spec first, so that the feature flows naturally through to other vendors. Especially because it's changing the way URLs work.

You're going to get people generating millions of these links around the web. That's a long term legacy of hyperlinks generated by the citizens of the web. Although the fallback is obviously pretty harmless, what about a future feature? If this doesn't work out, but some other feature down the line does, suddenly thousands of these links start working in weird ways. Or worse, the feature doesn't happen at all, because of the legacy of broken links. I know that's pessimistic, but URLs are the foundation of the web, changing how they work should be funneled through the spec.


Historically, web specs often start more descriptive than proscriptive. Building a spec without a reference implementation is a great way to build an unimplementable spec.


Thank you very much for replying on this thread. It's absolutely a very useful feature and - when done in a standardized and privacy conscious way, I think it would absolutely be an enrichment for the web platform. (Can we extend the same for images, too, btw?)

I think the reason this sparked concern is because (by using fragments) this intrudes into a field that was previously under full author control.

I think clear guarantees about which aspects of a webpage are the responsibility of authors and which are under browser control are important - only going by real-world usage and assuming everything not directly used is free for the taking is not enough here.

E.g., I think there are failure modes for SPAs that are not easily found with a usage search. [1] Additionally, this would make it harder to know for new applications which kinds of fragment identifier are "safe" to use and which are not.

There seem to be some existing specs that deal with the same problem [2].

Maybe those could be a starting point for the feature to to forward without interop/responsibility problems?

[1] https://news.ycombinator.com/item?id=19170230

[2] https://news.ycombinator.com/item?id=19169582


I’m not sure what internal resource exist or what access is available to your team, but I would think that Google’s search indices would be the best resources on the planet for analyzing existing URL fragment design patterns. It seems like you could classify various implementations and run tests against each of the major groupings so that you can be confident that xxx.xxx% of sites that currently use fragments will be supported by this design.

It’s an exciting possibility to link more directly to resources. I hope it is implemented in such a way that all browsers can follow those links with parity in the future. If that is the case I have a few hundred thousand outlinks that could be refined for clarity.


Would you append the & to the fragment, or to the query? If to the fragment couldn’t that affect routing for SPAs?

To be clear, I agree that this format is exceedingly unlikely to collide with real fragments, but still seems like there is a safer approach.


> Would you append the & to the fragment, or to the query? If to the fragment couldn’t that affect routing for SPAs?

To the fragment. Tt could, depending on how the app was written. Is there a framework or some common techniques that use this I could read-up on?


There was a time several spa frameworks used explicit hash routing rather than url-like push state but I don’t have sources handy. I thought angular was one of them but they may have done away with this behavior.

I work in analytics so I’ve seen things like UTM params get couched into the hash, breaking parsing, so at least it’s something to keep an eye out for.


Thanks, I'll make sure to dig into that some more!


At Microsoft, I worked on a popular Web app that for various reasons had to resort to using the # to enable deep linking. If you put anything in the hash it will break things. We assumed an empty hash unless it was something we set.


Everything after # in the URL is not sent to server, which makes it handy for storing tokens.


Just to clearify your phrasing. Are you saying that it would be ok of this breaks less than 20million webpages? Because that is still a hell of a lot of pages.


It could also break pages that rely on full control over the hash, having targetText= injected unexpectedly would break a lot of crude params parsers.


That's true, but presumably these pages would have the same problem with existing id-fragments, no?


Tell me how autoplay audio is doing with it's rollout, rollback, and butchered secondary rollout.

You can claim to have high standards, but Google is trying to dictate the future of the web.


> It's an abuse of the purpose of URL fragments

So what they're saying is, they expect a clash of their abuse of URL fragments with others' abuse of URL fragmentsto be rare because...

The arrogance here is astounding: "It's OK for us to use this, because others shouldn't be using it"


Aren't url fragment originally used for page position already? Like url.com/#title would scroll to the page the div with id "title"? In that case I don't feel like using url fragment for a word position is so much breaking it, it keeps the use of url fragment for info about page position, right?


Yes but this is about namespacing/syntax and likelihood of clashes.

Normally, url fragments were intended to be used by page authors, making clashes impossible (page author controls both page and anchor syntax). Google are breaking this contract, but argue that the fact that they're using parameter-style syntax should not clash with author usage, because using parameter-style syntax is "an abuse" of the feature.


This is a good point, which I didn't get until your comment. I think they should surface this on that page, because it seems everyone is thinking this will be a big clash, but as you say, if someone shares a sentence, the likelihood is low.

To double check, does this means things like,

  https://www.spa/#!/path/to/view
Can become

  https://www.spa/#!/path/to/view&targetText=percent%20encoded
The only part I don't get about the likelihood and this extension is how it interacts with these kind of hash routes.

I haven't thought that much about the next part, but how do people feel about the following?

  web+text://https://www.spa/#!/route/to/view#targetText=percent%20encoded
or something. Rather than extending URLs vertically, extend horizontally into a different protocol the URI of which can encode the syntax elegantly.


I think that's more risky in that most browsers and many intermediate proxies or other traffic inspectors are likely to have special casing around http: (at least) and https: and for most traffic to be over http(s): and a tiny sliver over web+text: seems fraught with more peril than a (fairly small and intention-aligned) abuse of #fragments. Even mail agents, markdown processors, and other text-display applications often have special casing to turn http(s) URLs into touchable/clickable links.

Page authors are unlikely to use web+text: if it means cutting off old browsers from access.


Googles’s fragment solution should gracefully degrade to behaving like the normal link in other browsers (as long as there isn’t some fragment conflict) but creating a new pseudo protocol would have to be supported by each browser to even get to the site.


You could use two hashes like that and stick with http. I reckon browsers would cope, and you could strip the second one.


Well, if we're talking about intention, they were originally a browser feature allowing to scroll to a specific <a target=..>. They were invented before JS if I'm not mistaken, so the assumption was that the browser was handling it.


That is true of any URL, the user can decide to use it in ways you don't want. But that is also an intrinsic feature of the web. The best approach is to give users flexibility, but accept that websites may overide that behaviour.


Originally used with <a> anchors IIRC. So, yes it breaks the original use, instead of scrolling to the anchor with the correct "name" it could/would scroll to a different text search point.

Why not "##" as a new URL fragment, or add a "scrollto" to the link tag that the browser can use?

I guess those were discussed, is there an RFC?


>Aren't url fragment originally used for page position already? Like url.com/#title would scroll to the page the div with id "title"?

That was the pre-SPA, traditional role of anchors...

If you did any html between 1995-6 and 2005, you'd use it for that.

Then, with AJAX/SPAs, # was used for state (pre-history API) -- and I'd assume many webpages are still left at that use.


Of course, page anchors are still in wide use, and not just on old pages. https://github.com/facebook/react/blob/master/README.md#inst...


Very wide use.

I just finished a large healthcare web site that uses page anchors.

Page anchors are recommended by one of the (many) federal web accessibility guidelines. This is so that people with limited motion or poor motor control, or visual impairments don’t have to scroll to find what they’re looking for on a page.

It’s also how the “Skip to content” links work, which is Accessibility 101.

Sorry, I’m not at work or on a computer so I can’t link to a reference.


Here's a reference for "Skip to content" for those who have never bothered to make their web sites usable for people who aren't 20-somethings in perfect health (Google):

https://webaim.org/techniques/skipnav/


It's another story point in the tale of how Google shaped the neo-modern web.

Google's influence is massive, anyone wanting to build a profitable website needs to bend to their standards of performance and structure.

Normally this was a general push in the right direction for everyone. With how they've been handling AMP however, part of me wonders if they're not trying to build a feature moat around the web itself. Although misguided use of their power by individual product-managers is probably the less sinister, more likely reason.


Hmm, who provided the first popular JS framework that used fragments for navigation again?

I think it was called "Angular".


Gmail was probably the first very-widely-used app that used the fragment for navigation, and Google drove the whole #! thing.

That technique is now considered obsolete, with the History API (history.pushState and all that lot) having supplanted it. IE9 is the most popular browser currently in use that doesn’t support it; there are still definitely some systems out there that will use the History API if it’s available, or fall back to using the hash if it’s not, but I find that using the hash for navigation in deployed apps is surprisingly dead, given that it’s shy of six and a half years since IE10 came out, and Firefox and Chrome only had a year and two’s lead (and Chrome’s implementation especially was unreliable for quite some time).

Using the hash for navigation is still very common in development workflows. But that can be ignored for a feature like this.


pushState is bad for many use cases because it completely breaks linking.


It doesn't have to, if you also provide routes on the backend so that pushState URLs actually resolve to a real URL that serves the same content. Or just provide a wildcard route and use JS to detect the URL like you would with hash-based navigation.

It does require more coordination between the front and back-end though.


I meant using pushState using the state object for something other than caching. If I click on "About us" and this results in pushState({page:"about-us"}, "", "#") then linking just doesn't work any more. It's a regression back to the days of Flash.


Ah, gotcha. Yes, that's a truly awful way to use it.


You can go back even further. GWT (formerly Google Web Toolkit) uses the fragment to store navigation tokens. I actively maintain a GWT SPA in my day job.


> It's an abuse of the purpose of URL fragments

Would you just look at who's preaching... If it ain't the same Google that all but standardized the use of #! contraption in the pre-HTML5 days - "you don't use, we don't crawl it" kinda thing.

This is an abuse alright, but a better choice of wording might've been appropriate.


While it does smack a little of the arrogance of a big player, they are using the fragment part for more-or-less what it used to be used for (navigating to a specific location on a page), and if you use something in a way it isn't intended (using a navigation aid for state management) perhaps you should expect certain things to break at some point when combined with other hacks of the same feature.

It is important to note that this won't break your site/app in the normal use case. People using your work will continue to do so uninhibited unless/until they wish to use this new feature in Chrome.


Couldn't they put this in a second fragment? #pageNavigation=home##targetText=foo

That way both could live together, if the page uses # in the fragment, just percent encode it.

At most this would break old bookmarks.


It should only cause a problem for pages using “targetText” as an anchor or id where someone would link to “page#targetText”

I would expect that it’s possible for both to work since Chrome’s hack requires the equal sign and its parameters. So with no equal, works as spec. With equal works with hack (that will likely make it into spec).

Even when a page has the targetText anchor and receives the text link param, I think the desired user behavior is to link to the text, not the anchor.


The proposal, as currently stated, is to use fragment id processing first. If that fails to find a target, fallback to text matching.

In other words, even if you give an element "id='targetText=something'", this feature wouldn't break that.


Good lord, I see pages that would break from this constantly...


Which is exactly why we build an experiment first, to find potential issues before shipping.

If you do see a page that you think might break and you're on Chrome newer than 74.0.3706.0, try turning on chrome://flags#enable-text-fragment-anchor and giving it a try. If something looks out of place, that would be extremely useful data for us. File a bug either at

https://github.com/bokand/ScrollToTextFragment/issues

or

https://crbug.com/new


you constantly see pages with #targetText=blah fragments?


This assumes all SPAs would ignore unknown parameters they see in the fragment part. Even if the SPA itself doesn't use "targetText", it could still try to validate the fragment and abort with errors on unknown data present.

Also, this would require the "targetText" parameter to live peacefully alongside other parameters. (That in turn would require there were a standard definition of "parameter" inside the fragment at all)

E.g, if Chrome encounters an url with an existing fragment, it would somehow have to combine this with its own targetText parameter and hope the SPA still understands the combined fragment.


I think he means it the other way around. You have a Single page application with routing based on #someRoute, then someone tries to share some link with the #targetText=... which will lead to an invalid route. Or am I missunderstanding something?


You’re thinking of ?targetText= or &targetText=

Hash isn’t used for query parameters.


Not for query parameters sent to the remote server, but there are definitely pages/applications that store state in the hash.

For example, look at the URLs used by mega.nz, or any encrypted pastebin (they store the decryption key in the hash so it's not sent to the server).


Backbone JS uses hash for routing. Case in point - http://backbonejs.org/#Router. There should be lot of sites that uses hash params as routes which still exists from the early SPA era.


It was used in SPAs. It was a common technique in the mid-2000s up to 2010 or so.

But not for query parameters per se, for app state stored in the query string (and bookmarkable).

https://stackoverflow.com/questions/15238391/hash-params-vs-...



doesnt seem to break on addition of a targetText "hash-param"


It is in client-side JS single-page-apps...


lots of older or poorly written SPAs use the hash exactly in this way. I guess their fault for being poorly written, like those people who allowed GET requests to /deleteaccount based on the theory that you had to be logged in with a browser for it to ever happen and not considering that google would make an extension that did all get requests on a page on entry.


They didn't use it because they were "poorly written", it was a "state of the art" technique back in the day.

Gmail used, others used it.

https://stackoverflow.com/questions/15238391/hash-params-vs-...


sure, it was state of the art back in the day, but state of the art over time depreciates to technical debt, and the ones that are left using what was once state of the art are now poorly written.


>and the ones that are left using what was once state of the art are now poorly written.

They could just be perfectly written (for their time), just legacy and not updated, is my distinction.


at some point the accretion of legacy and not fixing issues to match better understanding turns a perfectly written for its time application into a poorly written application for the present.


That doesn't give Google the right to break all those sites though.


right, I guess I should have indicated sarcasm on the I guess it's their fault part.


Yes, in particular many B2B web apps' help systems (looking at you, adobe) do very unholy things with these fragments.


Wouldn't it be valid to anchor the link to a div with an 'id' of targetText? I fail to see how that's in any way an abuse of URL fragments.


> We'll just completely change the commonly understand processing model of fragment identifiers, but it'll be fine since we don't think anyone is using them anyway...

That's not true. Truth is:

> We'll just completely change the commonly understand processing model of fragment identifiers, it may not be fine, but can you really stop us?


We shall stop using the turning lights (blinkers)... It is an abuse but anyway, novody uses them.


I would like to urge the browser developers/makers to adopt existing proposals which came through open consensus which do precisely cover the same use cases (and more!)

W3C Reference Note on Selectors and States: https://www.w3.org/TR/selectors-states/

It is part of the suite of specs that came through the W3C Web Annotation Working Group: https://www.w3.org/annotation/

More examples in W3C Note Embedding Web Annotations in HTML: https://www.w3.org/TR/annotation-html/

Different kinds of Web resources can combine multiple selectors and states. Here is a simple one using `TextQuoteSelector` handled by the https://dokie.li/ clientside application:

http://csarven.ca/dokieli-rww#selector(type=TextQuoteSelecto...

A screenshot/how-to: https://twitter.com/csarven/status/981924087843950595


Are there any guidelines for chromium development? It seems like one can just commit a feature to it and make half the world use it.

Regarding URL hash abuse elsewhere: this current development in Chromium is different from front-end frameworks and specific websites doing it because SomeSite can't break OtherSite while both having a different way to handle the hash.

Now Chromium randomly claims part of the hash which others now need a workaround for? Are these devs serious?

edit: dear downvoter, explain yourself :)


> Are there any guidelines for chromium development?

Absolutely! Launching a change to the web platform is a long, arduous process. This feature is currently taking the very early first steps.

For more details, see https://www.chromium.org/blink/launching-features.


Thank you for the link and I may have come off a bit harsh with that last sentence.

Having some feature like this for the web would in itself be good, just claiming a key in the hash part just for this would (and I know that hash doesn't really have a format so "it's not a key" etc, but it does gets used that way).

A proposition: instead of using plain 'targetText' how about having some prefix for all Chromium's claims in the hash part so that the front-end (framework) developers can filter these out without needing to keep a list?

Good luck!


Explanation:

> ... and make half the world use it.

vs.

> Add about:flag for Scroll-To-Text

> Adding chrome://flag to allow users (particularly on Mobile) to easily enable and test the feature.

(Commit message via https://chromium-review.googlesource.com/c/chromium/src/+/14...)

So nobody will use the feature unless they explicitly choose to.


Sure, but that won't stay to be the case since otherwise no needle-moving group would start to use it, so what would step 2 be?


"Integration with W3C Web Annotations" https://github.com/bokand/ScrollToTextFragment/issues/4

> It would be great to be able to comment on the linked resource text fragment. W3C Web Annotations [implementations] don't recognize the targetText parameter, so AFAIU comments are then added to the document#fragment and not the specified text fragment. [...]

> Is there a simplified mapping of W3C Web Annotations to URI fragment parameters?


I'm so glad that google just sometimes ignores W3C's horrible design-by-committe nonsense. Its the same reason XHTML failed.

    #selector(type=TextQuoteSelector,exact=foo)
vs.

    #targetText=foo
/edited to be a more fair comparison


I urge you to dig a little deeper and see why things like `prefix`, `exact`, `suffix` exists in that particular example:

https://www.w3.org/TR/selectors-states/#TextQuoteSelector_de...

Please note that you've actually changed the example!

If you just want to include `exact` or "targetText", you can still do:

http://csarven.ca/dokieli-rww#selector(type=TextQuoteSelecto...

or equivalent:

http://csarven.ca/dokieli-rww#selector(type=TextQuoteSelecto...

but that selects all instances of the text "annotation" in the document. Which is precisely why we need `prefix` and `suffix` to be able to identify the actual selection that the user made.

So, here is the equivalent of your example:

http://csarven.ca/dokieli-rww#selector(type=TextQuoteSelecto...

I hope that clarifies.

This is just tip of the iceberg! Let's stick to fair comparison and consider extensibility.


Even with that in mind, the W3C spec adds unnecessarily verbose syntax fluff.


We can only compare what's on the table, ie. arbitrary text selection. The targetText proposal doesn't necessarily result in a unique identifier that deterministically corresponds to user's original selection. So, I'm not sure if there is any point in discussing the syntactical differences on specificity further at this point. Having said that, at the end of the day, it is going to be the application that generates and uses the identifier.

Please have a closer into why we should be mindful of different use cases here, especially those pertaining to different resource representations, ways to "select", and factor in the state of the resource/representation at the time of the selection. It is there for use. It doesn't mean that all possible options needs to make its way into the identifier. This is important because people are going to use these identifiers in documents, and preserving context matters. Like I said earlier, the W3C Note has considered these, as well as permitting the use of different (RFC) selectors for different media. Let's reuse instead of NIH-FUD.


It's not like (un)necessarily verbose syntax is that uncommon in URLs. Or rather that's exactly where you'd expect an excrutiatingly verbose description of what you're linking to.


Such fragments don’t need to be particularly human-readable, only machine-readable. Given that, greater flexibility is generally a virtue: different matching strategies will work better in different contexts, and different tools can benefit from it. Consider the increased scope of the annotations feature—it’s designed for things like more robust referencing, bookmarking with annotations and other things like that.

There then remains the question of cross-compatibility: do all relevant tools implement the same techniques? That is a legitimate concern, but it’s well-enough specced that it shouldn’t be a problem.

Also as noted, various tools out there already use this format. The Chrome team ignoring that and using their own, functionally inferior format is hostile.


it adds a huge amount of unnecessary complexity for such a simple feature

There is a reason nobody implemented the W3C proposal, If I have the choice between a simple working solution that anyone can describe in a single sentence specification and a hundred page specification with tons of unneeded features that nobody is going to implement ever, I choose the former. I wonder how much hate Mozilla would've gotten for the same feature, probably none.


No, Mozilla would've gotten plenty of hate for potentially breaking sites, and adding features without discussing things with other vendors and Internet bodies. The thing is, we all know that Mozilla can't get away with that, but Google can, given Blink's market dominance.


But various tools have implemented the W3C proposal; just not browsers yet, probably mostly because getting the UI right is tricky, and the risk of breaking things higher. The problem is not in the particular syntax, but the feature as a whole.


The W3C design seems to be extensible for future use cases with a consistent syntax while keeping everything under one keyword. Something like this could also be applied to other types of documents, not only to text. Let's say to quickly draw a highlight on an image or specify a page in a PDF document. While "#targetText=" is simple, it is not a generic solution.


Yeah.. I remember Microsoft once thought it would be great to not follow standards. What we got is Internet Explorer.


W3C takes into consideration any platform where a browser might run, instead of which OSes are supported by Chrome.


Isn't it that Google don't have to care about breaking millions of websites but W3C do? At this point every web developer is used to bending to Google's will, W3C have similar power but need to get consensus first.


The bottom one looks prettier, yes. Does it also take into account all the problems the first one does?


Not saying its the best one but I built a system we used on the NYTimes web site for awhile that was fault tolerant and adjusted to changes in the page:

https://github.com/NYTimes/Emphasis


Typical GOOG move. Add some code, break the web outside the Google way. Why can't those guys write an RFC "Content sublocator-protocol for diverse media"...

Well, I started writing this and read your answer. This is not a problem specific to html/the web, for example I regularly encounter this when referencing things in pdfs. Will definitely have a look at this!


Back in the days of yore, I've made a super simple userscript [1] that was used in my close circles exactly for this purpose; basically is just

    window.find(document.location.hash.substring(1))
And yes, it was really convenient to exchange links without need to add "and there search for %word%". I haven't probably seen any other use of window.find [2] ever since.

[1] http://userscripts-mirror.org/scripts/review/8356 [2] https://developer.mozilla.org/en-US/docs/Web/API/Window/find


Cool, it's like it's exposing Ctrl+F.


Why aren't XPointer/XPath fragments used instead of yet another half-backed reimplementation of the same concept?

XPointer and XPath have been ratified by the W3C in (...checks...) 2003. They have since been revised, slimmed down and used in countless XML-based applications and specs.

The cool thing about XPointer is that it is basically impossible that its fragments are already used in the target document. No site would break if XPointer were used.

Examples of XPointers:

* Pointer to _all_ the occurrences of "target text" anywhere in the page (goodbye JS-inserted <mark>)

#xpointer(string-range(//,"target text"))

Pointer to all occurrences of "target text" in a <p> in the <main> article

#xpointer(string-range(//main//p,"target text"))

* Pointer to the second <h2> heading

#xpointer(//h2[2])

* Pointer to that specific bullet point that you want your colleague to change ASAP

#xpointer(//ul[7]/li[3])

* Abbreviated fall-back to XPath

#//ul[7]/li[3]


XPointer is one of options for a fragment selector in W3C Reference Note on Selectors and States: https://www.w3.org/TR/selectors-states/

See also https://news.ycombinator.com/item?id=19169582


I suppose it could be argued that the DOM might change on a page, but the text content is less likely to. We want deep links to the text content, not the DOM element.


At first glance, this seems like a really half baked idea to me. Some immediate thoughts:

- Wasn't there a draft spec floating around in the early 00s that tried to accomplish this, but didn't catch on?

- Fragments are already used to link to fragments, albeit more author friendly than user friendly I suppose

- This won't fly very well with sites that still make use of fragment based navigation, will it? (Believe it or not, I still see new projects use this over the history API!)

- What happens if there's an element in a page with the id or name `targetText`? This is not mentioned in the very tiny compatibility section[1] (where some actual data wouldn't hurt, btw)

- The linked proposal[2] already mentions that more specific fragment semantics is defined in other media types, so why not follow suit and propose the HTML spec is amended, instead of making this a user agent feature?

- Fragments in URLs are already pretty useless in sites that don't load content until after the dom is ready, i.e. anything that just loads a bunch of JS and then builds the site in the client – how would this fare any better?

[1]: https://github.com/bokand/ScrollToTextFragment#web-compatibi...

[2]: https://github.com/bokand/ScrollToTextFragment#encoding-the-...


> (Believe it or not, I still see new projects use this over the history API!)

want to guess which one doesnt need JS, or server-side handling to work?


I think maybe you're misunderstanding what I meant by fragment based navigation – that's probably my fault. I refer to the shebang style fragments, that was used to implement deep linking in primarily JS driven sites. This of course watered down the use of fragments in pages, and nearly made them useless frankly. Ironically, the history API – which of course as you say requires JS to work – is what probably saved fragments from being completely useless, since it meant everyone could return to using paths for deep linking and still implement their sites as JS blobs if they so chose. Moreover it opened up for hybrid approaches, where the server can actually serve a pre-rendered site that's then amended by JS (or not!) since the server would now see the full path that users were navigating to, as opposed to /my-catch-all-entry-point. Servers never get to see the fragment, after all.


the issue is that your server now needs to somehow understand which part of the path is the "fragment" without any distinguishing characters by which to parse it out of the url.

if the server now sees deep urls like `/products/patagonia/parkas`, how does it know which part is the fragment? is it /parkas or /patagonia/parkas? what if you have urls that are not completely uniform? or have different uniformity at different prefixes?

the router needs to be more complex. but i agree that the history API has advantages over shebangs for deep app links.


Point isn't so much that the server needs to do anything, you can still just serve up the same response no matter the path, and have the client render different results – just as you would with a shebang style navigation in the past. The difference is that you know have options, which you didn't before since the server never saw the fragment part of the url, only / or /entry-point or whatever. So you can have more complex logic server side (really, this was done in the past long before in-client routing was a thing) but you don't have to, it can still be deferred to the client to deal with. Using shebang style navigation you never even really had the option – the history API enabled this.


Lots of (mostly free) hosting just serves the files you upload, so that wouldn't work unless you copied the files to be served at every possible route.


“#” Fragments aren’t sent to the server at all. They are ONLY used to get to a specific anchor (old school) or processed by JavaScript to do something (often fake navigation by showing and hiding content). This article is adding a 3rd magic use for them.


Yeah, I know. Parent was saying that the History API doesn't need any server support, because you can just serve the same file no matter what the subpath, but my point is that lots of servers don't even have support for that. That's why the fragment is still more useful than the History API in some cases.


Fair point, thanks for adding this perspective! I of course assumed there was at least a modicum of control over this server side but as you correctly point out if there is no such control then shebang style navigation is your only option if indeed you must do client side routing. I once abused custom 404 pages (I think in GH pages, but it may have been some other free host with similar functionality) in order to achieve this kind of behaviour. It was just a demo so didn't matter much, but of course this is far from ideal for anything else.


Ahh! that make more sense. Thanks


I'm confused, at what point is a server involved? Fragments aren't sent to the server.


[flagged]


> Honest question: is there a specific person who is smugly grinning about this at google.

I sort-of understand where you're coming from, but isn't this needlessly hostile? Unlike in the case of "spying innovation" you brought up, Chrome's "Scroll to Text" seems to be intended with the user's benefit in mind (and only indirectly Google's — via greater user satisfaction, but that's exactly the product design model we want).

It would have obviously been far preferable if this had gone through the "proper channels" (discussion at the W3C, standardisation etc. first), but I don't think that there's any evidence to believe that anybody at Google thinks that they're the first to come up with this.


I would hope not, but who knows? I guess it's more like the road to hell is paved with good intentions.


Something open, standardized, quite similar and already working is Web annotations. This includes 'robust anchoring' capability which could surely be used to solve this problem.

https://www.w3.org/community/openannotation/

Hypothes.is is a well established open implementation.


https://web.hypothes.is/ is the URL parent was referring to, but there was a typo.

This looks similar to Genius.com's annotations. I wonder how this will evolve with multiple providers allowing similar functionality. Ideally you'd be able to combine annotation networks for a single URI to get all possible annotations you want on a given page.

Problems with these types of systems are the same as product reviews like on Amazon. Astroturfing & false information. It'll be interesting to see how that evolves.


Hypothes.is is general purpose but has a focus on scholarship. They have different namespaces (including global) and work with different communities (and customers) to provide overlaid environments.

Moderation and curation are perennial issues. Maybe these ovelays will help. It's not the same, but Reddit has different communities on different subreddits (with different moderation policies) discussing the same thing.

And re portability, they've made some very encouraging noises about data portability and exhange. It's baked ito the standard, and their default content license is, I believe CC something.

Thanks for catching the typo!


> https://web.hypothes.is/ is the URL parent was referring to, but there was a typo.

> This looks similar to Genius.com's annotations

» At least one member of the Hypothesis team came from Genius (formerly Rap Genius) which is the largest annotation service on the web.« https://news.ycombinator.com/item?id=13739965


Because that is exactly what we have all been clamoring for. Gee thanks Google.

Ok, that is more than a bit dismissive and I shouldn't be as cynical as I am about it but I switched to Firefox for a reason, this crap just keeps interfering. But oh wait, want to watch Youtube.tv in Firefox, oh so sorry you can't because it won't do video acceleration. I remember when you had to run Silverlight to watch Netflix too, that sucked as well.


I recently started to run into new IE6 stuff - bank plugin for certificates that was made only for Chrome, then on mobile some ecommerce site which simply didn't work in mobile FF (real one, on android), stuff like that. I suspect there will be way more of this monopolistic crap now that FF has fallen below 10% and Edge died.


I use Chrome for a few spare things (like Youtube and full-screen gslides) but for all else FF or Safari (both with ublock+umatrix) are much better. FF handles XML rendering way way faster than Chrome - great for my job.


How do you run umatrix with Safari? I’ve thought it’s not available.


I got that part wrong - sorry to tease. No uMatrix in my Safari. Just content blocker + ublock-origin.


For youtube watching, youtube-dl works nicely for much of youtube, and after you download the video, presumably your local player can make use of all your hardware acceleration that might be available.

http://rg3.github.io/youtube-dl/about.html


I can't find an RFC about this, but it looks like something similar has been proposed in the past, with a slightly different API.

> In URIs for MIME text/plain documents RFC 5147 specifies a fragment identifier for the character and line positions and ranges within the document using the keywords "char" and "line". Some popular browsers do not yet support RFC 5147.[5] The following example identifies lines 11 through 20 of a text document: http://example.com/document.txt#line=10,20

https://en.wikipedia.org/wiki/Fragment_identifier


Right on. Different fragment selectors can be used depending on the media type of a representation: https://www.w3.org/TR/selectors-states/#selector

Expanded here: https://news.ycombinator.com/item?id=19169582


I must say I prefer the RFC version - while still vulnerable to rotting from text changes, at least it should still link to the same general vicinity.


HTML documents don't generally have the concept of lines since text reflow based on the way user agent (browser) is rendering the document. That's why the RFC only talks about text/plain MIME type - browsers render those with all line breaks intact. You can maybe use a CSS selectors, but that adds a tremendous amount of complexity and is not easy for non-web developers to read anymore.


Oops, good point. I suppose you could do by newline/paragraph plus character instead, but that would be even more vulnerable to text changes.


Very significant limitation: it only links to the first instance of the target text.

What is the expected behavior if you highlight a second instance of some target text and try to create a link to it? It’s bad that the spec doesn’t discuss it. I can see this happening with short selections, like linking to a single word.

Seems like this could use more thought.


I think this is being over thought a bit. Even in it's most primitive form, it could still be quite useful in practice.

It's pretty easy to construct cases where it fails, but all such systems will fail either due to changes in content or changes in structure, and many will fail due to ambiguity as well. It's only really useful to compare how this might fare against other approaches. An example would be producing an XPath expression. Which may be more robust to ambiguity but weaker to structural changes.

This system will fail if the content is different enough that the target text is gone, which I believe is very reasonable. And while you could construct ambiguous text that couldn't be disambiguated by picking a larger string, in practice the value added with the complexity of trying to solve that problem is probably not very good. For the purpose of linking to sentences and paragraphs, this system seems sufficient and probably even better than many alternatives. I don't think many articles have a lot of repeating sentences.

Other use cases than linking to actual text would probably have their issues. Like, trying to link to a cell of a table would be quite problematic. That case may be better handled with something like an XPath expression, but in any case it's a fairly different use case imo.

All in all, my initial opinion is that this design is useful and easy to understand. It gives you the equivalent of Ctrl+F which is not bad.

(Disclosure: I work at Google, but this is the first I've heard of this and I don't work on Chrome.)


> I don't think many articles have a lot of repeating sentences.

This misrepresents the comment you're replying to. The case I gave was "linking to a single word." Articles have a lot of repeating words.

I can see users highlighting and linking to a key word or two in the section they want to link to. This is going to result in a broken link if it's not unique. This is not an edge case. It's not even mentioned as a potential issue in the spec.

Should the browser at least warn that this is going to result in a broken link as the doc currently stands? Should it disable the feature until sufficiently unique text has been selected? Should it just silently let you copy a broken link and chalk it up to PEBKAC (which is what the spec seems to imply)?

Needs more thought IMHO.


Why not just use fragments? It's pretty easy to place a unique fragment at all places you care to link, except perhaps in the very nice case of reimplementing Ctrl+F, which would need JS anyway, so why not just expose an API call, like window.find[1]?

I really don't understand what this brings to the table, besides making people upset.

[1] https://developer.mozilla.org/en-US/docs/Web/API/Window/find


Simple, it doesn't depend on requiring the content to have lots of identifiers. While you may convince some web developers to just add more identifiers, you can't convince everyone. And for largely unstructured plain text like mailing list archives, there's not really a logical way to do that anyways.

Re: making people upset. I mean, nearly anything will upset people. One way a search engine could incentivize people to add more usable fragment identifiers is by ranking pages with them higher, but I am certain many wouldn't be thrilled by that.

If this is implemented I don't think it will actually raise serious compatibility issues, and it's not a huge deal to remove it later, it should degrade softly.


That depends on the author annotating their page with ids so it's highly content dependent. In addition, authors can't always predict what will be interesting to users and pages frequently don't have IDs on elements you might want to link to; this is particularly true of long passages of text.


This whole thing is messy and fragile by its very nature. I'd argue that if you have to get as fancy as linking to the second instance of a string, then your odds of the link going to the right place later are not that great anyway. So it might not be worth even trying to solve that problem.


Perhaps it could automatically expand the selection until it becomes unique, and support an index property for cases in which that would result in an unreasonably large selection (or hits EOF).

Using indices alone would also work, but would be exceptionally prone to link rot as text changes, especially in the case of short (e.g. 1-word) selections.


Although you are right, it's already how browsers work in relation to IDs - only the first element with that ID is selected. (Yes, IDs are supposed to be unique, but it's not like HTML "breaks" if you don't follow this rule).

Plus, you can always select more text.


Doesn’t the spec specify, that the link points to the first instance of the selected text? Does that not accurately define what happens when you try to link to a latter instance: It links to the first one.


That's not great from a UX point of view - for most users, their experience with the feature will be trying to use it, selecting a phrase, and having it silently link to an entirely different area.


This isn't a question about where the browser will scroll. This is a question about what will the browser do if the user tries to create a link to something that can't be scrolled to.


Perhaps it could actually prefill the standard find dialogue and scroll to the first match, then you could easily see if there was more matches and jump to them.


They could fix this by having a second query string parameter, say, "targetInstance". Which would take the number of the instance. Indeed, things like this would have come to light sooner if they went through a more normal RFC process.


They could add and index too, but linking to a search phrase is already prone to breakage if the text changes, so there's not much point in complicating it.

Usually the same phrase is not repeated in the text if you select more than 1-2 words.


Yeah needs something like occuranceNumber=5 I.e. navigate to the fifth occurance


Also will it be able to make the link opening party to execute malicious requests? Think 'delegated fuzzing'.


It seems like more and more, Google is leveraging their near-monopoly on web browsing to just skip the proposal process and do whatever they want without discussion.


Do you actually follow spec discussions, or are you just guessing? From what I have observed the Chrome, Firefox, and Safari teams are in constant communication. They discuss new features, implementation details, and potential pitfalls.

Despite all the suggestions of "browser wars", they're actually very friendly and collaborative.


The proof is in the article:

> It is most likely that this will be added to other browsers like Firefox too.

> Microsoft’s new version of Edge is going to be based on Chromium, so this will be available on Edge too.

So, although some discussions with Firefox may have begun, it's clearly using the Microsoft decision to use Chromium as a way to force their standpoint. Note also the lack of mention of Safari.

This is the kind of stuff that should go thru a central committee because different browsers will react differently to what should be a simple url.


I've followed a few directly, and as a web developer I generally have my ear to the ground for these things. It's possible this feature was discussed, but the article made it sound like it wasn't in a formal capacity, and I know for a fact that Google has forced things through in the past, even when there's been active protest from people at Mozilla. And that was while Microsoft still had their own engines.


I think he's right. SPDY and WebP both started as Chrome-only features, and the world has essentially been forced to adopt them.

So far I think that's been a good think on balance because they are good technologies. But it is worrying how much power Google has in this space.


SPDY was donated to the IETF for adoption into the HTTP/2 standard. Other browser vendors only implemented it at that time.

WebP is a file format and doesn't fall into the normal standardization process. Browsers implement support for new file formats at their own behest.


Yes. After SPDY had already been implemented and shipped in Chrome.


Of course, that's how it works. Chrome is hardly alone in field testing their new feature proposals.

You'll note that they're also willing to kill features which are not adopted, like PPAPI.


> You'll note that they're also willing to kill features which are not adopted, like PPAPI.

After adding whitelists for Google properties like Hangouts, of course.


It's not like interoperability suffered because of that. HTTP fallback was always there.


Nevertheless it's some kind of fait accompli, especially in a market with only three main actors.

Where one actor can also start using the new "standard" overnight and break lots of services for people which are not using their browsers... You know what I mean. It's some kind of enforcing to have a monopoly.


I agree with you on Google's monopoly but imagine if every browser has to go through proposal process to release any feature? Everything will just stagnate.

Browsers copy features from each other. If this is a useful feature, it would make its way to other browsers pretty soon.


It's not just a browser feature, now is it, it's an interoperability feature since you're supposed to share links (it's in the title). This means that the near-monopoly is going to drive people who receive the shared link to consider using the browser that supports this. That puts other browsers quite on the defense, does it not?


"Your link doesn't work with firefox" sounds indeed really bad.


Stagnation is not always bad. For one, it helps keep complexity down. We're at a point where building a browser is impossible unless you have literally millions and millions of dollars to invest. That is not good.


Why is the possibility of a wheel being reinvented a desirable thing?


Is a browser really a "wheel"? There are so many choices to make, so many lines drawn in the sand. And besides, you really don't want the marketing and tracking giant Google to make all the decisions regarding the software that delivers so much content to billions of users' screens, do you? They've been okay so far, but there's a huge conflict of interest here.


Because competition is good?


The hundreds if not thousands of ad supported calculator and flashlight apps on play store beg to differ.


Reasonable amount of competition is good.


Competition that spurs innovation is good, more like, but I am skeptical of the claim that competition is necessary to prompt innovation, and whether the costs of such a model (leading to massive waste of society's productive capacity) are worth the purported benefits. For instance, a good deal of mathematics and science was not done on the principle of competition, but search for improvement. The real question is whether the development in goods and services can be or could have used similar principles, which is not only more efficient but seems condusive to better mental well-being; I'd rather desire to improve than to slight (and possibly sabotage) someone who must be my opponent.


Good for the consumer.


If building from scratch maybe, but when was the last time a browser was built from scratch? Even Chromium was forked from WebKit.


That's my point. The web is so complex that browser engines are becoming a monoculture. Even if you forked Chromium now, you'd need hundreds of engineers just to keep up with all the new things that are constantly added.


Yeah, I was rather sad to see EdgeHTML go. Although I never used Edge, rendering engine diversity is good for the ecosystem.


And yet the fact that the major engine implementation is opensource still allows each vendor to innovate by adding new useful features to the users.

We're really far from IE6 era of complete stagnation.


Not really, as far as the core browser engine tech goes. In practice it's too difficult to maintain a Chromium fork that diverges in any significant way other than the front end. Opera and Edge (per announcement) use vanilla upstream Blink.


> new things that are constantly added

By golly, that's the problem right there.


One thing I’ve discovered from learning and using Common Lisp seriously is that it’s really important for a system to be specified in a way that allows for extensions to the system to be implemented in terms of primitives the language provides (e.g. you want a new language construct, write a macro to transform your desired syntax into Common Lisp). In languages and systems that don’t have this feature, an implementation that introduces useful extension begins to develop lock-in and makes it harder to compete. (E.g. no one writes the Haskell specified in the Haskell report anymore, because all the libraries you might want to use use one or more ghc-specific extensions.( )


There are significant tradeoffs between shared constructs and self contained systems. Self contained systems resist ossification; they can be unilaterally replaced. A shared construct has to evolve in step with all of its users.

But the self contained system will have thousands of incompatible implementations, and the shared system will be easier to interact with and build on top of.

What this means is that a browser should standardize on things meant to be interfaces, and leave out all of the rest. Unfortunately, they didn't and now browser engines are huge and impossible to evolve except by adding more stuff.

Similarly, the value of programming languages is almost entirely in the interfaces they provide which allow you to develop code. Lisp provides almost none of that, and that is why it failed to become mainstream.


> no one writes the Haskell specified in the Haskell report anymore, because all the libraries you might want to use use one or more ghc-specific extensions

This is a bit of a misconception. One can still write Haskell98 whilst using a library that use a GHC-specific extension. The library doesn't (have to) force its consumers to use extensions too!


I phrased that a bit wrong: my experience (from talking to the Haskellers in my company) is that Haskellers generally prefer to turn on a bunch of extensions when they write code. So, while you might be able to write pure Haskell98, it tends not to be idiomatic to do so.


But since when is it a good thing that the last time someone managed to build a web render engine from scratch was thousands of years ago?


Hopefully Mozilla will have some luck modernizing by replacing pieces one at a time with Rust components. ;)


What do Firefox implementation details have to do with this?


They are modernizing the browser by replacing pieces over time. That's relevant to the topic of browsers being built on legacy tech.


I don't see how this supports to the argument that you can easily build a new, competing browser by forking. If you fork and then need the same kind of resources to replyce the forked codebase with your own, I don't see what you'd win.


Even if you fork, you'll still need constant investment to just keep it up-to date with the ever-evolving "living standard".


It sounds not unlike what used to be the role of W3C before HTML5. Everything stagnated, while Microsoft with IE largely did not care, and everyone ended up making their pages in Flash because it was the only really consistent target.


> Everything will just stagnate.

And that is bad exactly why?


Because we've all been heavily brainwashed in to requiring constantly "new" stuff. It's going to take a World changing event to break us out of that cycle IMO.


in this case it looked OK

> Though existing HTML support for id and name attributes specifies the target element directly in the fragment, most other mime types make use of this x=y pattern in the fragment, such as Media Fragments (e.g. #track=audio&t=10,20), PDF (e.g. #page=12) or CSV (e.g. #row=4).

fragments is already there but most pages do not have a clickable ID attribute. If Chrome could automatically provide one I think it's more convenient than F12 a devtool for links.


Although there already was a proposed standard by a W3C group for doing this, although it is a lot more complicated (with a bunch of variations, but even implementing one of those would be nice): https://www.w3.org/TR/selectors-states/#json-examples-conver...


> in this case it looked OK

That’s not really the point


I’m fine skiing down this slippery slope


IE 5 was also welcomed with joy when it arrived.


web standards are descriptive not prescriptive, and have always been: nobody asked before adding the <blink> tag to their web browser, it was just added (and later removed) after it was used by more than one implementation and consensus was found how it's supposed to behave.

For example, WebSQL died because there was only one implementation: implementations come before the standards.


Not strictly true.

In the W3C process, a Working Draft can be written before any implementations. Even a CR doesn't need to have been realised. To progress to PR and Recommendation, it needs to have implementations.

WHATWG is more descriptive, HTML 5 has been described as a collection of everything Hixie has seen somewhere on the web and thought cool.


Not strictly true, okay.

Still, it's nothing new that browser vendors just add the things they think may be useful and see what sticks (which then eventually ends up in some standard at some point like asm.js/WebAssembly - or fizzles out like NaCl).

These days with vendor prefixes, polyfills and generally a focus on backward compatibility there's typically some care taken to not leave users of other browsers in the dust.

That's very different from past efforts like, for example, ActiveX which made a full (but undocumented) Win32 environment part of the web browser design.


Browsers have always done that


We've had a time when standards were at the forefront, between the death of the last browser monoculture, and the establishment of the current one.

It was good while it lasted.


if you don't specify a timeframe you can't be contested and therefore your nostalgia reads like it is for a time that never existed except in your mind


Judging from this chart[1], definitely somewhere between 2008 (the x-axis starts at 2009, but if I recall correctly Chrome was released in 2008) and second half of 2012. Possibly the original comment was referring to a larger time span than this, but judging from the linked chart that's when Chrome was still on the up-and-up compared to IE. In my opinion though, this timeframe was 2008 to about 2014, when Google (and Microsoft) essentially strong armed Mozilla into accepting EME[2]. I'm not sure that's when the slippery slope really started, but it's definitely when I started believing there's a new boss in town – same as the old boss.

[1]: https://en.wikipedia.org/wiki/Google_Chrome#/media/File:Web_...

[2]: https://hacks.mozilla.org/2014/05/reconciling-mozillas-missi...


I remember writing websites at that time. Developers were worried about there being "yet another browser to support". It was a big deal because supporting all the browsers was already a lot of work. Nothing rendered consistently. Javascript would run fine in one browser, but not in other. You got to deal with fun features like quirks mode and worrying about transparent pngs. You had to use Flash if you wanted to embed a video.

The web is a completely different place today. Everything renders consistently across the different browsers - even Edge for the most part. Javascript is a modern language now. Video/audio embeds were finally standardized. Layout tools are significantly better with flexbox and soon, grid.

The idea that we were at some web standards pinnacle in 2008-2012 is crazy to me. The pinnacle is now.


> The idea that we were at some web standards pinnacle in 2008-2012 is crazy to me. The pinnacle is now.

I don't think the point was that we were at a pinnacle in terms of standards quality, completeness, or user agent consistency. Rather I think the point – at least it for me – was that during those years it was more of a conversation, not Google choosing a direction and everyone else more or less forced to follow suit. We are certainly in a better spot today in terms of capability, I agree, but I'm not sure I'd agree that we're in a better spot in terms of collaboration.


At what time has there been better collaboration between browser vendors than right now? That is the reason that browsers are so well in sync with each other now.


Well that's the argument, isn't it? Collaboration doesn't mean necessarily mean being in sync or following suit. When there's a hegemony then a level of collaboration might still exist, but if the interests don't align with the dominant party they can just go ahead anyway and more or less force others to fall in line – e.g. EME. Even if everyone syncs up, it's not necessarily good collaboration. Conversely, the dominant party can choose to just ignore other's contributions and ideas if their interests don't align, effectively making those contributions largely pointless.

Whether or not you agree is one thing, but I hope my point is clearer now anyways. :o)


Thanks for being respectful and well-reasoned in your response.

My opinion is that the collaboration is probably the best it's ever been right now. The example I gave lower down in this thread is that when WebAssembly was introduced, the Chrome team decided to deprecate their own solution, PNaCl.

We're far from the days of ActiveX.

https://blog.chromium.org/2017/05/goodbye-pnacl-hello-webass...


> Thanks for being respectful and well-reasoned in your response.

Likewise!

WebAssembly is a very good counter example, but still my feeling – or fear, really – is that this is not in spite of Google's hegemony but because of it. Let's say they hadn't gotten on board with WebAssembly and instead doubled down on PNaCL or come up with a different competing proposal. Even if all other parties rallied around WebAssembly there's a very good chance the dominance of Google would make it stillborn. If it's not in Chrome, it's simply not worth bothering, purely due to its dominance.

I recognize though that this is a bit of a straw man argument, and one based more in my opionins and feelings in the matter rather than anything resembling objective truths. I suppose that's also why we'll have to agree to disagree. :o)

> We're far from the days of ActiveX.

Thankfully!


Cheers! It's a difficult thing to quantify anyway, so agreeing to disagree sounds good to me. Do enjoy your weekend!


Aren't they "obviously" talking about Internet Explorer's hegemony "then", Chrome's rise "now"?


That’s exactly what I would do in the same shoes I think. Proposals are cool and I understand why it is better for an open web etc… but getting features immediately is pretty cool too.


But then they only work in one browser and that is massively uncool.

So yeah, standards please. And not Googley “Comments? What do you mean ‘comments’? We’ve already shipped this and we’re not changing it, so you better follow suit”-standards.


It’s almost as if Google is the new Microsoft!


You mean: the old Microsoft!


No, no, Microsoft was the old Microsoft but now they're the new Apple. Google is the new Microsoft, and Apple looks like the new Google but is actually the old Apple (the old one, mind, not the middle one where Jobs came back). Oh, and Dell is the new Gateway. :P


Embrace, Extend, Extinguish


Who are they embracing? They ARE a web based company, why extinguish web browsing?


With AMP they are trying to become the web.


What does that even mean? AMP is built on WebComponents, which are an existing web standard. That's no less standard than anything else using custom elements.


amp is a self serving standard.

Nowhere in the standard "higher search ranking on google inc" is mentioned as a must have feature. Yet it is the only feature anyone care about when discussing adoption.


If that were true (and I'm not so sure it is), then you'd blame the search team for letting it be a ranking factor. It has nothing to do with web standards.


You are not wrong (about what is on the standards). But you are being downvoted because everyone who did implement AMP in the real world, did so for the SEO benefits.


Oh I have votes disabled on social media. I'd rather form my own opinion when reading comments.

Regarding AMP, I appraised it from a speed perspective and found it partially effective, but only a bandaid fix. Addressing real site speed issues is the better approach.


Are we blaming the wrong hand ?


They are caching a chunk of the web and routing traffic from google.com to their cache. Since traffic never leaves google.com, it is now in a way the web itself. This is achieved thanks to AMP.


Technically that's AMP Cache, not AMP itself, but I get your point. It's worth noting however that both Cloudflare and Bing also run their own instances of AMP Cache.

https://amp.cloudflare.com/

https://blogs.bing.com/Webmaster-Blog/September-2018/Introdu...


And they are going to hide from the user the fact that they are indeed on AMP page on google server by hiding all prefixes.

They already started doing part of this plan btw - https://bugs.chromium.org/p/chromium/issues/detail?id=881410


Reading the comments on this post and all seems to be envy.

While Chrome does useful stuff like this, Firefox progress in the web space is... eh... adding ads to the home page?

Maybe Firefox should've went with a proposal on how to add ads into the browser, or on how to send all of your URL history to a third party. :^)


Almost caught up to the capabilities of Ted Nelsons Xanadu Hypertext System from the mid 20th century.


I was just seeing a documentary about him (the Internet actually) by Werner Herzog. It was the first time I had heard about Xanadu or Ted Nelson. His ideas were quite radical from what I know as the Web. And I got to thinking how the web might have been had Xanadu become mainstream.

But then I opened the Wikipedia page on Project Xanadu, it's second paragraph is:

> Wired magazine published an article called "The Curse of Xanadu", calling Project Xanadu "the longest-running vaporware story in the history of the computer industry".[2] The first attempt at implementation began in 1960, but it was not until 1998 that an incomplete implementation was released. A version described as "a working deliverable", OpenXanadu, was made available in 2014.

So... I don't know what to make of it.


It's basically as if he proposed a teleportation device, but couldn't deliver any of it. And when Tim berners Lee build an airplane which everybody started using he starts complaining on how much better his idea was. In recent years he's only been doing that last thing.


1980s is not "mid 20th century".


It was proposed in the 1960s as per Wikipedia.


Seems very cool!

But the functionality feels like it'd be prone to breakage as new text is added to the document (possible in Wikipedia articles for example). But it'd be super-helpful for search engines as they will have a relatively recent copy of the text, and can auto-generate the links on the backend before showing the search results to the user.


I'm on the one hand skeptical of the idea being useful, and on the other acknowledging that emacs' ability to open a file on a line and word is quite useful. (Basically, the results of grep can be directly fed to emacs to jump to the result.)(I'm sure this is not limited to emacs.)

You are right that the link is somewhat time limited. However, tools like emacs and ctags show this can be somewhat worked with. Instead of the very specific links that grep creates, generate one that has a location, but also a small pattern that indicates it. To that end, you could generate a link to the word "breakage" with a selector saying where it is expected. And the browser can then just fudge around that selector to find that word. Or highlight where it was expected.


As described in the Here is How This will Work section, the implementation is humble. It either scrolls the page to the first instance to the string in the fragment or it scrolls and highlights text included and between two comma-separated strings (commas in the strings themselves would need to be url encoded). If you wanted to scroll to the nth instance of a string, you've have to use the latter format and include more text, making it unique or at least the first instance.

Yes, it would fail if the target text no longer exists in the page but link rot is already commonplace. It would be important for it to ignore DOM elements, it can't fail because the target text crosses two paragraphs or a span.


I'd assume a hash of the content would be included with the link, and if the hash changes, then it would make a best-effort guess at the right location, with a different color highlight indicating that it's a guess. At least that's how I would do it.


What would you hash though? Content on a page changes frequently and dynamically so the hash would mismatch very frequently.


Is this going to lead to an api of #<keywords> for different scrolling to behavior?

Also, what happens if the page already has an anchor text that happens to be the exact match? i.e. #targetText=My%20Heading is already an existing anchor


Then the = should be encoded as %23 in the URL and should not be interpreted as #targetText=<encoded target text>


Let me guess, it has nothing to do with https://www.w3.org/community/openannotation/


"Look at you, standards committee - a pathetic creature of humans and ideals, panting and sweating as you chase after my unrestrained "innovations". How can you challenge a perfect, immortal GOOGLE!" (c) Shodan, probably


Similar Javascript feature existed for a long time:

Link to first occurrence of "pseudo" on page: https://trac.edgewall.org/wiki/TracLinks#/pseudo

Link to last occurrence of "highlight" on page: https://trac.edgewall.org/browser/trunk/trac/htdocs/js/searc...


How is this supposed to work when most web pages these days are a mash of dynamically generated javascript?


Same as regular URL hash I guess? (Browsers tend to wait a little if they don't find the target)


Can this be used to gain some informations about the content of a webpage, with a timing attack ?

Searching a phrase on a webpage will be faster when the phrase exists, on average. If this timing is observable, this could be exploited to guess the content of a web page.

Someone did this already with selector-based anchors: https://blog.sheddow.xyz/css-timing-attack/


Feature author here.

The CSS timing attack actually influenced the design process heavily. The original design was to use a stripped down CSS selector but we found this too large of an attack surface.

There's definitely still concerns around making sure a 3rd party can't exfiltrate information from the page using this but we think we've found a set of restrictions that should prevent this: https://github.com/bokand/ScrollToTextFragment#security


Using the hash for navigation is still quite common, because usage of history pushState requires server support to correctly work on reload.

I’d also argue that SPAs using it for navigation are not abusing it. The purpose is to allow linking to different sections in a document. Navigation is effectively an advanced form of the same concept. It’s just that as you navigate, the DOM happens to mutate to bring the content you linked to into view, instead of just scrolling down to it.

Hash fragments are for page authors to define, not the browser. While this sounds like a useful feature, it’s implementation is simply wrong.

What they really want to do is extend the RFC for URLs to add an additional component. There are already unsafe characters that could be used for this purpose that would be nonbreaking.

For example, add a | at the end of the hash fragment, and anything after that can be for the browser to communicate page state independent of the page author’s intent.

The current proposal, however, should be firmly rejected.


So Chrome users will unknowingly be offered to create links which won’t work in other browsers?

How come I’m not surprised?

These links should if nothing else be prefixed chrome:// to be explicit about it not being real web-links.


Does that help, surely that just means other browsers have to support chrome:// if they decide to adopt that feature?


Track what you and your friends like, build a graph mine that data and earn more ad revenue.


I'd like this for PDFs. Often I want to share a PDF opened at a specific text. You can create a pdf url which jumps to a certain page. You can even link to a search for a certain word (single word).

But I can't link to a search phrase in a PDF. For some reason, they implemented searching for single words, but not phrases.


Searching for text in PDFs is a not easy to do in general.


I actually wrote a really similar extension a year ago, please give me feedback!

https://chrome.google.com/webstore/detail/target-search/nohm...


Hey, let's break standards in a dumb way, what could possibly go wrong?


Interesting. I think this is useful if implemented correctly.

I had an idea similar to this a few years ago. Basically, you would be able to bookmark your place on the page and come back to it later or send the link to someone else. I didn't think that using a word or phrase was useful because it could show up multiple times on the page. Instead, I used the scroll distance. Here's a crude demo:

http://jboyer87.github.io/page-progress/

Of course, this isn't device-independent. I never went much further with it.


I remember how many websites (15+ years ago) used to offer something similar to this. JS would pick out the query used on a search engine (and sent with the referrer) to scroll to and highlight matches to the query.


This is indeed very useful feature to share specific content. I built larynx 'A tool to share content along with voice' with similar intention. In larynx a user can share screenshot of the particular snippet along with voice.

I wanted to build a similar future in the feature where clicking the link would take the user directly to the snippet, of-course it would work only inside larynx apps; but with chromium penetration Google can implement this feature seamlessly.

[1]: https://larynx.io


I'm very excited about this. I've always thought this should be a fundamental part of linking the web. Along with linking to times in videos, audio, etc. I think youtube allowing you to jump to a certain time via the url is a great example. I've wanted to share podcasts down to a specific point in a conversation many times, and if it wasn't a video, I didn't know how to do it. It just feels like a fundamental way to share information.


This breaks so much stuff. So many websites use #something for a myriad of purposes. And it's their discretion, their website. I've never even seen #arg=val used in URIs ever, so nothing would be expecting that. Or #arg1=val1&arg2=val2.

This feels just like when I discovered that one day Chrome just decided to use the canonical address of a page instead of the actual address, when sharing it, telling no one. These people are a plague on this world.


Glad to see google wasting no time on their new browser monopoly to screw up everyone else’s experience and push their own vision for the web. /s


    #targetText=whatever
Ew, this is so hacky and bolted-on.

But I guess this is the price we pay for letting the WHATWG go on with their coup against W3C.


Great feature and a pretty straightforward polyfill, could even be implemented before Chrome releases it (the context menu to get the link being the hard part, but a select text + tooltip with "Copy link to this text" would work, or I'm pretty sure good designers will find great UIs for it, might even normalizes as a new pattern)


This really reminds me of https://indieweb.org/fragmention

What I don't get is how it will be handled if you have more than one occurrence of the phrase on that page.


The usage isn't as _pretty_ as Chromes but usage allows for changes in original text to not break the link:

https://github.com/NYTimes/Emphasis


I love this, or at least the general idea that it represents, though the specific implementation might be a little odd. (I'd rather the link to be some sort of 'coordinate' -- like a CSS selector, but for text?)




In my mind's eye, I can see someone chaining multiple events to trigger malicious behavior with this.

-Would the scrolling occur AFTER page load?

-What if the text that is passed in is removed/never existed?

-What if the text exists in multiple places? I guess it immediately goes to the first instance, but if you are trying to link to the second, third, etc., this API wouldn't work...

-What if the text screws up a URL query parameter?

-How will this impact SEO?

-Would this still work if the parent element of a text node is set to "display:none"?

I just hope it doesn't become a thing with sites relying on this functionality.


During my, I dunno, 25 years on the internet, the number of times I've wanted to do this is 0. Not sure about the value proposition here.


I bet this has been possible to accomplish with a custom browser extension. And also, services like Evernote or Genius might help.


So... like xpath?


Yes, but worse. There is not a single day when I'm not annoyed that xhtml was abandoned.


It would be nice if google would link to the part of the webpage where your search text appears.

Maybe this is a step towards that goal.


Another feature inching closer to Xanadu. Excited for this!


Like Medium?


good news for UX


wow


Everyone in this thread needs to put their money where their mouth is and go back to Firefox if they're currently using Chrome.


Did exactly this, a few weeks ago. Was heavy Chrome user, now fully on Firefox (and Safari on mobile). I am so satisfied with this decision, and the transition was much easier than expected.


I would do it instantly if the devtools were half as convenient as Chrome. Unfortunately they are super clunky


What are you missing? I was feared of this too, but they seem to be just fine.


Can't inspect websocket frames. It's ridiculous.


While I run into this too from time to time, it's not something everyone uses. Also it takes me 5 seconds to workaround by debug logging incoming ws messages. Most web devs I talk to can do everything with FF devtools more than fine.


I'm out of that, but maybe somebody else in here got an idea how to fix your problem with that.


My solution is to use chrome when faced with that particular limitation. There's a bug open on the mozilla tracker and it's been there for a while but no one seems to care.


I use both, Chrome for client side dev, Firefox for everything else. Chrome is just the dev tool. Works like a charm because switching between tabs inside a browser is just as quick as alt-tabbing between browsers.


I use them all the time, don't seem clunky to me. What else is not working other than the websockets inspection you mentioned?


Did it on Monday evening. Great choice, no problems at all. Could import all data, also switched to Firefox on my Smartphone where I can now run uBlock, too.


It makes such a difference on mobile being able to run a good blocker.


Totally. I am also running my own Ad blocking VPN with Pihole/PiVPN and routing all my mobile traffic through that. It's just great and probably the best $4/month (for the Vultr instance) I have ever spent.


This. I would argue that Chrome seems faster (probably proprietary tech on most sites), but with µBlock, this is gone and there's no ads.


Can't, locked in by the password manager and android integration.


No hardware video decoding on Linux ...


Indeed - but it's hardly an issue in practice.


Well, watching videos is one of the things I do most with my browser. So this is a no-go for me, as the fans of my laptop will start spinning with Firefox (and battery life is worse).


Real use case for this is so Google can poke into context for links shared via services they have no ownership.

They got used to know exactly what people share and, much more importantly, what people click on email, thanks to gmail rewriting all links to a google track url (while misleading the user by a fake status text on mouse over). When they sum the info of 1. who send it, 2. who clicked on it, 3. what was the email context, they can further classify the page content. (for ads, search ranking, whatever they do today)

But they can't do that on whatsapp, facebook, etc. With this "feature", now they have at least a modicum of context when the page impression show up on their Google Analytics trackers. Did you visit that wikipedia page because of recent news events, or doing homework?

Or, we could just go with occam's razor and believe that a bunch of bored engineers at google decided to simply add a feature completely disconnected from any of their business plans, and spent a few months tweaking a git repo with nothing but a README file... while the actual code review has follow up questions that were left open and merged anyway: https://chromium-review.googlesource.com/c/chromium/src/+/14... ¯\_(ツ)_/¯


It's a shame because it is actually a good idea, but now Google has first-mover advantage, so any other implementations will just be a clone of what Google already did.


http://archive.is/ got there first, sort of. If you archive a page then highlight some text, the URL changes to a direct shareable link to that text.

Bonus: being its own site, and implemented in javascript, it's already cross-browser.


Isn't that...better than every browser having an incompatible implementation of it, so Firefox users could only share links with Firefox users, etc?

I get the argument that it's better for these things to be done by consensus among the browser makers. Is it in scope for WHATWG?


> Or, we could just go with occam's razor and believe that a bunch of bored engineers at google decided to simply add a feature completely disconnected from any of their business plans

I understand the skepticism about Google's motivations but "making Chrome better and more feature rich so people will want to use it" is completely in line with Google's business plans.


I believe you. That's what the fine folks at the IE team did with ajax too, for example. And I am not being ironic, but the line is so thin that I have to point it out either way.


> what people click on email, thanks to gmail rewriting all links to a google track url (while misleading the user by a fake status text on mouse over)

They use the same trick for Google Search in a way that is transparent to the user - as long as you have JS turned on. When you switch it off, you realize instead of clicking on HNs URL, you actually click https://www.google.com/url?q=https://news.ycombinator.com/&s...

Now, they want to move a step further. I wonder when they're going to stop.


Actually, it seems like the <a> you click on Google Search has the right href (which is what appears on the status text) but it also has a ping attribute to it for tracking.

https://www.w3schools.com/tags/att_a_ping.asp


I'm surprised there is not a chrome extension to remove these.


I'm not surprised at all. Fortunately we still have Firefox: https://addons.mozilla.org/en-US/firefox/addon/google-no-tra...

Edit: it seems to remove ping, but the onmousedown event is still there.


Nobody cares, really, except for a percent fringe users of a percent of Google users.

On a more general point, once again Google Search is provided to you for free. If you don't want to use it by their rules, you are entirely free to use Bing or Yahoo. Actually using the service regularly, because it is tailored to your workflow and provides value to you, while complaining that the tailored/value-adding part is driven by an analysis of your behaviour on the servicee... that's very hypocritical.


I don't use Google, I use StartPage. I am plainly stating that I am surprised nobody made an extension to remove the ping= link attribute from all webpages, considering that the only reason it exists is to track user activity.


That is exactly what is hypocritical about your behaviour: Startpage exclusively uses results from Google, which the company pays Google for. So you are using Google regularly while complaining about it.


check out one called NoRedirect on both chrome and firefox.


This technique (redirecting through some dummy ?url=... page) is a common way of doing an external redirect on any service you're building. It's a simple way of making it impossible to see the actual Referer to external website, which is more secure to your data and the service you're building (e.g. Gmail).


There's a standard way to do that, which is to add a rel="noreferrer" tag to your outgoing links.


Code usually doesn't carry final HTML in its data structures, whereas converting a URL in a single place where it's extracted is easy. In other words, making sure you didn't miss to put rel=noreferrer everywhere is way harder.


And it still makes somewhat sense to not entirely rely on it, given people still use browsers that do not support it.


> It's a simple way of making it impossible to see the actual Referer to external website

It's about tracking outbound links, not about masking the referrer. As already mentioned, there are standard ways for that.


Tracking can be done in JS via onclick event also. What makes you so sure it's not about masking referrer?


Because I've done tracking like that for several years, lots of services are still doing it this way and because rel="noopener" or rel="noreferrer" is the way to go if you want to mask the referrer. Even Google recommends to do it that way[0].

If you are not using it for tracking, there is no point in doing it like this after all.

[0] https://developers.google.com/web/tools/lighthouse/audits/no...


But browsers seem to have stopped giving out referrer some years back.


I am having trouble deciding if Google has finally beaten Microsoft on the 'we can do what we want to' front, specially given that Microsoft is running backwards in that race of-late?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: