Where Am I? NYTimes or Google?

superasn · on July 4, 2020

Yes this has been a big issue for a very long time now. Google wants to push a release where it will display the hostname of the amp site even if the content is being served from google.com[1].

Mozilla (and Apple) are strictly against it and thank god for Mozilla. If Google had a bigger market share this would already be something we would have been living with. I'm sure there are better sources for this, but here is the first result:

https://9to5google.com/2019/04/18/apple-mozilla-google-amp-s...

realusername · on July 4, 2020

Being completely against AMP for obvious reasons, I'm personally not against signed exchanges itself, this feature could spawn a whole new class of decentralised and harder to censor web hosting, that sounds like a great addition.

jhhh · on July 4, 2020

Going to also spawn a whole new class of semi-persistent malicious pages (say created via XSS) that once signed and captured can be continuously replayed to clients until expiration.

CGamesPlay · on July 4, 2020

What? The signing allows the content to be mirrored in other locations with guarantees about consistency. It doesn't imply anything more about the content than SSL does.

8organicbits · on July 4, 2020

If Google AMP is acts like a cache of content, then cache poisoning attacks are a concern. How those cached items expire will determine how long an attacker who poisons the cache can serve malicious content.

entire-name · on July 4, 2020

At that point, wouldn't the approach be to defend from the client side? Namely, we can instruct the client to not trust any content sign by such-and-such keys. This can be done by pushing out a certificate revocation, etc.

judge2020 · on July 4, 2020

This would be pretty cool (remotely revoking signed exchanges), however it's not part of Google's proposal - Unless every previous security consideration about caches is accounted for in SX's, it's probably not safe to start faking the URL bar.

gregable · on July 4, 2020

Certificate revocations do apply to signed exchanges.

CGamesPlay · on July 4, 2020

Why does Archive.org get a pass on this one? Signed responses mean that there's a very clear way to leverage the browser's domain blacklisting technology to stop the spread of malware, which isn't presently possible for any content mirrors on the web.

8organicbits · on July 4, 2020

Archive.org makes it clear you are on archive.org. The URL shows archive.org. The page content shows archive.org at the top. [1]

Google AMP doesn't show Google on the page. Google is pushing for the URL to show the origin site's URL instead of Google[2].

If an attacker poisons a nytimes.com article served by Google AMP, how does a browser's domain blacklisting help? Block google? Block nytimes.com? Neither makes sense.

1. https://web.archive.org/web/20050401090916/http://www.google...

2. https://9to5google.com/2019/04/18/apple-mozilla-google-amp-s...

CGamesPlay · on July 4, 2020

I believe you might be misunderstanding the idea behind signed exchanges. To be clear, Signed Exchanges are how AMP should have worked all along.

example.com generates a content bundle and signs it. Google.com downloads the bundle and decides to mirror it from their domain. Your browser downloads the bundle from google.com, and verifies that the signature comes from example.com. Your browser is now confident that the content did originate from example.com, and so can freely say that the "canonical URL" for the content is example.com.

Malicious.org does the same thing, and the browser spots that malicious.org is blocked. At this point it doesn't matter if the content came from google.com, because the browser knows that the content is signed by malicious.org and so it originated from there.

Hope this helps clarify. Obviously blacklisting isn't a great security mechanism; my point is just that signed exchanges don't really open any NEW vectors for attack.

remexre · on July 4, 2020

I think the concern was more that if I can XSS example.com, Google is now serving that for some period of time after example.com's administrators notice + fix this. (In the absence of a mechanism to force AMP to immediately decache the affected page(s), that is.)

dillonmckay · on July 4, 2020

Yes, at what point does example.com lose control of their content and for how long?

8organicbits · on July 4, 2020

I'm following.

Imagine that example.com builds the bundle by pulling data from a database. If an attacker can find a way to store malicious content in that database (stored XSS) and that content ends up in a signed bundle that Google AMP serves (similar to cache poisoning) then users will see malicious content. When the stored XSS is removed from the database, Google AMP may continue to serve the malicous signed bundle. So an extra step may be needed to clear the malicious content from Google AMP.

How exactly the attacker influences the bundle is going to be implementation dependent, so some sites may be safe while others are exploitable.

joshuamorton · on July 5, 2020

Signed exchanges can only serve static content, so it's not clear what you could do maliciously.

thw0rted · on July 13, 2020

I think most of the comments in this thread mean "malicious" in the sense of injecting malware (say, a BTC miner) or a phishing attach or something into the signed-exchange content. However, you also have to consider that the content (text, images) itself could be "malicious", in the sense of misinformation.

If, purely as a hypothetical, Russian operatives got a credible propaganda story posted on the NYT website 24 hours before the November elections, and an AMP-hosted version of it stayed live long after the actual post got removed from nyt.com, I'd certainly call that "malicious". Of course, just like archive.org, I suspect that in a case as high-profile as that, you'd see a human from the NYT on the phone with a human at Google to get the cached copy yanked ASAP, but maybe on a slightly smaller scale the delay could be hours-to-days, which is bad enough.

fomine3 · on July 6, 2020

vinay_ys · on July 4, 2020

Along with signing, we need explicit content cache busting and explicit allowed mirrors list (which can be revoked instantly). Then it would be at par with TLS + current cache busting mechanisms on top of TLS.

londons_explore · on July 4, 2020

As long as javascript on the page has some way to inspect the signatures and where it was delivered from, you can implement cache busting, allowed mirrors, and invalidation yourself however you please.

gregable · on July 4, 2020

Not continuously. The signed content includes an expiration date, which the publisher controls.

This expiration can also never be set more than 7 days in the future.

realusername · on July 4, 2020

I don't see how this tech would further help to make malicious pages created with XSS, any thoughts? It sounds like it's the same issue with or without signed exchanges.

jhhh · on July 4, 2020

The point wasn't that this technology would uniquely enable XSS attacks, but rather that it could allow malicious actors to persist particular attacks for the duration of the validity of the signed content. Any brief vulnerability in a website now becomes serializable. They considered this already. Look in the draft "6.3. Downgrades":

"Signing a bad response can affect more users than simply serving a bad response, since a served response will only affect users who make a request while the bad version is live, while an attacker can forward a signed response until its signature expires. Publishers should consider shorter signature expiration times than they use for cache expiration times."

realusername · on July 4, 2020

I see indeed, I don't think they are going with the right approach here, there should be an automatic way to upgrade signed content / check for updates, short signatures just destroys the benefits of the feature.

foepys · on July 4, 2020

It's the only way to do it. TLS has shown that OCSP and the likes are not adding significant security and short certificate expiration is the only way to go.

The serving nodes are not necessarily under control of a well intended party that complies with upgrade requests.

ehsankia · on July 4, 2020

And I don't see the issue with short expiry. The point of a cache is to reduce load, not to entirely eliminate it. Even with a 5m expiry, it's still 5 orders of magnitude better than having a 100+ QPS on your server.

sk5t · on July 4, 2020

Parent considers that the feature could be used to turn a temporary problem into a long-term problem. Sort of like how certificate pinning could be twisted to ransom an entire domain.

grawprog · on July 4, 2020

What you describe sounds a lot like URL spoofing, which is already pretty much used entirely to trick and scam unsuspecting users into clicking malicious links. Signed exchanges would just be an even harder to detect version of this.

henryfjordan · on July 4, 2020

Signed exchanges make it super easy to detect fraud. You can verify the signature...

jefftk · on July 4, 2020

No, with signed exchange your browser verifies that the original site really did produce the content you are viewing.

ddevault · on July 4, 2020

No... no, it would not. It would centralize web hosting and make it less censorship resistant.

Sure, you could move it somewhere else and have it show up in the address bar the same, but the actual URL has changed and you need to somehow get the new URL into people's hands. And ultimately you've centralized a lot of websites under a smaller number of service providers which, before, would have been on their own domains.

ehsankia · on July 4, 2020

I'm not sure what you describe there but it sounds much more complicated than it is.

Isn't signed exchanges basically CDN's without having to setup DNS? It's in theory no different than using CloudFlare to serve your content, except any CDN can just serve it without giving them access to your domain.

fabrice_d · on July 4, 2020

Or you could store signed exchanges on platforms like Dat or IPFS and get real decentralization.

ehsankia · on July 4, 2020

Right, the point is that anyone can serve the content through any platform, which is why it allows decentralization. But I just don't understand why people are hanging so tightly to the idea that a URL is a direct path to a server, because that just isn't true.

dependenttypes · on July 4, 2020

> this feature could spawn a whole new class of decentralised and harder to censor web hosting

How so?

Ericson2314 · on July 4, 2020

https://docs.ipfs.io/concepts/dnslink/ this is how it should be

humblebee · on July 4, 2020

There is an issue opened around supporting HTTP exchanges for IPFS in the browser as well.

https://github.com/ipfs/in-web-browsers/issues/121

Spivak · on July 4, 2020

I don’t really think Google’s plan is that weird. And it would be amazing for decentralized networks, archiving, and offline web apps. Google can’t just serve nyt.com — they can serve a specific bundle of resources published and signed by nyt.com verified by your browser to be authentic and unmodified.

mulmen · on July 4, 2020

How does centralizing content on Google from multiple sources improve decentralization? The web is already decentralized. That's why it is a web.

AMP is a scourge. It's a bad idea being pushed by bad actors.

capableweb · on July 4, 2020

The current implementation of the AMP cache servers obviously doesn't help the decentralization.

I think what Spivak is saying though is right. If we could move from location addressing (dns+ip) to content-addressing , but not via the AMP cache servers, in general, anyone could serve any content on the web. Add in signing of the content addressing, and now you can also verify that content is coming from NYTimes for example.

Also, I'd say that the internet (transports, piping, glue) is decentralized. The web is not. Nothing seems to work with each other and most web properties are fighting against each other, not together. Not at all like the internet is built. The web is basically ~10 big silos right now, that would probably kill their API endpoints if they could.

fauigerzigerk · on July 4, 2020

I think this would require an entirely new user interface to make it abundantly clear that publisher and distributor are seperate roles and can be seperate entities.

I don't think this should be shoehorned into the URL bar or into some meta info that no one ever reads hidden behind some obscure icon.

ehsankia · on July 4, 2020

Isn't it already the case though with CloudFlare and other CDNs serving most of the content? Very few people really get their content from the actual source server anymore.

fauigerzigerk · on July 4, 2020

That's a good point. I just feel that there is an important distinction to be made between purely technical distribution infrastructure like Cloudflare's and the sort of recontextualisation that happens when you publish a video on Youtube. I'm not quite sure where in between these two extremes AMP is positioned.

mulmen · on July 4, 2020

Thank you for this explanation. AMP has put a really bad taste in my mouth but what you describe here does have some interesting implications. Something to consider for sure.

logicalmonster · on July 4, 2020

Please fact check me on this, but the ostensible initial justification for AMP wasn't decentralization, but speed. Businesses had started bloating up their websites with garbage trackers and other pointless marketing code that slowed down performance to unbrowsable levels. Some websites would cause your browser to come close to freezing because of bloat. So Google tried to formalize a small subset of technologies for publishers to use to allow for lightning fast reading, in other words, saving them from themselves. AMP might be best viewed as a technical attempt to solve a cultural problem: you could already achieve fast websites by being disciplined in the site you build, Google was just able to use its clout to force publishers to do it. As for what it’s morphed into, I’m not really a fan because google is trying to capitalize on it and publishers are trying various tricks to introduce bloat back into AMP anyway. The right answer might be just for Google to drop it and rank page speed for normal websites far higher than it already does.

eternalban · on July 4, 2020

> How does centralizing content on Google from multiple sources improve decentralization?

It actually makes perfect sense in Doublespeak. /s

DougBTX · on July 4, 2020

They’re suggesting a web technology which would allow any website to host content for any other website, under the original site’s URL, as long as the bundle is signed by the original site. That could be quite interesting of a site like archive.org, as the url bar could show the original url.

But AMP is a much narrower technology, I’d imagine only Google would be able to impersonate other websites, essentially centralised as you say. The generic idea would just be a distraction to push AMP.

Everything would be so much better if the original websites were not so overloaded with trackers, ads and banners, then there would be no need for these “accelerated” versions.

a9h74j · on July 4, 2020

I see where you are going, but what if my website is updated?Is the archive at address _myurl_ invalidated, or is there a new address where it can be found? I am thinking of reproducible URLs for academic references or qualified procedures, for example, which might or might not matter in the intended use case.

Could there be net-neutrality-like questions in all this as well?

prepend · on July 4, 2020

I think this is possible already, but should not override the displayed URL for the content.

Create a new “original URL” field or something.

amelius · on July 4, 2020

Google is not a single server. Think of Google as a CDN.

oblio · on July 4, 2020

So it's decentralized because Google has multiple servers? And here I was, thinking that Google runs everything from a single IBM mainframe.

What you're saying would be described as distributed... Not decentralized.

HeWhoLurksLate · on July 4, 2020

Seems to me like it's easy to forget there's a difference between those two..

domenicd · on July 4, 2020

+1. The way I think about it is that signed exchanges are basically a way of getting the benefits of a CDN without turning over the keys to your entire kingdom to a third party. Instead you just allow distribution of a single resource (perhaps a bundle), in a crytographically verifiable way.

Stated another way, with a typical CDN setup the user has to trust their browser, the CDN, and the source. With signed exchanges we're back to the minimal requirement of trusting the browser and the source; the distributor isn't able to make modifications.

skybrian · on July 4, 2020

It seems like there is a risk that an old version of a bundle will get served instead of a new one by an arbitrary host? Maybe the bundle should have a list of trusted mirrors?

gregable · on July 4, 2020

There is a publisher selected expiration date as part of the signed exchange which the client inspects. The expiration also cannot be set to more than 7 days in the future on creation. This minimizes, but of course does not eliminate, this risk.

xg15 · on July 4, 2020

It also makes signed exchanges completely unusable for delivering packages offline. (E.g. the USB stick scenario)

What a bummer.

smichel17 · on July 4, 2020

Browsers could have a setting to optionally display the content anyway, along with a warning to the effect of "site X is trying to show an archive of site Y", similar to how we currently handle expired or self-signed SSL certificates.

gpm · on July 4, 2020

Alternatively super short expiry times. It doesn't seem like it would be that concerning to have another site serving a bundle that was 5 minutes out of date. It doesn't seem like it should be too much load to be caching content every 5 minutes.

jml7c5 · on July 4, 2020

I could see some sort of alternative URL bar ("https://nyt.com/somearticle/ | served by https://somecdn.example.org/blah"), but complete replacement is far too dangerous and confusing in that it is completely hidden.

jefftk · on July 4, 2020

The New York Times surely already serves their pages through a CDN, silently, and with the CDN having the full technical capability to modify the pages arbitrarily. Signed exchange allows anyone to serve pages, without the ability to modify them in any way.

(Disclosure: I work for Google, speaking only for myself)

jml7c5 · on July 4, 2020

My objection is that it's no longer clear if you're dealing with content addressing or server addressing. If I see example.com in the URL bar, is it a server pointed from the DNS record example.com (a CDN that server tells me to visit), or am I seeing content from example.com? If I click a link and it doesn't load, is it because example.com is suddenly down, or has it been down this whole time? Is the example.com server slow, or is the cache slow? Am I seeing the most recent version of this content from example.com, or did the cache miss an update?

KajMagnus · on July 4, 2020

What if there was a `publisher://...` or `content-from://...` or `content://...` protocol, somehow? (visible in the address bar, maybe a different icon too, so one would know wasn't normal https:)

And by hovering, or one-clicking, a popup could show both the distributor's address (say, CloudFlare), and the content's/publisher's address (say, NyT)?

ignoramous · on July 4, 2020

> a way of getting the benefits of a CDN without turning over the keys to your entire kingdom to a third party.

https://blog.cloudflare.com/keyless-ssl-the-nitty-gritty-tec... is a thing now.

icebraining · on July 4, 2020

The session key, which is given carte blanche by the TLS cert to sign whatever it wants under the domain, is still controlled by Cloudflare.

To put it simply, Cloudflare still controls the content. The proposal here would avoid that, by allowing Cloudflare to transmit only pre-signed content.

Spivak · on July 4, 2020

Your browser would have a secure tunnel to CloudFlare which is encrypted with their key. But then that tunnel would deliver a bundle of resources verified your browser differently that CF doesn’t have the key for.

dannyw · on July 4, 2020

The plan is bad because google currently tracks all of your activities inside AMP hosted pages site in their support article.

Google controls the AMP project and the AMP library. They can start rewriting all links in AMP containers to Google’s AMP cache and track you across the entire internet, even when you are 50 clicks away from google.com.

gregable · on July 4, 2020

While that's theoretically possible, the library can be inspected and does not do these things.

simion314 · on July 4, 2020

Could Google give specific persons different versions or is technically impossible?

gregable · on July 5, 2020

Technically yes, but not very practically. The domain is cookieless, so it would be difficult to even identify a specific user, other than by IP. Also, the JavaScript resource is delivered from the cache with a 1 year expiry, which means most times it's loaded it will be served from browser cache rather than the web.

ric2b · on July 5, 2020

How is google.com cookieless?

gregable · on July 5, 2020

The AMP javascript is served on the cdn.ampproject.org domain, not google.com.

rocho · on July 4, 2020

It's very possible indeed.

tgv · on July 4, 2020

They have the log files.

pdkl95 · on July 4, 2020

> the library can be inspected

Really? Could you publish how you are inspecting an unknown program to determine if it exhibits a specific behavior? There are a lot of computer scientists interested in your solution to the halting problem.

Joking aside, we already know from the halting problem[1] that it you cannot determine if a program will execute the simplest behavior: halting. Inspecting a program for more complex behaviors is almost always undecidable[2].

In this particular situation where Google is serving an unknown Javascript program, a look at the company's history and business model suggests that the probability they are using that Javascript to track use behavior is very high.

[1] https://en.wikipedia.org/wiki/Halting_problem

[2] https://en.wikipedia.org/wiki/Undecidable_problem

pwdisswordfish2 · on July 4, 2020

By reading the source code?

    def divisors(n):
        for d in range(1, n):
            if n % d == 0:
                yield d

    n = 1
    while True:
        if n == sum(divisors(n)):
            break
        n += 2
    print(n)

I don’t know if this program halts. But I’m pretty sure it won’t steal my data and send it to third parties. Why? Because at no point does it read my data or communicate with third parties in any way: it would have to have those things programmed into it for that to be a possibility. At no point I had to solve the halting problem to know this.

Also, if I execute a program and it does exhibit that behaviour, that’s a proof right there.

The same kind of analysis can be applied to Google’s scripts: look what data it collects and where it pushes data to the outside world. If there are any undecidable problems along the way, then Google has no plausible deniability that some nefarious behaviour is possible. Now, whether that is a practical thing to do is another matter; but the halting problem is just a distraction.

pdkl95 · on July 4, 2020

> at no point does it read my data

Tracking doesn't require reading any of your data. All that is necessary is to trigger some kind of signal back to Google's servers on whatever user behavior they are interested in tracking.

> or communicate with third parties

Third parties like Google? Which is kind of the point?

> [example source code]

Of course you can generate examples that are trivial to inspect. Real world problems are far harder to understand. Source is minified/uglified/obfuscated, and "bad" behaviors might intermingle with legitimate actions.

Instead of speculating, here is Google's JS for AMP pages:

https://cdn.ampproject.org/v0.js

How much tracking does that library implement? What data does it exfiltrate from the user's browser back to Google? It obviously communicates with Google's servers; can you characterize if these communications are "good" or "bad"?

Even if you spent the time and effort to manually answer these questions, the javascript might change at any time. Unless you're willing to stop using all AMP pages every time Google changes their JS and you perform another manual inspection, you are going to need some sort of automated process that can inspect and characterize unknown programs. Which is where you will run into the halting problem.

pwdisswordfish2 · on July 5, 2020

Funny how people can literally "forget" that Google is a third party. Probably people at Google believe they are not third parties. Not even asking or trust, just assuming it. No other alternatives. Trust relationship by default.

saagarjha · on July 4, 2020

> I don’t know if this program halts.

Be cool if you did ;)

FabHK · on July 4, 2020

If you didn't catch the joke: It is currently unknown whether there are any odd perfect numbers (and the program halts on encountering the first).

https://en.wikipedia.org/wiki/Perfect_number

https://oeis.org/A000396

IanCal · on July 4, 2020

> Could you publish how you are inspecting an unknown program to determine if it exhibits a specific behavior? There are a lot of computer scientists interested in your solution to the halting problem.

This has nothing to do with the halting problem because that is concerned about for all possible programs not some programs.

We obviously know if some programs halt.

    while true: nop

Is an infinite loop.

    X = 1
    Y = X + 2

Halts.

More complex behaviours can be easier. Neither of my programs there make network calls.

wmf · on July 4, 2020

Publishers who use AMP were already allowing Google to track everything through either Analytics or Ads.

Likewise, AMP pages are mostly accessed from Google search that's already tracked.

robin_reala · on July 4, 2020

As a user I can choose to block GA, either through URL blocking or through legally mandated cookie choices in some regions (e.g. France). When served from Google I have no choice in the matter.

gowld · on July 4, 2020

If you can block GA at the client, you can block google.com at the client, no?

robin_reala · on July 4, 2020

Not if I want AMP pages. (I mean, I don’t, but there are presumably people who do.)

tyingq · on July 4, 2020

The AMP spec REQUIRES you include a Google controlled JavaScript URL with the AMP runtime. So technically the whole signing bit is a little moot, given that the JS could do whatever it wanted.

gregable · on July 4, 2020

The same could be said of any CDN hosted javascript library. For example: jquery. There is an open intent to implement support for publishers self-hosting the AMP library as well.

donaltroddyn · on July 4, 2020

For most JS served by CDN, you can (and should) use Subresource Integrity to verify the content. At least the last time I was involved in an AMP project, Google considered AMP to be an "evergreen" project and did not allow publishers to lock in to a specific version.

gregable · on July 5, 2020

Long term versions are now supported, so publishers can lock in a specific version.

Publisher hosted copies are in the pipeline, as I referenced in the parent comment. My choice of verbiage was a bit confusing it appears.

donaltroddyn · on July 5, 2020

I don't think it's your wording that's confusing. You are contradicting the AMP documentation.

AMP's documentation seems to indicate that the LTS is stable only for one month (new features released via the same URL each month), and so is not compatible with SRI (see https://github.com/ampproject/amphtml/blob/master/contributi...)

You can specify a version (ie, https://cdn.ampproject.org/rtv/somenum/v0.js), but the AMP validator complains about that.

WA · on July 4, 2020

> The same could be said of any CDN hosted javascript library

Yes, and? What’s your point? It’s actually a security weakness to include third party JS. The whole thing runs on trust.

gowld · on July 4, 2020

What's an open intent? Where is this documented?

tyingq · on July 4, 2020

AMP spec: https://amp.dev/documentation/guides-and-tutorials/learn/spe...

"AMP HTML documents MUST..."

"The AMP runtime is loaded via the mandatory <script src="https://cdn.ampproject.org/v0.js"></script> tag in the AMP document <head>."

Do a whois on ampproject.org:

"Registrant Organization: Google LLC Registrant State/Province: CA Registrant Country: US Admin Organization: Google LLC"

Note that jQuery, as mentioned in some GP comment has no such requirement. Google AMP is quite unique in this regard. This is NOT some general CDN type issue. Also...agreed, WTF is "open intent"?

gregable · on July 4, 2020

https://github.com/ampproject/amphtml/issues/25873

tyingq · on July 4, 2020

Note "open", i.e., unresolved. Perhaps in a less positive light, "how to enabled signed exchanges/AMP without controlling it".

gregable · on July 5, 2020

Correct. Open as in not resolved yet, but intended to be resolved in the future.

ComputerGuru · on July 4, 2020

You missed the required part.

grey-area · on July 4, 2020

That's not why Google (the corporation) wants this to happen. This is not about technical capabilities but about power.

They cannot be allowed to become the gatekeeper for the web.

jacquesm · on July 4, 2020

They already are. The question is not how we're going to stop that from happening but how we are going to roll it back.

xg15 · on July 4, 2020

I agree, if we finally got a way to have working bundles on the web, that would be extremely useful. (And would also restore some of the capabilities of browsers to work without internet connection).

It seems to me, a lot of the security concerns come from the requirements to make pages served live and pages served from bundles indistinguishable to a user - a requirement that really only makes sense if you're Google and want to make people trust your AMP cache more.

I'd be excited about an alternative proposal for bundles that explicitly distinguishes bundle use in the URL (and also uses a unique origin for all files of the bundle).

gregable · on July 5, 2020

I believe the issue with this is that users already largely don't understand decorations in the URL. For example, the difference between a lock and an extended verification certificate bubble. Educating a user on what a bundle URL means technically may be exceedingly challenging.

pwdisswordfish2 · on July 4, 2020

In what ways is this different/similar from "content centric networking"?

https://m.youtube.com/watch?v=gqGEMQveoqg

(Google Tech Talk from Van Jacobsen on CCN many years ago)

wmf · on July 4, 2020

AMP is happening and CCN is not.

pwdisswordfish2 · on July 5, 2020

Do you mean to say that is the only difference?

wmf · on July 6, 2020

No, there are many differences but they don't matter. Since CCN is not economically feasible, none of the technical details matter.

pwdisswordfish2 · on July 7, 2020

Why is CCN not economically feasible?

olingern · on July 4, 2020

The problem is ownership. Google is “stealing” or caching content for what they consider a better web.

I don’t support ads but I also don’t support Google serving a version of the page that steals money from content creators. So, therein lies the problem: choice.

I can imagine a future where amp is ubiquitous and Google begins serving ads on amp content. Luckily, companies have to make money and amp is not in most people’s or company’s best interests.

If amp was opt-in only, this would be much more ethically sound.

gregable · on July 5, 2020

Signed exchanges guarantee that the content cannot be modified by the cache, such as ad injection.

Google has never injected ads into any cache served AMP document (technically if the publisher uses AdSense, this is false, but that's not the point you are making).

It's difficult to follow what definition of theft is being suggested. The cache does not modify the document rendering, it's essentially a proxy. In a semantic sense, this is no different than your ISP delivering the page or your WiFi router.

jeffdavis · on July 4, 2020

It's completely moving away from the client/server model to something else.

Perhaps that's a great thing to do, but it's not something to do quietly.

mooman219 · on July 4, 2020

Just hearing about this from the thread, I'm getting a IPFS vibe from this. It would be interesting to see that tech get more native integration with the browser from this idea.

boomlinde · on July 4, 2020

How is it not weird that I see a domain name in the URL bar that has nothing to do with the domain I actually requested content from?

colordrops · on July 4, 2020

Why do they need a special extension though? What's wrong with DNS?

gregable · on July 5, 2020

Signed exchanges are an extension to digital certificates, such as used for TLS. This is independent of DNS.

nxnews · on July 4, 2020

Why would it be amazing for decentralized networks and offline web apps?

remexre · on July 4, 2020

If I publish mycoolthing.com/thing, it could be mirrored over a P2P network as peer1.com/rehosted/mycoolthing.com/thing, peer2.com/rehosted/mycoolthing.com/thing, etc., in a way that would make it evident to end-users not familiar with the protocol that the content is from mycoolthing.com.

tgv · on July 4, 2020

AMP is of course not P2P.

tyingq · on July 4, 2020

I think the point is that signed exchanges ( https://developers.google.com/web/updates/2018/11/signed-exc...) could potentially be useful, if separated from AMP, and made an actually secure thing. Like, for example, the spec doesn't require specific Google controlled js URLS to be in the content.

gregable · on July 5, 2020

Signed exchanges is actually separate spec from AMP. The browser implements it independently. There is no requirement for AMP pages to use signed exchanges nor for signed exchanges to be AMP.

rpastuszak · on July 4, 2020

Remember when Google was telling us that third-party cookies are there to protect us, and Safari/Firefox/Edge are just reckless and pose a risk to users by blocking them?

Jabbles · on July 4, 2020

Please provide a link, I could only find this, which suggests Google has reversed course:

https://www.techradar.com/uk/news/google-is-phasing-out-thir...

rpastuszak · on July 4, 2020

> By undermining the business model of many ad-supported websites, blunt approaches to cookies encourage the use of opaque techniques such as fingerprinting (an invasive workaround to replace cookies), which can actually reduce user privacy and control.

https://blog.chromium.org/2020/01/building-more-private-web-...

I'm going to copy paste my older comment on this:

I find their "removing 3rd party cookies will incentivise businesses to rely on fingerprinting" discourse dangerous.

It implies that other browser vendors (Mozilla, Safari/WebKit, new Edge) are in fact making the Web a more dangerous place.

I believe it's dangerous because it creates a harmful, unproductive PR narrative—people might just assume this is a true statement, without learning about both sides of the problem. I'm not trying to strip anyone of agency, I just don't think most of my friends would have time to research this topic and might decide to follow the main opinion instead.

The answer I'd like to hear: Yes, it does push some actors towards fingerprinting, but preventing fingerprinting should be dealt with regardless. Changes should happen both on legislative and browser-vendor level.

Viliam1234 · on July 4, 2020

Sounds a bit like: "By locking your door, you only encourage thieves to break your window, which can actually increase the damage they cause you."

rpastuszak · on July 5, 2020

Precisely.

Jabbles · on July 4, 2020

Thank you for the additional background.

rpastuszak · on July 4, 2020

Thanks for asking, we need more comments like yours

esafwan · on July 4, 2020

Have a look at this: https://blog.cloudflare.com/announcing-amp-real-url/

Cloudflare allow using of same domain to use AMP. In this case, content is served from Cloudflare CDN.

tyingq · on July 4, 2020

Note the Cloudflare hosted AMP pages still mandate AMP requirements, like including a Google controlled JS uri in your content. Signing is moot if you allow Google to run arbitrary JS on your content. They haven't abused it yet, but it's allowed by spec. Subresource integrity isn't mandated, explained, or recommended.

archon810 · on July 4, 2020

It's called Signed Exchange and it's the same thing the comment you replied to is about.

m-p-3 · on July 4, 2020

I'm still waiting for general support of addons for the next version of Firefox on mobile just so that I can have the Redirect AMP to HTML[1] addon.

[1]: https://addons.mozilla.org/firefox/addon/amp2html/

henriquemaia · on July 4, 2020

I'm on Firefox on a mobile device. That addon has a redirection link [1] specifically for Firefox' mobile version.

As I didn't know about this addon, thanks for sharing it.

[1] https://addons.mozilla.org/en-US/android/addon/amp2html/

deadelvis · on July 4, 2020

Here's a user script to get rid of AMP (I use it with adguard): https://userscripts.adtidy.org/release/disable-amp/1.0/disab...

priyaranjan · on July 4, 2020

This has been discussed over & over again and there is no representation from amp team to make it any better. I was surprised to realise how much my life changed when I started using firefox + duckduckgo. Full time, at work & home, on macOS & android.

xg15 · on July 4, 2020

Aren't we essentially reinventing http proxies with this?

nokya · on July 4, 2020

Just having this idea already tells how important it is to actively resist Google.

One should never forget that at a certain point, Google will likely invoke the looser's argument ("protect you from terrorists and pedophiles") to require proof of identity prior to granting access to any resource or service it controls.

Anything that helps them advance in that direction must be fought fiercely.

dqpb · on July 4, 2020

Isn't this basically like a CDN or a PoP cache?

IncRnd · on July 4, 2020

Not exactly. For a CDN to work, the DNS is repointed towards the CDN's servers. In this case, Google is trying to cover-up that Google and not NYTimes is serving the page.

wmf · on July 4, 2020

Is NYTimes's use of Fastly also a cover-up?

boomlinde · on July 4, 2020

Does NYTimes' use of Fastly subvert the meaning of the URL by literally covering it up in the address bar? Nope? Not the same thing, then.

Personally I don't think there's anything wrong with the fundamental concept of signed exchanges. The only problem is that it's just that: a signed exchange of content, which should have nothing to do with the domain name authority in the URL. By all means, display "Content from: a.com" in a box next to the URL, but don't change b.com to a.com in the URL as though it doesn't already have a well defined meaning.

ehsankia · on July 4, 2020

> the meaning of the URL

The issue is that the technical meaning of the URL is very far from what most user think of.

Is the URL an address for NYT's server? Not really because you are actually hitting Fastly's server. So when NYT sets up a magical DNS config, it suddenly is fine, but using crypto to sign the package and serve it on a CDN that way, then it's suddenly "subverting the meaning of the URL"?

We can have a real discussion of what the meaning of a URL is, but I think your interpretation is unfair. I think it's entirely fair to argue that it makes sense for URLs to be an address to a specific content.

boomlinde · on July 5, 2020

> The issue is that the technical meaning of the URL is very far from what most user think of.

My argument is not really concerned with what most users think of, but humor me, what do they think of?

> So when NYT sets up a magical DNS config, it suddenly is fine, but using crypto to sign the package and serve it on a CDN that way, then it's suddenly "subverting the meaning of the URL"?

Yes, because HTTP/S scheme URLs have a definition that implies a meaning, which is subverted when you create exceptions to that meaning. NYT setting up a "magical" DNS config that resolves to some third party server is perfectly fine by that definition, and resolving one FQDN while displaying another is not. It's not sudden, this standard has existed in one form or another since 1994.

> We can have a real discussion of what the meaning of a URL is

Yeah, let's do that instead of harping on about what's fair and unfair. It's not a matter of fairness, it's a matter of standardized definitions. By all means, create a new "amp:" URI scheme where the naming authority refers to whoever signed the data and resolves to your favorite AMP cache, but don't call it http or https.

ehsankia · on July 5, 2020

I think the subtle shift of view here is that the URL shows the address where the content is located, more so than where the content was actually fetched from.

An example of where this occurs today is caching. You could be hitting a cache anywhere along the way. Hell you could be seeing an "offline" version, but the website would still show you the "address" of the content.

This is no different, you're hitting a different cache, but the "URL" you see is the canonical address of the content you are looking at, not where it was actually fetched from.

boomlinde · on July 9, 2020

> I think the subtle shift of view here is that the URL shows the address where the content is located, more so than where the content was actually fetched from.

The only sense in which content is located anywhere is as data on a memory device somewhere. With the traditional URI in which the host part of the authority is an address of or a domain name pointing towards an actual host, you have a better indication of where the content is located than you do if this is misrepresented as being some other domain name which in fact does not at all refer to the location of the content.

The shift, if any, is that people may be less interested in where the content is located and more interested in its publishing origin.

> An example of where this occurs today is caching. You could be hitting a cache anywhere along the way. Hell you could be seeing an "offline" version, but the website would still show you the "address" of the content.

Yes, because that's how domain names work.

> This is no different, you're hitting a different cache, but the "URL" you see is the canonical address of the content you are looking at, not where it was actually fetched from.

It's different in the sense that a host name as displayed by the browser then has multiple, conflicting meanings that have no standardized precedent.

IncRnd · on July 4, 2020

Google by definition wants to cover-up the domain name that serves amp web pages. Over half of the article discusses that.

For your question about fastly, I already answered that in the comment you replied. The fastly CDN requires that the DNS is configured to point at fastly servers. Take a look at https://docs.fastly.com/en/guides/sign-up-and-create-your-fi... under "Start serving traffic through Fastly".

  Once you’re ready, all you need to do to complete your service
  setup and start serving traffic through Fastly is set your domain's
  CNAME DNS record to point to Fastly. For more information, see
  the instructions in our Adding CNAME records guide."

A CNAME record is a dns mechanism that aliases an alternate domain for a canonical domain.

wmf · on July 4, 2020

Users don't see DNS records. In the old world they click on a nytimes.com link and get something served from a Fastly server, but in the future AMP world they click on nytimes.com and get it served from a Google server. It isn't different.

IncRnd · on July 4, 2020

You bring up an interesting point - if AMP hosting is the same as a CDN, then why do companies use both solutions? "Because they appear the same" doesn't mean they are the same.

AMP requires that you consume other Google products, which requires that additional JS is loaded. When your mobile site doesn't use AMP, Google limits SEO rankings your mobile site can have. Google AMP requires your pages meet Google's Content Policies or they won't host them.

AMP and CDN delivered pages are architected differently and Google imposes restrictions and requirements that don't exist in a CDN.

simias · on July 4, 2020

I agree with you, if I understand the signed exchange proposal correctly the trust model is effectively similar (NYT explicitly opts to let Fastly pretend to be them through their DNS config in the same way that a signed exchange would let them explicitly opt into letting Google pretend to be them).

I'm still opposed to the change, I see this centralization of the web through CDNs as a bad thing, I don't want to make it easier.

jefftk · on July 4, 2020

The trust model is pretty different: in the traditional model NYT has to trust their CDN to serve the content unmodified. In the signed exchange model, any modification will cause the content not to validate, and the browser will reject it.

rocho · on July 4, 2020

It's very different. And the difference lies in the URL bar. When you use a CDN, your visitors will still see your domain. With Amp, they see google.com.

tsimionescu · on July 4, 2020

That's exactly what Google wants to change, and what Mozilla is opposing.

matheist · on July 4, 2020

Not if nyt hasn't authorized google to act on their behalf. "Yes we will serve your stuff on your behalf at your request, now that you have stuck your sign on our door via a DNS record" vs "We're putting your sign on our door because as the authors of a browser we can do that, whether you like it or not".

remexre · on July 4, 2020

NYT in fact has to go through non-trivial technical effort to authorize Google to act on their behalf.

petrey · on July 5, 2020

Holy mackerel

agumonkey · on July 4, 2020

has the mainstream web jumped the shark ?

jeffbee · on July 4, 2020

Well, it's the going opinion of HN for years that the main problem with AMP is it shows the actual origin instead of the proxied origin. Lying about the URL is something hundreds of HN comments have angrily demanded.

nojs · on July 4, 2020

Why do you say that? I don’t think people want it to show the “proxied origin”, they want AMP to get out of the way and google to link to the real website.

ori_b · on July 4, 2020

No. The complaint is that Google is redirecting the content to servers that they control.

jeffbee · on July 4, 2020

This is not correct. Anyone can host AMP. See, for example, amp.cnn.com. Google hosts AMP content for its customers who elect to use that service. It’s not a nefarious plot.

reaperducer · on July 4, 2020

People have been railing against Google's Amp on HN for years, and I think I finally figured out what it's for.

It's Google way of combatting phone apps.

If all of the world's information — especially current news and similar information — moves from the open web into apps, then Google can no longer crawl, index, or scrape that information for its own use. The rise of the mobile phone app is a threat to Google on so many levels from ad revenue to data for training its AIs.

So Google comes up with Amp to convince publishers to keep their content on the open web, where it can be collated, indexed, and otherwise used by Google for Google's services like search and those search result cards that keep people from visiting the content creators.

Google's explicit carrot in all this is the user benefit of page loading speed. Google's implicit carrot in all of this is page rank. But Google's real motivation is to have all of that information available to itself.

Can you imagine what would happen if content from even one of the big providers was no longer visible to Google? New York Times, WaPo, or even Medium? It would create a huge hole in a number of Google products and services, make its search results look even weaker than they already are, and cause people to look for search alternatives.

That's my theory, anyway.

hortense · on July 4, 2020

Amp was a reaction to Apple News and Facebook News: using those applications to read the news was a much better experience than using the web. Why? Mainly for two reasons:

1/ Apple and Facebook were hosting all the content.

2/ The content did not come with megabytes of JS and other unnecessary crap.

Amp is an attempt at saving the web, and Google is interested in that for the reason that you gave: they make their money from the web.

_bz2r · on July 4, 2020

> Amp is an attempt at saving the web, and Google is interested in that for the reason that you gave: they make their money from the web.

Yes; attempting to save the web in much the same way that the parasitic wasp is trying to oviposit in your thorax and take over your behavior, in order to save you from being eaten by the spider.

No thank you, sawfly.

Kevin605 · on July 4, 2020

This has already happened in China, where Baidu (The Chinese equivalent of Google) can’t crawl any articles from WeChat (The Chinese equivalent of Medium), as a result, the usefulness of its search result has deteriorated significantly. Recently, Baidu has been trying to start its own publishing platform with little success.

saagarjha · on July 4, 2020

> WeChat (The Chinese equivalent of Medium)

TIL

Kevin605 · on July 4, 2020

Well, it’s more like WhatsApp, Medium, Venmo, and Facebook all combined into one giant app.

Kliment · on July 4, 2020

Also food ordering, travel reservations, health care appointments, banking, government services, and a whole lot of other things that would take too long to list. It's not an exaggeration to say the entire Chinese consumer experience runs through WeChat.

remus · on July 4, 2020

I think this is a fairly cynical take, as having news on the web is also pretty great for users.

Imagine if instead of having all news stories a quick search away you instead had to install apps from X different news sources (and inevitably grant them permission to access your location, contacts list, name of first born child etc.). It'd create lots of little silos of news with very little ability to go outside those silos.

Put another way, the web is a great platform for news. It does benefit Google, but it also benefits the billions of people who can freely access a huge range of sources.

sillysaurusx · on July 4, 2020

Interesting theory. One hole is that companies want to be on Google's results. It hurts WaPo not to be in the top N results, so they have an incentive to make it at least possible.

mclightning · on July 4, 2020

Who is really using the dedicated apps for each news site? Web is just way more practical; for translation, for copy-paste, for sharing.

Besides,you dont need the app on your mobile.

antpls · on July 4, 2020

I bet non-techie people _already_ read their daily dose of news from 1 to 2 news websites at most. Installing a dedicated app is not much different than surfing the same 2 websites everyday.

Also, for techie people, do you consider RSS as part of the "web" ? To me, an RSS aggregator app is superior to browsing 20 different news websites, all with different formats.

"web is just way more practical" isn't obvious. It depends about what you put in the "web" bag, and the use cases. Most apps use "web" protocols, so they are technically part of the web.

hootbootscoot · on July 4, 2020

RSS should make a comeback

satyrnein · on July 4, 2020

That's long been Google's stated reason for Chrome and much else, that pushing the web forward as a platform aligns with their interests as well.

summerlight · on July 4, 2020

Apple and Facebook really doesn't care if the web dies as long as their platform take the lion's share. But for Google, search as a product can exist only if the web itself remains relevant and this is why it's trying to keep display ads alive even though it doesn't really give them much money compared to search ads but all the privacy complication coming from third party tracker.

ffritz · on July 4, 2020

Interesting, though the barrier for users to install a new app seems to be very high these days. Most people only install a few necessary apps and thats it. In addition, we are talking about publishers here. There are thousands of news sites, no user has more than a couple of news apps. That’s why they have to keep up their website anyway, with or without amp.

IfOnlyYouKnew · on July 5, 2020

That’s Google’s motivation for almost anything. Especially Chrome.

Abishek_Muthian · on July 4, 2020

I think the main issue is limited AMPCache providers and inability for the publisher to choose their own AMPCache providers. Which is being exploited the two search engines.

AMP project by itself is open-source and it explicitly states 'Other companies may build their own AMP cache as well'.[1] There are only 2 AMP Cache providers - Google, Bing. Further, 'As a publisher, you don't choose an AMP Cache, it's actually the platform that links to your content that chooses the AMP Cache (if any) to use.'[2]

Say, if Cloudflare provides a AMPCache and if the site publisher can choose their own Cache provider this can be resolved effectively as AMP by design itself is easy for a laymen to create high performance websites; of course there is no excuse for hiding URLs.

[1]https://amp.dev/support/faq/overview/

[2]https://amp.dev/documentation/guides-and-tutorials/learn/amp...

snowwrestler · on July 4, 2020

Can we please stop trying to pretend AMP is some sort of community-driven open source project? AMP was created by Google, for the benefit of Google. We are not obligated to play along every time a company says “open source.”

Abishek_Muthian · on July 4, 2020

>We are not obligated to play along every time a company says “open source.”

I Agree. IMO, Google has been using 'open-source' for weaponized marketing, same way Apple has been using 'Privacy'. But, either of them could be much worse without those.

ehsankia · on July 4, 2020

Yet Google's own competitor, Bing, is clearly also using it. Isn't that part of the point of open-source? That anyone can see and use your work?

eeZah7Ux · on July 4, 2020

> We are not obligated to play along every time a company says “open source.”

This is the point.

People easily confuse "open source" with "free software" and "community driven".

A lot of corporate-driven open source greenwashed the dark patterns of closed source: centralized development, user lock-in, walled gardens, poor backward compatibility, forced software and hardware upgrades.

Abishek_Muthian · on July 4, 2020

>"community driven"

This concern has been raised time again with every major Google open-source project e.g. Android, Chromium, Golang etc. and that concerns have helped improve certain aspects of the project.

But, I wonder whether a huge corporate like Google can build such large scale projects without such criticism, if the the project needs to be successful they to gain from it after-all they are investing their employees and other resources in it. And them being invested in it, is a major reason for adoption by other parties and resulting in a successful open-source project.

More over, such large projects have helped overall SW ecosystem and even startups economically. I for one would say, without such large open-source projects I wouldn't have even been able to build products from a village in India and compete with products from valley.

All I'm saying is, them being open-source at least helps us raise concerns and make them take actions; being a complete walled garden and just asking to 'trust us' is much worse.

eeZah7Ux · on July 4, 2020

> But, I wonder whether a huge corporate like Google can build such large scale projects without such criticism

Yes: they could at least develop large projects in a foundation with many other companies

> And them being invested in it, is a major reason for adoption by other parties and resulting in a successful open-source project.

...and the main source of pain when the projects are "pivoted" or just dropped due to a single company business needs, as it happened many times.

> such large projects have helped overall SW ecosystem and even startups economically.

They hugely harmed competing projects and competing companies including Mozilla, many phone OSes, many grassroots programming languages.

It's well known that google developed various projects to kill competitors or buy startups cheaply and drop the project afterwards.

There isn't an infinite pool of open source developers - far from it!

Any large corporation that drains the pool to create a competitor to already existing FLOSS projects is actively harming the ecosystem.

> being a complete walled garden and just asking to 'trust us' is much worse.

Closed source can be less harmful that fake-open source. A lot of people actively avoid closed source and fall for the latter.

Abishek_Muthian · on July 5, 2020

>They hugely harmed competing projects and competing companies including Mozilla, many phone OSes, many grassroots programming languages.

IMO, we're the reason it failed. We as a consumer didn't buy FirefoxOS phone over Android, iOS. We haven't adopted Firefox browser enough for it to become have the major market share. The same argument can levelled against any proprietary product VS open-source product.

That proves my point, being 'completely community driven' open-source project isn't the only criteria for the success of a project.

eeZah7Ux · on July 4, 2020

Funny how I get bunches of downvotes on this account but never on other accounts. Time to switch.

SquareWheel · on July 4, 2020

Did Cloudflare end their Amp Cache? They hosted one previously.

Abishek_Muthian · on July 4, 2020

I didn't know about it, AMP site lists only Google, Bing. But I know for a fact that cloudflare has no issues caching AMP sites like any other sites though.

SquareWheel · on July 4, 2020

It looks like they killed it last year. The codename was Ampersand.

Creation: https://www.cloudflare.com/press-releases/2017/cloudflare-an...

Death: https://blog.cloudflare.com/announcing-amp-real-url/

I agree with you that users should be able to choose their Amp Cache.

lern_too_spel · on July 4, 2020

The link aggregator, not the publisher, must control the AMP Cache in order to prerender pages from it safely.

twhitmore · on July 4, 2020

The whole AMP thing seems anti-competitive and hostile to the open web.

It's a really bad look on Google's part to be pushing this.

earthboundkid · on July 4, 2020

There has been no regulatory action since Microsoft (which happened as Google was being born), so the tech giants have forgotten fear and no longer self-regulate out of simple self-interest.

TheSpiceIsLife · on July 4, 2020

Another way of looking at it is: they absolutely are self-regulating.

And it appears to be a problem.

Another problem is, there's effectively no distinction between regulator and regulatee.

realusername · on July 4, 2020

> Another way of looking at it is: they absolutely are self-regulating.

If they do that, it's not really visible, I don't see any regulation with how Google is behaving regarding search & web, if anything it looks like anti-competitive monopoly behaviours.

TheSpiceIsLife · on July 4, 2020

Ya'll misunderstood.

Self-regulating in the same way an alcoholic meth addict self-regulates.

Ensorceled · on July 4, 2020

Most of us understand, it's just that your "word play" is not helpful.

raverbashing · on July 4, 2020

I am conflicted

Yes, AMP is an anti-competitive move by Google

At the same time AMP is "faster" because it gets rid of all the nagware and JS crap that the original page has.

So yeah, I don't like what Google is doing but I don't like what NYT is doing neither

moksly · on July 4, 2020

AMP is faster? I’ve never been on an AMP page where I didn’t eventually need to go to the actual site to get the full content. So it’s really just an annoying step between me and the content I searched for.

It’s been one of the primary things that’s driven me away from google and into DDG. I don’t really care about privacy enough to leave google, but I end up leaving more and more of their services because their competition is just less annoying.

ogre_codes · on July 4, 2020

> At the same time AMP is "faster" because it gets rid of all the nagware and JS crap that the original page has

Google gives preference to AMP content whether the source page is lightning fast or not. I get the frustration with crappy web pages, but a big part of the reason web pages are getting increasingly crappy is because Google and Facebook (and to a much lessor extent Amazon in a weird way) have a stranglehold on the web advertising market and publishers are getting smaller and smaller slices of advertising revenue. AMP increases Google's lock on the market. Since AMP pages can only really be monetized by the publisher, this puts even more power in Google's hands.

acdha · on July 4, 2020

> AMP is "faster" because it gets rid of all the nagware and JS crap that the original page has.

AMP is faster only for poorly-optimized JS-heavy pages but the design is fundamentally flawed to require all of its own large amount of JavaScript to run before anything displays, whereas most of the traditional bloat doesn’t block rendering. That means any optimized page - Washington Post, NYT, etc. – loads noticeably faster even before you factor in how often you need to wait for AMP to load, realize that some part of the content is missing, and then wait for the real page to load anyway.

That design forces it to be less reliable, too: before I stopped using Google on mobile to avoid AMP, I would see on a near-daily basis failed page loads due to the AMP JS failing in some way and when it wasn’t failing it was still notably slow (5+ seconds or worse on LTE). Since all of that JavaScript is forced into the critical path, anything less than unrealistically high cache rates means the experience is worse than a normal web page.

WPT examples:

https://www.webpagetest.org/result/200704_GR_62165b7f695e300...

https://www.webpagetest.org/result/200704_5F_f5c36a7c41cf4c2...

lern_too_spel · on July 5, 2020

Those tests show you don't understand why AMP works. It works because it prerendered, which is going to be faster than anything you can do.

acdha · on July 6, 2020

If that were true, AMP would be consistently faster. Since anyone who’s used it knows that it’s not, you would find it educational to learn about the issues with detecting user intent, reliably prefetching dependencies, and the relatively small / frequently purged caches on mobile browsers.

AMP’s design is very fragile: if you are using Google search results, they correctly guess what you’re going to tap on before you do and your browser fully preloads it, it _might_ be faster to run all of that JavaScript before anything is allowed to load and render. If any part of that chain fails, it will almost certainly be slower or, because it disables standard browser behavior, prevent you from seeing content at all.

lern_too_spel · on July 6, 2020

> If that were true, AMP would be consistently faster.

It is. AMP results load instantly for me.

> you would find it educational to learn about the issues with detecting user intent, reliably prefetching dependencies, and the relatively small / frequently purged caches on mobile browsers.

And you might find it educational to learn why AMP doesn't rely on these things. There are no dependencies that need to be fetched for the initial render.

This idea isn't surprising. Multiple other systems use the same ideas, including Apple News, many RSS readers, and Facebook Instant Articles. AMP just does it in a way that isn't anti-competitive (like the former) and allows for multiple monetization schemes and rich formatting (unlike RSS).

> if you are using Google search results, they correctly guess what you’re going to tap on before you do and your browser fully preloads it, it _might_ be faster to run all of that JavaScript before anything is allowed to load and render

AMP doesn't rely on fully prerendering the page, only the portion above the fold, which it can calculate because the link aggregator page knows the display size, and the elements allowed in AMP are required to report their dimensions. This allows multiple pages to be prerendered.

> because it disables standard browser behavior,

What standard browser behavior does it disable?

markosaric · on July 4, 2020

It is ironic considering that 7 out of top 10 most used third party connections on websites are owned by Google.

So you can see why there must be some kind of internal struggle at Google. They understand the value of a faster web but they also cannot go after the main cause of the slow web. And this is how technology such as AMP gets invented and makes things worse.

https://markosaric.com/google-amp/

amelius · on July 4, 2020

But why allow a third party (Google in this case) to collect data on your reading behavior on NYT?

varenc · on July 4, 2020

If you loaded the megabytes of JS served by the actual nytimes.com, they’ll certainly be sending your data to Google as well for advertising purposes.

(Albeit, that’s far more blockable)