Cloudflare radar, which presumably a much bigger and better sample, reports Bytespider as the #5 AI Crawler behind FB, Amazon, GPTBot, and Google:
https://radar.cloudflare.com/explorer?dataSet=ai.bots
And that's not including the most of highest volume spiders overall like Googlebot, Bingbot, Yandex, Ahrefs, etc.
Not to say it isn't an issue, but that Forture article they reference is pretty alarmist and thin on detail.
The difference is that, AFAIK, those bigger AI crawlers do respect robots.txt. Google even provides a way to opt-out of AI training without opting-out of search indexing.
possibly unpopular opinion, I trust the bigger companies more than small ones on stuff like this. It would be so much easier to not offer anything, rather than intentionally create a potemkin setting and risk the blowback that would occur if discovered. Hopefully this comment does not age poorly.
full disclosure: worked there [edit: google] a while ago, not in search, not in AI.
I'm a bit in a hurry, don't have time for close reading. Does that article say some Google apps (notably Maps) store locations on your device even if you have configured them to not store it in your Google account? I may miss something, don't have time to read between the lines today.
It says, "A prominently located direct link or button which may be located within either a customer account or profile, or within either device or user settings."
I think where the interpretation that one-click sub == one-click unsub is from this passage:
"The ability to cancel or terminate an automatic renewal or continuous service pursuant to subdivision (c) or (d) shall be available to the consumer in the same medium that the consumer used in the transaction that resulted in the activation of the automatic renewal or continuous service, or the same medium in which the consumer is accustomed to interacting with the business, including, but not limited to, in person, by telephone, by mail, or by email."
The idea being that one-click is a medium, which doesn't seem to be the intent here.
With GA4, the tracker code is loaded from www.googletagmanager.com (even if the tag isn't loaded via a GTM container).
The measurement requests can be sent to (region1|www).google-analytics.com or analytics.google.com (to share cookies with Google login better).
https://www.quantable.com/blog
Analytics experiments and opinion, around 10 years of back deep dive articles -- mostly Google Analytics web performance, bots, SEO.
One of the sites (coop.se) in this decision did use a server-side GTM container to mask the IP before it was sent to Google, but they were still told to stop using GA, but they weren't fined. The DPA said that the _gads, _ga, and _gid cookies were enough to be identifiable. I don't follow the logic there, but that rules out using a proxy for compliance (at least done as coop did it).
I do know Plausible, and their motivation is to make a sustainable business providing basic web analytics, which is why they charge for their service and Google doesn't. The data they provide to the users of their service is like an order of magnitude less detailed than what Google provides.
I get the cynicism about the industry in general since Google led this merger between web analytics and advertising, but there are plenty of providers in the analytics space that aren't following that path.
Cloudflare Web Analytics is extremely simplistic and does not allow for any persistent identification of users or storage of personal information. It uses HTTP Referrers to count visitors and that's it.
One could argue that since it's a US-based company it can't be Shrems II compliant, but you can make that argument about a lot of things.
As a US-based company, they process (even if they don't store) the IP address. As such, the personal data of the EU users is transmitted under the control of the US Surveillance Act. No SCCs nor commercial contracts can shield this data.
You might have a legitimate interest in processing the IP, but because of the aforementioned issues, you cannot provide sufficient controls nor protection of Personal Data.
As such, using Cloudflare as your Data Processor, exposes You, the Data Controller, to DPA scrutiny. As always with GDPR/DPA and EU, whether it is illegal/non-compliant depends on each DPA.
The decisions don't explicitly mention a version, they say these particular sites:
"...shall cease to use the version of the Google Analytics tool used on 14 August 2020".
They don't say if that's UA or GA4. The original complaints from NOYB refer to UA, but the issues cited in this decision would apply to GA4 as well.
So when the DPA says "Companies must stop using Google Analytics", there's no reason to think they only mean the version that was already shut off when they published that post.
I guess they can't ban a product for all eternity. In the decision [1] they are a bit more specific:
"This shall be done in particular by ceasing to use that version of the tool
Google Analytics as used on August 14, 2020, if not sufficient
protective measures have been taken."
Most alternatives are not made by advertising companies, but they also frequently aren't free... Rolling your own from the ground up is not necessary or typically advisable when there are so many good options, including many self-hosted and open source options if you're wanting that level of control.
I usually describe the cost of GA as "subsidized by your customers' data".
Not to say it isn't an issue, but that Forture article they reference is pretty alarmist and thin on detail.
reply