PSA for any devs out there implementing FB Pixels: – Facebook's pixel will, by d...

miked85 · on March 22, 2021

This is good info, but Backblaze shouldn't even be using FB pixels to begin with.

cosmie · on March 22, 2021

I don't necessarily disagree with you, and whether a pixel should be there at all is definitely a discussion in itself.

But for those who are implementing FB Pixels, I wanted to put out some potentially useful information that can help protect against unintended data disclosure, after mentioning the auto-listener behavior in a reply to another comment and being met with surprise[1].

[1] https://news.ycombinator.com/item?id=26537078

nerdponx · on March 22, 2021

Seems like it's designed to make it easy to accidentally send more of your users'/customers' data back to FB than intended. Oopsie!

azernik · on March 22, 2021

Given their (paying) customer base, which skews more towards content producers, I suspect it's more likely intended to ease setup for less-technically-savvy users.

ie they don't really want your truly-security-critical customer data. But if they can boost their conversion rate with sites like dogfoodreviews.com by 5%, and the price is sending backblaze.com's fantastically-sensitive paid customer data into an unsecured data path, they will absolutely do it.

Comparable to the absolute havoc that Zoom wreaked on browser security to save one click on starting a call.

cosmie · on March 22, 2021

> I suspect it's more likely intended to ease setup for less-technically-savvy users.

> ie they don't really want your truly-security-critical customer data.

It's both. It eases implementation with a one-and-done snippet, and then slaps a user-friendly GUI on the other side for marketers to sort through the firehose and use what they want.

While making it also trivially easy for marketers to toggle a button that OKs the turbo-boost mode that siphons up (hashed) sensitive customer information, which can then be used to claim credit for additional conversions by cross-referencing the (hashed) PII siphoned up against what Facebook has for those exposed to your ads.

OneLeggedCat · on March 22, 2021

Literally every external privacy setting of fb across its entire business is designed that way.

KingOfCoders · on March 22, 2021

Exactly. When the most important thing for a backup provider is the trust people put in you, don't do anything that risks that trust.

gigatexal · on March 22, 2021

Edit: my biases against Facebook keep me from making cogent points.

mvzvm · on March 22, 2021

This feels like an almost intentional misunderstanding.

gigatexal · on March 22, 2021

Pixel tracking on an internal customer page... scanning and uploading metadata about users backups? How much am I misunderstanding?

mvzvm · on March 22, 2021

> I guess they can’t afford to cover costs given current prices so they sell customer metadata to Facebook

This sentence, your thesis, is absurd. Where does it say they make money from selling data?

Furthermore, "I guess they can’t afford to cover costs given current prices" is a really strange foundation to leap from. Do you have any facts? Or are you just speculating on BackBlaze and making wild assumptions?

> now that other like-minded crazies can find each other faster than ever

You are right, there is nowhere else online where "crazies" gather - not 4chan, not reddit, not voat, not twitter, just Facebook?

goatinaboat · on March 22, 2021

Where does it say they make money from selling data?

What would be their motivation for just giving their customers data to Facebook for free?

suprfsat · on March 22, 2021

Apparently it's incompetence, if they're unaware that they could have been making money on the deal.

gigatexal · on March 22, 2021

The last point I’ll admit exposes my bias against Facebook.

But the previous two points make no difference on why tracking is baked into an internal portal so I speculated as to the reasoning as anyone would do — it’s not a stretch to think that data owners could sell customer data to an aggregator like Facebook for an additional line item of revenue.

manigandham · on March 22, 2021

Facebook does not, and never will, pay for data like this.

mnw21cam · on March 22, 2021

Which raises the question - why would anyone ever include a Facebook tracking pixel in their web page, if it provides no benefits, and has a cost?

manigandham · on March 22, 2021

> "if it provides no benefits"

Because it does? There are valid reasons for using the pixel in advertising.

busymom0 · on March 22, 2021

> I guess they can’t afford to cover costs given current prices

This is plain wrong. One of the reasons I was a fan of B2 was their tech and how they achieved such low costs:

https://www.backblaze.com/blog/design-thinking-b2-apis-the-h...

https://www.backblaze.com/blog/backblaze-and-cloudflare-part...

While this FB pixel debacle is obviously a very big screw up, it's pretty much a "screw up" and unintentional from what I understand so far. And they have fixed it already which is a positive step towards redemption.

From my speculation - the screw up seems to have happened from including the googletagmanager. They probably only wanted it to stay on the home page of B2 (for ad conversion tracking if I were to guess), not on the dashboard itself after login. The screw up caused it to be on the dashboard too.

mvzvm · on March 22, 2021

Why is that? Perhaps they get business value out of it?

croon · on March 22, 2021

If that business value destroys trust in their main business, that's a problem.

brightball · on March 22, 2021

This is the thing that a lot of executives don’t ever seem to understand.

tga · on March 22, 2021

It looks like they’re about to lose a bunch of business value because of it.

bschwindHN · on March 22, 2021

Won't somebody please think of the business value???

manigandham · on March 22, 2021

That's how businesses make decisions.

bschwindHN · on March 22, 2021

> Backblaze shouldn't even be using FB pixels to begin with.

> Why is that? Perhaps they get business value out of it?

Oh, they get value out of invading my privacy? Carry on then!

manigandham · on March 22, 2021

Running and measuring ads is one of many things that delivers value to a business, yes. The privacy issue in this case is clearly an implementation mistake and seems to have been resolved.

Ignoring the situation and context to make a comical statement doesn't really add anything to the discussion.

ksec · on March 22, 2021

The overwhelmingly mentality on HN is that All Ads are bad.

And all targeting Ads is bad. Because by their definition, All ads are tracking ads.

This mentality also fits the current Internet and Twitter narrative. Especially true when it is from Facebook. Which happens to be pure evil on HN, twitter sphere and MainStream Media.

Nextgrid · on March 22, 2021

> The overwhelmingly mentality on HN is that All Ads are bad.

Ads will at best waste my time/bandwidth/processing power and at worst compromise my privacy and/or convince me to make a bad financial decision.

I don't see why this mentality is wrong?

sokoloff · on March 22, 2021

Ads are micropayments that work. I don’t like them per se and run an ad blocker and PiHole, but the fact that others don’t allows me to micropay for a lot of content with my time.

ignoramous · on March 22, 2021

> Ads are micropayments that work.

One thing if those micro-payments end up enriching the creators, but another when blackholed into the coffers of a few tech cos.

ectopod · on March 22, 2021

Surely you meant to say:

> but the fact that others don’t allows them to micropay for a lot of content with their time.

I'm not objecting. I do the same. But in what sense are you paying? I'm sure I'm not.

sokoloff · on March 22, 2021

When I read a 5 minute Medium article, I'm paying with 5 minutes of my time. If I decide to bail out 1 minute into the article, I have still lost 1 minute of my time.

The creator isn't getting any benefit from it, but I'm still paying.

efreak · on March 29, 2021

＞ Because by their definition, All ads are tracking ads.

John Gruber has an ad at daringfireball.net, currently for a company called Simris. IIRC the ad is pure text, not loaded by a script, and does not track you. Other blogs (usually security professionals ime) have text-based ads that are probably part of the theme in a static site generator.

ksec · on April 5, 2021

I should have put /s at the end of that sentence.

I know all of what you said, but somehow HN still thinks Ads are bad.

tomaskafka · on March 22, 2021

I assume they get business value by retargeting site visitors - to do this, just run the FB pixel (properly configured!) on a marketing pages of the website. Not in the logged-in part!

manigandham · on March 22, 2021

Why? Using ads to increase business is completely valid. This issue is data leakage due to an implementation error and has nothing to do with using advertising services from Facebook or other companies.

HelloNurse · on March 22, 2021

By this reasoning, using guns to shoot people is completely valid; the issue is stray shots due to inadequate aim and has nothing to do with being a criminal engaged in a drug war.

I'll resist the temptation to draw a parallel between "advertising services from Facebook or other companies" and a crime syndicate.

manigandham · on March 22, 2021

No, that's not the same reasoning at all. It's an irrelevant and outrageous strawman where you compared the use of advertising to "using guns ... as a criminal engaged in a drug war". Ridiculous at best and I'm not sure what temptation you resisted.

If you have a real rebuttal against advertising then reply with that instead and we can discuss how technical implementations can be fraught with security mistakes and errors, regardless of industry or product.

HelloNurse · on March 22, 2021

The point is that "technical implementations", such as how to shoot properly, shouldn't be discussed "regardless of industry or product", such as being a gangster: sharing PII with Facebook is something most web sites should avoid, not something they should do properly.

manigandham · on March 22, 2021

> "shouldn't be discussed"

Why not? Technical implementations can always be discussed separately from the context they're used in, and even your extreme example of guns has perfectly valid uses in the police and military. Yet you're making the strange comparison to being "a gangster". Why? What's the point of this convoluted analogy?

> "sharing PII with Facebook is something most web sites should avoid, not something they should do properly"

Again, why? You seem to claim a lot without any basis. Data has valid uses, and being used properly is foundational to providing privacy.

HelloNurse · on March 22, 2021

It shouldn't be controversial that not sharing sensitive data with Facebook is "foundational to providing privacy" and therefore using a Facebook tracker to fuck users needs a solid, extraordinary justification like "valid" gun use by the police and military.

You seem to believe that this particular breach is accidental, but reckless incompetence on Backblaze's part isn't much better than deliberate disregard for user privacy: any online service from Facebook should raise a red flag.

manigandham · on March 22, 2021

If you agree that the sensitive part was in error, then you're just against sharing data with Facebook for ads? That's certainly not some unanimous global perspective as I'm sure you know, since that's their actual business used by millions of other companies.

Nextgrid · on March 22, 2021

There are ways to use ads without violating privacy nor breaking the law (remember that this practice is illegal under the GDPR).

Either way, if you must do ad tracking, do so on your homepage. Once the user is logged in and has paid you money for a service there shouldn’t be any ads nor tracking.

manigandham · on March 22, 2021

> "without violating privacy"

Yes, that's covered by this being a mistake in implementation as I said.

> "there shouldn’t be any ads nor tracking"

Again, based on what exactly? Finding new users that are similar to your existing customers is a completely valid strategy.

Most people in this thread are making wild statements from the typical emotional/outrage driven pile-on when anything happens.

tobr · on March 22, 2021

> Finding new users that are similar to your existing customers is a completely valid strategy.

What on earth does “valid” mean here? It’s certainly not acceptable (to me as a customer) if it involves exposing your existing customers to these risks. Those ends can not justify those means.

manigandham · on March 22, 2021

Valid as in it's a common, reliable and efficient way to gain new customers.

Customers weren't intentionally exposed to that risk nor was it part of a trade-off, it was an implementation mistake for many reasons, something I've repeated 3 times now. What is so complicated to understand here?

tobr · on March 22, 2021

Customers were intentionally exposed to the risk, because they intentionally added this third-party code. If they’re not thinking in terms of risk management when they add third-party trackers to their site they do not have an adequate security process. There is a trade-off to security whenever you allow code like that in your product. They can’t just wave it off as a mistake, because it’s a mistake that is very telling about their priorities.

manigandham · on March 22, 2021

The mistake was allowing code into that specific part of the product.

Under your definition, there can never be mistakes at all; which is impossible.

tobr · on March 22, 2021

It’s very simple: if you include un-vettable third-party code in your system, and system also handles sensitive data, you are dealing with a huge risk. You need to make sure that the code is unable to touch the sensitive data. As it turns out, it’s a lot harder than not having untrusted code and sensitive data in the same system in the first place. The direct mistake was probably that the wrong code was included on the wrong page, but if the risks involved had been taken seriously, such a small mistake would not have been able to have such a catastrophic effect.

Nextgrid · on March 22, 2021

Based on respect, common sense and the GDPR?

> Finding new users that are similar to your existing customers is a completely valid strategy.

But this can be achieved with tracking in the homepage without embedding trackers in the actual product right next to sensitive data?

> Most people in this thread are making wild statements from the typical emotional/outrage driven pile-on when anything happens.

This doesn't make these statements any less valid though? Most people are indeed outraged that a paid professional product is ratting them out to Facebook which makes total sense as nobody would've expected that.

manigandham · on March 22, 2021

> "But this can be achieved ... without embedding trackers in the actual product right next to sensitive data?"

Yes, it was an implementation mistake. How many times do I have to repeat that? See, this is the outrage that doesn't even read the actual comment.

Corrado · on March 22, 2021

The thing that concerns me about the FB Pixel (and GTM) is that the host is completely free to do anything and everything to the page. Even if they don't do anything "evil" today, tomorrow is a different story completely. This scares the pants off of me and makes me want to rip out any "tracking" that I've ever installed on any site anywhere. Actually, that's probably not a bad idea.

Are there no browser level protections for this type of thing? I thought CORS was supposed to prevent these activities from happening.

cosmie · on March 22, 2021

Virtually all tracking boils down to 1x1 sized images getting embedded on the page, with various metadata being attached in that image call. The javascript libraries may include other functionality (like additional fingerprinting and such), but are primarily just convenient abstractions that generate and embed the the tracking images for you. Most provide the details needed[1] to build your own generator function, which would allow you to integrate the tracking you want while reducing your security exposure to third party code.

As for GTM – a deployed container is self-contained. If you don't want to expose your site to third party code, but want to use GTM as a convenient control plane for configuration of tags and tagging rules, you can do that. Instead of using the standard snippet that loads the container from Google, you can just grab the generated javascript file for the container after a new deploy and self-host it. It gives you the convenience of GTM (central control plane for tagging-related stuff, versioning and commenting, etc) but without the security exposure of embedding externally hosted scripts.

[1] https://developers.facebook.com/docs/facebook-pixel/advanced...

tga · on March 22, 2021

The actual 1x1 pixel is a leftover from the previous generation tracking tools, and even the page you liked to recommends _against_ using that method because it can’t spy on users enough.

Here we are talking about a tracking _script_ embedded in the page and sending to Facebook everything the user does (“standard or custom events triggered by UI interactions”).

Using only a pixel to track how users move around the app wouldn’t have landed Backblaze in as much hot water. Instead, it looks like the Facebook _tracking script_ (automatically) exfiltrated sensitive data like file names, and that crosses a limit.

cosmie · on March 22, 2021

It's not a leftover – the core premise of how these scripts work use the exact same principle. Even when using the JS tracking library, if you look at the network calls to Facebook after the initial script download, they're all hits to https://www.facebook.com/tr/ with the metadata for the call in query parameters and a return an image content type (image/gif).

As I mentioned in my original comment, the tracking scripts are more than just generator functions for the image pixels. They also do stuff like browser fingerprinting and cookie management[1], and ensure these things get tacked onto generated pixel calls. This improves the fidelity of the data sent back to Facebook, but ultimately it all boils down to image calls with tracking data tacked on as query parameters to the call.

The reason Facebook (and others) don't recommend doing this is because

– As you mentioned, they have way more freedom to do what they want on the page when you load their actual script. So of course that's going to be their preference.

– Advertisers use these pixels for attribution purposes, but ad networks also use the opportunity to further fingerprint and profile users for targeting within their platform.

– The tracking script abstracts away the actual tracking protocol being used (i.e. the query parameters and their associated values). Which helps ensure calls are made correctly, as well as provides flexibility to make changes in the underlying protocol while retaining a stable interface via the JS SDK.

- Takes care of things like generating a unique user id, looking for and saving Facebook Click IDs when seen on incoming traffic, and tacking those values onto pixel calls when they occur.

Any user ID can actually be used, so long as it's unique (and Facebook's methodology is documented and easily replicated in [1], if you want to be consistent with the SDK). And persisting a query parameter into a cookie is actually more robust if done by a first-party script, since ITP has made the lifespan for cookies written by third-party scripts so short.

As long as your custom image generator accounts for those two components (generates a client id if none exists and persists + includes a fbclid if seen on incoming traffic), you will get close to parity with the JS tracking library as far as attribution in Facebook Ads without any need to load third party scripts from Facebook (or other advertisers). Which, as an advertiser, is the only part that you care about. What isn't at parity is all of the secondary fingerprinting that ad networks do, but that's the ad network's problem and preventing that shady shit from happening on your site is the precise reason you'd want to roll your own tracking calls to begin with.

[1] https://developers.facebook.com/docs/marketing-api/conversio...

sbarre · on March 22, 2021

As a first-party site owner, subresource integrity checks[0] (that someone else already linked elsewhere in this thread) lets you at least determine, at the browser request level, if a third-party script has changed since you installed (and hopefully audited) it.

0: https://developer.mozilla.org/en-US/docs/Web/Security/Subres...

mhils · on March 22, 2021

The ad-tech response to SRI is to provide script tags where the entrypoint script loads additional scripts, which you cannot pin. ¯\_(ツ)_/¯

sbarre · on March 22, 2021

Hmm that's a good point.. That does seem like a flaw in the plan..

overscore · on March 22, 2021

For various reasons including this, advertising tracking is moving server-side, where the company can much more tightly control what gets sent to the vendors, and where third party JavaScript no longer has access to the DOM, network requests, or cookies.

londons_explore · on March 22, 2021

And pesky users can't point out GDPR violations using the browser Developer Tools.

Server side analytics will prove much more powerful and opaque when it gets integrated deep enough into web dev stacks to work properly.

deadbytes · on March 22, 2021

This is scary to think about.

The upside of third-party trackers is that you can completely block all of them by just blocking third-party javascript. What are we going to do once all of this tracking code starts getting served from the first party domain instead? Or even served inside the same source files as site code?

I imagine we will start seeing a new class of privacy extensions that behave more like anti-virus. Checking for known hashes of tracking scripts, monitoring for certain patterns of behaviour during execution.

overscore · on March 22, 2021

The future is entirely server-side tracking, with no JavaScript executed in the client unless for UX tracking like Hotjar or A/B testing like Target or Optimize.

Personally, I haven't seen a desire in companies to skirt GDPR. Rather companies just want to be compliant and not have to worry about data breaches or reputational damage from their marketing tools. This example with Backblaze is exactly what companies are trying to avoid.

altano · on March 22, 2021

CORS is about safely allowing two cooperating, different origins to communicate. In this case, Facebook and the host are cooperating.

Lukas_Skywalker · on March 22, 2021

As a protection for the users, addons like Facebook Container for Firefox [0] can isolate all Facebook tracking and prevent the scripts from running on pages that are not facebook.com.

[0] https://addons.mozilla.org/en-US/firefox/addon/facebook-cont...

rciorba · on March 22, 2021

Even just using an ad-blocker will prevent this: https://github.com/gorhill/uBlock

corobo · on March 22, 2021

And if that doesn't tick your creepy boxes lets try the financials. If a user hits your tracking pixel they (and those like them) will more likely see ads similar to yours, meaning potential customers will be more expensive to obtain now.

Don't give data to Facebook lmao.

hertzrat · on March 22, 2021

I have a question about tracking scripts: can they read what we type into browser addons? Eg, your master password when unlocking a password manager?