The case for doing what they did is that it helped a lot of people access books (eg school curriculum ones) in the middle of Covid (not everyone knows how to pirate things).
Not saying it was the right move but you can see where they're coming from, they wanted to help people.
Yes I love the idea and I want it to be legal. Perhaps it is even worth someone testing the law here intentionally to get the courts to rule.
The Internet Archive's mission, imo, should be to protect and steward their archives and ability to continue to maintain them above all though. Putting that at risk for a morally righteous additional cause is not worth it. This was a leadership failure
timeline doesnt change the conflict of interest... he likely received pressure after his first comments that made him change his statement not to lose the grant.
the corruption that we are seeing right now is alarming. more alarming is how this person is protected by a big chunk of the media and population.
The pfizer/EU text messages is pretty alarming too.
The fact that most of the research up to this day is redacted and the recent news of lab leak makes everything tie together into something that is not a conspiracy theory anymore but the reality.
Can you expand a bit on the quality of Google's internal practices with user data? eg what would an engineer on Google Photos need to do to be able to access my photos, if possible at all
In general, SWE/SRE do not have access to user data. SWEs just have no prod access at all, and SREs generally only have the ability to run signed binaries with code that has been code reviewed and submitted, though there are obviously break glass features with auditing. Though I am not super familiar with them.
ML is where things tend to be a little bit more grey overall since being able to look at data is very useful for development, so some things are scrubbed for PII, but then accessible in some form. But for things like GMail or Photos, I would assume nobody (including ML engineers) can read your data as these are basically impossible to sanitize.
Some products have systems train ML models without engineers seeing the data, e.g. spam filtering, even when the underlying data is considered sensitive.
Basically all the data is available with no oversight if you ask permission and have some allusion to a relevant $JOB reason to need it.
The fairly recent case of people's private conversations being shipped out to basically unvetted contractors for labeling and analysis (and subsequently leaked) should serve as sufficient evidence that "shit happens," and if private conversations without having even initiated an interaction with your Google devices are being tossed around and leaked, forgive me if I don't believe that when producing the tagging, timeline, and album features in Google Photos there wasn't some underpaid, unwatched contractor snooping through my photos without my permission.
I believe I can share an anecdote: a while ago I have uploaded a bunch of photos of a work event to the corporate account. This being corporate, it runs pre-release of everything. The gallery managed to hit some bug in the jillions of lines of Javascript, which I never cared to understand.
I reported the bug. Knowing how security works technically, I added to the bug the words "I'm happy with whoever works on this to take a look at the gallery, here's a world-readable sharing link". A couple rounds of bug comments later, I have been asked to sign a legally binding consent form allowing an engineer to look at the gallery. Then somehow they decided I need to sign a different form to satisfy whatever other legal spirit needed appeasing. Only then someone finally looked at the bundle of photos. They figured out whatever was triggering the bug. They generated a gallery reproducing the bug with generic sample images. Whoever worked on the bug and adding a regression test worked off that synthetic gallery instead.
My concern—and I think what everyone's concern should be—is not that Google employees have access to my data. Or even that it's used to train ML models. It's that my data is sold to advertisers via a shady behind-the-scenes marketplace, and later used to profile me in order to show me content that manipulates me into spending money.
And that, moreover, I get none of the profits from these transactions, and have no control over whom it's sold to and under what terms.
IIUC the content scripts that extensions use load after the page is loaded, so this shouldn't affect page load times. But please reach out if you feel like it does!
Most pages these days still do a lot of additional loading after "page load", so you're still going to be delaying that unless you take special pains to wait until all other JavaScript activity has died down.
I don't know why but this idea really seems to rub some people up the wrong way. Yet we all know that on a big team only a handful of people will be doing the majority of the work.
Yep, accounting for carbon offshoring lowers the difference (US emissions gain 7% if you count consumption not just production, and China's lose 14%) but it still holds that China emits more per capita.
I blog about books, podcasts, courses I've done, product management thoughts du jour, and the occasional rant