Yes we know. brew tells you that. And you can disable it. Data is being sent to Google everytime you do almost anything in almost all websites, using the same technology. So what?
Some of the times it's to make the product better, or better targeted. Some other times it's just for spying on the users.
Let's stop complaining about stuff that someone does and tell you they do it. There are many more that do the same things without telling you.
Furthermore, now that you know, for brew specifically, will you opt out? Is your brew command history so secret that you care more about noone looking at it than helping making brew better? Chances are you use brew quite often. Chances are brew is not perfect. Will you choose to make its progress slower because of no real security reason?
How do we expect open source to become better if everyone is being a crybaby because brew got a history of how often they do brew update?
The US approach to privacy is "let the market sort it out". It has sorted it out by transforming the internet into a giant spying machine sucking up contacts, photos, documents, videos, metadata, what packages you use, when where and how you exercise, everything
The EU approach to privacy (eroded by lobbying and lack of control over US companies) is that citizens have a right over their information. They can ask what personal information a company has on them, ask that it be corrected or deleted.
This has resulted in companies that are more careful with data.
Whenever I read a post like yours I immediately think of so-called "useful innocents", to put it euphemistically [1].
Fighting for a non-goal of better open source through analytics (get real), corporate dominance and fewer rights for individuals.
Companies don't need analytics to improve software. They certainly don't need analytics from Google, the leading spyware-as-a-service company in the world.
I expect open source to become better through writing quality software, engaging with users and if need be doing some surveys or organsising other reach-out initiatives in the open. Not analytics turned on by default but it's-ok-cause-we-tell-you-we're-fucking-you.
Do you use brew?
Have you contributed code to it?
Have you installed something using brew, before updating brew for a while?
For me, my answers are yes no yes. A while ago, you did that, and you get an old version of the thing you just installed. With analytics, they saw that many people forget to update before installing after a while, so they added an update step in that case.
I call that progress, no matter how small. I didn't change that and to you didn't change that. The brew devs changed that. Maybe they would have changed it anyway later.. Who knows. But the fact that brew because a tiny tiny bit better because of analytics, in part from me, makes me a happier user. Now I don't need to remember to update before installing.
Of course it's not the best way to "contribute", but it's something. Definitely better than "contributing" your data to companies, like you said.
I can totally see the benefits of telemetry. But it would be way less phishy if it were opt-in, or at least opt-out with a very visible information message.
The problem with this is, do you want every program you ever use, to start prompting you with a series of questions about various opt-in / opt-out questions? I get annoyed enough that gnu parallel keeps asking me about citing it, and that's one program.
Bash would like to record analyitics. y/n/more information
> y
> ls *.c
ls would like to record information about how you use it.
y/n/more information.
> y
file1.c file2.c file3.c
> grep 'string' *.c
grep would like to record some information about your
usage. y/n/more information.
> y
file1.c:8: string
> vim file1.c
vim would like to record information usage, to help
improve vim in future. y/n/more information.
> y
The 'C' package in vim would like to ...
If I saw something like that, I'd format my drive and install another OS.
The very idea of your shell or file listing program sending analytics to the mothership is ridiculous. They shouldn't be even talking to a network directly. They have a set of well-defined tasks, and spying on you is not one of them.
"The problem with this is, do you want every program you ever use, to start prompting you with a series of questions about various opt-in / opt-out questions?"
Simple answer, Yes.
It would lead to conversations about the elephant in the room. It also alleviates the assumption I trust all providers(comprehend the EULA/ToS) and puts their actions under scrutiny. If more people realized what data is collected and by whom for whom they could make an informed, cost/benefit choice. Other useful results are, what is the applicability of that data to the project, what characteristics define the 'Trusted Partners' that data is shared with and, possibly, inject some restraint into the collectors' decisions of what to collect.
Then we hit the problem that if we don't push that request at users, probably a tiny fraction will turn it on.
Worse, that fraction will be the statistically unusual people who bother reading and finding such options, meaning we can't derive any statistically useful results about the user base from them!
Frankly, that's your problem. I don't see even a true desire to make the software better as a legitimate reason to exfiltrate data from people without asking.
Maybe you should ask them better. For example you could ask for help on the home page of your project or in the beginning of a tutorial. Or maybe they do not want to paticipate. Is it right to ignore this and turn analytics on by default?
Browser history is useful to user (but maybe the access to it could be restricted by means of authentication). I'm fine with off-line browser history that's accessible only to the user (and not JS running on a website). But sending any of it over the wire should definitely be opt-in.
An unexperienced user might not know that the history is recorded and later can be accidently seen for example by other member of the family. A good software would not allow that.
I remember some IM app for linux had history turned off by default. I was really surprised then.
Come now, a bajillion other software projects have "figured out" that people forget to update everything, always, without resorting to this sort of behavior. There's even a whole library, Sparkle, the exists just to solve this problem. Your nonexample of the necessity of analytics merely serves to reinforce the point you are trying to rebut.
> The EU approach to privacy (eroded by lobyying and lack of control over US companies) is that citizens have a right over their information. They can ask what personal information a company has on them, ask that it be corrected or deleted. This has resulted in companies that are more careful with data.
More like, this has resulted in things like nonsensical "Cookie warnings" that only waste the user's time.
Cookie law is one of the stupidest laws that EU forced. I had to install additional rules to adblocker to get rid of this nonsense and it still doesn't catch everything and keeps irritating me... I don't care, damn it! http://nocookielaw.com/
The law is pretty simple: If you track users, or transmit information to third parties which could track the user, you have to get the user to opt-in.
The original intention was to get rid of Facebook’s shadow profiles, which it did – in the EU, websites don’t embed Facebook’s like button anymore, but have a two-click solution, where you have to click it, then it’s actually loaded, then you can like.
It is not worth it. I would really want a good way to protect myself as an Internet user from this law that the EU is enforcing on me. To protect myself not from Facebook, not from Google, but from the EU. It also gets in my way as a web developer. The law is not simple. 100 words sentences of law jargon are not simple. Many, many pages of such sentences is the opposite of simple. To be honest I have never really understood the law and I don't think I ever will. Even keeping track of it is far from simple.
But according to this website: http://ec.europa.eu/ipg/basics/legal/cookies/index_en.htm
Cookies clearly exempt from consent according to the EU advisory body on data protection- WP29pdf include: (...) third‑party social plug‑in content‑sharing cookies, for logged‑in members of a social network.
So, I think what you say about Facebook is not true.
I was talking about non-logged in users. The shadow profiles.
> The law is not simple. 100 words sentences of law jargon are not simple. Many, many pages of such sentences is the opposite of simple.
The German version of the law is just as simple to read as any newspaper or book you read in German high school, so at least I haven't had an issue before.
IMO, the intent was good... but yeah, the implementation is a grand demonstration of what you get when legislators don't understand the technology they're regulating. It would be far better if they'd required browsers, not websites, to show the notifications - much like they already do when a site wants to access your webcam/mic/location/etc[0]. That would mean much less implementation work (once per browser instead of per site), no way for underhanded sites to use cookies without the user being notified or despite the user declining them, consistent UI across all sites...
Such a thing could provide significantly more useful information, too - I envisage a notification with "This site wants to use a cookie on your computer" at the top, "allow/deny, now/always" buttons and a "What are cookies?" link at the bottom, and a user-friendly breakdown of this particular case in between, things like:
• "only visible to this site" vs "visible to ad.doubleclick.net" etc - maybe including, say, the Organization Name from the cookie domain's SSL cert, at least in the case of cookies set to "Secure" (maybe only if the cert's EV)
• "until you close your browser" vs "for a week" etc - perhaps with a way for the user to force session-only if desired
• possibly some kind of warning about snooping risk if the cookie's not marked secure, or not HTTP-only & 3rd-party scripts are on the page, etc
• for the case of 3rd-party cookies, it'd be possible to list which other sites have used the same cookie in the past
And so forth. The most importantant point being that you could actually trust this information - your browser has no motivation to lie to you about it, but any random site might.
Or, to look at it another way... so what if the browser pops up notifications more often than the law requires? Might encourage people to make less use of cookies for functionality that doesn't really need them, which would be no tragedy. And if your site really can't do without them, you could pop up a message explaining the situation & politely asking to be unblocked.
I suppose there'd be a danger of alert fatigue... maybe some heuristic analysis of the cookies themselves would be in order, to at least tentatively classify them as "tracking" vs "other". Eg, Google Analytics & Piwik cookies could be identified pretty reliably.
> I suppose there'd be a danger of alert fatigue...
But that's happening with the current solution as well (not to mention all those crappy sites with their modal popups that teach you to close anything opening asap anyway)
Cookie warnings are an indicator showing that a website takes part in a global spying network. But I think instead of warnings browser developers should just disable third-party cookies by default. Sadly, the most popular browser developer by conicidence owns a popular tracking service so that is not going to happen.
> Companies don't need analytics to improve software.
I was with you except this line. Data is needed to make better choices and prioritize improvements. How would you like people to collect data? Google Analytics is free, easy, and powerful. Adding the cost of buioding a KPI tracking and reporting later from scratch... simply would be too much for most open source projects.
It's the combination of companies gathering ever more personal data and companies proving to be absolutely incapable to prevent leaking personal data that I find worrying even more than the direct spying.
The EU approach of “let’s regulate everything and let incompetent out of touch bureaucrats write the legislation” left to annoying (much more than ads for me) cookie banners and other such nonsense. Starting next year, cookies — i.e. often random nonsense numbers — will have to be treated as personal data in EU. I envy US.
>Data is being sent to Google everytime you do almost anything in almost all websites, using the same technology. So what?
This still doesn't mean we should just shut up and take it for desktop software. What anyone does on its OS shouldn't leave the LAN - same critique stands for recent MS endeavour with Windows 10.
I definitely don't mean that we should shut up and take it. But before we complain about an open source project that maaaaany devs use happily, let's complain about those other cases first, yea?
Ok, sure. At least complain for both. I didn't see the post referring to anything besides brew. I'm all for protesting and trying to fix any size and kind of injustice. But just don't target the less guilty. brew is still guilty for not having an even bigger announcement. But less target others too, at least. I would hate to see brew or any other similar software be the sole target and have to change for the worse, because it was unlucky enough to be the scapegoat.
>I didn't see the post referring to anything besides brew.
That's because this post is about brew. Windows 10 got tons of flak when it pulled this crap as well. "There are worse cases" is not an excuse for a project doing something shitty.
>Data is being sent to Google everytime you do almost anything in almost all websites, using the same technology. So what?
No, it is not so what. This is a major problem with Google instead. The fact that a single company can track majority of the web traffic is a huge risk.
Sure. Opt in would be nice. But only if it was opt in always and everywhere, and all users, technical or otherwise, knew the reasons why analytics are gathered for a product they use. I think in that case, many people would still opt in. I would. And I would hope users of my products would too.
Yes, and maybe it's okay if we don't have perfect visibility. We've been building things for thousands of years without Javascript event notifications, I'm sure we will do just fine with a little less data
Yes, there are arguments for organ donation to be opt-out. But that's extreme case, when one loses literally nothing - by virtue of being already dead - and another person stands to gain additional years of life[0]. Some other socially beneficial things also have strong arguments for being opt-out (like retirement plans, because opt-out protects people from their own stupidity/short-sightedness).
But that doesn't change the validity of the proposal that opt-in should be the standard - exceptions from which must have solid reasons. Just making it easier for someone to make money off people is not one of those reasons. Neither is vague "making the product better".
--
[0] - INB4 yes, there are also valid arguments that opt-out organ donations will reduce doctors' willingness to fight for patient's life. Human societies are complicated.
Yes, there are arguments for organ donation to be opt-out. But that's extreme case, when one loses literally nothing
Actually, I've opted out because my next of kin stand to lose quite a lot (the harvesting of organs needs to be done quickly, well within the mourning period of the people I care about the most). The decision to leave sight of my body needs to be theirs, and theirs alone.
Well if you can't bear the thought of your friends and relatives not being able to stare at your dead corpse for a couple of hours and value that over doing something amazing and saving someone's life then more fool you.
If your in the position to donate organs you most likely died in an accident. You likely won't have that rosy picture of your family and friends around your beside. You might not have anyone.
Yes, fuck you too. Try a little empathy next time, and failing that, reading comprehension.
I said the decision should be theirs. I did not say my body couldn't be parted from them, I said it was their decision to do so. I used to be registered as a donor, but because the law has now changed that my donorship overrides the wishes of my loved ones, I have withdrawn it.
And I said they may not be there to give that decision, so that's moot really. Organs like lungs and hearts expire very quickly. Reading comprehension indeed.
> Try a little empathy next time
For the dead person, or the person who misses out on a life saving organ due to the dead person?
It's your body and your choice obviously, but I think saying it's up to my relatives to decide isn't a great reason to be taken off the donor registry. There is plenty of time for them to grieve but only a few hours to take a vital organ. If you want to do something great if you expire unexpectedly then it's up to you, don't put that on your parter/relatives.
Also somewhat related to this - recently I've realized that this privacy paranoia is going to slow down medical advances coming out from big data so much.
For example your wearables get to collect so much biometric info, if that data can be connected to detecting conditions early it would provide a lot of value down the road. At some point we will have the option to collect data about what you ate, what you did, where you went and then how that affected your biometrics, and we can inexpensively collect huge scale DNA samples, etc. all that data if available publicly could really provide insights in to things that are really not practical in limited group studies.
For reasons such as this I think I'm fine if services collect anonymized (unless we solve identity theft and such security concerns) information about me, I'd just want them to make this data free.
I mean the general sentiment. I'm not saying people should have access to your full medical history, personal info, etc. on demand.
I am saying is that these benign things are opt-out not because most people wouldn't want to do them if they weigh the prons and cons but because they don't want to put in the effort of doing so and will just be conservative - which is logical from an individual perspective - but will cause us to lose out on opportunities as a whole.
Also this data is getting collected weather you want it or not, even intelligence people are just taping over their webcams as a security measure - the attitude that we must protect every bit of privacy by default will lead to the future where hidden data collection is the only way to access data - people will be making money off it, it won't be available to general public (for eg. public research) and there will be no transparency about it. And if you think the government will protect you - well they are the biggest transgressor here.
So instead of fighting a lost battle with trying to keep absolute privacy why not just make most of that data public and available and focus on protecting the really sensitive stuff.
> So instead of fighting a lost battle with trying to keep absolute privacy why not just make most of that data public and available and focus on protecting the really sensitive stuff.
Because you can't. You cannot build something that protects the really sensitive stuff out of stuff that's leaking data left and right, making it harder and harder to protect the really sensitive stuff.
> Some of the times it's to make the product better, or better targeted.
Homebrew lead maintainer here. This is exactly why we do it. To repeat myself from below: we're chronically understaffed and underfunded.
> Furthermore, now that you know, for brew specifically, will you opt out?
It's worth noting: if everyone who uses a niche piece of software opt-out when it breaks we'll look at our analytics and likely remove it from Homebrew instead of fixing it. This may be unfair but given the amount of work: we need ways to prioritise. Analytics have helped us do that so we can prioritise fixes on things that are critical for many people.
The team behind Brew is doing a great job and if analytics can help them do a better job, then be it. Nobody is forcing you to use it and opting out is simple enough.
Does it have to be Google, the company that already sucks up the most information about pretty much everyone on the internet? Why don't you send the data to your own machines?
We don't have our own machines that are capable of handling this. If you're offering to host them for us for at least two years for free: get in touch.
Hmmm.... probably better to see how it turns out before jumping to conclusions. :)
From initial discussion with Mike, there are two parts that would be needed for a (self hosted) setup:
a) The hosting itself (this is something I can likely help with)
b) Setup and ongoing management of Piwik. This isn't a skill set I have (nor am interested in). However I'm generally thinking the Homebrew Community is big enough to rustle up at least a few people with the right skill set for getting this done (and keeping it done :>).
Trying to get both hosting + Piwik setup/mgmt from the same set of people may be a tall order, so doing them separately is probably more achievable.
As a data point, this kind of approach - sponsored hosting/hw/similar + volunteers to look after Community infra - is used successfully in other projects. PostgreSQL is a good example.
Can't really think of any obvious reasons it wouldn't work for Homebrew too. ;)
It's the cost of running the servers, the experience and time to run the servers and the time to adjust our analytics code to use the new system (which I'm willing to do).
Additionally, this (kind) offer is pointing to other organisations that may be able to host the servers and discussing potential solutions to the other options. It's not a slam-dunk by any means.
This is all done in my free time so I don't need an excuse to do or not do any of this.
µblock blocks and removes ads and trackers, µmatrix does the same for cookies and JS. Clean links improve page loading times when you click on a link on reddit or Google, it extracts destination link and gets you directly to the destination without redirect, also google, facebook don't know that you clicked the link.
>Yes we know. brew tells you that. And you can disable it.
No, we don't know. This article wouldn't have hit the front page if it wasn't subtle and user-hostile by default (opt-out).
>Data is being sent to Google everytime you do almost anything in almost all websites, using the same technology. So what?
Brew is not a website, it's a CLI tool. People expect this behavior of websites, not of sysadmin tools. How would you feel if you found out your compiler shipped off a copy of your code to a remote server each time you encountered a compiler error "to improve the compiler" and you had to opt out of this behavior?
>Some of the times it's to make the product better, or better targeted. Some other times it's just for spying on the users.
It's always spying, regardless of the motivation.
>How do we expect open source to become better if everyone is being a crybaby because brew got a history of how often they do brew update?
Somehow open source has functioned all of these years without silently harvesting users' usage data. I don't know how new you are to open source, but this is definitely not the way you do QA for an open source project.
That's a bit extreme. A compiler sending your codebase has potential IP implications. Sending your brew commands back to brew is much different.
What commands do you run in brew that isn't specific to brew? Whereas, compiling code is a very specific-to-you scenario. If brew fails, it's likely to be caused by brew itself or a package that is part of the repository. If you your compiler runs into an issue, it's not inherent to the compiler necessarily. I'm not a fan of brew at all, and I can't really find the malice here. Either opt out, stop using brew, or accept it. Why are you on OSX if you care that much about your package manager anyways? Ive worked at many companies, and any that let you choose a MacBook will let you opt for a Linux laptop instead (YMMV I guess).
Command line tools do not usually do this at all, particularly FOSS command line tools. This is a very worrying development and reduces my trust in FOSS.
--edit-- to whoever downvoted me, can you explain why?
I love FOSS but I also lovemy privacy. I have come to generally have a default level of trust for popular FOSS projects. Things like this make me question that trust. Why is this wrong?
Please don't distrust a whole category because of a single tool. Most FOSS tools don't do this, especially if they're "Free Software" as opposed to just being "open-source", since "Free Software" indicates a moral, not just practical stance on software.
Well perhaps my assumptions about well-known and widely used open source stuff were just too naive.
I guess I'm not really thinking "distrust them like they actively include malware" just that it looks like I need to at least keep a look out for stuff like this.
I love FOSS. I believe in it, I use it all the time, I've made a few minor contributions here and there too. It's not like I'm saying "OMG FOSS is teh evil!", just that maybe my trust level was calibrated a little high :)
> Some of the times it's to make the product better, or better targeted. Some other times it's just for spying on the users. Let's stop complaining
These kinds of complains are fine, I don't think we should stop them.
It's a bit like corruption in politics, it's not going to be solved anytime soon, but complains do raise awareness and prevent abuses. Who know what's going to happen if we stop complaining?
> Will you choose to make its progress slower because of no real security reason?
No real security reason? We can test that. The developer who added this feature can easy earn trust in such statement by changing the disclaimer and making themselves liable if information extracted through this mechanism causes any identifiable harm.
If this is a sure bet, no risk, perfectly guarantied to be safe, then such change should be trivial.
It takes almost zero effort to implement Piwik instead of giving all your data to Google. A simple donate button will take care of the costs of a $15 per month shared server for this purpose.
Really that is all it takes. Everything else just goes against the fundamentals of open source software, which is, respect user rights.
this is going to sound insulting, but have you tried being an open-source developer?
I've worked on products with tens of thousands of users, and the vast majority of those users never communicate.
Worse (in my opinion), the users who communicate most tend to be experts, leading to programs tending to become better for expert users, and worse for beginners / occasional users -- see for example how many linux programs have options you can turn on to make them more user-friendly, but of course as a beginner how do you find them?
Worse (in my opinion), the users who communicate most tend to be experts, leading to programs tending to become better for expert users, and worse for beginners / occasional users
On the other hand, not listening to users but analytics leads to dumbed-down inflexible programs that can't get out of their own way for expert users.
If you have "tens of thousands of users" and most of them don't communicate, I'd call that successful. They're satisfied.
If you look at Google Play Store you will see a lot of comments from non-expert users (you can easily see it from their messages). Maybe that is because writing a comment in Play Store is easier than subscribing to a mailing list many projects still use or registering in a forum.
Also maybe they do not say anything because everything works ok for them?
I’ve also worked on products with tens of thousands of users, and while tracking is definitely convenient (I recently added a feature where, on first start, the user is informed that crash reports would be sent, and required them to choose "dismiss" or "opt out" before letting them use the app), but it’s no reason to hide it like this.
Nor is it a reason to even try tracking with Google Analytics.
In my case, I ensure that the backend server is also FLOSS, that no problematic data is transferred, and so on. All FLOSS, everything verifiable by the users.
But running Google Analytics? Without prompting the user?
I don't think the problem is using Google or not. The problem is that software collects and sends data to a network without user's authorization. It is a package manager, not a program for reporting what software you are installing and how exactly you do that and what operating system you are using and user might not expect such behaviour.
Free software is supposed to do what the user wants, not what its developer wants.
I can understand why this might upset people, but for me I hope it will make brew a better product. I'm sure we all know as developers how annoying it can be to not know the problems users have, and what they are using your software for -- that's why most websites have analytics.
I'm hoping to add something similar to software I work on, with an opt-out of course. I believe it will help me serve my users better.
Please make it opt in. If you truly believe people closely read the prompts for having their data harvested, a default to "No" should not impact you at all.
If you don't believe they read the prompts closely, you're an asshole for stealing data by default.
Is it "stealing data" to record that certain options in my program are never used, or used frequently, or whenever a user clicks on options A,B and C in sequence the app crashes?
My problem is that I believe that 90% of the users of my app won't care one way or the other if I record these stats. If I default opt-out these people, then I lose all that useful data and in the process, I believe, make my app worse for everyone.
I won't deny this is not an obvious choice, but I think personally on balance it's better to opt-in. Especially when the data is anonymised and doesn't contain any private/identifying information. But I understand how others would disagree. If there was some kind of OS-wide "don't record what I do" option I could hook into, I would use it.
Your belief may not match the real world. You can only know that by actually asking them.
Look at the outcry when Microsoft put this stuff in Win 10.
At the very least you should present a question on install. Hiding a notice amongst licenses and other information, as often happens, is underhanded and scammy.
>My problem is that I believe that 90% of the users of my app won't care one way or the other if I record these stats. If I default opt-out these people, then I lose all that useful data and in the process, I believe, make my app worse for everyone.
If those people wouldn't volunteer to opt-in, it's just as likely that the reason they are not opting-out is because they missed the notification that someone is collecting data about them.
It's a UX anti-pattern to default behavior to something the users may not want. If you're worried they'll accept whatever default there is, just explicitly ask them if they want to relay usage stats and you'll be surprised how much of the 90% you claim don't care will start caring.
Look at the comments on this thread, there are multiple accounts of people surprised by this. The very fact that this article is on the front page is proof that Brew tricked users.
>Especially when the data is anonymised and doesn't contain any private/identifying information.
You've either made massive breakthroughs in the field of information security or this is a bogus statement. If the user's computer even connects to an analytics service, they've already got an IP address, frequency of connections from that IP, etc and all of the correlation that comes with enabled by their other data sets. Just because it's anonymized by the time it comes out of Google in Brew usage reports doesn't mean it hasn't given Google additional information to profile people.
You should, perhaps, question your beliefs. When I get "do you want to send...", I only very rarely click "yes/OK", and then only if that and the OS were the only things running and the problem is locally reproducible. The state of my machine is none of your damned business.
You could be right. I could pop up a message early on, saying I'm doing analytics.
I think, based on the conversation here, I'm going to make it very clear there is analytics, but not allow turning them off. This is (I think, you are welcome to disagree of course) the best situation:
* Everyone knows there is analytics going on, no hiding.
* I get good quality analytics from my actual users, as no-one can opt out (well, you could start port blocking, I'm not going to make the program stop working in that situation).
I imagine it would get REALLY annoying if every command line program started doing that, every time you logged into a new machine.
Of course, that suggests we should introduce some kind of standard, then bash (or whatever shell you use) could prompt you once, and every other program could use that setting.
At least this would make the problem with how many commandline tools are tracking you visible, and perhaps would start a discussion on the necessity of tracking for tools like brew, ls, cp, dd and so on.
This. A thousand times this. To me this whole thing feels like the maintainer is trying to justify something that most people won't ever accept as the status quo. Bad times when software this solid is going in a wrong direction. Time for a Homebrew fork?
No need for a fork, if someone was willing to put in the time and money to maintain a private anonymous analytics server, the homebrew folks would be happy to use it.
Unless of course you are fundamentally opposed to tracking of any kind, then a fork is required. It is a fairly massive infrastructure to recreate however...
They can call it "anonymous data" as much as they want, but in the end it's still data transmitted to Google's servers. If you are simultaneously logged into your Google account on that computer, the information is not anonymous anymore. Google could with high probability correlate connections.
> They can call it "anonymous data" as much as they want, but in the end it's still data transmitted to Google's servers.
You appear to be saying that it's wrong to send any data to Google - which strikes me as an indefensibly extreme position. For example if they were tracking a simple count "number of times anyone anywhere has run brew" and it was stored as a single global integer then it's hard to imagine what issue you could have with it as it's basically no worse than a simple hit counter on a website.
So what are you saying? Surely the problem has to relate to what data is collected?
I'm saying that if you make your software talk to Google, don't say that it collects "anonymous aggregate data". The data it sends could easily be traced back to you (by Google).
And after all the Snowden revelations it's very naive to assume that Google (or any other company) will protect your data.
Doesn't that depend on what's being sent? Either it's personally identifiable (or at least de-anonalysable) or it's not. Within reason some things are not of any real interest to anyone.
Sure, but then the problem isn't the information they collect but the fact that it goes to Google. Short from them setting up their own analytics service though, to whichever one you send this data it'll be traceable up to some extent. Google just might have a larger trove of data about you, but that's also through your own doing.
(Also, downloading a movie is a different thing - there you're not stealing nor exfiltrating anything, you're downloading data as intended by uploader, who may or may not have the copyrights for that data.)
(INB4, ripping a DVD and uploading it is not data theft. Exfiltrating a pre-release copy from movie company's servers would be data theft.)
By your definition, there would be no such thing as stealing IP, PII, launch codes or identities. However, those things are well accepted terms in all industries, including tech. You need to reconsider your strange restriction to mutual exclusion if you ever want to engage in meaningful conversations.
All inaccurate to varying degrees - or at the very least metaphorical/rhetorical uses. If you're allowed to push a meaning in one direction then I'm allowed to point it out and attempt to bend it back a little.
>push a meaning in one direction then I'm allowed to point it out and attempt to bend it back a little.
I'm not pushing anything. Your argument is about whether or not copying information is 'stealing'. I'm just pointing out that it's a very commonly agreed upon vernacular all over many industries. Call your credit card company and tell them someone stole your credit card number and see if they understand what you are talking about and/or try to correct you by saying it was merely copied.
Go ahead and try to change the definition, but don't waste time on HN with that crap when the subject isn't even about whether or not it's "stealing" because it makes for boring conversation.
You don't see anything even slightly problematic about referring to opt-out (rather than opt-in) aggregate and presumably anonymous usage statistics as 'stealing'?
You don't think that is possibly somewhere towards the further reaches of consensus on what is a reasonable application of the word?
You're welcome to disagree with my position but you're going a touch further than that. You're also accusing me of arguing in bad faith and being somehow aware that my argument has no merits. Considering I think I'm stating a fairly reasonable view I find that somewhat disingenuous in return.
I think you've tired of me but I want to finish with one last thought: "Choose your battles wisely" - this might not be the best place to make our last stand. There's a danger of tiring out potential allies before the point things really matter.
People here are talking about opt out vs opt in but there's a third option: Forced choice. This is where you present the two options to the user (without either prechecked / default) and the user has to choose one or the other to continue.
This avoids both the problem that the user doesn't consent with opt-out, and the problem that nobody cares to opt-in. The disadvantage is it doesn't work with unattended upgrades.
Yes, one reason I like linux compared to windows is not having to click through "yes, yes, continue, yes, continue" in every installer. I don't want that bringing back whenever I install a new machine / program.
Enabling analytics that send whatever information to Google. Yes, I can opt out, but often times I just don't have the time to verify the privacy policies of each and every piece of software I use.
Opt in may be "more honest", but you're going to see a a lot less useful data as a result of it, because the pool of people who will opt in is less than the pool of people who wouldn't care what the default is.
Let's assume that 90% of people don't bother to turn off telemetry and 10% do. That also means that in your preferred scenario, 90% of people wouldn't bother to turn it on and the 10% also wouldn't turn it on either. If you can't be bothered to turn something off, why would you be bothered to turn it on?
That means that you'll either get something close to 0% telemetry if it's default off, or 90% if it's default on. So, it makes sense to be default on if you want your telemetry dataset to be big enough to be worth it.
But then we get to your question - why bother with an opt out? Well, if the decision is either to use the product and send telemetry or not use the product, that 10% of people aren't going to use your product. They care about not sending telemetry that much.
At that point, as a dev you just have to ask yourself is it really worth losing 10% of your users to an always-on telemetry policy or is it OK to make the concession of allowing an opt out in order to grow your user base?
Personally, I'd rather make the concession and get that 10% of people on board. If they're vocal enough to complain the shortcomings of my product they're probably vocal enough to also talk about the good things and give me free advertising.
> If you can't be bothered to turn something off, why would you be bothered to turn it on?
If people aren't turning telemetry on, do you think they really want to send you data in the first place?
What you are doing is exploiting users assumptions of how normal CLI tools behave. They don't assume it's going to relay my information back to google when I use it.
Your entire argument essentially boils down to, "I'm sure I can get away with taking my users' information by making it the default behavior with a buried notification and I can placate people who care about privacy with an opt-out."
Your entire justification doesn't even mention privacy or caring about users, it only mentions dealing with pesky users who care to get your user-base higher and promote your product. You clearly have very loose ethics when it comes to privacy so I don't think there is much we will agree on. I just hope one day this behavior will be shunned enough that it will stop due to market forces before something like the EU regulates it away.
If people aren't running make with -j, do you think they really want their build to take advantage of all cores in the first place?
My argument is not specific to telemetry, it's a general one. If you have an option to do something that's not a default and it's not part of the software's core functionality, most people aren't going to consider it even if it would be to their advantage. For example, make -j.
It's for that reason that I don't think the "if they aren't turning it on then they don't want it" argument holds as much water as you think it does. That argument groups together three groups of people: people that know about the setting and don't want it on, people that don't know about the setting and don't want it on, and people that don't know about the setting and would be happy if it was on.
Ironically, if you had good telemetry you'd be able to figure out how many people fall in to each group and make decisions about settings based on accurate data. Without that, you're forced to work on assumptions.
> Your entire argument essentially boils down to, "I'm sure I can get away with taking my users' information by making it the default behavior with a buried notification and I can placate people who care about privacy with an opt-out."
I think you're making the assumption that telemetry has to violate your privacy in all kinds of heinous ways and therefore only be a bad thing. If that's your mindset, of course you're going to think that I'm the kind of person that's out to trick and fool people and betray their privacy. And in all fairness to you, it's reasonable to be jaded when companies like Microsoft have horrible things like P2P software update distribution enabled by default. It's reasonable to be jaded when you don't get told exactly what kind of data is being sent back as telemetry. There's a fine line between "this is good" and "you're just relying on people not knowing how to change the defaults in order to try and get away with horrible things" and all too often that line is crossed.
But I'm an idealist. I see the good things that can come out of having telemetry. I want to know if my software has started getting popular in locales that I haven't written translations for yet so I can commission a translation to improve the experience for people in that locale. I want to know if there's a setting many people turn off so that I can consider turning it off by default to match user expectations. I want to know if my users are sticking to older OS versions because if they are I need to keep older hardware around in order to test and provide them the best possible experience.
I don't think anyone would have an issue with software sending back that data (and only that data) if you clearly tell them that's happening, and I also think that most people would be perfectly fine with that being a default behaviour. Of course, there is always going to be a group of people that will have an issue with sending back that data and that's why I made the point about keeping it as an option.
That group isn't just "pesky users who I only care about for promotion" (perhaps I was too flippant about saying that in my original comment). They could be trying to harden a machine so that it only uses the network under known circumstances. At the same time, you'd hope that software intended for use by people who need comprehensive privacy like whistleblowers wouldn't have telemetry at all.
Given that OS X isn't 100% free software, no one should be using brew in that kind of comprehensive privacy situation and so having telemetry isn't inherently bad.
It's the "clearly tell them that's happening" bit that's the issue here. If you have software that has been around for a long time that doesn't do something and then in a new release it starts doing that thing, you need to let people know that! It doesn't matter whether that's telemetry or anything else - if it's something that violates existing expectations you need to tell your users loud and clear.
If you listen to Changelog #223 podcast[0], both Mike and the podcast author showed quite a lot of derision and condescension to the people caring about analytics upload, so I don't think it's getting fixed.
Awesome, so not only does he not care about privacy, he feels the need to deride people that do?
Maybe the reason they needed analytics in the first place is because of this myopic perspective. If you assume everyone who disagrees with you is an idiot, you're going to stall really quickly.
If they want data, I will happily fill up their survey form but please don't take it for granted that my personal data is yours otherwise I may go to China for their great wall.
Little Snitch is excellent for this. I remember running brew at some point and being asked by Little Snitch if it could connect to Google. I said "deny forever" and have never worried about it since.
The way I accomplish this is by setting up an dnsmasq DNS forwarder service. Find domains list for ad blocking / privacy on the internet and add these entries in dnsmasq's config.
> As far as we can tell it would be impossible for Google to match the randomly generated Homebrew-only analytics user ID to any other Google Analytics user ID. If Google turned evil the only thing they could do would be to lie about anonymising IP addresses and attempt to match users based on IP addresses.
Look, by now Google knows my whole stack and what I'm doing with each project, and Homebrew using Google Analytics doesn't bother me much. I still disabled it a while ago and after reading "if Google turned evil" put like a remote possibility I won't turn it back on. There are two extremes in tech — paranoia about everything and magical optimism about everything. This is an example of the latter.
A corporation of Google's scale and relevance to different industries simply cannot afford to not be evil.
Doesn't mean they're absolutely evil and just waiting for an opportunity to partner with a hypothetical fascist government — just that you can't assume the best.
>Damit gilt in Deutschland der Grundsatz, dass die Erhebung, Verarbeitung und Nutzung personenbezogener Daten und die Auswahl und Gestaltung von Datenverarbeitungssystemen an dem Ziel auszurichten sind, so wenig personenbezogene Daten wie möglich zu erheben, zu verarbeiten oder zu nutzen.
Rough translation: Thus in Germany we have there principle, that the collection, processing and using of person related data and the selection and design of data processing systems have to be guided by the aim to collect, process or use as few as possible person related data as possible.
Here, now someone has a list of software installed on various machines and the versions of that software, and whatever else information. It may be enough data to identify me or other users, the computers and the installed software. It may also allow other uses, which I have no idea of.
It's likely that this information spreads to unintended places.
I don't think this is a good idea.
I never liked the idea of using Homebrew, and this makes it even more suspicious.
This is google analytics. It's a bit different than google directly taking the data and using it for google purposes (e.g. streetview cars gathering SSID information for location pinpointing). The data (afaik) is used only by the homebrew team instead of Google.
A more accurate summary would be "Brew is sending usage data to a google owned analytics service".
The data still goes to google. While the expectation would be that the data is only used by homebrew, there's no actual way of knowing that google isn't gratefully helping itself. And given the entire way Google's business works, they probably are.
Given the number of people who work for Google, and Google's internal culture of openness, I find it extremely unlikely that Google is violating its customers' privacy. That would almost certainly get whistleblown.
Does not exist like you think when it comes to projects. People are very secretive about lots of projects for all kinds of internal political reasons. Using data they technically have the right to use in a somewhat scummy way is a perfect example of a project that would be kept out of the spotlight.
Google never cooperated with PRISM. As soon as news came out that the NSA was tapping Google's intra-datacenter dedicated lines, Google announced they would accelerate encrypting all that traffic.
if google didn't correlate analytics data from multiple sources, it would be about as informative as a local piwik install, which is to say "not much".
Perhaps they don't "use" it directly, but surely they use it to improve their statistical profile of the specific users? Does anyone know if the GA cookie sent by brew is same or correlated with the one in the browser?
From the source code it looks like brew uses a UUID generated locally (e.g. from a source like /proc/sys/kernel/random/uuid). I don't see how it could be correlated with anything in the browser.
Opting out [0]
Homebrew analytics helps us maintainers and leaving it on is appreciated. However, if you want to opt out of Homebrew's analytics, you can set this variable in your environment:
export HOMEBREW_NO_ANALYTICS=1
Alternatively, this will prevent analytics from ever being sent:
brew analytics off
I've seen this crop up a few times recently and a vocal minority seems to panic every time. I ran into it the other day installing pm2 w/ npm which downloads an optional package that they use for analytics
https://github.com/Unitech/pm2/blob/master/package.json#L184
Is the problem that they are using google's service? Is it the tracking that people don't like? Is it the messaging/copy to the users? I'm sure there's plenty of people on hacker news that build analytics software for a living. I've worked on email click and open tracking. Doesn't seem terrible if it helps them build their product.
You know...you can already get analytics if your mirrors just agree to share logs. It may not be as detailed or complete, for sure, but using another analytical tools does feel a little superfluous.
> Is the problem that they are using google's service?
This is part of it I think. The brew devs get stats, but so does Google which they can aggregate across everything. Standing up your own analytics server/cluster, especially for a project the size of brew, wouldn't be trivial and I can see why they leverage Google's service.
But I can understand people not wanting to have data sent to Google. I moved everything off Gmail to my own E-mail server back in 2013 and moved search to DuckDuckGo.
Does google simply have access to you analytics data to do whatever they want? I was under the impression that they could only use it in ways that you the account holder defines.
It depends on what you mean by "access to your analytics data." Anonymized, aggregated data is being used to inform product decisions, but engineers don't have direct access to the production database to query whatever they want. There are almost no cases where you'll be granted approval to look at real user data, and if you are granted access, 100% of your activity is monitored.
As a side note, this has been the case at every big company I've worked at, not just Google.
Privacy erodes one complacent action at a time. Every time people secretly (if it weren't subversive the default would be opt-in) funnel information off to Google or some other massive aggregation point, we should raise hell rather than bending over and accepting this hostile behavior.
This really doesn't concern me at all. I don't feel I am harmed by Google possibly knowing what software I install, because if anyone wants to look I keep the script that installs it in my public dotfile repo on GitHub.
Can anyone explain to me how this could be used against someone? I'm asking completely seriously as I don't see any harm in this no matter how hard I try.
Edit: I do agree that the notice should be more visible for those that are concerned about this. It should require user input even if it is just waiting for a key press to give you a chance to read it along with info on how to opt out.
> Can anyone explain to me how this could be used against someone? I'm asking completely seriously as I don't see any harm in this no matter how hard I try.
This is called the "nothing to hide" argument. If you currently have no foes and are generally closer to the elite of a society than to the margins, then these few little data points collected on you (pseudo-anonymously or not) cannot really hurt your.
But as your digital shadow grows (and with google analytics et al being used to extensively, it certainly is) and you drift to the margins of society, possibly developing some foes in the elites in the process, the information about you that is now available to people in power becomes more threatening for you.
Say, you're now a black, female, delivery driver in Detroit instead of a Silicon Valley software engineer. Once you give cause for scrutiny, say, a conflict with your employer, the digital shadow of data-points can be searched to find anything that, taken out of context or not, can be used against you.
Or, put differently in famous exaggeration by Cardinal Richelieu:
If you give me six lines written by the hand of the most honest of men, I will find something in them which will hang him.
It harms everyone because it's one more actor trying to make collection of data look normal, as can be seem from the multiple misguided posts wondering why this is a problem.
It's not normal that software is sending data to some server somwhere by itself by default. Devs used to have some decency and left this opt-in. They see there's little push-back and they're making it opt-out.
Almost all mobile apps and a significant number of desktop apps have analytics now.
I can no longer edit my comment, but thanks for the examples given. I'm not sure if it's changed my views but I think I understand where the anger is coming from and I have something to think about when things like this come up.
Quite honestly, I don't remember brew asking me this on any of my 3 macOS machines, I'm not saying it didn't, but the fact that I'm a 'techie' person and didn't notice or forgot about it worries me greatly.
It probably didn't. I run brew regularly on my machine, and I did not get any prompt asking me to enable analytics. From reading other comments, I see lots of users are in the same position, and homebrew sneaked this in with just a two line notice buried in the verbose output. This worries me deeply, to the point where I'm considering removing this software from my machine.
While I obviously feel the same way, I'd take a step back and see what Brew's response is to this thread / the backlash of what people want / don't want, because it has, overall been a good thing of macOS.
You're right, we probably should be using Little Snitch — just know that it's not a golden ticket! There has been some research about bypassing Little Snitch, for example this one from 2016:
And that has been fixed. It could have been announced in a better way, but is fixed. Are you suggesting people should not use Little Snitch at all? Has the tool lost all credibility forever?
Its looking a bit hyprocitical here to be honest. If you use websites in the web today, you are sending data for sure unless you use a adblocker that blocks analytics as well. Brew collecting data this way is unethical but complaining about these things these days is not going to do any change to be honest.
I would be glad if the people who complain about this dont use any websites that collect data without giving any control to the user. But we all know the real deal.
No dont take it the wrong way. But what are you fighting against? The entire web? You can but you have to understand that the scale is so massive that we can't do anything unless its becomes a massive movement throughout the world. But really? A movement for privacy? No chance. Its simply a first world problem. In my country we have around 22,000 people dying everyday in hunger. Privacy wont even be heard here. So its near impossible.
I don't think we can assume that. My colleague sitting right next to me uses HN often and brew as well. He doesn't have an adblocker and watches youtube with ads. FFS!!
There are many good reasons to run macOS. But if you run Linux, instead of macOS, you will get a package manager (one supported by your distro) that performs most (if not all) of the functionality of brew. I find it frustrating to see people use macOS and rely/complain about brew.
I distinctly remember being asked about this ages ago, and finding it easy to disable, or is this something else? Is it still using google analytics when you've asked it not to?
There is no way that I would want to have this data and this amount of data, recorded and transferred to google. Especially not about software installation on my computers or computers of friends or clients.
Thanks to craplications like NPM etc... I'm sure that certainly has been the case. As I mentioned elsewhere, I can't remember being prompted for this on any of my three machines, but I bet I was and forgot.
Privacy-respecting defaults should be there so that you don't have to spend your whole day reading prompts and ToS looking for that one sentence that fucks you over.
Genuine curiosity question: Are all analytics a violation of privacy? If brew simply sent a message that said "brew was used" with no PII, would that matter? At what point is it an issue?
Honestly, my reaction to that would be "why the fuck is it doing that", possibly followed by a firewall rule or a comment in the source code and rebuild. I'm annoyed when an application connects to the Internet for reasons unrelated to the task it's performing for me.
That said, even a ping to the mothership is an identifying information - at minimum, it already contains your IP address. Whatever additional data may be added to that ping will likely also reveal details about the machine you're using.
Personally, I'm not a privacy nut; I mostly don't care - I leave plenty of information about myself around. But the thing is, I'm doing it voluntarily and as a primary function of some service. I don't like when applications exfiltrate data for reasons not related to their purpose.
They are blatantly admitting they are stealing user data then if they realize most users aren't reading the prompts and accept the defaults (data harvesting in this case).
OMG grow up, you use google every day, you have facebook, twitter and maps on your phone but suddenly you worry because the developer of brew who needs some metrics! The only thing missing now is the Snowden guy to make a blog post about it.
I think you should grow up and realize that not everyone conforms to your worldview. As an example, I don’t use Google every day (I use DuckDuckGo), I don’t use Facebook, Twitter, or Google Maps, nor do I have a phone (can’t justify the expense when I have Wi-Fi access most of the times I need it).
As echoed by many others above, I never gave Homebrew consent to send any information to Google; it should be opt-in. Frankly, I was quite annoyed after updating Homebrew and found it already trying to connect to Google Analytics immediately after. Thankfully, I was able to block the request with Little Snitch and then opt-out of the analytics, but it certainly left a bad taste in my mouth that they’d start collecting data before even letting me opt-out first!
“But the plans were on display…”
“On display? I eventually had to go down to the cellar to find them.”
“That’s the display department.”
“With a flashlight.”
“Ah, well, the lights had probably gone.”
“So had the stairs.”
“But look, you found the notice, didn’t you?”
“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”
Im extremely pissed off by the way they snuck the prompt in between several hundred lines of shell output on update. This is NOT being prompted.
I can live with the analytics. I'd might even consider to leave it on IF BEING ASKED PROPERLY.
All previous discussion says "oh you get notified". The point is, you don't get properly notified. And this is, why this is news again over the old discussion on HN.
The point here is that, like that Ubuntu search we all hated, this is not behaviour we expect from well behaved FOSS application, notice or otherwise.
Google collect obscene amounts of info on everything from government interactions to software preferences. So far it's largely been confined to the web and could be blocked by using an ad or tracker blocker.
Now that software may or may not do this stuff too... it ups the game and the threat.
This will make me less likely to use Brew. Likely no great loss to the team I'm sure.
They do prompt for this. And the document that details this if very descriptive. They have also specifically called out the files used for analytics and even have an environment variable to echo the analytics information.
I think this is paranoia.
And thanks a lot to Max for making this great piece of software.
> This is essentially your pathetic argument. You should be ashamed of yourself
This breaches HN's civility rules badly. You can't comment like this here, and we ban users that do it, so please don't do it again.
Arguing this way is an old internet trope, but it's one that we need to be vigilant about not reviving here, if we want to have civil, substantive discourse instead of decadence. If you have a good point to make, it's not only possible to make it civilly, it will be stronger for it. Calling another user a "useful idiot" is also not ok (https://news.ycombinator.com/item?id=13035349), euphemism or no.
It's clear that you feel strongly about this issue, but not allowing strong feelings to trump civil interaction is the essence of civil interaction.
> Your pathetic argument. You should be ashamed of yourself
I'm not giving a free pass simply because it's open source. Rather it's an informed decision that I'm able to make because they're completely transparent about what limited metrics they aggregate.
Personally, I'm in broadly favour of them collecting basic metrics, their motivations for doing so, and the approach they've taken.
> He might be a nice target if you want to raise a stink and litigate.
I read your explanation downthread that you didn't mean anyone should do this, but the sentence as written is beyond the pale. This is not good community behavior. Please do a better job of considering other people when you post here.
> Looking at the website one of homebrew's developers (or at least the website developers) is based over here in Europe. He might be a nice target if you want to raise a stink and litigate.
Homebrew lead maintainer here. We're chronically understaffed and underfunded. We have analytics so we can make Homebrew better by figuring out how to prioritise security, maintenance and bug fixes on packages based on how much they are used and how often they fail. I've been working on making regular dumps of this data and have been trying to find people who will provide us with non-Google hosting but: we're chronically understaffed and underfunded.
One of our CI machines burnt out last night so I'm now trying to plan vacation time to drive for a 14 hour round trip and pick it up. While I'm doing that someone points out there's someone on Hacker News threatening to sue me. This is for working on software (mostly) in my free time that's used by people (mostly) for their employment and obtained for free. After the last thread on this (https://news.ycombinator.com/item?id=11566720) I had people emailing me to tell me I'm a nazi and should kill myself for weeks.
Everyone here who objects to the analytics is well within their rights to do so (and should either opt-out or stop using Homebrew) but we need to start asking ourselves as a community whether we think treating open-source maintainers like this is more or less likely to encourage more people to work on the open-source software you rely on to do your job.
Couldn't agree more: the shoot first and ask questions later treatment of OSS maintainers needs to stop right now. People who make use of OSS projects need to check their sense of entitlement at the door. People don't realise that a lot of these really valuable projects are run on next to zero budget but one, or a very small handful, of core contributors, who give up their own time to do so.
If things you don't like happen with a project, start a dialogue, fork it, or use something else instead. Don't just wade in making threats.
(Sadly, this has been going on for a very long time: I remember the debacle over .NET 2.0 support in NDoc where the author had some difficult personal circumstances coupled with people constantly hurling abuse at him. Net result: he understandably quit, not just NDoc but all OSS involvement, nobody else wanted to pick up the project, and the project died. This must have been 11 years ago now but it seems like, since then, the problem has if anything become worse.)
Treating your users as idiots is one thing. Users treating you as an idiot is another thing. Neither one of these conditions is acceptable.
As a long-term homebrew user, I am very appreciative of your hard work (and have even contributed here and there), but in that light I am also very disappointed that I have to manually disable analytics and now have a lot less trust in the homebrew project.
In the same way that I would complain if I were to see a bug in your code that radically effects my production, I consider this invasion of privacy - whatever the justification - worthy of complaint, too. Whatever the justification for this, its not an acceptable path to take, and I strongly urge you to consider making Analytics opt-in, and not opt-out. I have no problem with my next 'brew update' pausing, explaining the situation rationally as you have in this post, and giving me a y/n opportunity to proceed with participating in analytics. I do have a serious problem with you assuming that this is okay, not informing me of it in a fashion that makes me aware of the tracking, and just letting it slip in during an update, which I am now even more suspicious of, even though I do them religiously every week, and have done so for years.
You've just made your product less attractive to me as a developer and user. That's a sad state of affairs, and I hope you fix this bug.
> we need to start asking ourselves as a community whether we think treating open-source maintainers like this is more or less likely to encourage more people to work on the open-source software you rely on to do your job.
Exactly this. The homebrew team has done an amazing job for the past 7 years, all free and volunteer work. People threatening with suing the maintainer are poison for the open-source community. But hey, haters gonna hate.
If I understand correctly, the data is anonymized before it is sent to Google [1]. How may people worrying about privacy download from Github using a proxy/vpn solution to hide their IP?
I've listened to the Changelog podcast about Homebrew and first learned about Google Analytics there. I think people do not realize how big homebrew is (in form of usage and traffic). At this scale, the maintainer opted for collecting data anonymously in order to find bottlenecks and improving the entire software stack. You can disable that feature in the same way you would disable cookies in your web browser.
> I'm now trying to plan vacation time to drive for a 14 hour round trip and pick it up
From a happy and grateful Homebrew user, thank you for all the work you are doing. It's really easy for folks to underestimate how much effort is involved in running a project like this. Details like the above should be a reminder that it's clearly a very significant amount of hard, sometimes thankless, work; work which you are in no way obligated to do. There are large numbers of people who owe you a massive debt of gratitude for all the time and hassle that your effort has saved them. And it's quite sad that we (people) tend not to speak up when we are happy, and only make noise when we're upset about something.
It's surprising to hear that a package manager needs local analytics to figure out which packages are used.
On the other hand it was likewise surprising to hear that homebrew was using github as its CDN.
I don't rely on homebrew to do anything, but I am concerned about the normalisation of surveilance and will point that out in a reasonably polite manner.
As both a GitHub engineer and Homebrew lead maintainer: that's not true. You're perhaps thinking of CocoaPods where there was some problems which have been addressed through repository reorganisation work on CocoaPods side and a new API that I added to GitHub for Homebrew that makes it easier to avoid no-op `git fetch` runs now.
Just wanted to say thanks for all your hard work, it's appreciated by many. Maybe this wasn't the best choice, but missteps happen. Sadly, we'll never know how it might have been dealt with if people were able to take a deep breath and open a real, constructive dialogue about it rather than immediately freak out and make threats against you, forgetting (or simply not caring) that, by and large, people like you have been doing them huge favors while demanding nothing in return. Pathetic.
It's easy to get lost in all the hate, as those groups tend to be more vocal. So I'm writing this here just to what I reasonably can to stand up against all the negativity. I feel so helpless just reading all these comments.
Homebrew is excellent software, and the malice expressed in this and similar threads does in no way reflect (my) real world experience of people's perception of Homebrew. I would go so far to say it's probably the best piece of software that I know, it's really easy to use, always works and helpful if something goes wrong. Good software is pretty rare to find, especially open source software with no commercial backing.
There is some kind of dispute about privacy, with commercial companies wanting to collect as much data as possible and some people opposing to it. Is not your project picking the wrong side here? This can be used as an argument later: "see, even popular open source projects do this".
I think that giving up on user's privacy is not an acceptable solution. If you don't have enough time to fix all the bugs and nobody is willing to help then nothing can be done.
What I expect from open source software is that an application should not collect and send home any data by default (I understand that it is not written in the license but still expect this). Breaking this expectation makes me and maybe not only me a little disappointed.
> There is some kind of dispute about privacy, with commercial companies wanting to collect as much data as possible and some people opposing to it. Is not your project picking the wrong side here?
As we're not providing information to any commercial company or selling it I don't think this is a reasonable comparison. If we were collecting information to sell and raise funds for the project I'd object to that, too.
I'm not sure it would be as contentious if you were running your own analytics software. But some more privacy-aware users (myself included) have serious reservations about Google's increasingly pervasive data collection.
The fact that Homebrew uses Google Analytics means that it provides Google with more data, even if you don't hand that data over to other companies after the fact.
We've tried extremely hard to ensure that Google cannot tie this information to other information. It uses a randomly generated UUID that you can change at any time and we request IPs be anonymised using the documented method (and I see no evidence that Google lies about this). When there's an alternative that is provided to us for free, can handle our load and stores similar or less personal information to Google Analytics we'll consider it. So far there's been a few offers of help but none of them have met those requirements.
Do you guys have a general concept of monthly cost for e.g. DigitalOcean or similar, just to handle the hosting and bandwidth for a Piwik backend or something similarly not Google? At your scale I'm sure it is well beyond anything I could afford to subsidize given my present responsibilities, but this might possibly be an easier conversation to have in general were it more widely known what kind of costs we're talking about here - I get the impression that part of the GA-related upset may just have to do with the fact that people don't really understand the magnitude of the problem under discussion. (Or I hope so, anyway.)
Google's business is tying information to other information in ways people don't expect, so your "trying very hard" seems likely to fail. Plus, it seems like you're not trying very hard:
> It uses a randomly generated UUID that you can change at any time and we request IPs be anonymised using the documented method...
You already have a UUID. Why give Google the IP at all? Of course, I don't understand your desperate need for "analytics" unless it's a need for approval. Why can't you take all the blog posts talking about "brew" as approval enough?
I realise it's difficult to remember when faced with messages like this, but there are many of us that use Homebrew, accept the tracking, saw the message, and greatly appreciate all of the work you and the team put in.
Haters will always hate. Keep up the good work guys! People seem to think they are entitled to something when using open-source software, but they're not. It's still the people working and maintaining it who get to take the decisions, and if you guys think you need analytics to provide a better tool, go for it.
I also completely understand the need for default opt-in. I probably wouldn't check the prompt and leave the default, but for a project like this, I honestly don't mind sending usage data.
If you're not happy with the direction taken by the maintainers, find another tool.
A well funded, well staffed project looks very different than an under staffed, under funded one. They have to make harder decisions about what work to prioritize. I think what Mike is saying just emphasizes why they made this choice. What little work is able to be put into homebrew needs to be as targeted and helpful as possible. Analytics help then do that.
So to answer your question, the status of the project and his personal affairs have a huge impact on design decisions.
I'm just saying that you're putting yourself on risky ground, not that I want to do something personally.
In fact I'd even consider keeping GA enabled IF BEING NOTIFIED.
The current notification is the problem, as it is close to invisble. Hiding output in a long log message is a dark pattern, I think it did not happen out of spite (you just used your normal shell output mechanism to show that).
However: Such a chage of policy is more important than the list of files being removed by git. Even opt out is OK IF THE USER IS INFORMED.
Currently it's like arthur dent being informed about the highway being built ... yeah we did display the notice.
I know building an open source project is no fun and I can really relate to your desire to know more about installations and deinstallations. Personally I don't even object to using google GA, just to the way it feels being snuck beside the radar.
> Looking at the website one of homebrew's developers (or at least the website developers) is based over here in Europe. He might be a nice target if you want to raise a stink and litigate.
This certainly sounds like you're advocating for people to - at the very least - harass him.
> He might be a nice target if you want to raise a stink and litigate.
* That Phone in your pocket? "No problem"
* The browser you're using right now? "This is fine"
* Your internet-enabled TV? "All good"
* Open source project collecting usage metrics? "Woah, hold on there buddy"
Get some perspective.
They're completely transparent about which very limited usage metrics they use. You can inspect exactly which metrics are sent. And you can turn off these metrics if you want.
You're not paying anything for the product, and the analytics aren't being used to target you as a consumer, or to pitch advertising at you.
Simply by reading this page on hacker news you've sent more information to a third party than you would using brew.
If you have a judge who is not completly dumb he would laugh you off for such litigation. If you have a judge who still has his staff print out internet pages for him it might get compilicated for you as a contributor. Idiot judge already happened to me, that was an expensive lesson life taught me.
So no, I wouldn't risk sticking my name on something like that.
Because everybody else does it, doesn't mean you don't get away with it. (And over here the big boys google/apple can do a lot of things the small ones can't even begin to try - they have the advantage of getting before a "high enough" court with intelligent judges)
I think the, valid, concern mentioned in the original comment is that the notification of this is not prominent enough.
Also Ideally I'd suggest you should be looking for informed consent before starting to gather information of this nature where an application previously did not do it.
Something similar to Debian's request to submit data for the package popularity ratings that is provided when you install debian might make sense.
Debian's popularity contest package gets it right. You're asked once during installation, with the default being to decline. If you keep choosing "next" without reading or thinking you get a working installation with your privacy intact, and it never bothers you again.
It's used for bug triage, so I don't think bias is a problem. The bias is rewarding people who opted in by increasing the chance bugs in their software will be fixed.
I prefer local tools tracking me. That means I use them, that's why they are installed. And I want them to get better. That's why I'm fine with them tracking me.
A random website, though, I honestly don't care, but if I could choose, I would opt out.
Makes sense, right?
You're pretending there is an inconsistency in their argument when there is none. Let me give you a slightly (ever so slightly) exaggerated explanation of why you're incorrect in your analogies.
* A phone is a communications device. We know it transmits information. Unwanted tracking that exceeds the metadata strictly necessary for the execution of it's purpose and usage of this metadata for any other purpose needs to be opt-in.
* The browser is a communications application. We know it transmits information. Unwanted tracking that exceeds the metadata strictly necessary for the execution of it's purpose and usage of this metadata for any other purpose needs to be opt-in.
* An internet-enabled TV. We know it transmits information. Unwanted tracking that exceeds the metadata strictly necessary for the execution of it's purpose and usage of this metadata for any other purpose needs to be opt-in.
* Open source project collecting usage metrics? Downloading software packaged generates metadata. We understand that, and use the software on that basis. Any usage of that metadata and tracking my usage habits of said Open Source project needs to be opt-in.
>Looking at the website one of homebrew's developers (or at least the website developers) is based over here in Europe. He might be a nice target if you want to raise a stink and litigate
You're perfectly within your right to feel conned by the data collections. I myself opted-out after I saw the message when I installed home-brew few months ago and, yes, I am a bit bummed that I have to take these extra efforts to opt out.
But threatening to sue for a completely free software that people maintain voluntarily? You're not even just saying that there is a potential for litigation, but you actually said that because they're a "nice target"?? HN mods can ban me for all I care, but I have to say you're one entitled asshole, and this kind of behaviour is the reason of why so many open source project maintainer burnt out.
How about try not to assume malice out of people's free software from the very first thing they did that you don't agree with?
Its open source. Why not open an issue, contribute a patch to make the process opt-in, start a discussion, start a fork? Oh, of course, why would we do that when we can attack the maintainers, call their software a spyware, and threaten to sue instead? While we're at it, let's use words like "spyware" and "stealing" to paint them in an even worse light.
I, for one, loves the homebrew project, and I don't want the maintainers to burn out. There isn't much that I can contribute to the project, but calling out this kind of behaviour is the least I could do, and thats exactly what I'll do.
The maintainer already stated it's not a bug, plus the nature of the message implies opt-out was a design choice.
> of course, why would we do
Why start a fork before "attacking" the maintainer? Are you suggesting we need to audit every code base we use? As reasonable as checking every box of morning cereal for glass shards.
The software is spyware. Words like "spyware" and "stealing" are correct, I don't care how it paints the maintainer to describe the situation accurately, they created that situation.
> calling out this kind of behaviour is the least I could do
by "calling out this kind of behaviour" you attacking critics with poor arguments? That's not support, its fan-boyism.
> The software is spyware. Words like "spyware" and "stealing" are correct, I don't care how it paints the maintainer to describe the situation accurately, they created that situation.
No, that is not "correct". The aren't even tracking their user, they are tracking the use of their software. Your use of the words are manipulative and dishonest.
> by "calling out this kind of behaviour" you attacking critics with poor arguments? That's not support, its fan-boyism.
None of the things I called out are "critics". Legal threats because they live in a place that makes them a "good target"? Calling the software a "spyware" because they collect usage data? That is not critic. That is bullying.
If the people doing those things think the I'm a fanboy for calling them out, since clearly logic matters little to them anyway.
spyware : "software that enables a user to obtain covert information about another's computer activities by transmitting data covertly from their hard drive."
Nothing there mentions user tracking. If I don't know information about my computer is being transmitted, it's spyware.
> Legal threats
I didn't comment on this because I didn't say that.
> That is bullying.
I disagree. It is accurate, not disproportionate. Plus you misrepresent my opinion; it is not because "they collect usage data", but because "they collect usage data without your permission"; hence spyware.
> If the people doing those things think the I'm a fanboy for calling them out, since clearly logic matters little to them anyway.
In order to "call someone out" you need to argue the case, so bit early to bring out the ad-homs.
I'm calling you a fanboy because you suggested defending the author is your 'contribution' to his project, which biases your perspective.
There you have it. Stop claiming there is a "prompt".
It's the same as the "3rd party cookies enabled by default" setting on Firefox.
It's the wrong default if you really care about your user's privacy as Firefox claims to do.
It's hard to even find the setting unless you know where it is.
And it's Google who is paying so it stays the way it is and patches that change it are rejected.
It is not Google paying for 3rd party cookies. Firefox tried to change it and it became a publicity disaster for them.
Suddenly they were met with headlines like "Mozilla/Firefox hates small business owners" and they are simply not capable of handling a publicity attack (remember the whole CEO thing?) and they caved in.
It was in nightly 3 years ago. If you followed along on Planet.mozilla you could see the story unfold, but it is hard to find a single source that summaries everything.
But the bottom line is, despite Safari already doing this, Mozilla was met with a lot of hostility when they tried to implement it, and eventually gave up.
Here's an excerpt from Computer world, that'll give you an idea about how things were unrolling back then.
Mozilla has effectively postponed Firefox's controversial third-party cookie-blocking policy for several months.
...
Months ago, when Mozilla's proposal appeared set to debut in Firefox 22, several online advertising groups, including the Interactive Advertising Bureau (IAB) and Association of National Advertisers (ANA), vehemently objected, claiming that the on-by-default blocking would "disenfranchise every single Internet user" and result in the shuttering of small businesses and small websites. An official with the ANA promised that Firefox users would see more ads, not fewer, if the feature was switched on.
> Looking at the website one of homebrew's developers (or at least the website developers) is based over here in Europe. He might be a nice target if you want to raise a stink and litigate.
Incredibly shameful comment. Calling for someone to be sued, and then especially targeting someone who'd be vulnerable. Fucking unbelievable. Flagged this hard.
Man, it would be great if analytics couldn't be opted out of and you insane paranoiacs would leave the rest of us normal people alone. Went straight for the nuclear option: litigation.
This is why I think I missed it on all my machines, it really isn't noticable but it's a _big deal_.
*Edit: I run brew update; brew upgrade; brew clean on at least two of my three machines at least once a day, while I should be more vigilant opt-out is really something I'd expect from Microsoft or Google themselves and not an open source project such as Brew.
Data isn't being sent when I use most of my developer tooling. When I visit a website, I expect it's going to be inundated with garbage analytics so I use things like ublock and noscript.
What I don't expect is a package manager siphoning off data to Google.
A bit offtopic: I'm a recent convert to Brew and I was a bit surprised that it is missing two features that I am used to seeing in a package manager. The first is that there is no way to upgrade casks, and the other one is that there's no way to autoremove dependencies when a package is uninstalled. Since Brew is at this point a mature project, I am guessing these features are not in the plans, which is too bad.
Some of the times it's to make the product better, or better targeted. Some other times it's just for spying on the users.
Let's stop complaining about stuff that someone does and tell you they do it. There are many more that do the same things without telling you.
Furthermore, now that you know, for brew specifically, will you opt out? Is your brew command history so secret that you care more about noone looking at it than helping making brew better? Chances are you use brew quite often. Chances are brew is not perfect. Will you choose to make its progress slower because of no real security reason?
How do we expect open source to become better if everyone is being a crybaby because brew got a history of how often they do brew update?