I highly recommend using uMatrix[1][2] if you're very privacy-conscious. It's the full-blown everything-at-your-fingertips console.
By default, it blocks third-party scripts/cookies/XHRs/frames (with an additional explicit blacklist). You then manually whitelist on a matrix which types of requests from which domains you want to allow. Your preferences are saved.
It is a bit annoying the first time you visit any new domain, because you need to go through a bootstrapping whitelist process to make it work. After a while I find I do it almost automatically though.
I use it in conjunction with uBlock Origin and Disconnect, and it still catches the vast majority of things. As a nice side-effect, I find I keep pretty up-to-date with new SAAS companies coming out!
Any browser plugin is inferior to using a hosts file. Hosts file's blackhole any network request before even attempting to make a connection. These browser plugins only help if you're using the specific browser — they aren't going to help that electron/desktop app that's phoning home. They wont help block inline media links (Messages on a Mac pre-rendering links) that show up in your chat programs which attempt to resolve to Facebook. They also wont block any software dependency library that you install without properly checking if it's got some social media tracking engine built in.
I don't even waste time or cpu cycles with browser based blocking applications. Steven Black's[1] maintained hosts files are the best for blocking adware, malware, fakenews, gambling , porn and social media outlets.
That doesn't stop anyone using IP addresses directly, and I find that a small minority does (but that minority does including, e.g. Microsoft for some Win10 updates).
Depending on your threat model, you might need to go the proxy/firewall route.
Hosts file is a weird middle ground - that has to be installed and maintained on every device, many of which (e.g. iphone/ipad) won't let you do that. It's better to set up a local DNS, which will serve every local machine; and as I mentioned, doing this at a firewall level is better yet.
I started using the brave browser, which i noticed blocks alot of network requests. Loading the NYT took about 25 seconds to finish all requests using chrome. With brave it took 4 seconds. Also brave gives you the option to pay the sites you visit with BAT tokens, if you want. Brave in conjunction with Pi-hole looks like even more secure and perhaps further page load speed.
That's mostly what I use at home too. It works very well.
It doesn't quite work well while on the road sometimes. For those cases, I have a docker running diginc/pi-hole (with some additional hosts file blocking going on), then I point my laptops DNS towards that and am good to go.
Running your own DNS server, and intercepting DNs traffic on your routerC is better for two reasons
1) processes and machines that bypass hosts are also caught
2) a large hoss file takes time to parse, line by line, slowing every DNA lookup. A DNs server should cache the entire better.
I have scripts and compiled utilities that transform HOSTS file to tinydns format. I use nsd as well which uses BIND format. I prefer the tinydns format.
Zone files are more flexible than HOSTS files, but I still use HOSTS as well. I have never had a concern about the speed of using HOSTS. It is certainly faster than DNS.
There is a comment in this thread where someone asserts that HOSTS was never designed to handle "millions of entries".
I would be interested in reading about a user who visits "millions" of websites or otherwise needs to do lookups on "millions" of domains.
I maintain a database of every IP address I am likely to use on a repeated basis, in several formats, with kdb+ as the "master" format. I believe most users will never come close to intentionally accessing "millions" of IP addresses in their entire lifetime.FN1 I could be mistaken and it would be interesting to learn of a user who can dispel this belief.
FN1. If you think about this, it may cause you to question the necessity of DNS for such users. Or not. Times have changed since the advent of HOSTS. They have also changed since the advent of DNS. For example, using "consumer" hardware, I can fit all the major ICANN TLD zones on external storage and the entire com zone (IMO, by far the most important) in memory. This is many, many more domains than I will ever need to look up. Assuming at best I will not live much longer than 100 years, I could not and will not explore them all or even a significant fraction.
If you know which DNS names you will need to know, then yes, there's no need for more than a hosts file.
Until DNS changes.
If I move a server from one IP to another, I change DNS, and in $TTL time everyone's pointing at the new server. Apart from you with a hosts file. How does that work if everyone has a hosts file?
If I say "check out this interesting story on blahblah.com", you don't have it in your hosts file, how do you get it?
I maintain a list of every phone number I am likely to use on a repeated basis, but sometimes I need to look up a phone number I don't know (in the old days this was a phone book locally, and directory inquiries further afield. Now it's ddg and assume they have a website. Which isn't in my hosts file or dns cache, and I've never visited before)
I maintain DNS entries for my home network of a dozen devices -- I host it on my mikrotik, but it's handy to have, when I type "ssh laptop" rather than remembering if it's on .71 or .73. It's one step better than a plain text file, as there's a standard based way to remotely query it. At work I maintain a DNS server with 2000 entries on my network, which is actually hosts file powered, but again I use dnsmasq for the DNS server rather than rsyncing that hosts file to 2000 machines.
"How does that work if everyone has a hosts files?"
In your particular case, I dont know. You have to do what best suits your needs, whatever they are.
Here is how someone else solves the problem of changing IP addresses. For my needs, I actually like this method.
The entire ICANN DNS used to be bootstrapped to a small text file called "root.hints", db.cache, named.root, named.cache, or something else. As far as I know, it still is.
How does one know the IP address from which to retrieve this text file?
Maybe they have it memorized, or written down somewhere, or perhaps it is written into some DNS software default configuration. In all cases, they have this address stored locally.
No remote lookup.
What happens when the administrator of the server that publishes the text file wants to change IP addresses?
This does not happen very often, but it does happen. What do they do? Considering that the entire ICANN DNS was bootstrapped to this one file, and assuming this is truly meant to be a dynamic system, then this is arguably the most important IP address on the internet.
They notify users in advance that the IP address is going to change.
Thats it.
As a www user, of course I would have to do a DNS lookup for blahblah.com. However I do not do lookups for the server with db.cache, for the .com nameservers, and in most cases not for the nameserver for blahblah.com either, and I do not do lookups using recursive caches. If blahblah.com changes its IP address I do not have to wait for changes to propagate through the system via TTLs. I am querying the authoritative nameserver, RD bit unset. If an IP address changes from the one I have stored, I know immediately when I try to access it. (I like being aware of these changes.) If I was relying a recursive cache I would probably not notice that the IP address had changed.
IME, IP address changes happen less frequently than people writing about DNS on the web would have one believe. Hence this system works well for me. Most domainnames I encounter are keeping the same IP address for long periods.
Ideally, if blahblah.com is not changing IP addresses frequently or unexpectedly but needs to make a change, she could publish a notice somewhere on her web server informing users she will be moving to a new IP address, just like the server that serves db.cache.
Also another viable solution although slightly more complicated than adding a line to a file. A question I have is whether you're talking about running your own DNS on your machine or on a server you control?
One benefit to the hosts file is that it travels with you everywhere you go. I have my DNS configured at home, but my hosts file for when I'm at a coffee shop, on a plane, work trip, or vacation.
I work around this with an ipsec vpn to home. Dns is setup on my router with unbound, I just point to that. When at a coffee shop or any untrusted network, I vpn into home.
He refers to a setup where egress DNS traffic is routed to a local DNS server. Thus, regardless of machine configuration, all machines use the local DNS server.
A convenient GUI app for managing the hosts file on macOS machines is Gas Mask, btw. You can have a local hosts file to block your pet peeves, then subscribe to a remote hosts file (such as the one linked above), and activate the combined hosts file.
Windows 8-10 users that use Windows Defender will notice that some hosts file entries will be ignored (like popular domains like facebook.com). You will also need to add an exception to the hosts file in Windows Defender.
I wonder what their rational for this is. I know in the past malware have modified the hosts file to block malware removal tool domains but why ignore entries for Facebook?
I heard that blackholing requests to Microsoft telemetry URLs also has no effect. Any way of finding the unlockable list I wonder.
More likely they just bypass looking at the local hosts file for such names, so the request always goes out to your DNS servers.
Therefore blocking these names by redirecting to 127.0.0.1 will work if done at your DNS server (for instance if you run an instance of https://pi-hole.net/ for that).
Unless of course they make the lookup use specific name servers that they run, instead of the local resolvers that your machine is configured to look at, for those names but that is less likely.
In that last case, you can often redirect those queries if they are standard DNS requests on your router. That's how my local network is configured -- all DNS requests are sent to my Pi-hole instance, except those coming from the Pi-hole itself. Even something like:
See my comment on TAForObvReasons - If you invest in a Safari Content Clocker that allows custom rules, you can effectively do the same thing as a hosts file. Safari Content Blockers prevent network traffic similar to host files works. The only thing you wont be able to block is some app that has a Facebook analytics or Facebook login dependency inside e.g.
Spotify.
This works great for keeping spam off your devices -- off your local network at any rate. Not possible to modify iOS hosts file without jailbreaking it.
On an iPhone without jailbreak you can use 1blocker[1]. Since every browser on the iPhone is basically a UIWebView/SFSafariViewController controlled by iOS, Safari Content Blockers[2] apply globally preventing web visits. Safari Content Blockers also prevent Messages from rendering Facebook content inline.
My 1blocker rule called "Bye Facebook" is:
https?://([a-z\.]*)?facebook\.com.*
I should probably update it to factor in a lot of these other TLD URLs now that I think about it.
> Since every browser on the iPhone is basically a UIWebView/SFSafariViewController controlled by iOS
Most browsers rely on WKWebView, which uses the Safari Nitro JavaScript engine but allows customization of the user interface as well. UIWebView is pretty much legacy at this point, and SFSafariViewController does not allow any customization beyond basic theming.
> Safari Content Blockers apply globally preventing web visits
Unfortunately, this is not true. They only work in Safari and Safari view controllers.
> Safari Content Blockers also prevent Messages from rendering Facebook content inline
Are you sure about this? As far as I know, Messages uses a web view.
Actually I just double checked and I was incorrect. I just remembered that I have another app on my phone called AdBlock[1] which is responsible for blocking requests at the network layer. They run their own DNS server and create a custom list to black hole network requests that match certain formats. If you add Facebook as a custom rule to AdBlock, that will prevent Messages from pre-rendering content and also block any messages to facebook from any service on your phone as long as you're connected to their VPN.
Sorry about the confusion... I'm really doing a lot to keep myself off of Facebooks radar.
I do not believe this is true. In messages the previews need an approval the first time some domain should load a preview and then this setting is stored. AFAIK there is no way to recall the permission though.
> they aren't going to help that electron/desktop app that's phoning home.
What's your threat model? Mine is third-party tracking cookies, and desktop apps don't share my browser's cookie jar. So while technically I can be tracked by IP from a desktop app, Facebook can't tell if it's me or someone else at the same coffee shop.
In particular, one nice thing about Chrome extensions is that they don't apply to incognito windows. I regularly use HTTPS Everywhere in block-all-HTTP-requests mode + an incognito window on wifi connections I don't trust, because the incognito window will permit plaintext requests, but it doesn't read my cookies or write to my cache, so it's sandboxed from my actual web browsing. I can safely read some random website that doesn't support HTTPS with my only concern being my own eyes reading a compromised page; none of my logged-in sessions are at risk.
> any software dependency library that you install without properly checking if it's got some social media tracking engine built in.
... is this a thing? (I totally believe that it's becoming a thing, I just haven't seen it yet and am morbidly curious.)
That works only with JavaScript active which uMatrix blocks for 3rd party. The sites one visits mainly are not known for 1st party fingerprinting (that's mainly done by the ad networks). The extra paranoids (like me) can also block JS for certain 1st party sites.
I use uMatrix only experimentally (I rely on NoScript) but it offers a fascinating flexibility of control if one is in the mood. As well, NoScript is near useless when doing stuff with AWS where uMatrix offers the right flexibility (allow from site Y, but only when fetched from site X).
While I acknowledge that your use case may be confined to browsing the internet, I still don't see what prevents a desktop app from reading your cookie jar.
Edit: your browser history (which may contain your profile URI) might be pretty out in the open, too.
Oh, yes, none of it is sandboxed from an actively malicious app—but an actively malicious app can just ignore your hosts file, too.
My threat model is a developer who includes a standard tracking snippet from a third party but is not going out of their way to reliably violate my privacy at all costs (because they have other features to ship, and the tracking snippet works on most computers). If your threat model includes actively malicious developers, stop running native apps from them at all.
Just browsing through the "fake news" section that hosts file is ridiculous. There's a tremendous number of completely legitimate news sources that are blocked and many that, while lurid, are not in any sense "fake news." The list includes both liberal and conservative legitimate news sites.
uBlock has much the same effect. Requests err in the console with "ERR_BLOCKED_BY_CLIENT" instead of a 404, so technically it is possible to detect this with a first-party JavaScript file.
Block all accesses to non-local (192.168, 10., etc) IP-based URLs. Will break some legit stuff, but, not that much for most users, and they can just whitelist that as it comes up.
IP whitelists could also be aggregated and shared on github similar to the current DNS blacklists.
If you're using uBlock Origin or uMatrix to block third party scripts it won't matter. That's what makes using the dynamic filtering in uBlock Origin so powerful. Easily the best bang for your buck ad blocking that you can do.
HOSTS files are static. They were never designed for blocking ads or tracking. And for all we know, every connection does a linear search through the HOSTS file so the larger it gets, the more wasted time, because it was never designed to have millions of entries.
To add to this, I've seen that stupid hosts file blacklist from SpyBot cause some Windows network service get locked up for 40 seconds every time the laptop was resumed from suspend or booted up in Windows 7. Parsing the hosts file took a relatively extreme amount of time for exactly this reason, massive hosts files are a kludge at best.
I swear by uBlock/uMatrix, but it's amazing how much of the web it breaks and how little of the content of some sites are hosted by the site itself. The web has become very reliant on CDN's.
My public broadcaster (http://www.abc.net.au/news/) for instance is completely reliant on third parties for it's "live story" functionality. It loses half it's functionality at work where twitter is blocked and uBlock kills the other half. It also kills the live stream when it can't load one of the half a dozen trackers on the page.
I'd love the no-script functionality to be merged in too so that I could turn off javascript by default.
You can turn on disconnect’s blocklists in ublock origin rather than run both. ublock origin comes with quite a few lists, but most of them are turned off by default.
I default deny all 3rd party scripts and frames, in addition to the blocklists, and I only sparingly noop relevant domains, the bare minimum to make pages work, on a page-by-page basis.
On top of that, I have Privacy Badger, Cookie Autodelete Decentraleyes and I've turned on first-party isolation.
It's mostly unobtrusive once my most important websites have been properly noop'ed, and it's relatively simple to add temporary exceptions if needed.
I'd prefer a solution that does not just work for a specific browser, but instead blocks all traffic regardless of browser, application, virtual machine, ...
This will only protect you while your on your own network. A lot of the juciest data is about your public location, for that you need something device/browser specific.
There's nothing (except possibly your ISP) stoping you from opening your firewall and using it remotely. I personally run dnsmasq (manually configured, but otherwise similar to pihole) on a VPS.
> There's nothing (except possibly your ISP) stopping you from opening your firewall and using it remotely.
My ISP won't but there are ways around that. The biggest problem I've faced is on the modem side of things, finding something I'd trust to be open to the internet, ideally something I can install openWRT or similar on and something I know will work in my market. It's an options minefield.
I've got a RaspberryPi Zero (WiFi via USB..ugh). Would that be too slow for DNS, or would having my DNS server be local vs remote negate that slow interface?
I use Little Snitch[1] (and its sibling Micro Snitch[2]) for filtering connections at the system level. I don't interact with it too often though, because I rarely install new apps.
Not to say /etc/hosts doesn't work, these days I just find I prefer things with better UX.
To clarify, I whitelist my browser entirely in Little Snitch and delegate to uMatrix and other extensions.
I also don't pre-emptively load in rules into Little Snitch - I have it running in active/interrupt mode, so it prompts me whenever it tries to make a new connection I haven't signed off on before. Unsurprisingly, not very many apps try to connect to Facebook.
Because it is completely impractical. I used LS but it's a waste of time to check and block ads servers or malicious domains, which is why most garbage should be blocked from hosts or dnsmasq.
The maintenance aspect of LS is definitely on the high side and only really dedicated folks will stick to it; if it were to come with auto-updated maintained lists it would most likely be used more
Little Snitch is for MacOS.
As a linux user I desperately looked for an equivalent and found none.
Douane was suggested. It's no good.
What a sorry state of affair. We need a simple app-level filtering solution.
Same story. I have always been dreaming of a Linux equivalent for LittleSnitch. More than a decade has passes since I've switched to Linux, still nothing...
Even better would be doing it on a device. It's a reason to have an intelligent router on your network where you run a custom dnsmasq or whatever, then you cover your phones and all the hootenanny that comes with a digital life. Like your fridge.
Same here. I have zero desire to do any sort of manual configuration any time I visit a new website. Blocking third-party cookies will eliminate like 90% of tracking and uBlock origin and Privacy Badger handle a significant percentage of the rest.
Using uMatrix together with uBlock Origin is a little bit redundant, as uBlock also offers a matrix functionality (enable advanced options). As a matter of fact, iirc uBlock is developed upon uMatrix's codebase by the same author.
I tried uMatrix several times and quit. It's crazy heavy. It takes forever to whitelist parts of a page until functional.
Scripts then XHR then more cascaded scripts then frames..
A hidden beauty of uMatrix is that with a little training, it doubles as a reminder against low-value search results -- content scrapers, marketers you've already decided never to buy from again. Just red-out the first-party web site. Search engines that can't keep up with the SEO-optimizers e.g. Duckduckgo become more usable.
I second this. uMatrix is my go-to add-on to prevent the vast majority of connections I do not want. However, I pretty much always use my browser in privacy mode no matter what (for several reasons), so the bootstrapping process never ends for me, but that's a price I am more than willing to pay.
I don't know enough about how different profiles operate to say with any certainty, unfortunately. So it's possible there is no benefit. However, for mobile there does appear to be at least one benefit. When I recently checked to see what data Google had on me using their free tool, I saw that the only web history they had associated with my device was all from URLs visiting while not in privacy mode. These were all links that I clicked from text messages which will always default to open in the regular, non-private tabs, even if you only have incognito tabs open. That right there is enough for me to always and forever use nothing but privacy mode. For the PC, this is a non-issue as I never log into my browser. Although I do use privacy mode there near 100% of the time as well.
I'm continuing to have issues with frames not displaying unless I completely disable the addon. It's extremely frustrating. I'm in advanced mode and all options are as they should be.
One minor feature I miss from noscript is that (unless I've missed a setting) umatrix can't block site scripts but allow bookmarklets. Though with the new extension API, I don't know if it's possible at all or not.
Yet again, software freedom fighters got there years ago.
Free Software Foundation got there earlier. From publishing https://www.fsf.org/facebook published on on Dec 20, 2010. FSF & GNU Project founder Richard Stallman has been rightly objecting to Facebook for years in his talks and on his personal website at https://stallman.org/facebook.html.
Long-time former FSF lawyer Eben Moglen rightly called Facebook "a monstrous surveillance engine" and pointed out the ugliness of Facebook's endless surveillance (at length in http://snowdenandthefuture.info/PartIII.html but in other places in the same lecture series as well). See http://snowdenandthefuture.info/ for the entire series of talks.
Yes, but where do they offer solutions to transition (emphasis on transition) from what people currently use to a more open ecosystem?
At least in the software licensing arena, having personally visited a lecture from Stallman, I was left with the impression that he wasn't offering a solution, just a vision of a Utopia without any guidance on how to transition to it -- more specifically, how would we make money from open source software, when currently proprietary software is the default for making money.
> how would we make money from open source software
There are many existing examples, so this is clearly a solved problem already. You charge for support, or for feature requests, and so on. That's how SUSE and RedHat make their money.
The flaw with looking at proprietary software's monitisation is that it usually just boils down to "pay for the binary". This obviously won't work with free software, you need to charge for development rather than access (though you can also use a seat-based model where you only provide support for machines that have valid licenses).
SUSE or RedHat are excellent examples when it comes to monetizing OSS for large businesses. What about consumer software? I do not think OSS will survive monetization when it comes to dealing with individual consumers.
And then there's the question of SaaS. OSS exists but a lot of high quality alternatives are paid. I don't think services like Todoist, Pocket, Evernote etc would exists on the open source model you described.
This is only half the story. Several business models around open source are working and proven (e.g. hardware vendors writing kernel drivers) but there is also a lot of unpaid volunteer work that everyone seems to take for granted. GIMP is one example. And probably a large number of the packages on NPM. Or think about OpenSSL, everyone was just assuming it must be well-funded, while nobody was actually funding it.
That definately is one solution, but perhaps made possible by restricting distribution of software in the first place to get a leg up (see comments regarding improvements and distribution restrictions regarding installer scripts in SLS http://www.linuxjournal.com/article/2750). This suggests that it isn't a solved problem, because the initial conversion to an open source model (or free software) with support on the side may have required a different model to start the venture.
None the less, it's admirable, and hopefully a net benefit for everyone.
My point was more about Stallman and co calling foul with regards to software freedom, codifying their own ideal, but not giving directions to reach that ideal. This feels like a safe pulpit to sit upon, where their view isn't falsifiable, useful when they want to say "I told you so", and eventually taking all credit for everyone else's efforts in between to make the end goal possible.
> perhaps made possible by restricting distribution of software in the first place to get a leg up
This is incorrect, you can download the full ISOs for SLE from the SUSE website, with 30 days worth of updates. The source code (and the system required to build it) is all publicly available on https://buid.opensuse.org/. I beleive RedHat have something similar.
I'm also not sure that an interview from 1994 with the creator of Slackware is a good indicator of the current state of distribution business models. Though even in 1994, both RedHat and SUSE were selling enterprise distributions.
Since the past determines the future, the relevant part of the 1994 interview is "... Instead, he claimed distribution rights on the Slackware install scripts since they were derived from ones included in SLS...", which as I understand it, is the restriction of distribution I was referring to (perhaps redistribution is more accurate).
This suggests that the business model benefited from restricting redistribution and modification of the source code, so breaks the assumptions that the business model was purely based on making money from open source, and so doesn't fully support the idea that proprietary software is unnecessary, in the case where we take SUSE as an example of saying it is "already solved".
Given the bang up job we've done so far, I'm not sure people making money off of software has been a net positive for humanity. Maybe it's the idea that money is the only way we can get software that is the problem.
I remember when the internet was mostly people's personal websites that they didn't make money off of, and frankly, the internet was better then. The best websites that exist now started in that era.
Isn't that what the "tragedy of the commons" describes, with the internet being diluted or poluted by content that doesn't add much value?
I think you can probably find a suitable subset of the internet and it still feels like the old days, but then you have to be happy with a much smaller community.
And fair point to regard money from software as a potential net negative, and I just don't have an answer that is objective. There is a lot of software that highlights the creativity of people, and I like it, and am happy to pay for it, and also happy to get it for free. Like paying for books or borrowing them from the library.
> I think you can probably find a suitable subset of the internet and it still feels like the old days, but then you have to be happy with a much smaller community.
I think most people would be happier with a smaller community. Facebook encourages a large number of low-quality connections, which are actually worse than not being connected at all: I don't want or need to interact with my friend's racist cousin I met at a party three years ago or my ex-roommate's mom who always wants to sell me homeopathy supplies. These people are actively detracting from your life.
Those are extreme examples, but even people I might get along with suck up your time. If I am not connected emotionally/socially with someone enough to get their phone number and send them a text occasionally, I probably don't need to give them even a few seconds of my time on a regular basis.
> And fair point to regard money from software as a potential net negative, and I just don't have an answer that is objective. There is a lot of software that highlights the creativity of people, and I like it, and am happy to pay for it, and also happy to get it for free. Like paying for books or borrowing them from the library.
I'd be happy to pay for good internet too, but unfortunately there are few businesses willing to do this. Ad sellers won the race to the bottom on price (a strange game--the only winning move is not to play) by simply being "free" to users. This works because of short-term thinking: users don't think ahead to how ads will affect their lives, and content providers don't plan to grow a business slowly the way a for-pay business grows.
> I was left with the impression that he wasn't offering a solution, just a vision of a Utopia without any guidance on how to transition to it
Stallman quit his job to write an entire free software operating system and essentially dedicated his entire life to it. What more do you want?! I don't even want to imagine a world without Richard Stallman.
That doesn't really offer a solution for those of us that need to get paid. The choice he may have made to quit his job and write free software is anecdotal and simply won't generalise to everyone.
Stallman judges those that write proprietary software, calling this type of software immoral, and yet doesn't offer guidance to get from the current situation to a better situation. Without realistic and generalisable guidance, it is simply self righteousness on his part,
the same as any extreme idealist.
> That doesn't really offer a solution for those of us that need to get paid.
If you have anything at all to do with software you're "getting paid" by using the wealth of free software that makes up GNU Linux, OS X, etc. Free software constitutes a powerful non-scarcity-based model that is extremely good for productivity. Just because he doesn't kowtow to standard capitalist race-to-the-bottom models doesn't mean he's not "realistic".
Why should he come up with a solution to help you get paid? If you're making a living unethically it's not up to me to figure out a way for you to do it ethically. In any case, there are plenty of other people who have figured out how to do that and there isn't the slightest reason to believe that all software being free means nobody would get paid to write it.
He owes me nothing but he's not offering credible proof that his way is applicable to anyone but himself.
And as for calling something immoral which is the result of someones hard work and doesn't impact them if they choose not to use it, simply sounds like sour grapes and self righteousness.
> do they offer solutions to transition (emphasis on transition) from what people currently use to a more open ecosystem?
A comment I made two weeks ago that is pertinent to this discussion:
Niche market software, used by a limited number of highly specialized professionals, is somewhat incompatible with the open source economic model. When a piece of software is used by very many users, and there is a strong overlap with coders or companies capable of coding, say an operating system or a a web server, open sources shines: there is adequate development investment by the power-users, in their regular course of using and adapting the software, that can be redistributed to regular users for free in an open, functional package.
At the other end of the spectrum, when the target audience is comprised of a small number of professionals that don't code, for example advanced graphic or music editors or an engineering toolbox, open source struggles to keep up with proprietary because the economic model is less adequate: each professional would gladly pay, say, $200 each to cover the development costs for a fantastic product they could use forever, but there is a prisoner dilema that your personal $200 donation does not make others pay and does not directly improve your experience. Because the userbase is small and non-software oriented, the occasional contributions from outside are rare, so the project is largely driven by the core authors who lack the resources to compete with proprietary software that can charge $200 per seat. And once the proprietary software becomes entrenched, there is a strong tendency for monopolistic behavior (Adobe) because of the large moat and no opportunity to fork, so people will be asked to pay $1000 per seat every year by the market leader simply because it can.
A solution I'm brainstorming could be a hybrid commercial & open source license with a limited, 5 year period where the software, provided with full source, is commercial and not free to copy (for these markets DRM is not necessary, license terms are enough to dissuade most professionals from making or using a rogue compile with license key verification disabled).
After the 5 year period, the software reverts to an open source hybrid, and anyone can fork it as open source, or publish a commercial derivative with the same time-limited protection. The company developing the software gets a chance to cover it's initial investment and must continue to invest in it to warant the price for the latest non-free release, or somebody else might release another free or cheap derivative starting from a 5-year old release. So the market leader could periodically change and people would only pay to use the most advanced and inovative branch, ensuring that development investment is paid for and then redistributed to everybody else.
I wonder how Facebook devs feel when they read such posts.
Do they feel rejected ? shameful ?
Does their salary really outweigh this collective disapproval
of their peers ?
I actually just got a job there out of school, starting in a few months. Reading these comments is certainly interesting although it's not news to me that Hacker News hates Facebook.
I've long been skeptical of the effects of social media though, and I'm taking this job mostly just because doing otherwise seems like a really poor career choice. Plus it seems like Facebook is here to stay, and I can dream of helping to fix the problem instead of just enabling it.
EDIT: Is HN's Facebook hate getting so heated I'm getting down-voted for sending some good vibes to a newgrad about to start her/his first job?
I bet you all took you first job at Doctors Without Borders helping children in Angola. FFS the Waltons are the scourge of this world but I don't blame the kids going off to work at Walmart. I bet a lot of you pay taxes in the US too-- those taxes financed the war in Afghanistan but you didn't move to Morocco, did you?
Give me a break. Let this kid come in with a good attitude, eyes open, loud and proud. Who knows maybe he'll turn some heads. The guy signed and it is a good career move, what's wrong with cheering the guy up. Disappointed at you HN.
Actually, it is down to you to make a change, and do ethical things. There are ways to influence things which you mentioned from the war in Afghanistan to privacy issues with Facebook. But to do that, you have to care.
That's about all though, isn't it? There's a negligible chance you'll actually fix the problem, unless you manage to leak evidence to the media or similar.
I write software for biologists, am I feel I'm much, much happier doing that than I would be working at Facebook.
I spilled OJ at myself when I read his post. I mean its great he is so gullible to believe he can change something at such big corp, and he reminds me when I was 16 with big head of dreams how to change the world.
But seriously - do we know any single example of an intern coming to a big corp and "saving it" - by that I mean steering it off the dark and deceiving waters and actually bringing it into light for the good of society and people in general??
Getting a job offer at Facebook is a great achievement. Some people I know just moved to US to pursue masters and then apply at Facebook. If you have offers from another tech giants like Google than you can do your own analysis (SWOT maybe) to choose the right options.
Some years down the line you can always switch to any company in the world.
I worked for a less-than stellar online publication in Australia. Think low-budget Daily Mail. I didnt care, I still got paid and got to switch off and do my own thing when I wanted to. I'm not my job.
Examine the present embroglio over Facebook, Cambridge Analytica, weaponised viral clickbait, fake news, radicalisation, etc., etc.
Media are the information sourcing and feedback loop for societies. The print media went through its crisis of awareness in the early 20th century. See especially Lippmann's Public Policy.
I heard once that if there is something - whether of a monetary value or not - that allows you to "sidestep your morals", then you didn't have any morals in the first place.
Then I'd argue that I don't have any morals in the first place.
There are differing levels of how strongly I feel about certain moral values I hold. For example, working for a company that dealt in wholesale killing of others is obviously worse than working for an advertising network. Would I work for doubleclick for $1M a year? Hell yeah. Would I spy on citizens of my country for the same amount? No.
That is what I suspect but still.. Respect in the eye of peers is a human need. If I met a developper and learned he works for FB, it would be visible on my face that I feel somewhat put-off.
'Peer' is very flexible. This could be a comparison to people the same age in other careers.
Also, keep in mind that Facebook engineers are constantly surrounded by other Facebook engineers so their SE peers probably do approve. They collectively don't think Facebook is a problem so they implicitly approve of each other.
I don't use FB and definitely do not like their business model, but it's also true that their users are there voluntarily and most of them are at least vaguely aware that they're being aggressively data harvested by an amoral, profit-maximizing corporation. So I don't see a legit reason to blame or hate on the devs there. Especially when you consider that most of them work on the "good" rather than the "evil" part of their stack.
Except for the folks who didn't make an account whose friends have contributed to "shadow profiles" for them...
I mean, let's just admit that you don't have to be a Facebook user and you don't have to sign a Facebook TOS for them to accumulate data about you, so it's not quite as cut-and-dry as you make it out to be.
As far as the "good" and "evil" parts of the stack... fair point. I think most devs are somewhat abstracted away from the collectively malicious vision, since most of the constituent parts are relatively benign on their own -- "let's identify faces in photos!", "let's automatically identify faces in photos", etc. It's product folks, or maybe even higher up than that, who connect the powerful pieces produced by devs to actually make Facebook the monster it is today. I'd guess that even the devs who have impact on that vision don't really have the power to dramatically sway that vision, they've got a bit of technical input at best.
Still, have you ever worked on a product you don't believe in? If you're just cashing in a check, I guess it could work, but if you're as idealistic as me you want to work on something that's doing good in the world. Especially when so many tech companies proclaim their intent to "make the world a better place" or "do cool things that matter."
Not really. Many people just feel they have to use Facebook to connect with the society efficiently. Some people even consider those who don't use Facebook weird.
That's their problem. I never use FB and it does not hurt one bit. People have to be responsible for their own choices. No one is forced to use FB. Their weak will is not my problem.
The network effect is strong but it does not make FB use necessary or involuntary. You will not be put in a cage, fined, beaten, fired, etc, for not using FB.
Yep, but you might be left out of many events and slowly become a social outcast amongst your friends. As a first-hand non-facebook user, this aspect really blows.
Would you say the same about gambling software? What about software related to selling heroin?
All of these products are designed to be as addictive as possible (to varying degrees). The whole point of of an addiction is that your are there voluntarily. (Not saying facebook is as bad as the things above, just that they are all designed to addict.)
They certainly do make it as addictive as possible, but IMO that still does not mean users are there involuntarily. Persuasion is not the same as coercion.
Doesn't FB keep track of (possibly profile) people who don't directly use FB? They can possibly profile you from using data brokers and information actual FB users give to them.
That's a fair point but I would put much of the blame for that on the US government for not having proper EU-like privacy laws which make it illegal, or at least mandate an opt-out mechanism.
In capitalism it's mostly the customer's responsibility to switch when a company becomes too-evil in some way. Depending on your politics you could, perhaps, argue with some truth that network effects make that extremely difficult for social media sites, so the only solution is to have the government extensively regulate FB, twitter, etc, but I am not entirely convinced. IIRC FB engagement with teenagers and early-20s folks is already declining noticably in the US (though there probably are multiple reasons for that beyond just privacy concerns).
Does their salary really outweigh this collective disapproval of their peers?
Isn't that the going ethos currently?
Something akin to: Pay me as much as possible, don't ask me to be part of your culture, don't ask me to work more than 40 hours a week, don't ask me to take stock, don't have a mission statement, make sure I'm working on something that is engaging mentally.
Depends on what you mean by “going ethos.” Healthy perspective on an employment relationship from your perspective, yes. Anathema to a large number of Silicon Valley startups where you will likely get branded as neither a culture fit nor a team player, as well, yes.
It’s wise in the Valley to hold such an opinion but not make it very prominently known. There are a number of people who want your job and will say something more palatable to your employer, and eat your free meal while shipping a feature at 10pm because they buy into the posters on the wall. Many allegations of ageism (but not all) can probably trace back to something like this, in my uninformed gut opinion, because you will almost certainly get replaced by a new grad when the hammer falls. I don’t even see it as personal, but a demonstration of incentives: they can get a loud 40 from you or a quiet 80 from a newly minted BSc. QED.
Just decline invites and push back on over 40. Expressing that attitude at a typical company within this audience basically paints “please lay me off” on your back. When you’re getting into post-senior titling, or you’re really specialized in a tough req to fill, is when that approach becomes more feasible.
I met some of them. They either don't think about it, or they actively revel in not being one of the proles that aren't in on it. The ones that stress about it eventually leave.
I worked for an AV vendor for years. The online hate it gets is huge and constant (mostly wrong, but some points are valid). And even though I don't like the monetization strategies for example, I believe the product itself was OK, and most importantly helpful to many.
As a Facebook user I obviously don't like what they do with the data, but at the same time I think they provide an OK service that is beneficial to many. I wouldn't mind working as a developer there on what I imagine is overwhelming majority of positions.
I think most of their peers are mature enough to realize that the poor (historic) management decisions of a company (which are now catching up to said company) are not reflective of the personality or character of the hundreds of thousands of employees performing various roles at said company.
Pi-Hole [1] is another nice way to filter domains at the DNS level network wide, if you want a wider reaching solution that supports wildcards. Great way to use an extra Pi if you have one sitting around.
If your router is running pfSense, pfblockerng is the equivalent of pi-hole. You can put the same blocklists into it, even. Though the Steven Black combined list is usually enough on its own.
Debian has had its share of fuck-ups in its package management system. There's very little difference between blindly trusting debian, vs blindly trusting pihole. Don't pretend you check out the contents of all the packages you use.
That is unnecessarily aggressive. Also, besides the snarkyness, it is a bad argument. What you write seems to be an appeal to hypocrisy. To see that, the following analogon might be helpful: One doesn't need to personally comprehend every decision in a democracy to have more trust in it than in a dictatorship, and one can say that without living in a democracy.
Is there a way to redirect to a local HTML file for any blacklisted host file addresses? Something like "You tried to access a site that's blocked in hosts file"? I tend to add blacklists like this then few months later wonder why some site doesn't work.
It's been around since the beginning of time itself I guess. You can try something like dnsmasq. One liner in the conf file: address=/.facebook.com/127.0.0.1
edit:
For Ubuntu this should work (one versions from Trusty and newer):
Bit sloppy because it doesn't pick up the domain names with dashes. But my point was that if you want to blacklist *.facebook.com you shouldn't try to enumerate every single variant of it, that's not durable.
Adblocking is a non-trivial task, but there are trivial solutions.
1.) Install hosts-gen from http://git.r-36.net/hosts-gen/
% git clone http://git.r-36.net/hosts-gen
% cd hosts-gen
% sudo make install
# Make sure all your custom configuration from your current /etc/hosts is
# preserved in a file in /etc/hosts.d. The files have to begin with a
# number, a minus and then the name.
% sudo hosts-gen
2.) Install the zerohosts script.
# In the above directory.
% sudo cp examples/gethostszero /bin
% sudo chmod 775 /bin/gethostszero
% sudo /bin/gethostszero
% sudo hosts-gen
Add a cron job, and enjoy your faster and adfree-er internet. Further, you can add your custom (this FB) block to the local files in /etc/hosts.d, which then will be concatenated automatically.
This is a good thing to enable, but I think that smartphones contribute exponentially more data to Facebook services than laptops and browsers do. Smartphones give easy access to location, background running services, microphone. Even if you block these permissions to the app, Facebook gets the data from their data providers that use Facebook ads.
Looks like a good alternative if you can't get root. It uses VPN to blackhole requests.
If you have root, I'd use AdAway[0] which changes the hosts file directly.
Another great alternative is Blockada[1], it does the same as DNS66 and Adaway but in my experience does it felt much more reliable. It is available on f-droid[2]
Well, he could be referring to the relative changes over time of what is contributed by a desktop computer and what is contributed by a smart phone. Antiprivacy features on phones seem to get better at a much faster rate than antiprivacy features on a computer.
I'd say better than 50% chance that the delay increases. But the phrase would be unambiguous if it were called an "exponentially increasing back off algorithm".
This is unnecessarily pedantic. An exponential back off algorithm has a 100% chance of increasing the delay, that's the whole point. Nowhere other than pure mathematics would I see the phrase "exponentially" and even consider a <1 exponent.
My probability was for hearing the phrase "exponential back off algorithm" without knowing anything about the algorithm. I don't work in that field and had never heard of the term before the earlier post.
Experience suggests that most of the time when people say exponentially they mean an exponent greater than 1, but I have been surprised by what people have meant before so I personally wouldn't say that probability is greater than 90%. That's what I meant in more detail.
Can you explain this a little more? Can this be done on a personal phone? I was under the impression that the hosts file was essentially untouchable on an iPhone.
Google around - 'iphone mobile device management'. There's a service that's free for a couple devices[1]. Apple also makes a (terrible) app called Configurator. There are a bunch of others, but most of them are designed for (and priced for) corporate use.
You need to learn a little about what you're doing if you want to go this route, and there is some setup. But basically, you're taking on the role of a corporate IT department, pre-configuring and possibly locking down the phone.
I set up a profile in Configurator a few years ago and am a little afraid of touching it - that application makes iTunes look thoughtfully designed and stable.
Its actually quite annoying to block all of facebook. There are a lot of innocuous sites that have at least some small reliability on facebook and blocking all of facebook makes using these sites a tad bit difficult / poor UX.
For anyone who's interested, I also maintain a tracking protection list for Internet Explorer. It's based originally on the Ghostery and Disconnect lists, but I now update it independently. It's designed to be concise and speedy, yet also comprehensive. Note, however, that due to the limitations of tracking protection lists in IE, it can't block everything. You may need to supplement it with a small hosts file. Check it out here: https://github.com/amtopel/tpl
If you run your own DNS resolver you can use the wildcard trick.
Something like this in an RPZ zone should do it:
facebook.com IN CNAME .
*.facebook.com IN CNAME .
facebook.net IN CNAME .
*.facebook.net IN CNAME .
fbcdn.com IN CNAME .
*.fbcdn.com IN CNAME .
fbcdn.net IN CNAME .
*.fbcdn.net IN CNAME .
fb.com IN CNAME .
*.fb.com IN CNAME .
fb.me IN CNAME .
*.fb.me IN CNAME .
tfbnw.com IN CNAME .
*.tfbnw.com IN CNAME .
should be unnecessary since the DNS zone above it, facebook.com is already CNAME'd. Most resolvers will take a CNAME as "any further requests go to here", which to my experience usually includes NS servers.
(This is also why you don't CNAME your root domain, CNAME conflicts with any other record type)
> What software actually parses /etc/hosts, at least on Linux?
glibc resolver
A good entry point for reading more about it:
$ man nsswitch.conf
If your /etc/nsswitch.conf file's "hosts" line contains the keyword "files", then it potentially uses /etc/hosts. If "files" is first (typical default config), it looks there first, before the other places listed.
This is done under the hood when programs use resolver functions like gethostbyname or getaddrinfo.
I don't use that list, I use Steven Black's [1] list has 1004 entries which is more complete than this list. It would be less, but more than 16. Even at that, you're right it would definitely reduce the size.
Provide people that care about privacy with a public DNS server they can use that auto blocks those domains (and update's its lists). I would pay for it (few dollars a month)
Feature suggestion: allow people to add their own entries so I can purposely block reddit or hacker news to reduce distractions.
Pretty sure I would set this DNS server on both my phone and desktop.
Can somebody elaborate why this link from 2016 is gaining steam here? Is it because Cambridge Analytica misused FB data? May be I am missing something, do we know if facebook was wittingly complicit?
I'm not a big fan of Facebook but I do find it useful. That being said, this feels to me like a coordinated attack campaign. Take an issue, blow it up, attach various other nebulous bad things and push it to a public that was already primed against that very service. Seems to be working great.
I do not know who or why is pushing this campaign but it definitely feels organized and calculated.
If Facebook is negligent or complicit, I'd certainly change how I interact with the platform.
I haven't kept up with the details and would love to be pointed to some summation / analysis of the facts.
The whole conversation, without having read into everything here in absolute detail, seem to be very tool oriented. Am I the only one here overwhelmed by the sheer amount of domains involved?
It's mostly subdomains since windows can't use wildcards (*.domain.com). Setting up such a large hosts-file might slow down your computer a bit though.
There are some tools that lets you run wildcards in the hosts-file but can't remember the names at the moment.
I'm not sure what you mean, I see 13 different domains in that list, the rest is subdomains of the same 13 domains. You can't count that as "sheer amount of domains". Our company probably have 2000 different subdomains on 5 domains? Subdomains we can create as we want to, it's just some letters before the domain part of the adress. Eg: Subdomain.Domain.com. That is what the wildcard is for, *.Domain.com catches them all no matter how many extra we create. Wildcarddomain is needed for example on an SSL certificate to accept any subdomain for the domain you ordered.
block all of Google's IP addresses: https://support.google.com/a/answer/60764?hl=en (note: your internet (the web) will stop working properly if you do block all of those IPs, which is a big problem)
A lot of sites pull in popular js libraries from google; the idea being that they'll already be in a user's cache and even if they're not, google has a better (cheaper, faster and/or lower latency) CDN than the site author.
I'd like something so I can use the web from China (without a VPN) . Right now a lot of the common JS/fonts/etc from Google break - and webpages go wonky. Is there a way to preload a cache?
In their defense I believe they do tend to host that stuff on domains that do not set or retrieve the regular google tracking cookies. Though there are other tracking methods that they might still be using.
it depends on what sites you are using, but I beg you to try it and I can almost guarantee you that it will break your browsing experience... (even if you aren't using google search or gmail)
Why would you block all the domains but still keep your account that you would no longer be able to access? The account is the problem not the domains. You would have to block the domains on every device you use. Just kill the problem at the source and delete your entire surveillance account with facebook.
Similar solution to blocking things at your local recursive DNS resolver, assuming you have a captive pool of devices, let's say in 10.240.0.0/24) in a LAN, all of which are given DHCP addresses and DHCP-assigned DNS resolvers, and you're in control of a bind9 server that's on the same LAN.
Not going to prevent people with admin rights on their workstations from using another DNS resolver (or VPN, or whatever), but a fairly low effort solution.
Is there a service that, say, subscribes to a live list of this domain set (like adblock consumes easylist) and updates my hostfile automatically?
If not, that is a piece of software that I would find useful and worth paying for (with the ability to audit the software's ability to phone home about the rest of my hosts file)
Your host file, hmm. Maybe something based on disconnect.me. If you're mostly worried about the browser (which seems sensible for most users), you can just enable tracking protection in Firefox: https://support.mozilla.org/en-US/kb/tracking-protection
It would be useful to know how to generate this list in the first place, then just adopt that to create the list on our own, instead of coming back to this github repo to sync every so often.
I do not see this in the repository, presumably to get people to come back to his github repo for updates, but that's my cynicism.
I wrote a small tool that translates AdBlock Plus filter lists into hosts file format [1]. It can only translate simple domain-name rules but might be of interest to people in this thread.
Thank you, I didn't know about the `cat -` trick to read from stdin (works the same as `echo hi | cat /dev/stdin`). Even after all this time, I still learn something new every day.
A lot of commenters mention dnsmasq. I wrote some scripts a while ago to help minimize a dnsmasq config that had been generated from a hosts file. People in this thread might find them useful.
Your list does basically nothing for Google tracking domains. Here is mine: (note that this blocks recaptcha which a lot of websites are now using for login annoyingly). I add entries for IPv4 and IPv6 (0.0.0.0 and ::1 respectively).
Minor segue, is there any easy way to Geo-block URLs, both by ccTLDs and by geolocation of IPs from certain countries.
I have pi-hole running but it doesn't support that currently, best it does is wildcard but even for that it needs domain and won't do just on the ccTLD.
Interesting to see several domain names/servers with 'mqtt' referenced. Wondering if Facebook interacts with IoT devices routinely, or perhaps they use MQTT for Messenger message transfers etc.?
In general blacklists are a better choice overall for non-technical users. Do you really want an angry text message or phone call every time $FAMILY_MEMBER has some site that's rendering poorly because they haven't properly whitelisted one of the 12 legit domains it hits? And do you really trust them to not whitelist some ad & tracking domains?
It's basically a big lookup table, trading storage for speed.
The most noticable effect is that your web pages load faster, because a lot requests for unnecessary data (eg. Facebook in this example) complete immediately. Occasionally you will miss out on a webpage that depends on it.
Think uBlock Origin, but not for just your browser but your entire system.
I can block domains on my laptop, no problem. But I have not been able to figure out any convenient way to block websites on my Android phone. My Android phone comes with a Chrome browser. Any ideas about how to block websites reliably on an unrooted/jail-not-broken Android phone?
Block at DNS level on a device (router or DNS server) and proxy all Android traffic to said device.
I use a pfsense router running OpenVPN and pfblockerNG. PfblockerNG sinkholes all DNS requests to domains from a list such as this one. Then by using OpenVPN I simultaneously encrypt my connection when roaming remotely and I can specify to use my home DNS server to sinkhole ad/tracking domains.
Thanks for the suggestion. I think this will work fine in a home network that I can control. But this is not going to work when I am traveling and using my carrier's 4G network. Am I right? Is there any nifty solution to address the later?
I am a little disappointed that I can't do something as simple as install plugins for my phone browser that can block sites.
I'd like to mention a problem with blocklists like this that you put into /etc/hosts. I've noticed that many sites trivially evade the blocklist by adding a redirect. I.e., if example.com is blocked, but it redirects to example.ru or example123.com or example.team, then it still works. The spammers and advertisers don't have to change all the existing links to example.com -- they simply need to add a new redirect every few weeks.
that's not how /etc/hosts works. the domain listed in /etc/hosts (example.com) will point to 0.0.0.0 (or 127.0.0.1)... you'll never even make it to the server so you won't get the redirect.
Oops, you're right. I discovered that it was my browser that was "helpfully" adding www in front of lots of domains I had blocked in /etc/hosts. For instance, if I blocked example.com, my browser would automatically try www.example.com (which might then redirect to something else entirely).
In my case, I'm using Firefox. I can stop this behavior by setting "browser.fixup.alternate.enabled" to "false" in about:config.
By default, it blocks third-party scripts/cookies/XHRs/frames (with an additional explicit blacklist). You then manually whitelist on a matrix which types of requests from which domains you want to allow. Your preferences are saved.
It is a bit annoying the first time you visit any new domain, because you need to go through a bootstrapping whitelist process to make it work. After a while I find I do it almost automatically though.
I use it in conjunction with uBlock Origin and Disconnect, and it still catches the vast majority of things. As a nice side-effect, I find I keep pretty up-to-date with new SAAS companies coming out!
---
[1] https://chrome.google.com/webstore/detail/umatrix/ogfcmafjal...
[2] https://addons.mozilla.org/en-US/firefox/addon/umatrix/