Hacker News new | past | comments | ask | show | jobs | submit login

Squid is just a god-awful piece of development

This is nonsense. I was one of the developers on Squid for many years, which makes me more aware of its flaws than most folks, and it does have many flaws, even in current versions. But, to say it is god-awful development is pretty deeply ill-informed, and insulting to the small handful of dedicated volunteers who currently maintain Squid (which, I might add, is used by an order of magnitude more people on an order of magnitude more servers than Varnish; it is the most popular proxy cache in the world, serving about 80% of the market last time I saw research on the question).

Squid has different priorities and goals from Varnish. One of Squids priorities is to run on as many architectures and operating systems as possible. Another is to provide an extreme level of functionality, making it a swiss army knife of web proxy caching tasks. A third is being compatible with everything; and in the HTTP protocol world, this means dealing with thousands of edge cases and broken implementations. These things are not always consonant with being the fastest proxy, or with being the simplest design. Edge cases have a cost, which Squid pays because that's one of its goals.

Certainly, Squid has vestigial design decisions that cost it performance and efficiency on modern hardware (though the select/poll loop was the best we had until just a few short years ago, and the VM layer on many UNIXes is retarded; calling the way Squid does things 1975 technology is idiotic), but there have been improvements in a number of areas over the years.

Anyway, I would advise taking the flames of a developer on a dramatically less popular and less capable project with a grain of salt. There are interesting ideas to be found in Varnish and in its developers comments, but his knowledge of Squid is clearly limited, and his opinions are clearly motivated partly by a weird bit of anger toward the much more popular Squid.




I wonder if Squid's priorities are now less useful than they were when Squid was designed, and if that's a source of the derision lately.

Many products designed in the 90s and early 00s took compatibility very, very seriously. Things had to run on everything, and a lot of engineering effort went into patching over the differences between platforms. So you got things like Java, autoconf, Squid, Apache, Linux, various GUI abstraction layers, etc.

Now, it seems like the pendulum is swinging towards performance, simplicity, and ease-of-use, and people are saying "fuck compatibility". Since so much software has moved to the server side, people are just standardizing on one type of server (usually some Linux flavor), saying "We'll develop for this", and ignoring everything else. And as a result, products that used to be essential because things really needed to run on everything are getting a lot of hate for their byzantine config options.

I see this most with autoconf, another project that has had lots of abuse thrown at it, but it seems like it could apply to Squid or Java as well.


I think that is extremely likely, and autoconf is an excellent example. autoconf hate is almost nonsensical when you realize what it replaced (though, I kinda hate it, too, because it's very confusing, but definitely not more than what it replaced...). In a world of people only building for Linux (and where Linux has mostly standardized to the point where building for one is the same as building for any other), it seems needlessly complex, but in the world of Solaris, AIX, SCO, Irix, Linux, FreeBSD, OpenBSD, NetBSD, and a bunch of others, it was simply necessary. If you didn't let autoconf handle some of it, you simply had to write your own crazy shell scripts to figure things out.

I think it comes down to younger folks not having experienced the pre-Linux world of the server-side (or at least the pre-Linux domination). There was a time when there were dramatic differences between the various UNIXen, and only something like autoconf could make an Open Source project build on anything approaching a majority of them.

I think a good example of a weird characteristic of Squid is its cache_dir directory layout. By default, it'll distribute files into a directory/subdirectory/ hierarchy. This doesn't really serve a useful purpose on modern systems (it doesn't really hurt anything, either, it just looks weird to modern users), but on some UNIXes back when Squid was being built there were dramatic performance effects from having huge directories full of thousands of files.

Likewise, the whole virtual memory argument that the original article talks about is complaining about something that was intentionally added to the Squid design in order to address the horrible characteristics of several major UNIX VM implementations (even now, having a good VM layer is not something you can just assume).

I feel a little bit of unease when I see projects that do choose the "we develop for this" model. I'm not sure it is a useful feeling, but I think we probably should keep our options open. I've made cross-platform a religion for most of my developer life, and it is perhaps less important now...but I don't think it's wise to just say, "OK, it runs on the latest Linux kernel. Fuck it, let's ship it. Screw everybody using anything else."


Now that DVCS has made branching and merging so much easier, maybe a middle ground approach to portability becomes reasonable: the trunk can support only Linux and people who want to run on other OSes can maintain appropriate branches. Like OpenSSH and OpenSSH-portable. (This also prevents minority platforms from externalizing their costs.)


Interestingly, Robert Collins has been one of the core developers on Squid for about a decade and also was one of the original developers on the Bazaar/bzr/arch project, and is currently employed by Canonical to develop Bazaar. And yet, to date, I think Squid still uses CVS. I think that's irony, or something.

In the past there were branches for oddball systems, like Windows. But they eventually merged into HEAD. I'm not entirely sure DVCS solves the problems that having multiple branches introduces. I'd need to see some numbers or something. OpenSSH isn't in a DVCS, is it? So they aren't a data point in either direction, though I wasn't aware of OpenSSH-portable.


OpenSSH is a bit odd, too. It was pretty much started by OpenBSD people, which makes the team more focused on OpenBSD; and the -portable version doesn't just make it compile on Linux/Solaris/..., but also adds PAM support and other stuff. Most teams are more heterogenous, and most programs don't need to integrate that much with the OS.


I don't know, I always thought xmkmf was the more elegant solution.



>But, to say it is god-awful development is pretty deeply ill-informed, and insulting to the small handful of dedicated volunteers who currently maintain Squid

Not to poke at Squid (I'm far from qualified), but that's a horrible argument. There are plenty of extremely-highly-motivated people working on god-awful projects, believing they're being brilliant and revolutionary, when they're in fact re-doing something solved a decade earlier, built into the operating system they're using, and theirs runs 100x slower than the average naive implementation found by Googling.

Insulting it may be, but wrong it may be not.


Hold up...Which is a horrible argument?

I said it is deeply ill-informed. I said this after explaining that I was a Squid developer for many years. I was making a statement based on experience, not pleading a case. If you believe I'm incapable of determining whether a project is god-awful development or not, that's fine.

And then I said that it is insulting to dedicated volunteers. This was not an argument for why Squid is not awful development. I explained that later. I was simply stating that it's insulting.

And, as for this:

"when they're in fact re-doing something solved a decade earlier"

Yeah, in a lot of cases Squid is the "solved a decade earlier" example here.

"built into the operating system they're using, and theirs runs 100x slower than the average naive implementation found by Googling"

If you believe Squid fits this description you're really not qualified to comment on it.

As I said, Squid has many flaws, but it was not designed or developed by children or idiots. It was built by competent software developers who wrote many of the papers on the topics that it addresses, and invented many of the techniques that are now standard in proxy caching software. Varnish takes more ideas from the Squid developers than Squid developers could ever take from Varnish, whether he knows it or not.

Squid does have legacy and baggage. Squid also has capabilities that Varnish would never have need for (ICP, for example, cache digests, hierarchy features, etc.).

"Insulting it may be, but wrong it may be not."

And I said that it is both.


As an entrenched-developer in a project, you likely have emotional baggage about your project. You're more likely to unjustly-defend your own project. I probably should've included the used-by-80% portion as well, as it's part of the claim that it's not god-awful.

As I've said, I am not qualified to comment about Squid, nor was I. I was commenting about some / other developers and projects, not pointing a finger in any particular direction, and pretty clearly referring to extreme cases.


I haven't been a Squid developer since 2006. I'm defending it in the same way I would defend Apache, or BIND, or MySQL; and I'm doing so as a Linux/UNIX/Open Source old-timer that remembers what it was like before these projects existed, and realizes how much the developers of these projects have given us over the years. It is an institution for a reason (or a lot of reasons), and the people who built it and currently maintain it are deserving of a modicum of respect. And the software is deserving of an honest assessment of its flaws and strengths, rather than unbacked assertions of being "god-awful" or "1975 technology".


But, to say it is god-awful development is pretty deeply ill-informed, and insulting to the small handful of dedicated volunteers who currently maintain Squid

PHK was obviously referring to a specific issue; Squid's implementation and usage of caching in its particular area of application. Varnish is obviously better at that. Squid is obviously better when it comes to portability and flexibility.

I'm going to think out loud here, and please take a minute to consider this, because I'll just try to be honest, not mean or elitist: Is it enough for my software to compile and run correctly on multiple platforms for me to call it 'portable', or should I take advantage of the characteristics of each platform?

I know that when it comes to my work, I choose the second, and I'm guessing that PHK would too. The "1975 programming" bunch, well, maybe not.


"Is it enough for my software to compile and run correctly on multiple platforms for me to call it 'portable', or should I take advantage of the characteristics of each platform? I know that when it comes to my work, I choose the second, and I'm guessing that PHK would too. The "1975 programming" bunch, well, maybe not."

Squid supports async IO threads on Linux, has experimental (maybe stable by now) support for epoll on Linux and kpoll on FreeBSD, it automatically chooses the best option for poll/select based on the platform it is building on, uses the best available malloc (and will use alternatives if available and the build system has a retarded one; it used to include dlmalloc in the source tree just in case no acceptable malloc was available but that's probably gone by now), can run as a service on Windows, etc., etc., etc.

So, Squid does both, in many regards. It not only builds and runs reliably on pretty much every platform you can throw at it, it runs better on platforms that provide the mechanisms it needs to run better (Linux and FreeBSD, in particular).

It can't trivially replicate the dumb (but very fast) pipeline from net to memory to net that Varnish uses because Squid has a dozen or more access points into the data passing through. It could certainly be cleaner and simpler, and there was a zero copy patch running around for a long time for Squid 2, which never reached stability (due to the huge amount of code you have to touch to make such an idea work). Maybe Squid 3 has done something about all that, I'm not sure.

Squid is not incredibly fast, but it's definitely able to take advantage of better and more modern platforms.


Fair enough, I was just going on the description given in the article. Yours sounds a lot more reasonable -- I should have checked it out first. That said, everything I said about optimization still stands (and my comments on the VM layer are still correct).


Ya but the major point of the article was to point out that certain things Squid does ostensibly to improve performance actually end up being counterproductive. Do you disagree with that?

According to your logic, since more people use Windows, it is clearly a better engineered piece of software than Linux, right?


The things that Squid does to improve performance do improve performance on many platforms and for many workloads. This is demonstrable, and was supported by evidence at the time the changes were made. Squid didn't spring up from nothing yesterday. It was developed over a decade by a number of very smart people.

Do they improve performance on the latest version of Linux in the particular workloads for which Varnish is designed? Probably not.

"According to your logic, since more people use Windows, it is clearly a better engineered piece of software than Linux, right?"

Where do you see me stating that popularity equals better engineering? My only comments about popularity were about the motivations of the Varnish developer in his criticisms of Squid.

But, I would argue that one of the reasons Squid is more popular than Varnish (and any other proxy cache) is that it works in so many situations and does so many things. It doesn't have to do with engineering, at all, but answering the needs of a broad spectrum of people. It also helps that it has reliably been answering the needs of a broad spectrum of people for over a decade.

Squid is a goddamned institution, and while there's plenty of room for other proxy caching tools, like Varnish, it pisses me off to see folks hurling pointless abuse at people I know to be extremely good software developers (two of whom probably have written an awful lot of code in places that everybody here uses on a regular basis...if you use Ubuntu, for example).


Isn't Squid like, 5 or 6 years older than any other open sourced proxy cache? That would make it more popular for the same reason that Apache is more popular than other open sourced webservers: name recognition? I remember that for a while, the Squid was the only proxy cache of which I knew. and the only one I had ever attempted to install.


Squid's ancestor (a caching component of the Harvest project) was the first web proxy cache, period. One of its developers (Peter Danzig) went on to NetApp and produced the NetCache, which was the first commercial proxy cache. The Open Source option predated the commercial variant by a couple of years.

There have been numerous proxy caches that have come and gone in those years, and there will likely be numerous that come and go while Squid continues doing its thing.

"That would make it more popular for the same reason that Apache is more popular than other open sourced webservers: name recognition?"

Do you really believe name recognition is the only reason Apache is the most popular web server?

It doesn't have anything to do with the huge array of capabilities that Apache has that no other web server has? Or the broad ancillary tools support? Or the huge pool of knowledge available for it? Or that it is proven reliable? Or that it can safely be expected to exist and still be a viable option in five or ten years?

But, to answer your question of whether I think Squid is popular for the same reason (or reasons, as I believe is the case) Apache is popular: Yes.

Squid is popular for exactly the same reasons Apache is popular. It is a reliable product that serves a wide variety of users very well. It is well-maintained and has a long history of being well-maintained. It is well-understood by a lot of people, so it's not going to be neglected or a source of contention if the IT guy moves to another job. It is fast enough for a lot of uses. It is extensible via a number of scriptable access points, so developers can easily make it do what they need it to do. And, despite the accusations of horrible design, it has proven itself to be quite resilient. The number of security issues in Squid, for example, has been truly miniscule in the past decade.

The funny thing is how many people are acting as though Varnish is going to somehow take the place of Squid as soon as people realize that Varnish is "better". For one small subset of problems for which Varnish is specifically built, it may make sense. But, for a huge array of other proxy and caching problems, Varnish isn't even a contender.

I should maybe explain that my first company was a web caching proxy company, building products based on Squid. I deployed Squid several thousand times over seven years. Varnish would have worked in maybe a few dozen of those deployments. Squid solves a ridiculous array of problems, even though it solves none of them as well as a purpose built application could solve any one of them. Varnish solves its handful of problems very well...and does nothing for all the rest of the use cases.


"Do you really believe name recognition is the only reason Apache is the most popular web server?"

I think that that's a huge component of it. I know that I've been in a number of shops where they deployed Apache, not because it was the best tool for the job, but because it was the only tool of which the admins knew, and they didn't know where else to look.

I no opinions about Varnish vs Squid, but I will point out that I had no idea what Varnish was until I read this article, but I've known what Squid was for a while, so they're not competing on equal terms.


eh, from a SysAdmin point of view, popularity and familiarity are reasonable arguments. Certainly not the only arguments you should take into account, but the ease with which I can hire people who know the system, and the ease with which I can use a search engine to find answers to my problems is very relevant. I can't say I've been using squid for 10 years constantly, but I've used it on and off for 10 years. Just about any reasonably experienced sysadmin has used it some.

Caching problems can be a little tricky, and having someone experienced with the caching system when something goes wrong can help a lot.

I'd say squid is a bit like NFS. It's not the best system possible in theory, but goddamnit, it's been tested to hell and back in systems much larger than I'm dealing with, and we know where all the problems lie.

(Also note, I believe Windows is only more popular on the desktop; Windows has never been a majority player in terms of webservers and other public Internet infrastructure, unless you count private intranets. in the public Internet server space, windows is fighting with sun and apple for the scraps. Linux is the big player here. )


"(Also note, I believe Windows is only more popular on the desktop; Windows has never been a majority player in terms of webservers and other public Internet infrastructure, unless you count private intranets. in the public Internet server space, windows is fighting with sun and apple for the scraps. Linux is the big player here. )"

No, you'd be surprised. Windows, (running IIS) actually has about a quarter to a third of the public webserver market share: http://news.netcraft.com/archives/2010/07/16/july-2010-web-s...

ASP.NET is actually very popular.


huh. so it does have a reasonable lead on sun and apple. Still, according to that graph, IIS has something less than half the market share apache does. (I know you /can/ run apache on windows, I just don't know anyone who does.)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: