To be fair, after coming back to C++ after years in the world of Python and Haskell, it's not as bad as I remembered.
C++ is actually a moderately effective functional programming language. With reasonable knowledge of the standard data structures, and using BOOST, one can actually write fairly expressive and effective C++ code.
Most of the time, I feel like all I'm doing is writing a much more verbose version of Python. The rest of the time, I'm hunting nightmarish segfaults and segfault and template errors.
The question isn't "Can C++ work?" or "Will C++ be faster?" because we all know it can work and that it will be faster. The question is "At what point does the extra development time become more expensive than just adding another server to my Ruby/Python/C#/Whatever App"
For me, someone who learned C++ in school but has been in managed enviornments ever since, I'd be too scared to use C++. I'm not ashamed to admit I get a lot of help from the Internet when I'm stuck on a problem and that help is available because everyone else is using the same tools I am. Using C++ to code web apps puts you out in the wilderness as far as I'm concerned and I just don't think I could do it.
And it's not just the dev time: with C++ (and C, and any notorious unsafe language..) you've got to be extra diligent about security, which uses even more time, and requires expertise not just in C++ but also how to write secure software in low level languages... a skill many C/C++ devs lack, unfortunately.
But then your expertise is not that different from your expertise. Your perl will probably still be faster overall than a beginner's haskell, not for the same algorithm, but for solutions to the same problem. Knowing which algorithms perform well for a given problem solved in perl is the most relevant part of expertise.
Chances are that the majority of developers won't find the most efficient algorithm to solve the problem.
(vs. the most efficient implementation of that algorithm... which is clearly going to be in a combination of C and assembly anyway).
Anyway, the guy using the good algorithm ends up with the fastest code. Different compilers will produce different quality of code.
Question is, if I am using C++ vs. a guy using Perl, will I ever reach the efficient algorithm? I might call it quits when I finally get something to compile and not segfault.
(Of course even this is kind of a strawman, because the perl guy can just reimplement in C++ when he figures out that he wants more speed out of his good algorithm).
I don't get the constant complaints about the GIL. Letting your Python program run on 2 cores will make it 2x faster at best. Rewriting it in, say, Javascript or Lisp or Haskell or Java will make it run 2-50x faster on one core. After you get your 50x speedup, then you can worry about the 4x you'll get from buying 3 more processor cores.
(And oh yeah; it's only shared-memory concurrency that things like the GIL affects. If you have a job to do that wants to use 8 cores, split the job up into 8 parts and invoke your program 8 times. There's your 8x speedup.)
That's if you're CPU-bound. I don't use Python, but I made an image acquisition program in C++ which could be a relevant example. We wanted to save the images to disk in real-time (30-60FPS). Doing this in the acquisition loop would make the software unusable (the goal is video-rate confocal microscope imaging); it's far too long, and much of it is just due to disk writes being slow, not to the compression time. Using a thread pool was the solution, not because of an actual increase in speed, but because from the loop's POV the write went from blocking to non-blocking so the CPU stopped wasting time waiting for the disk.
We also wanted shared memory since there can be a lot of image data which is shared between the image compression & saving, display, and possibly statistics or filtering modules.
Dunno about Python, but Perl has a library to do all disk writes in a separate (p)thread, so your main control thread never blocks on IO. (This is in addition to the usual event-loop tricks; I know Python can do nonblocking IO that way.)
I can only tell you why _I_ am constantly complaining about the GIL. It's because I would like to use a Python/C combination for in-memory data analysis. C gives me the speed and memory efficiency and Python gives me the ease of use and the web stuff.
There is no 50x speedup to be had as it doesn't get any faster than C. The only significant speedup will come from parallelism. 8 cores this year, 16 next year and probably a 100 cores in a few years. Since I'm holding a lot of data in memory I can only run one process not many unless I implement each and every data structure on top of shared memory, which I'm not going to do because it's unproductive.
I cannot use Java or JavaScript or any language that doesn't have value types (i.e. structs and arrays of structs) with a well defined memory layout. I don't want to use Haskell because my problem doesn't lend itself to functional programming as it's inherently stateful. I feel I would have to fight the nature of Lisp to make it use as little memory as C. It makes no sense to use Lisp when I need to know how lists are laid out in memory.
The only realistic options right now are pure C, pure C++ or C#. Go does have all the right properties as well. It's very immature at this point though.
I know this is possible. But I'm having trouble to imagine a web app design (or other network server) based on that idea. In order to use a lot of data in memory I would have to have a single Python process that gets called by nginx or apache. So that would be a bottleneck even before I get a chance to call my extension. And later there wouldn't be much code left that executes in the Python interpreter, which kind of defeats the purpose.
But I have to admit that I haven't fully thought this possibility through. Maybe you're right that it can be made to work.
I guess my point was that the fast languages are generally not the ones that are good for developing the fast algorithms. It is harder to experiment in a language like C than it is in a language like lisp or perl.
The other thing to point out is that there is no reason for memory usage, concurrency or speed to be problems in dynamic languages. These are all issues with the implementations of compilers/interpreters that we are using.
It just so happens that dynamic languages have only recently come back into vogue, and we have forgotten (at least in ruby and python) all of the work that was done to create efficient implementations of dynamic languages.
Examples being how well Lisp stacked up against C as early as the 80-90s, projects like StrongTalk, stack based languages like forth... the multitude of papers on efficient scheme implementations.
Dynamic languages were declared 'slow' and therefore were dumped in favor of C by most programmers. This has caused a gap in the knowledge that we have about implementing dynamic languages. Which is a shame, because there is a lot out there for us to relearn.
There are some fundamental problems with making dynamic languages run fast. Being able to prove that some variable will never contain anything other than a 32 bit int allows the compiler/JIT to do things that it cannot otherwise do.
The only way to make dynamic languages as fast as statically typed languages is to selectively remove dynamic features. A few type hints can make a huge difference.
I'm now used to first class functions, currying, etc. C++ actually has them, hidden in the stl and boost::function. Whatever I can't do using those, I can usually do by creating a family of classes which only implement operator() (I've only needed to do this once or twice, and it might have been avoidable).
Due to using a language with native dicts (and syntactic sugar), they are now part of my vocabulary. This means I immediately reach for std::map or boost::unordered_map when it makes sense.
Similarly, python generators and haskell lazy lists have made view sequences as reasonable objects to generate and iterate over. Custom iterators are just the same thing with added verbosity.
If I never used higher level languages, I'd still be treating C++ as C with objects.
In C++, I'm more likely to write a struct/class which implements operator() than an actual function. It lets me write code like this, which does mutate state, but I put it off as long as possible:
I imagine that code would give a lot of people fits. In my current project, I ran with a more functional approach to C++. I don't know if I'll keep everything I've used in this programming style, but I certainly will keep some of it.
One nice thing is; in a fast language, tests run quickly too.
Also, Python and Ruby don't have the equivalent of an ASSERT (you could roll your own but it's not standard practice).
Also, C++ has standard hash lists and complicated data structure are actually going to run quickly.
And while memory management can be a pain, you are doing it yourself so you can track down memory leaks rather than having them be inherent in the interpreter...
C++ doesn't actually have a standard "hash list" (I assume you meant hash map or hash set?). Depending on your implementation, you either have hash_{map,set}, unordered_{map,set}, both, or something entirely different. tr1 specified unordered_{map,set}, but tr1 is only technically a proposal, not a standard.
I think a safe assumption is that tr1::unordered_{map,set} will be available in all implementations by now, but until C++0x is ratified and implemented by major compilers, you will run into different platforms having (possibly) different implementations. And lets be honest - even after C++0x has been implemented in major compilers, you will still have subtly different/buggy implementations.
edit: I had conflated {hash,unordered}_{set,map}. Fixed
Nah, the committee isn't going to change unordered_set or unordered_map; they're struggling with far more urgent things, to finish the std off. Those have already gotten the full treatment, they have the usual std:: collection interface; what's to change?
RubyOnRails is very test gunghu, but to run a hello world test takes seconds, anyone know why? This seems very odd. Is this one of those do what I preach, not what I do things?
I doubt it takes seconds to test a hello world program in Ruby. However, to test a hello world Rails application is a different story, as you have to start up the web server, which takes seconds (even though it eventually just serves a simple page).
You're right and I agree. But let's be careful not to equate Rails with Ruby. Rails tests may have a slow startup time, but Ruby tests do not and other Ruby web framework tests may not.
Because if "performance doesn't matter" is repeated often enough people actually start to believe it, and if enough people believe it, it becomes false. Is there a name for this logic? :-)
It's perfectly sane to write web services (computationally heavy processes accessed via REST) in C++. But for the algorithm-light, marketing-heavy frontend to such a service, use something easier—the optimizations C++ offers aren't worth it there.
I don't have any experience with it, but I'm under the impression that MSVC / C++.net has all the goodies you'd need to do the easy frontend stuff also?
Sort of - I don't think that the web templates and CodeDom are there like they are for VB/C#. It's also kind of a mess going between managed and unmanaged code, which you'd be doing extensively if you want to use the System.Web namespace. Managing references on the managed heap is an extra hassle that has a fair amount of complexity. To top it all off, the support for the managed C++ extensions has been lackluster.
I thought at first that the article was sarcastic :-| I mean, what's taking time is not whether you use ruby python or c++ but the http request, request to the DB, IO or javascript loading..
And anyway, the part that really need to be optimized can still be done in C even if you use python.
There was a time I was a C++ guy who wanted to control memory and everything.. but now, I've got other things to do. If I can write 1 line that is more readable and cost less to type, why should I use C++ ?
And by the way, C++ isn't a verbose python. And, even if I once thought that boost was the best thing ever made, I feel that it's a waste of time. Instead of using meta-programming hacks to use lambda in a clumpsy/ugly way, why not simply use python or scheme ?
What has not been mentioned at all: how C++ webapps will spell the end of XSS and SQL injection as hackers refocus on the much more interesing but almost-forgotten buffer overflow vulnerabilities.
Such vulnerabilities still exist in PHP, Python and Ruby, because they are written in C. And they are much easier to exploit because almost every web app uses one of these languages.
Nonsense. Just because the Python interpreter is written in C does not mean that you can overrun Python strings and smash the stack like you can w/ C strings.
I believe it does however mean that your application code has to be GPLv2, and thus can't be linked to code using some popular licenses, for example Apache 2.0.
For a two to three years i was running in a VPS server in VPSLink which had about 64MB of RAM - for everything and without swap space. The target use of this VPS was probably as an email server or something as most people recommended the more expensive 128MB and 256MB plans for serving (static) pages. And actually there were only about 40MB left since the OS needed some memory for itself too.
Personally i thought that with better resources management i could do much more, so i wrote a custom HTTP server in that could fit in less than one MB of RAM. Most of my pages were generated offline using a custom program in FreePascal.
The server could also execute CGI scripts, so i also wrote a forum in FreePascal.
According to my logs, the whole system ran out of memory only once :-). Until the day i decided to give myself a little more features (when i got a much better VPS from Linode) i had about 5-6 sites running (different domains), a Subversion server and a few "dynamic" apps.
The forum can be found here. I still run it in my new VPS, although it got some spam. The a + b = ? anti-spam feature was new when i wrote the forum but seems that bots got better :-P.
Interesting to see that you used FreePascal! We are also experimenting with FreePascal and (fast)CGI. My collegue put up some sample pages: http://services.cnoc.nl/lazarus/index/fclweb
Let's say C++ can code be executed OVER 9000% faster than python.
However this does not imply a web app written in C++ will run even 1% faster than a web app written in python unless the performance bottleneck is code execution.
If the performance bottleneck is instead the database server (which it almost always is) then choosing C++ for your next webapp would be a _very_ masochistic premature optimization.
Way back in 2000, when we launched Planetarion (http://en.wikipedia.org/wiki/Planetarion), we used C++ with a custom webframework, and CORBA for communicating with the database.
At our peak late in 2002, we served about 320 million dynamic webpages a month using three desktop Pentium 3's for webservers, and a dual CPU P3 for the database. No caching, as that wasn't needed.
Blazing fast, and not all that difficult to work with once the basic framework was solid and in place.
He does have a point about efficiency, or about delivering a single app ... but you also get those advantages with a Java, or a .NET, or even an Erlang or Haskell app which are reasonably efficient ... and still, you won't have to deal with segfaults.
Also, if you want extreme scalability, like being able to serve 10000 requests/sec on a single server ... sorry, but raw performance doesn't cut it ... see this article for instance ... http://www.kegel.com/c10k.html.
Not to mention that the most usual bottleneck is the database (how many apps can you build that doesn't use one?). So even if you build the fastest web server in the world, if you're using a RDBMS you're going to end up with 100 reqs/sec, unless you're sharding or caching that data.
The bottom line is ... if you want extreme scalability, I don't think C++ is going to cut it, and you're going to invest a whole lot more in optimizations that are already done in more mature web frameworks.
Well, unless you have Google's resources and skill.
I thought so too, until I got to the part where he claims it's running rhymebrain! Still can't figure out whether he's joking or not. Naw, on second reading, definitely sarcasm.
The reason why dynamic scripting languages are more appropriate for web applications than C++ is simply that the bottleneck is somewhere else - namely, the Internet is slow enough to make the performance of the server-side code irrelevant. That can very easily change in the future.
Roundtrip latency is certainly going to add up to a substantial chunk of time, but that's not much excuse to discard performance considerations. Requests that take say 50ms of processing will take even longer if the box is busy; it doesn't take much to add up and becomes noticeable.
And if your implementation on the server side is very fast, you can do more.
My point was indeed that performance can matter, and will matter even more in the future, since no apparent technological limit on network speed has been reached so far.
If this guy is really running his web server on an Acer Aspire One 512, as he says at http://stevehanov.ca/blog/index.php?id=71 C++ sounds like an excellent choice.
Current embedded devices are likely to have at least 256Mb of RAM, 4Gb of Flash, and as much MIPS as a Pentium III. Not so "embedded" any more, don't you think? If Portability is an issue, then I am likely to use C with something like Lua. But I can't imagine for a second abandoning garbage collection before I see proof I can't afford it.
It's not exactly Facebook, but I've seen plenty of small network-connected appliances provide web admin interfaces. Routers are a common example, I've also used network console servers that do this.
For the alarm clock radio with iPod dock I just got for Christmas, I think the RESTful API would be easier to use than the array of buttons on that thing. My wife got the old, red-LED clock radio back out because she knew how to set the alarm on it.
Which is a very valid point, I have never understood why organizations that need massive scalability for simple, stateless, distributed apps would not pursue such a course.
I'm not sure if this falls under the robust category or.. sever :) but it supports forth and has total of 360 computers running at 700 Mips or 250 Gips
Web apps in Objective C --- the second-least safe programming language
on the market. Oh please, oh please, build your next huge application
in this. College tuition for my kids is freaking me out.
That has got to be the dumbest thing I have seen in quite a while. "Smalltalk is garbage-collected. ObjC deals in raw memory addresses. It's actually less secure than C, as I see it" shows a level of fail that is beyond explaining.
I think he does a good job of humorously pointing out the difficulties involved with writing a C++ Webapp, but I don't doubt that he's serious about the performance characteristics being a good thing.
Why believe it's satire? He's got an actual app already deployed and a full history of blog posts demonstrating his knowledge and interest in this field.
The thruth is, my company maintains a C++ we application. It's implemented as an Apache module and employs some very interesting in-memory shared structures.
C++ is actually a moderately effective functional programming language. With reasonable knowledge of the standard data structures, and using BOOST, one can actually write fairly expressive and effective C++ code.
Most of the time, I feel like all I'm doing is writing a much more verbose version of Python. The rest of the time, I'm hunting nightmarish segfaults and segfault and template errors.