While I think it's great that more and more developers are being exposed to building network applications using async I/O (which I guess is "evented" now) via Node.js, I think it's worthwhile to point out that the state of the art has moved well beyond these kind of callback frameworks. The reason is simple: it sucks programming in callback patterns on serious, large projects. You end up with lots of routines that are 6 or 7 callback chained together--and don't forget to attach error callbacks as well at each point.
In the Python world, for example, eventlet, gevent, and diesel (disclosure: my project) all use coroutines to achieve very high performance asynchronous I/O that's still written in a "synchronous" style. And before that, CCP games has been using stackless python to do coroutine-based networking for the servers behind the very popular MMORPG eve online (http://www.tentonhammer.com/node/10044).
Even slicker, in languages like Erlang and Haskell (via forkIO + the select/epoll I/O manager), support for asynchronous networking using "blocking-style" code is a fundamental language/compiler/vm feature. When you write any code in these languages, you're transparently taking advantage of the same scaling characteristics node.js and its ilk provide.
So, I'd humbly suggest to anyone getting serious about doing this kind of programming that they investigate some of these systems (which are usually built by weary callback-style async I/O veterans) and leapfrog the rest of their peers. Consider Node.js a gateway drug.
I've heard this before. "Oh, callbacks are fine for little things like this hello world demo, but for Big Programs, you need (threads/coroutines/etc.) because they're Serious Business."
That might be true. Maybe for Big Programs, something else is better. But I think the problem is that setting out to build a Big Program for Serious Business is Doing It Wrong.
The best frameworks (or, at least, my favorite frameworks) are collections of small tools with consistent interfaces and interchangeable parts.
If you can manage callbacks for a 100-200 LOC program, then you can build a Big Program out of such modules with a little bit of forethought. Instead of setting out to write a Big Program, why not try to figure out how to express your Big Problem in terms of a bunch of Little Problems, and then come up with a way to have Little Programs assemble.
Then, assemble the Little Programs to solve the Little Problems. There is no solveable Big Problem which cannot be broken down into some finite set of Little Problems.
Callbacks make it natural to solve problems in this manner.
I actively develop a several thousand line NodeJS project. It's quite nice, actually.
I don't know how to completely address your post b/c it responds to assertions I didn't make. I said nothing about "Serious Business"--and bundling threads and coroutines together is like bundling horseshoes and bicycles. I made no defense of threads.
Callback style, within an I/O handler, is (almost always) a concession to the event loop. When I'm writing a program, I write:
routine():
result = do_one_thing()
do_next_thing(result)
do_whatever()
With callback based I/O, you need the routine to end so the "reactor" up the call stack can get control again to do I/O on your behalf:
.. but, there are (at least) two problems with this:
1. You're making the stack deeper every time you "call over" to the reactor... you will stack overflow eventually
2. In most languages, the reactor doesn't have a way to "call back into" your stack frame and resume it.
Coroutines are a way to "call over" to the reactor, have your stack state frozen (in heap space, without going deeper in the call stack), and then "unfreeze it" later on, when the reactor has completed the IO.
The greenlet package provides this for Python; Haskell and Erlang have native support for this. And, this is ultimately how you want to program. A(); B(); C(). Sure, there are exceptions for UI driven callbacks, and callback patterns aren't all bad, but typically in network IO cases, the callbacks are explicitly a concession to the I/O loop's need to be scheduled, made necessary by a missing language or vm feature.
And: cleaner code benefits 200 LOC apps as much as it does 200k LOC apps. A good design is a good design at any scale.
function routine () {
doIOForMe(function (resp) {
doMoreIO(resp, function (nextResp) {
transform(nextResp)
})
})
}
// or with a little helper function:
function routine () {
chainTogether
( doIOForMe
, doMoreIO
, transform
)
}
The "Step" library from Tim Caswell (creationix) lets you do a LOT of very interesting parallel/serial stuff. In npm, I have a few helper functions that make it easy to do serial "this then that" stuff.
It would be nice to have something in the language that does this. Maybe something for the Coffeescript guys to check out. For me, plain old JavaScript is enough, I guess.
I think he's saying that the main event loop has called a callback, only to have that callback either call the main event loop, or something similar that processes events until a certain condition, and that in turn may cause other callbacks to be called. Stack depth has increased. And it tends to lead to bugs because sometimes unconsidered things happen in the midst of the first callback's processing.
Whether you like the model or not, the fact is that JavaScript with it's callback/event based model is what browsers understand. And the internet isn't going anywhere anytime soon.
Node is an attempt at using this successful model on the server too where is can solve massive scalability issues using the exact paradigm front-end devs already know.
Also while there are many technical ways to make code look blocking, but really be running other events under the hood, it's this exact implicit running of "other stuff" that makes writing threaded code so hard. You have to assume that things can change between every function call because you don't know if somewhere down the chain it's doing pseudo-blocking.
JavaScript's model is simple, you provide callbacks and you know exactly at what boundaries things can happen asynchronously.
In summary, node is one way of doing it. We think it's a better model and it's proven itself in the browser. If you think another model is better, than by all means go for it!
But: I don't make my server-side language decisions based on what happens to be in the browser. To think that a language required for use in such a constrained environment is just _coincidentally_ also the best language to use server-side where people have been doing event-based programming for decades seems... unlikely. It would be as ridiculous as asserting that we should all use postscript in the browser because that's the standard we've been using in printers for the last 20 years.
I suppose if you make the argument that browsers dictate that developers _must_ learn javascript, and therefore, with Node.js they can also program server-side without ever needing to learn a second language, that is a sound argument. I don't, however, know many good programmer that spend a career knowing exactly one language.
I never said JavaScript is best language to use server-side. That would be insane. I'm just saying that it worked well for the browser (mostly because it was forced on us, but still) and node is an experiment to try the same thing on the server.
It's a lot better than writing C, (which is what Ryan was doing before starting node) since JS has closures, anonymous functions and other functional niceties that make event based programming much easier.
The fact that you can now code your server-side code in the same language and paradigm as your client-side code is a huge bonus, but was not the reason node was created. V8 is an amazing VM and it's a language that lots of talented developers know. Why not let them loose on the server and see what comes out of this talent.
Given the constraint of using JavaScript on the server, node is the best solution. If you don't want that constraint, then maybe erlang or something with no-shared state is a better solution.
Javascript might not even be the best language for the browser. The browser programming model was set in stone years before significant applications were built in the browser. If we had a choice about the browser programming model, then it seems likely that better programming models would have emerged based on the experience.
I agree this is currently the best way of using similar models on both the client and server side. It'll be interesting to see if anything changes, on either side or both, when webworkers (browser threads) get more widespread support. At least it's likely that more direct comparisons of the programming models will be possible in the future.
I think developers get attracted to frameworks because of fun examples like the one linked in the posting. Until there are multitudes of interesting examples using other tech, I think it's going to be an uphill battle to attract significant numbers of developers to them.
Since many of the Node.js demos like this one are open-source it's much easier for developers to download working code and make modifications to see how everything works.
The technically superior solutions often get overlooked for the most accessible ones.
He's not talking about technically superior server-side Javascript environments, he's talking about langauges and frameworks that provide more than just a callback model.
Coroutines bring their own headaches and baggage. To say that callbacks are behind "state of the art" is a little misplaced,; its just a different way of doing things. With coroutines you have to worry about IO all over the place and have to make your functions coroutine safe and Ryan Dahl, the node.js creator, will argue with you all day long about coroutines vs callbacks.
I do agree, though, that developers should not ignore other ways of doing things, just don't discredit the callback way of doing things as it can be useful.
"With coroutines you have to worry about IO all over the place and have to make your functions coroutine safe and Ryan Dahl,"
No, no, a thousand times no. Node.js partisans really need to start actually using Erlang or Haskell for a little while before spouting this canned line off. You do not have to jump through enormous hoops to deal with IO in Haskell or Erlang, it just works.
Callbacks are behind the state of the art. Coroutines or generators or any other cooperative-multitasking primitives in languages that didn't support them from day one and have enormous sets of libraries not "cooperative-aware" are behind the state of the art, too.
This is all just "cooperative multithreading" again and I am yet to see anyone explain why this time is going to turn out any different than last time we tried cooperative multithreading.
It's really a question of how much state you're storing, and how you're dealing with that. Many languages and many runtimes allocate stacks in megabytes, and store every local variable in it.
With a callback-based system, stacks are short, and you explicitly carry that state. You KNOW what state you're storing, and you can see it easily.
There's some benefit to that.
And the pattern for aggregating functions is different: Receive a starting event, emit a done event -- you aggregate processes into sets of events, not into function calls. So yeah, it's not going to follow some of the same patterns that non-event-driven code follows, and event-driven code is going to look rather different than callback-passing code.
I think Ryan's right about coroutines, too: Coming back to vastly different state after a simple function call is basically going back to the days of using global variables.
Unless I'm misunderstanding you, you seem to be conflating purity with callback vs. coroutine--I'm don't know why the second would have any bearing one way or the other on the approach to the first.
If you don't want global variables, don't use them! If you don't want side effects, don't use them either. Those are the same decisions you make, callback or not. Both are great ideas.
function do_request() {
database_rows = get_from_database(argument);
return make_html_table_from_rows(database_rows);
}
What about the second requires a vastly different approach to state? What about the io loop "up the stack" has fewer effects or global variables than the loop called into at the coroutine?
I can do the first two calls serially without either passing through the first, or using a global variable or some kind of catch all "context". The local frame is the context, which is the oldest, most straightforward, most tried-and-true context in the book.
You need to try Erlang. (Or Haskell, but Erlang is more approachable.) You and every other Node.js partisan keep making criticisms that simply make it clear that you have no clue what you are criticizing and that doesn't make it terribly likely that you're going to sway me to your point of view.
"With a callback-based system, stacks are short, and you explicitly carry that state. You KNOW what state you're storing, and you can see it easily."
Actually, no. You have implicit state carried around in the function closures and you will discover that it is very easy to have a leak in there that will be very hard to diagnose. I say this because I am speaking from experience.
(Remember, Node.js isn't a blinding new architecture. The architecture has been around for over a decade and I don't even know when the idea started. The only new aspect is that this time, it's Javascript. I've got a lot of experience with that architecture, and what that experience tells me is never again!)
On the other hand, it is very easy to examine an entire running Erlang system, see every process and the exact contents of its stack at that point in time, and the exact amount of memory currently allocated to it, and since Erlang doesn't have any sharing between processes, that state is everything about that process. It isn't always the best about giving back memory if you have long running processes, but I was able to diagnose which processes were consuming my RAM, determine why they were consuming my RAM, and test out a fix for the excessive RAM consumption (since it came in the form of sending a particular message sooner rather than later), all without shutting down my server.
You do not have that level of introspection and visibility in Node.js. I don't even have to ask.
"Many languages and many runtimes allocate stacks in megabytes, and store every local variable in it."
I'm not talking about "many languages". I don't care about "many languages". I'm talking about good languages. Erlang can very easily allocate a couple hundred bytes to a process, or less. I'm not actually sure what the minimum allocation is, but it is certainly going to be competitive with a minimal Node.js handler.
"Receive a starting event, emit a done event -- you aggregate processes into sets of events, not into function calls. So yeah, it's not going to follow some of the same patterns that non-event-driven code follows, and event-driven code is going to look rather different than callback-passing code."
None of that appears to have any relationship to coding in Erlang, from what I can see. Better technologies don't have to have events. They just code right through things. A loop for a simple proxy might look like:
proxy(SenderSocket, DestSocket) ->
case socket_read(Sender) of
done -> done; %% return to the original caller
{ok, Data} ->
socket_write(DestSocket, Data),
%% go back for more
proxy(SenderSocket, DestSocket);
{error, Error} ->
handle_error(Error)
end
end.
I don't need to "aggregate events"; I just tell the system what I want it to do, and it does it, and I don't sit here and specify how to wire functions together. In Erlang, the above will not block any other process. If you don't want it to block your current process, that's easy:
spawn(fun () -> proxy(SomeSender, SomeDest))
Bam. Separate process and the current process can move on with life. No hooking up events. No code blathering on about how to interleave the events in that process with the events in this process. It's just happening. (There is standard library code to make things even more reliable, but going into the built-in supervisor stuff would take too much time. Also, it's hitting below the belt, no other language has anything quite like OTP.)
Erlang doesn't actually use coroutines, Haskell does only upon request, coroutines for concurrency are just cooperative multitasking and I mock them as well, albeit for different reason.
You need to try Erlang. If only to know how to argue against it without arguing against some fictional language that doesn't exist.
The problem is that they do not even understand that it is ridiculous to compare someone's hobby-project (actually a bunch of hacks - just read the source) and well-designed (all papers are available) battle-tested and widely used in telecoms (not in browsers) solution. ^_^
So, you're right - "It is Javascript". Same as for Clojure "It is JVM!"
Really? I recently wrote a framework to do networking with Lua. The network code itself is event based, but each TCP connection is handled by a Lua coroutine which makes it easy to write straightforward code such as:
function main(socket)
io.stdout:write("connection from " .. tostring(socket))
while(true)
local cmd = string.upper(socket:read())
if cmd == "SHOW" then
socket:write("show me some stuff\n")
elseif cmd == "PING" then
for i = 1 , 15 do
socket:write(".")
socket:sleep(1)
end
socket:write("\n")
elseif cmd == "QUIT" then
socket:write("Good bye\n");
return
elseif cmd == nil then
return
end
end
end
Rather than socket:sleep(), you probably want to use socket.select to multiplex IO. (I'm assuming you're using LuaSocket, though that's not entirely clear.)
As with select(2) in general, this doesn't scale up past 100ish idle sockets - it has to do a full scan over all sockets to check which are ready for IO, and the latency eventually dominates. (Not a big deal for most uses, but problematic for web applications.) If that's not an issue, though, it's quite easy. Lua is very underrated, IMHO.
I'm working on a Lua library (octafish) for doing libev + coroutine and/or callback-based servers. It's been on the back burner for a bit, though - a couple other projects have been crowding it out. I'll put it on github once it's further along.
I'm not using LuaSocket but my own homegrown code based off epoll() (it was a learning experience in embedding Lua). It could probably be adapted to libev if I had the interest in doing so.
Ok. What you wrote would adapt to select and nonblocking sockets in LuaSocket pretty easily, FWIW. Using epoll / kqueue (directly or via libev/libevent) scales better, but you're blocking on the sleep, so it wouldn't matter.
Programmer-mentality is somewhat different from other mentalities, but it's still bound to simplicity laws.
The technologies you are describing were already there, but it's not just a problem of technology, feature set or execution power. It's also a matter of simplicity.
Node.js is working because it gives so much power with a very simple approach. And by approach I mean also how much is simple to understand it and start producing something useful.
That's exactly the reason why we use more abstract languages, and that's the reason why Node.js is getting a lot of attention recently. If you don't know any language, do you think it's simpler to start with Erlang or Haskell, or with JavaScript? I don't have any proof, but my bet is surely on JavaScript. :)
And even the most obvious things are important. Because maybe a uber-developer can ignore the small details, but most of the programmers aren't uber, they just want to develop easily and happily. Every little detail matters. Just see how many steps you need to install Erlang on your machine, and how many steps you need to do the same with node. It seems stupid, I know. But when you sum every detail... it matters. :)
At the same time, this gives power to the uber-programmers out there to bring on more cutting-edge solutions when they need them, and feed the "simple" level with their discoveries and experience, making the language and frameworks evolve.
If you're right, one day we are going to have simple coroutines - in the complex, environmental and social sense expressed above. Maybe even in Node.js, because in the end JavaScript 1.7 afaik supports yield and V8 could implement it in the future. ;)
First off, is everyone running your Linux distro? ("sudo pkg_add erlang" works for me, though.)
Besides, are you seriously implying that the hard part of getting started with Erlang is installing it? A lot of developers aren't willing to sink time into learning a language that actually has new ideas, you know? Hell, it's not even OO. :)
I definitely agree when it comes to the code scalability of something like Node.js (although, I haven't built gargantuan javascript applications before, but am going to assume that it's not necessarily a pleasant experience).
Diesel looks great. When looking into alternatives in Erlang/Haskell/Scala, or XMPP/BOSH, either it's not easy to find quick how-to examples, or I'm looking in the wrong places. Any fingers in the right direction would be welcome.
The Glasgow Haskell Compiler (GHC) uses async I/O to implement all I/O actions (e.g. printing to stdout, opening a socket) and it's completely invisible to the programmer. Bryan O'Sullivan and I have rewritten the implementation used in GHC and GHC 6.14 will offer much better scalability (using epoll/kqueue) than previous version.
Just fork of a thread per connection (using forkIO) and just call normal I/O actions. The implementation will multiplex those forkIO threads onto a single OS thread that calls epoll/kqueue.
2. Provide sub-20ms interface switching response rates.
3. Protect the interface against extended network or server failure.
4. Buffer against poor network performance.
So I would agree with you that they have "managed to do without it" (code-sharing) so far. But they have done so only by avoiding responsibility for these goals, by considering them a convenience, or by not considering them at all.
Secondly, I can tell you from experience that a non-Javascript server-side framework can do the above but only with difficulty. Existing popular frameworks do not provide "convention-over-configuration" assistance in terms of straddling the network or exposing Model definitions to Javascript etc. They render interfaces on the server-side, rather than letting controllers and views interact on the client for example. Indeed it would not even be fair to expect these frameworks to meet the above requirements, simply because they were not designed to do so.
Eventually one must arrive at a realization that web applications (as distinct from "web sites") will mostly be written in the same language AND framework on both client and server. Fear not, this will probably be more fun.
I simply do not see why your goals 1-4 require a) the same language on the server as the client and b) the use of a non-blocking I/O model at the server. It appears to be a leap of faith on your part.
While marshalling of model objects to JSON is obviously going to be easier to achieve if your model objects are Javascript objects, marshalling to JSON is trivially achieveable in almost all programming languages.
To suggest that developers should - to achieve this microscopic payoff - choose to build servers using a language with a crippled concurrency model (i.e none) is laughable.
Re: "I simply do not see why your goals 1-4 require a) the same language on the server as the client"
Have you ever done 1-4?
Re: "marshalling to JSON is trivially achieveable in almost all programming languages"
Code sharing and JSON serialization/deserialization are not the same thing.
Re: "To suggest that developers should - to achieve this microscopic payoff - choose to build servers using a language with a crippled concurrency model (i.e none) is laughable."
Ships will sail around the Earth but the Flat Earth Society will continue to flourish.
Make it funnier, show us a maze and people trying to solve it in real time. Replace the mouse pointer with a colored token and constrain its movement to the walls of the maze.
This was a startup idea I had when I started Mibbit (I decided on balance webchat had more potential than multiplayer web games).
The idea was 'mouse games'. I had a similar setup where you could see everyones mouse cursor. I planned a whole load of multiplayer games where you could chase each other, complete puzzles, click on other people to kill them, draw circles round other players to trap them, etc etc. You could have messages show up like "First to draw a square gets a point". "Move to the left get a point". "Move as far away from everyone else as you can. Most isolated player gets a point". You could have objects being thrown at cursors, and you have to dodge them. Tons of potential to be a fun way for people to waste time :)
* Come back regularly to play it?
* Pay to play it?!! :/
* Click on ads? (If the creator were evil, they'd confuse
the users into clicking on ads, making them think it's
part of the game).
As with everything, it depends on the implementation ;)
I think it's probably tougher to monetize a web game, but flash game creators seem to be doing moderately well with advertisements that run before you can play the game.
There's an interesting hook in what you're talking about: I could IM a friend and say "hey jump on here, let's play together". It's much more social than most web games out there which is intriguing.
In general most web pages feel like you're there by yourself. The model of seeing many other people there at the same time is novel enough right now that it should be a little easier to get people to tell each other about it.
everybodyedits.com is very close to this; anyone can create a level and then you can play while you watch everyone else play at the same time. He's got $500 in donations so far I think.
Extremely cool demo. It would be interesting to see this applied to analytics, so you could see exactly what people are doing on your site in real time. Although that would also be a bit creepy.
I am thoroughly impressed, and excited for the prospect of unimaginably awesome web applications. I do have a few thoughts:
- How easy would this be to implement in another event-driven server, such as Tornado?
- I have an identifiable "mouse ID" in this app. Is this secure? Surely the security aspects of web sockets have been key in their design, but I am left with a bit of uneasiness with my browser opening a TCP connection to another server. I believe WebSockets will pave the way for a whole new class of web-based attacks, and possibly even highly mobile worms.
I'm wondering if Javascript will start being heavily used server-side? Would it be wise to start learning JS and how to use Node.js or is it absolutely not suitable for production use?
Hi, author here. While building this thing, the Node server had the tendency to drop out. I'm monitoring it with God (http://god.rubyforge.org/) now, so it restarts whenever anything goes wrong.
I'm genuinely impressed by how it's holding out now. :)
It a strange mix of disappointment and excitement to find out someone has just released a better finished version of what you were working on. Oh well, I guess I'll finish it anyways.
It has mutated quite a lot since I started... I'm taking the positions of everyone's mouse and feeding them to Fortune's sweep line algorithm to make Voronoi diagrams. I've made it look like you are a cell squeezing through layers of tissue.
http://uncc.ath.cx/applets/Voronoi_Standalone/
The problem here is that to bind the websocket to the port 80 they would need to code their whole sites in node.js (with no reverse proxying from nginx and the likes) or at least make a reverse proxy in node.js and use it as front-end.
It's not hard to serve both regular http requests and web socket requests on the port 80 on the same host. Just use a special context path for web socket requests, for example /websocket/*
Some of the node.js websocket impls let you do this easily while others don't. Don't use the ones that require a separate port. Nobody behind a corporate firewall will be able to use your app if you do.
I think it is possible to have a ws server listening on the same port (and even path) as an http server, hence it shouldn't really matter.
And if you really care, you could make a http forwarding proxy with node to forward normal http requests through to your app if it's written in something else, but deal with websockets itself.
In the Python world, for example, eventlet, gevent, and diesel (disclosure: my project) all use coroutines to achieve very high performance asynchronous I/O that's still written in a "synchronous" style. And before that, CCP games has been using stackless python to do coroutine-based networking for the servers behind the very popular MMORPG eve online (http://www.tentonhammer.com/node/10044).
Even slicker, in languages like Erlang and Haskell (via forkIO + the select/epoll I/O manager), support for asynchronous networking using "blocking-style" code is a fundamental language/compiler/vm feature. When you write any code in these languages, you're transparently taking advantage of the same scaling characteristics node.js and its ilk provide.
So, I'd humbly suggest to anyone getting serious about doing this kind of programming that they investigate some of these systems (which are usually built by weary callback-style async I/O veterans) and leapfrog the rest of their peers. Consider Node.js a gateway drug.
Linkage:
http://eventlet.net/ http://github.org/jamwt/diesel http://en.wikipedia.org/wiki/Asynchronous_I/O#Light-weight_p... http://www.galois.com/~dons/slides/a-scalable-io-manager-for...