The Way of the Gopher: Making the Switch from Node.js to Golang

forgettableuser · on March 27, 2016

I'm disappointed by the overall tone here. It's as if this were some kind of failure that everybody is piling up on.

- She made three passes to fix the existing code base.

- After that didn't work, with the CTO's involvement, this Go solution was devised.

- It took only 2 weeks to ramp up with Go and build a replacement that solved the problem.

- The number of instances needed dropped from 4 to 2

This sounds like a resounding success.

So congratulations Alexandra and Digg. And thank you for sharing. The article was well written and I learned some things about how the Node.js event system works under the hood.

rgbrgb · on March 27, 2016

I think the disappointment here is that the OP seemed not to understand the root cause of the problem and implemented a solution in a different language (load balanced worker pool) that could have also worked in the original language without a total rewrite. Then the new language is trotted out as the savior. It sounds like the file fetch workers and request handlers were running on the same process, so the longer-running workers ended up blocking requests. The way it's discussed, it seems like pure luck that they stumbled on a solution that solved this (running the workers on a separate queue/process).

I totally agree that it's impressive that it only took 2 weeks to build the Go version, it just seems like it would have taken 2 days to try the worker pool implementation with Node.js.

jksmith · on March 27, 2016

Good point. Let's consider another side to that with just some general observational comments. Golang seems to have a habit so far for revealing answers for devs with varying skills for internet applications that other languages don't.

I've moved from writing code (including golang) to managing large projects. My biggest concern now is meeting the three metrics of project ecstacy: 1) correct solution, 2) on time, 3) within budget. If one language appears to get me better performance on these metrics over another, then I'm interested, whether that language has generics or not.

Another concern from the business perspective is whether or not a language is easy to hire for, and gets devs more productive in less amount of time without creating a lot of technical debt in the process. I have a gruesome time dealing with this very issue, and if golang was the basis for my toolchain, my guess is my hiring concerns would probably be minimized to enough degree that it would have a positive impact on my business -- looping back around to metrics I mentioned.

Maybe the solution should have been kept in js, but I guess it wouldn't surprise me if the golang effort these folks just went through will probably continue to pay off in a substantive way.

conetwo · on March 27, 2016

Yes, the root cause analysis feels off. (Though the cause might be as simple as the cost of walking through the message queue.)

Something is definitely blocking or resource constrained and causing "thrashing": the (uncontrolled) number of requests allowed to spin up at a time (which creates resource contention) combined with the fan out (1 request = 100+ S3 requests/callbacks) seem like a likely causal factor. As you said, a worker approach (with limits on the number of concurrent requests) is going to be similar to the golang approach used.

The golang approach makes the average execution time of a given request more consistent but the overall wait time may still increase (dramatically) if the arrival rate grows too high. (Classic problem).

Say "easily" fixable by adding servers? Partly true. What happens if the S3 calls slows down dramatically?

daveroberts · on April 5, 2016

The first result from Google for nodejs worker pool leads to this project: https://github.com/jeffmo/node-worker-pool

The project has only 9 stars on GitHub. While it might be a perfectly fine solution that fixes OP's problem, it certainly doesn't inspire confidence as a battle-tested production-ready library. On the other hand, worker pools and Golang go together very naturally.

pacapaca · on March 28, 2016

This.

The author just doesn't know how to approach the problem and copied a solution, which happened to be in go.

Nothing wrong with that. Just don't try to pass it a success story or as an excuse to bash a language.

zaphar · on March 27, 2016

But using a process pool is far more heavyweight than threads. There is a cost there that you pay in hardware. Using Go allowed them to do this with less hardware cost than Node. That's a win that continuing to use Node would not have been able to provide.

ilaksh · on March 27, 2016

Node does actually have threads with modules like webworker-threads.

ysolo · on March 27, 2016

I've been coding in Node for a couple of years. I find it interesting. But I'm aware that the async model of Node inverts the flow of control, turning code inside out and makes application logic difficult to scrutinize and reason about. Stack traces in traditional multithreaded languages are easy to understand, whereas in Node a stack trace is necessarily filled with unrelated calls - or perhaps no stack at all in the case of next tick deferred callbacks - due to the single-threaded nature of the model. This necessitates passing around context via closures (or god forbid global variables!). The "context" in a threaded language is typically just the stack. With Node's async Promises and equivalents we can trick ourselves into simulating linear program flow, but it's a pale imitation of the real thing.

This got me thinking - why did the single-threaded "green" threading programming model like how the Java JVM was initially implemented all those years ago fall out of favor? It would be as performant as Node presently is but without the difficult async logic for the end user programmer. Async programming and green threads are duals - both use event queues behind the scenes to drive the scheduling. The complexity of the actual I/O and timer event callbacks is hidden from the programmer in green threads. The pseudo "thread" context switches would only be done upon blocking I/O. No need for actual mutexes as the program is still actually single threaded. Stack traces of green threads within a program would be easy to decipher. Computation on a green thread would also block all I/O as it would in Node, but we'd know that going in - no different from Node in that regard. Just the programming model would be simpler.

Vendan · on March 27, 2016

The programming model you are describing is almost exactly the model Go has, it just takes it a step further and allows N:M "green threading", so you can use all your cores.

_ak · on March 27, 2016

It is exactly Go's model. Go, since the very beginning, has a N:M threading model. runtime.GOMAXPROCS() resp. $GOMAXPROCS is your friend to control the amount of OS threads, and since 1.5 or so, it is by default equivalent to the number of CPUs visible to the Go runtime, anyway.

Vendan · on March 28, 2016

Well, he was describing a N:1, not N:M: "No need for actual mutexes as the program is still actually single threaded", but yes, Go has been N:M from the start.

diroussel · on March 27, 2016

I would guess it was the immaturity of the java IO layer that didn't play well with green threads. You need a good asycn IO layer on all the platforms that java was supporting (solaris and windows, plus others) for the green threads to work well.

tankenmate · on March 27, 2016

Node's javascript engine v8 can only run in one thread, if you want to access it from a different thread you need to take a lock (for example in Chrome / Chromium you need to take an isolate lock; each page runs it's own isolated v8 engine (read only code pages are shared, all other javascript pages (data plus read/write code pages) are private); in this sense it is very similar to the GIL in Python. So although node/v8 allows for concurrency it does not allow for multi-threaded parallel execution.[0]

Because of the way tasks are scheduled in node.js you can have head of line blocking; i.e. a task that is scheduled to run first can block all the subsequent tasks, the issue that the OP suffered from. You can sometimes reduce this by calling process.nextTick(), but you need to remember that it is only IO is non-blocking in node.js.[1]

[0] https://stackoverflow.com/questions/14409609/does-the-v8-jav...

[1] http://greenash.net.au/thoughts/2012/11/nodejs-itself-is-blo...

ysolo · on March 28, 2016

Since Google changed the V8 API to mandate use of Isolate pointers passed in to V8 functions as a parameter rather than implicitly using thread-local data for engine storage it is now possible to run separate V8 instances within the same thread (one at a time), or from different threads as they do not share state. One of the many issues Node has to contend with in using the V8 engine is a perpetually changing API surface.

ilaksh · on March 27, 2016

See webworker-threads

cunac · on March 27, 2016

I don't think java threading fall out of favor. It is just that people jump to new shiny things without understanding what they are trading for. I mean 200/req per minute is not something high to start with --> 3req/sec is now days high load ?

ilaksh · on March 27, 2016

    let result = await call();
    nextCall(result)

Not really a pale imitation, works just as well. The async and await keywords could be shorter. Toffeescript and Livescript are more concise than the new ECMAScript with their operators for async but it's just a little more text to read.

firasd · on March 27, 2016

The crux of the difference seems to be, in the Node.js service:

"when any request timeouts happened, the event and its associated callback was put on an already overloaded message queue. While the timeout event might occur at 1 second, the callback wasn’t getting processed until all other messages currently on the queue, and their corresponding callback code, were finished executing (potentially seconds later)."

And in Go,

"the service has a worker pool and a delegator which passes off incoming jobs to idle workers. Each worker runs on it’s own goroutine, and returns to the pool once the job is done."

awjr · on March 27, 2016

Node.js has cluster to solve this issue https://nodejs.org/api/cluster.html#cluster_cluster

Sanddancer · on March 27, 2016

Which brings us back to 2001 and Apache's prefork model. Threads are pretty battle tested at this point, and allow for different reasoning about the concurrency of a problem. The problem with node is that it pretty much only allows for one type of scaling if you start running into performance problems. Go gives you a much bigger toolbox to work with if, much like the post's author discovered, your problem set does not match up with the answers node provides.

qaq · on March 27, 2016

All is well until you have to deal with SEO for SPAs :) and then there is little choice other then node. https://github.com/audreyt/node-webworker-threads spawn real threads with web workers compatible API.

lenkite · on March 28, 2016

This is the first solution that a boring 'enterprise' Java programmer would have used and got on with other work.

maleck13 · on March 27, 2016

Just wanted to add my thanks to the author for writing this enjoyable piece. After working with node for 4 yrs & golang for 1yr, I think they both have their place. If forced to choose one, I would currently pick golang due to its excellent tooling, strict typing, excellent std lib and simple deployment.

kjksf · on March 27, 2016

For other stories of X => Go rewrites, I started keeping a list at https://quicknotes.io/n/vzb7-rewritten-tales-of-rewriting-s

feylikurds · on March 27, 2016

That list can easily be made longer. Off of the top of my head I know that there are many more projects which have switch languages to Go. What they say most of the time is (from the article):

Since our transition to Golang we haven’t looked back. ... we’ve begun the process of modularizing our code base and spinning up microservices to handle specific roles in our system.

kkowalczyk · on March 27, 2016

I agree, it's a work in progress.

If you have more links, I'll be happy to add them.

I have a much longer list of companies that use Go (https://quicknotes.io/n/1XB0-companies-using-go) but it's harder to find descriptions of rewrites than find out who's using Go.

iMiiTH · on March 27, 2016

You can add The New York Times to that list, although they rewrote from PHP, Python, Groovy, Java and C++ so no idea how it'll fit in that list.

https://www.youtube.com/watch?v=bAQ9ShmXYLY

BogusIKnow · on March 27, 2016

When you switch from Node to Go, two languages that are totally different in style, library support, performance, compile/web turnaround, client/server reusability, I always wonder why you chose Node in this case in the first place.

pron · on March 27, 2016

For the exact same reason they chose Go now, and the reason they'll switch to something else (much better!) in two-three years: because that's the current fashion.

Nothing says "I'm keeping up with my professional development" than relearning IO libraries and build tools, and rewriting in-house code, over and over again, every time in the new langue-du-jour.

zappo2938 · on March 27, 2016

Meanwhile Facebook is enjoying working with a large PHP code base. When asked they say they can write code that takes time to compile that runs fast in C++ or write code fast in PHP that runs slower. Writing features faster gives them they edge. They have an engineering team that does nothing but optimize PHP. For them an optimization is an acre of servers saved. They if something needs to run blazing fast can write code in assembly language.

What I'm told about Node is the JavaScript is running in a single thread but all I/O is being passed to native code which is written in C++ and running on multiple threads. It does use all the CPUs.

partycoder · on March 27, 2016

Blocking the event loop in node can cause severe issues. If you are juggling with 5 sticks but one stays in your hand for 1 second, your juggling will certainly fail.

Instead of flamegraphs I use nodegrind, which I can use with KCachegrind/QCachegrind.

KirinDave · on March 27, 2016

I think this is great engineering, and it shows that the Node runtime needs to consider how to actually address the needs of developers rather than smugly pointing to really frustrating solutions like Cluster that just don't provide equivalent functionality to what Go (and its predecessors in this style such as Erlang and Haskell) are providing.

There are actually a whole host of languages that would let you approach the problem with this technique and get there, so the only thing special about Go is that it is syntactically simple (although it's conceptually quite hard to build reusable code because of poor generic support the resulting bad error handling), so it makes for an easy migration.

We could see similar stories about going from Node.js to Erlang, Nim, Rust, F#, C#, Haskell, or even the newer Python & Ocaml runtimes. So it's not like Go is special here. But it's timing is special, because it's a harbinger of the industry's slow and reluctant admission that pretending everything is single threaded is okay. Even I/O multiplexing with coroutines simply cannot keep pace with even casual demands on modern infrastructure.

nissimk · on March 27, 2016

Originally, the web server was intended to be a very simple program with most of the complications in the browser. Most web app backbends are similar to this where all they do is query a database, cache or other web servers. This type of thing is perfect for node. If you are doing more complex stuff in your backend then go may be more appropriate.

cryptica · on March 27, 2016

I have read so many articles which incentivize switching from Node.js to Go and every single one of them (this one included) blabber on about the Node.js event loop becoming congested - This is completely misguided - It only shows that the engineer didn't understand the problem.

A single Node.js instance runs its business logic in a single process. A single Go instance can run its business logic in multiple processes. Of course if you're comparing a single instance of Go running on 4 CPU cores, it's going to be faster than a single instance of Node.js running on 1 CPU core.

But the thing the author fails to mention is that there are a lot of Node.js modules which help you easily scale across multiple CPU cores. E.g. PM2 https://www.npmjs.com/package/pm2

Had the author of the article actually understood the real cause of the problem, he would have saved himself and his company a lot of time.

Nodejs is still outpacing Golang according to Google Trends and yet I have never seen a single article about switching from X language to Node.js. My guess is that the Go community probably wouldn't exist if it wasn't for this constant form of aggressive propaganda/marketing. Node.js on the other hand doesn't need any marketing/propaganda; it just sells itself.

If you have a scalability problem with Node.js (or any language for that matter) and you can't find a simple solution that doesn't require rewriting your whole app, you're probably not qualified to manage this project at scale - Because scaling across multiple CPU cores is kids' play compared to scaling across multiple machines. Heck, my grandma could scale this thing across multiple CPU cores.

Personally I think Node.js is great because it encourages you to think about these problems sooner rather than later.

spriggan3 · on March 27, 2016

> Node.js on the other hand doesn't need any marketing/propaganda; it just sells itself.

If Node.js used language X,Y,Z instead of Javascript, it would certainly not have been the success it was. "Javascript on the server" was the marketing/propaganda . Nodejs came at the right moment, Js was exploding, websockets were exploding, coffeescript and co were exploding in popularity.

What Go does better than Nodejs is a better standard library, true concurrency, static typing. With Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code. Writing callbacks is tedious, promises are tedious, co routines with yield need plumbing and async isn't in the spec.

> you're probably not qualified to manage this project at scale - Because scaling across multiple CPU cores is kids' play compared to scaling across multiple machines

With Node.js any heavy computation needs to be launched in a separate process, which is not the case with Go. sure you can have one/2/4/8 process dedicated to heavy computation, but it will still be the bottle neck. In the end you'll have to use a different language in order to solve the single threaded problem.

Finally there is no mystery as to why some businesses are moving from Node.js to Go: They didn't need anything Node.js provides at first place. They didn't need to use javascript, they didn't need to write "universal" applications with the same logic running in the client and in the server, they didn't need single threaded async programming.

Go is not a silver-bullet, but I was a bit sick of having to write a callback every time I wanted to make an http request, just to prototype something.

Lukasa · on March 27, 2016

I agree with almost all of your post except this:

> With Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code.

No, it can't, and pretending that it can is misleading in a way that allows large teams of developers to cause themselves real problems.

The only sense in which this is true is that you can write a Go function with a simple signature like this:

    func DoSomeStuff (error) {}

And the caller has no idea whether this involves concurrency under the hood (e.g. the function can spawn its own goroutines and channels as it sees fit).

But, make no bones about it: this function blocks until it completes. This means that while this function may do concurrent work it is absolutely a blocking function. This is fine: sometimes blocking functions are good. But you cannot write a non-blocking Go function in the same way.

To make a function non-blocking you can return a channel out of it for the return value to appear on, like this:

    func DoSomeStuff (chan error) {}

In this model the caller really doesn't have to care whether the function is synchronous or not (though the fact that it was written this way strongly suggests that it is going to return asynchronously, or at least that the developer believes it will have to in the future).

Except...that return value just there? That's a Future. It's a terrible, half-implemented version of a Future, but that's exactly what it is. It's a promise to return some kind of result at some point when the underlying process has returned.

And if you don't want to block your current goroutine, you cannot block on that channel receive either. That means that you need a callback. There are two patterns for doing that: you could have some kind of central loop that selects over all channels like this and calls the callback functions (boy that looks a lot like Node's event loop, doesn't it!), or you can manually spawn your own callback functions in their own goroutines. Either way, you have callbacks and futures here: you're just building them yourself and calling them something different.

There are lots of good reasons to switch to Go: it's a language that makes lots of developers remarkably productive, it has an ingrained philosophy of building concurrent programs, it's pretty damn fast, and it runs on all kinds of awesome platforms. But claiming that Go has learned something magic and new about how to write concurrent software in such a way that you don't have to care whether your code is async or not is just not true: you always have to care.

agentS · on March 27, 2016

> But you cannot write a non-blocking Go function in the same way.

The caller can make function "non-blocking" by wrapping the call in a goroutine themselves. (There's some subtle differences, but they are mostly irrelevant here). For this reason, I'd say there is (almost) no reason to introduce asynchrony in your API in the way you suggest. The rest of your post built on this example seems shaky to me, since it seems built on an example API that doesn't need to exist.

I'd say that "you don't have to care whether your code is async or not" is a overstating the case. I would append the qualifier "unless you're introducing concurrency". Considering that almost no low-level APIs are asynchronous, this usually happens rarely (or happens in low-level code like the HTTP server). Examples that have come up for me: making N parallel RPCs, writing a TCP server. In those situations, you care about async vs not.

In event-loop based systems, it seems like async is in my face all the time, even when doing things that are entirely sequential.

Lukasa · on March 27, 2016

> The caller can make function "non-blocking" by wrapping the call in a goroutine themselves.

Sure, but if they want the return value then either they need to construct the Future-y wrapper I just described or they need to assemble it together in a collection of other function calls wrapped inside a function that itself is either Future-y or uses a long-lived channel to communicate results.

It is not novel to build up a non-blocking system from purely blocking method invocations. We've been doing that for years: it's called threading. Doing things this way has many advantages when written with appropriate diligence, and I'm not pretending otherwise. However, if you actually care about communicating between these arbitrary threads of execution than you either need Futures or queues (both of which are essentially just channels in Go), and at this point you've got the exact same problems as you get in NodeJS or any other asynchronous programming environment.

> The rest of your post built on this example seems shaky to me, since it seems built on an example API that doesn't need to exist.

I don't think that's fair: as I mentioned above, the fact that you as library author would not write the Future-y extension doesn't mean that the Future-y extension isn't built: you just force your caller to build it. That's fine, it's a perfectly good architectural decision (probably you should't be making those decisions for your user), but it doesn't remove the problem.

> I'd say that "you don't have to care whether your code is async or not" is a overstating the case. I would append the qualifier "unless you're introducing concurrency".

Sure. The thing that matters here is that Node is always introducing concurrency, because Node is concurrent. This is why all Node programs have to care about concurrency: they are all concurrent because their system is concurrent.

This is desperately inconvenient for many one-off programs, which is why I personally don't use Node for anything like that: I'd much rather use Python or Rust or Go. But that was never my argument. My argument was about OP's assertion that "with Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code. Writing callbacks is tedious, promises are tedious, co routines with yield need plumbing and async isn't in the spec."

The first sentence is dangerously misleading (while technically true, any system that does that is usable only in that one context), and the second one misses the point, which is that those things get effectively built anyway in any moderate-scale concurrent system in Go.

But my biggest point is this: Go isn't magic in regard to concurrency, and there is a weird amount of magical thinking around Go. Go is a very good language with a lot to like, and I like it quite a lot. But when boiled down to it, Go's concurrency model is threads with a couple of really useful primitives. And that's great, and it works really well. But it's not new or novel.

The sentence "with Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code" is equally true if you replace "Go" with "C", or "Python", or "Java", or any language with a threaded concurrency model. There's no magic here. It's the same building blocks everyone else is using.

vishbar · on March 27, 2016

Go isn't just a threaded concurrency model, it uses an M:N greenthreads pattern. Also, when you say that Go I/O operations are blocking, it is true that they'll logically block a goroutine. However, under the hood, it uses the same libuv-style async IO (or IOCP on Windows) that Node does. An operating system thread doesn't get blocked; the goroutine is "shelved" and woken up again when the I/O is complete. It accomplishes the same kind of thing as Nodejs does, it just abstracts the async nature of the IO away from the programmer. I have to say I like it: procedural execution is easier to reason about.

Lukasa · on March 27, 2016

Honestly, I think the distinction between M:N threading and straight OS threading is pretty minor. It grants some advantages to the language runtime: it can control the stack size, for example. But in terms of how it affects the development style and what kinds of bugs it encourages/discourages I don't think it dramatically differs from the OS threading model.

sagichmal · on March 27, 2016

It is categorically different. OS threads are orders of magnitude more expensive, which makes them a nonstarter for most problems that are a good fit for lightweight conceptual concurrency.

Lukasa · on March 27, 2016

It is not categorically different.

As I said above, green threading has advantages over OS threading, but they behave exactly the same in terms of design patterns and potential bugs.

This is what I was getting at when I said "not that different": compared to the difference between event-loop concurrency and threaded concurrency, M:N green threading is basically just a subcategory of threading.

sagichmal · on March 27, 2016

No. Green threads enable entirely new classes of design patterns that are categorically infeasible with OS threads.

Rapzid · on March 27, 2016

I believe that depending on the system call the thread handling the call could block, but it's not the same thread developer Goroutines are running on. Yeah though, same as nodejs.

jerf · on March 27, 2016

The core problem here is that the Node community invented/popularized a connotation of "blocking" and "non-blocking" that is excessively event-loop-specific. The important difference in their connotation is that code that blocks blocks the whole OS process. The conventional meaning of the term referred just blocking the running thread.

In normal Go, nothing is blocking in the Node sense. (Oh, if you put your mind to it you can manage it, but I've never once encountered this as an practical problem, either in Go or the equivalents you can do in Erlang if you put your mind to it.)

This has profound changes on how you write code.

It's true, Go is not magic. It's just another threaded language in most ways, with the "real" magic in the community best practices around sharing by communicating instead of communicating by sharing. In theory, you could write an equivalent set of C libraries and get most of the same things, but you'd have a lot of library to write. (This is why things like porting C goroutines to C have a hard time getting traction. It can be done, but it's actually the easy part. Also, you'd still be in C, which is its own discussion. But you can get the concurrency.)

The real issue here isn't that Go is necessarily exceptionally strong at concurrency, the real issue is that Node is exceptionally weak. It introduces this new concept of "blocking" that only exists in the first place because it is weak, and then makes you worry about it continuously, to the point that many people seem to internalize the concept as what concurrency is, when it isn't. It's really just something Node laid on you. So when you step out of Node, and you see a community that isn't visibly as worried about "blocking" as the Node community, someone trained by Node thinks they are seeing a community that "isn't good at concurrency". My gosh! Look how cavalier they are about "blocking"! Look how they tell people not to worry about it, and how casual they are about having users wrapping library code in goroutines and explicitly telling library writers not to do the concurrency themselves. But what you're seeing is what happens when you simply no longer have the problems Node and "event-based code" brings to the table. Go is not magic in the general case, but, honestly, when someone coming from the Node world picks up Go, I can see why they might go through a period where they sort of think it is. There are really differences in code style, and how easy it is to write correct code.

You have to make sure you're not letting the limitations of one connotation of "blocking" spill over into the other, or you will have problems. (True in both directions.)

To speak to someone else's point, "futures" in Go don't "suck", they basically don't exist. If you're writing in a recognizably "futures" fashion, you are not writing idiomatic or even particularly good Go. You don't need futures, because (what are today called) futures are basically an embedding of a concurrency-aware language into a non-concurrency-aware language, and you don't need them when the language you're working in is already concurrency-aware. That's why you don't see futures in Haskell or Erlang either. (I have to qualify with "what are today called" because the term has drifted; for instance, Haskell does have explicit support for an older academic definition of the term with MVars, but modern software engineers are not using the term that way.)

kasey_junk · on March 27, 2016

I've never in my life programmed in Node. When I say futures I'm talking about the logical concurrency primitive written about by Friedman/Wise in the 70s.

Lots & lots of idiomatic go code exists in that form (anytime you wrap a select that times out in a function you have a future).

Channels, what we are really discussing here, have 2 problems: the first is in abstraction, they don't provide basic primitives that other similar structures provide, like timeouts & cancellation. The second is in implementation. As futures you have to worry about all the edge cases around nil & closed channels. As queues they are highly contended.

Lukasa · on March 27, 2016

I agree with this entirely. As I said many times before, I like Go and I like its approach to concurrency.

All I'm trying to do is to make sure that people who make bold claims about abstracting away blocking code aren't misleading others: when calling other code you should always be aware of how it interacts with the flow control of your program.

philipov · on March 27, 2016

In your opinion, how does Python 3.5's async/await syntax compare to Go for writing concurrent programs? I work primarily with Python3 these days and have no Go experience.

tptacek · on March 27, 2016

Can you please explain why, exactly, I would want to take a simple function like "DoSomeStuff" and make it non-blocking with futures in Go?

Are you sure you're not just explaining how to write a Node program in Go? Write Node programs in Node.

kasey_junk · on March 27, 2016

He is speaking to the idea that you don't need to care about whether a function blocks or not. It's simply untrue (I'd go further and say if its untrue in all languages but that's a digression).

To have an abstraction where you really don't care about blocking or not you need promises/futures. Go's futures are bad. Real bad.

If you don't want the function to be non-blocking then you are fine with either method signature.

ashearer · on March 27, 2016

I took it to mean that with Node, some functions have a traditional return value, like functions in most common languages. But if there's even the possibility that they may indirectly invoke an async function, they need to have a different signature using callbacks or promises. For example, here's a simple function:

  function isValidFoo(val) {
      ...return boolean
  }

This function is synchronous now, but its contract needs to change if it internally calls something async (say, it's updated to check a value from a cache, which may trigger a DB lookup). The new signature has to add a callback parameter and drop the return value, or return a promise. Then its callers have to change how they invoke the function, and if they were previously synchronous their signatures have to change as well. Every synchronous function up the call stack needs to care about this internal change.

To avoid a ripple effect, all functions have to be designed to be asynchronous from the beginning, but this increases program complexity significantly. It means avoiding built-in language features such as direct return values, and often means substituting async library functions for built-in looping constructs.

Other languages including Go aren't like that. You don't need to wrap a simple return value in a promise or callback just because retrieving it might someday involve I/O. Execution of the thread/goroutine simply resumes whenever the value is ready, and the function returns normally. In those languages, Node's distinction between sync and async is artificial and unnecessary.

_ak · on March 27, 2016

...what? The point is that when any function in Go is blocking, it will obviously block the current flow of your own program until it returns. But it only ever blocks the flow of the current goroutine. Other goroutines will continue running just fine, and the runtime will schedule all the ready-to-run goroutines over the OS threads it has available, and that's nothing you need to take care of in your Go program. It essentially abstracts away what Node.js does in its event loop and the programmer does by manually splitting up the program into a series of function callbacks. Just use that knowledge to structure your program accordingly, and don't tried to badly reinvent futures.

kasey_junk · on March 27, 2016

How is that different than any language that supports parallel concurrency?

You can literally say that about every labguafe that has concurrency abstractions.

The whole point of the argument is that golangs abstractions just push the callbacks (or blocking sync) to the application layer.

tptacek · on March 27, 2016

I think I follow. But why do I want general-purpose functions to ever be non-blocking in Go?

To me, idiomatic Go suggests that libraries, data structures, business logic, &c is encapsulated in blocking functions, and that concurrency is expressed in glue code.

kasey_junk · on March 27, 2016

That doesn't seem to be the argument to me (though I'm totally ignorant about the Node aspect of this).

The argument to me is that Go advertises itself as being easy to write that concurrent glue code & then fails to provide basic abstractions around it.

Everything you mention about things being expressed in glue code would be easier if Golang provided a modern promises library & holds just as reasonably for other modern languages that do.

The "node requires callback hell" is a red herring because so does Golang (that or blocking code), just it does it at the app layer instead of the framework layer.

wtbob · on March 28, 2016

> The argument to me is that Go advertises itself as being easy to write that concurrent glue code & then fails to provide basic abstractions around it.

What‽ It provides channels, which are a really nice abstraction for concurrency.

> Everything you mention about things being expressed in glue code would be easier if Golang provided a modern promises library & holds just as reasonably for other modern languages that do.

Have you actually written any Go? Using channels in Go is much nicer than using callbacks. Callbacks basically force one to write one's code in continuation-passing style, while callbacks let one work about sync points only where you need.

kasey_junk · on March 28, 2016

> What‽ It provides channels, which are a really nice abstraction for concurrency.

Channels are a concurrent safe queue, nothing more. They are not even particularly well implemented concurrent queues as they are highly contended.

What go provides for abstractions around those queues are range and select. Range is a nice way to make simple concurrency cases look like traditional for loops (and is very rarely used in practice as simple concurrency cases are rare, at least for me).

Select is a nice addition to the language, yet the edge cases around channels make it very difficult to reason about without keeping http://dave.cheney.net/2013/04/30/curious-channels open.

> Have you actually written any Go? Using channels in Go is much nicer than using callbacks.

I have, for 2 years, full time. I have not used node. Have you ever used a language that supports concurrency besides Go? Go does not provide, as part of its concurrency abstractions, things that are deemed bare necessities by other languages such as timeouts, cancellation, supervision, lock free structures, etc.

The entirety of the Golang concurrency story can be wrapped up with 3 things a) the culture of the language prefers message passing concurrency b) the scheduler is a very good example of when simplicity gets you great performance in the main cases c) select as a language keyword is interesting.

By leaving the basics out you push all of that work to the application level, which is why I think the callbacks argument is a red herring. At the application level you are either going to encounter callbacks (as in the stdlib http handler) or blocking code (subject to races, deadlocks, etc).

Lukasa · on March 27, 2016

You wouldn't.

What I was respond to was this specific sentiment: "With Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code." My argument was that that is simply not true. Client code must know the difference between blocking and non-blocking code because it affects the flow of information around the program.

adrusi · on March 27, 2016

I haven't looked too deeply into Haskell's io model, but it think it might offer an exception to your "untrue in all languages" claim.

fooster · on March 27, 2016

> To have an abstraction where you really don't care about blocking or not you need promises/futures. Go's futures are bad. Real bad.

Please explain.

kasey_junk · on March 28, 2016

Well it's possible I over stated the necessity for futures as their might be other abstractions that allow you to compose concurrent/non-concurrent systems in a way that is transparent to the api. Its just one of the easier ones to reason about. That said, I don't believe concurrency should be transparent to the api, rather it should be front and center to it.

As for the go futures being bad, channels generally are a bad concurrent queue (contended, lack basic abstractions, etc) and using a single item queue as a future isn't in and of itself bad, but does mean you can't optimize for different usages that futures might have over message passing queues.

erlich · on March 27, 2016

> What Go does better than Nodejs is a better standard library

What is so hard about `require('lodash')`, or `require('moment')`?

> true concurrency

The trade-off is you get great concurrency without thinking about it. If you are working in Go, you have to start thinking about it as you go. You can always think about it as you go with Node.js if you want, but you don't have to and you will still get quite far.

> static typing

See http://flowtype.org or http://www.typescriptlang.org. With more advanced type systems than Go.

> With Go it doesn't matter if an operation is blocking or non blocking, that fact can totally be abstracted from the client code. Writing callbacks is tedious, promises are tedious, co routines with yield need plumbing and async isn't in the spec.

You can use ES7 async/await and promises today. That it "isn't in the spec" probably won't be for too much longer.

  async function() {
    const {data} = axios.get('foo')
    const results = []
    const arr = ['foo', 'bar', 'baz']
    for (const i in arr) {
      results[i] = await axis.get(arr[i])
    } 
    return results
  }

skybrian · on March 27, 2016

A big advantage for Go is that the standard library has been stable since 1.0 and there are no plans for breaking changes.

This appears not to be true for lodash.

jdd · on March 28, 2016

Lodash introduces breaking changes in major version bumps (following semver).

So upgrades are optional and not pulling the rug out from underneath the feet of devs.

Major releases allows the project to make corrections and avoid carrying around years of baggage.

skybrian · on March 29, 2016

Yes, that's usually the best strategy for most of us.

But a standard library is a little different due to the sheer number of dependencies and level of coordination needed to upgrade them all. Go got it mostly right the first time, and the lack of churn is a major advantage.

gotchange · on March 27, 2016

    async function() {
      const {data} = axis.get('foo'),
            arr = ['foo', 'bar', 'baz'];
      return arr.map(ele => await axis.get(ele));
  }

I rewrote your async function making it more concise and compact. Isn't advisable not to use «for in» with arrays?

Arnavion · on March 27, 2016

    return arr.map(ele => await axis.get(ele));

is not legal - the await has no corresponding async. You may have wanted to write

    return arr.map(async ele => await axis.get(ele));

but that's the same as

    return arr.map(ele => axis.get(ele));

(Assuming axis.get returns a promise.)

That is, it returns an array of promises of values, not a promise of an array of values like erlich's function.

erlich's example is correct, with the replacement of for-in with for-of.

Edit:

Also, a solution using Array.map() is not impossible:

    return Promise.all(arr.map(ele => axis.get(ele)));

but this has different characteristics than the original function. The original ran the tasks in series whereas this runs them in parallel.

hamburglar · on March 27, 2016

> Nodejs is still outpacing Golang according to Google Trends and yet I have never seen a single article about switching from X language to Node.js.

This is not the endorsement you seem to think it is.

> My guess is that the Go community probably wouldn't exist if it wasn't for this constant form of aggressive propaganda/marketing. Node.js on the other hand doesn't need any marketing/propaganda; it just sells itself.

Your guess? This is pure emotion arguing here. I have rolled my eyes at the golang converts for literally years and recently I decided to give it a try and it truly sold itself, sans marketing. I also really like node.js, for what that's worth. This comparison sounds like the hurt feelings of the True Believer.

Different tools have different applications. Don't ever forget it.

e12e · on March 27, 2016

>> Octo was also running across 2 medium EC2 instances which we bumped up to 4.

> Of course if you're comparing a single instance of Go running on 4 CPU cores, it's going to be faster than a single instance of Node.js running on 1 CPU core.

I'm not intimately familiar with Amazon, so to me it's unclear what "2 medium ec2 instances" mean. It might mean, 2 m3.medium general purpose instances, in which case they actually expose only one cpu core. They also seem a terrible fit for an IO/CPU-bound work load?

While I can sympathise with the author that it'd be nice to spend as little as possible on a (micro) service, it strikes me as odd that they first tried going 2x medium -> 4x medium, rather than 2x medium -> 2x x.large (possibly a different intstance type, like c4 -- and run 8 (or whatever) node instances per vm.

It does sound like this is a typical area which golang would be a very good fit (Reminds me of: https://talks.golang.org/2013/oscon-dl.slide ) -- But it also strikes me as odd that rewriting the (apparently) simple service in Go ended up looking cheaper than a) scaling up (rather than horizontally), and b) rewriting the service in nodejs (but with an architecture more suited to the problem).

TheOtherHobbes · on March 27, 2016

Go's speed is comparable to Java and nearly as fast as raw C. The only thing faster is raw C++ - and good luck getting a web project finished in a reasonable time using that.

Node is not a speed demon. It's not as bad as Python, which can be insanely slow. But if you use Node, your server costs are always going to be some multiple of what they could be. (Also applies to RoR.)

Of course the usual argument is that servers are cheap and devs are expensive, so you'll be quicker to market with Node and that's worth the extra cost.

Which is fine until you need to run at speed and/or scale. If you start having to pile on the instances, you will be wasting money. If you're big enough to need hundreds of instances, you will be wasting a lot of money.

IME it's debatable whether Go can't also be faster for development. There's a learning bump at the start, but I found it easier to write code that just works, easier to write good tests, and easier to deal usefully with errors.

BogusIKnow · on March 27, 2016

This is always the problem when you're hype driven. I'm sure the decision to use Node was also hype driven. And they will rewrite again with the new hype, right until they are bankrupt because they didn't invest enough in new features.

cryptica · on March 27, 2016

Totally agree! I would say the same thing if someone had a whole app written on Python and were having performance issues (I definitely wouldn't advise them to switch to Node.js or Golang) - If your ecosystem offers the tools to solve the problem, it's better to use these.

Fluda3 · on March 27, 2016

Why on earth are both of you getting downvoted so hard?

I am so tired of reading hacker news articles with comments that are so pushed down because of subjective opinionated voting that is so obviously skewed towards one personality type.

I have so many thoughts and opinions I want to contribute to these discussions but I have zero confidence it will fit within this type of environment that's been created.

It really saddens me to see some of the comments that get downvoted here.

jwdunne · on March 27, 2016

I agree. My voting policy reflects: I reserve down vote for comments that worsen the quality of discussion. Examples are rude or dismissive comments that fail to address any point in the OPs comment or article.

At its worst, I've seen people get down voted for asking beginner level questions. The calibre of folk who post here can, by its nature, make commenting a nervous activity. Getting down voted for trying to expand your knowledge is both harmful and wrong.

If I see a comment down voted without the above properties, I up vote to neutralise whether I agree or not. If the comment contributes to debate or discussion, it should be valued. I know I value alternate perspectives.

skewart · on March 27, 2016

I completely agree. I do the same thing for subjectively downvoted posts. I want HN to be a place where people share all kinds of ideas and ask questions at any level.

madgar · on March 27, 2016

pg believes downvoting is an appropriate expression of disagreement: https://news.ycombinator.com/item?id=117171

jwdunne · on March 27, 2016

I'm not sure there's an awareness the effect a down vote can have on someone new here. I remember when I made a comment after joining and it got a down vote. I didn't comment for about 6 months.

The fact that down votes are silent makes things worse. Great, you've been down voted. Why? Ask why, down vote again. It's almost a recipe for gradual shaping to conformity of group opinion for anyone who wants to participate here. It can then come across as say these things, don't say these things and you'll be sailing.

I don't know, perhaps an explanatory comment should be encouraged so that real education happens. Otherwise its almost Pavlovian in nature, especially for the less confident and highly anxious.

skewart · on March 27, 2016

That's fine. I think most of the HN community disagrees with him on that.

Given the way HN is designed (comments are ranked, low scores get grayed out, user karma scores are visible) if people downvote to express disagreement it would worsen the experience of reading comments, and, I suspect, the overall conversation.

lucio · on March 27, 2016

https://news.ycombinator.com/item?id=11273354

finalight · on March 27, 2016

this is why i hardly commment on hackernews even when I give constructive feedback; simply people here are mostly childish

BogusIKnow · on March 27, 2016

Yes, HN is an intolerant place. Have an opinion and you're downvoted. I have some sock puppets because of this and don't care for any downvote.

dandanisaur · on March 27, 2016

> he would have saved himself and his company a lot of time

she

cryptica · on March 27, 2016

Whoops :p

KirinDave · on March 27, 2016

> But the thing the author fails to mention is that there are a lot of Node.js modules which help you easily scale across multiple CPU cores. E.g. PM2 https://www.npmjs.com/package/pm2

There are lots of bad things to say about Golang, but it's pretty wrong to suggest that PM2 makes it as easy to write performant systems as Go does. Go even has a really good race detector to help catch the few cases where you can get around channels.

The idea that a language cannot directly and transparently support scheduling across CPU cores is an antiquated one. Java is abandoning it. C# abandoned it long ago. Erlang showed long ago that we can do it reliably and with great performance.

Heck, even browser-level Javascript is moving away from this idea with async support, although everyone seems rather reluctant to make the plunge towards a stackless architecture.

> If you have a scalability problem with Node.js (or any language for that matter) and you can't find a simple solution that doesn't require rewriting your whole app, you're probably not qualified to manage this project at scale

I've got 13 years experience, 10 of which are as a distributed systems engineer. I've definitely hit the absolute edges of what off the shelf opensource software can manage, and more than once butted up against bugs and limitations of language runtimes. I'm extremely qualified to say that this is wrong.

Sometimes re-architecture and rewriting is the only way forward. That isn't necessarily a failure of the thing to be replaced. It's not even a failure of the architect if it is anticipated and managed correctly.

jasonwatkinspdx · on March 27, 2016

> My guess is that the Go community probably wouldn't exist if it wasn't for this constant form of aggressive propaganda/marketing. Node.js on the other hand doesn't need any marketing/propaganda; it just sells itself.

This is one of the most absurd and myopic statements I've seen in a while. Go has a vibrant community because many people chose it, for reasons that are entirely valid. Just as people chose Node.

You're posting with an enormously superior tone for someone that didn't even read the original post (newsflash: they were using a 4 host service cluster).

mnutt · on March 27, 2016

I generally agree with the sentiment about scaling across cores and nodes being the same; after all, you're eventually going to need to scale across nodes so you may as well go ahead and get that working.

But there are times when it can be a pain. For instance, a node app I run at scale maintains an in-memory read cache that rarely gets updated. When we scale with processes, we end up having to duplicate the cache across every process. We could set up a helper process and have it manage the cache over a socket, or possibly write a c++ module that shares some memory but at that point it's no longer simple and/or there are more parts to potentially fail.

anton-107 · on March 27, 2016

Or... just use the battle-tested tool that solves this exact problem well for you. Put your cache into Redis or other in-memory cache.

regecks · on March 27, 2016

A network-addressable store such as Redis is not the same as just being able to pull stuff out of memory. A map guarded by an RWMutex is simpler in code, probably more performant for simple access patterns, simpler operationally and effectively can't fail, as the parent points out.

Yes, Redis can be the right tool. But, note how quickly we went from "I just want to run on multiple cores" to "I need to share data between my cluster workers so let's install Redis". It's just a total blowup of moving parts.

Great-grandparent argument reeks of elitism. I know how to deal with concurrency, parallelism, avoiding data races. Doesn't mean I need to waste my time implementing the primitives every project. To say that the language forcing you into multi-process architecture is a benefit, is a pitifully desperate remark. I wonder what single digit percentage of projects actually go into that territory? Maybe even less than 1%? Maybe with a runtime suitable for the task (e.g. Go or Java or hand-woven C++), less than 0.1%, because you do not run into pointless runtime bottlenecks to begin with?

I have implemented such projects, because we needed the workers to be geographically distributed, not because the language runtime fucked us over. This was a decision made in DESIGN not in DEFECT REMEDIATION.

I can think of a handful of "single-purposed" reasons to use Node on the server, such as DOM rendering (and other stuff that is only available in Javascript-land). But a serious project, no way. Not sure how people can stomach the inherent lack of safety in the language, the huge (relative to comparable substitutes) number of limitations in the runtime and the ecosystem churn.

ptrincr · on March 27, 2016

Yep, redis would solve this. Shareable cache across a distributed cluster of node.js instances.

fooster · on March 27, 2016

... and now you have two problems?

mnutt · on March 27, 2016

"helper process and have it manage the cache over a socket" is essentially redis, and another thing that introduces latency and can fail.

awjr · on March 27, 2016

I'm also finding the amount of open source "solved this problem already for you" solutions available for Node is phenomenal.

powerbook5300CS · on March 27, 2016

In my experience, critical problems required by successful language ecosystems that are solved by nodejs modules are either solved poorly or abandoned by the original author. The ones solved by go packages are solved elegantly and maintained.

Your mileage may vary but the JS ecosystem is mostly garbage from where I'm standing.

Want examples?

- Distributed worker queue

- Minimal web framework

- etc...

woah · on March 27, 2016

Left pad

jen20 · on March 28, 2016

"A single Node.js instance runs its business logic in a single process. A single Go instance can run its business logic in multiple processes."

Ummm, both of them run in exactly one process, this is how operating systems work. Go can do it's n-m scheduling across multiple _threads_ sure, but threads are not processes (unless as an implementation detail of an OS). If you are running lots of instances of Node, that is a different thing.

ldehaan · on March 27, 2016

what about just using the built in cluster functionality already in nodejs? I've used it many times with great results and it is really easy to implement.

ignoramous · on March 27, 2016

The clustering module on NodeJs cannot compete with speed that threading, or STM, or Actors offer. The thing with NodeJs is, it is not suited for CPU bound workloads. It's a great fit for I/O bound workloads (where most of the hardwork is done by, say, a DB).

Process to process communication has tremendous cost, although shared-memory does make that faster, but at the cost of complicating the interaction. Message-queue (ZeroMQ) based interaction also an option but, by definition, it cannot be as fast as languages that have concurrency support built-in.

Reasoning about concurrency is hard, and languages like Go (via CSP), Erlang and Scala (via Actors), Java (via Threads), Rust, and Clojure (via STM) take different approaches to make this less of a pain.

CSP and Actors are what seem the most 'natural' of solutions. And that's a major reason why Go and Erlang are frontrunners in the concurrency race.

pka · on March 27, 2016

I would say the most natural is STM - you don't need to change much, just wrap your concurrent, shared memory accessing code in a STM transaction, be careful with IO (or use a language which prohibits IO in a STM block like Haskell) and you are done.

Actors are pretty nice when it comes to distributed services, and Erlang/OTP is pretty much uncontested here.

Threads with manual locking are only useful when speed is really important, but this is somewhat of a moot point when it comes to concurrency. Both STM and Actors perform slower (starvation, message copying overhead).

Here [0] is a nice, in-depth discussion about some of the different approaches to concurrency.

[0] https://plus.google.com/109566665911385859313/posts/FAmNTExS...

ignoramous · on March 27, 2016

Interesting point there. Latency vs Throughput. Using green-threads (Quasar on JVM) might be the best of both worlds then?

Akka.io on JVM could pretty much do what Erlang can on BEAM, functionality wise. Although some people have raised concerns over it being a "pseudo" solution and that it is a no match for Erlang.

askyourmother · on March 27, 2016

On the plus side, all the devs running to Go from Node, maybe we can finally get all those great libs from npmjs written in go, like Kik, isArray, isThirteen, and so many more classics of robust software engineering that the js community excels at.

rubber_duck · on March 27, 2016

Don't forget mature and standardized tools like build, dependency management and distribution

planetjones · on March 27, 2016

> for each request, Octo typically fetches somewhere between 10–100 keys from S3

I'm really curious as to why this is neccessary - what are the 10-100 keys you are fetching per request ? Just reading the high level article it sounds like there is some issue with what this service is doing, irrespective of the language it's doing it in.

meonkeys · on March 27, 2016

Another idea: stress-test a Node.js process then hard-limit the number of requests each Node.js process handles based on that stress test. Add more Node.js processes (one per core on the same or other servers) until there is sufficient capacity to handle all requests. This assumes the Node.js code is already distributable across cores and servers, which seems to be the case based on the article.

Might have been a day-long workaround rather than a weeks-long rewrite.

zaphar · on March 27, 2016

Part of the goal was fixing this without having to just throw hardware at it. Every Node process has a non-trivial amount of overhead that requires additional hardware resources. Overhead that the Go app doesn't have. They could have thrown more machines at this but that costs money. The better solution technology wise was Go. It was better suited to their workload.

jongleberry · on March 27, 2016

My same thought as well. S3 has a rate limit, which is where a lot of these timeouts or errors are probably happening. Seems like they should be using a database instead.

zimpenfish · on March 27, 2016

If it's anything like the websites I've worked, 10s (sometimes even 100s) of database requests to render a page isn't all that strange.

CydeWeys · on March 27, 2016

Well, it's Digg, so maybe each key is a comment? That's entirely reasonable. I can think of any number of sample web applications that would require loading dozens of objects from a DB per page load.

star-trek-fleet · on March 27, 2016

Another evidence that the language that is narrowly focused and force "more" correct practices tends to work better in practice.

Or, coders are getting worse at master the more powerful language...

XorNot · on March 27, 2016

99% of the time what is called "mastering" is actually a whole bunch of stuff you shouldn't do if you want a maintainable codebase.

Tricks are just that: tricks. No one sets up a business based on how you pull a rabbit out of hat, nor should they.

star-trek-fleet · on March 27, 2016

No, mastering a complex and powerful language means use the minimal feature for the majority of your needs, and knows how and when to use the "dangerous" part for a small fraction of your needs.

I do not know much of JS, but people write C++ with loads of patterns are prominent examples of people who do not master the language. That's not trick. That's real engineering with (often) bitter experience.

0xmohit · on March 27, 2016

Some comments prompt adding another link here: <https://medium.com/@tjholowaychuk/farewell-node-js-4ba9e7f3e...

sah2ed · on March 27, 2016

https://medium.com/@tjholowaychuk/farewell-node-js-4ba9e7f3e...

FTFY

0xmohit · on March 27, 2016

Thanks. Didn't realize that HN screws up links within angle brackets.

takno · on March 27, 2016

On the face of it this sounds like a problem where the optimal solution would have been to throw more hardware at it while you looked at the underlying choice of datastore.

diegorbaquero · on March 27, 2016

The EC2 medium instance has 2 cores. Were you running 2 node processes in cluster mode? The only thing I asked myself when reading this. I'm keeping my vote for node.

dminor · on March 27, 2016

m3.medium only has 1 core.

dalyons · on March 27, 2016

im obviously missing something here, hopefully someone can explain?

I understand the eventloop/nexttick/etc architecture of node, but i dont understand how in this case he was blocking the loop? Shouldnt all the operations to S3 be async (and thus non-blocking, even if waiting for a timeout)? What was the specific part in this scenario that was causing the loop to stall?

amalantony06 · on March 27, 2016

If I understand correctly, the problem was that they were making several thousands of requests to S3. While the requests to S3 themselves were asynchronous, the callbacks for these requests were queued up for (synchronous) execution in the event loop. Due the large number of callbacks in the queue already, new callbacks for the incoming requests were queued up for execution behind the previous callbacks, leading to latencies in serving up responses.

dalyons · on March 27, 2016

Ah ok! Perfect, thanks, that's what I was missing. Makes total sense now. In any lang (even in ruby!) you could have separate thread pools(or EM loops, whatever) for the s3 requests & web handlers. But because node only has one EL, and node interprocess comms is awkward, its tricky. Gotcha, cheers.

I wonder if using something like async.eachLimit would have helped; it might prevent the s3 batches from flooding the loop & give a chance for web reqs to interleave, but probably at a cost to the median resp time.

romanovcode · on March 27, 2016

2010 - Making switch from PHP to Ruby

2013 - Making switch from Ruby to Node

2016 - Making switch from Node to Go

2019 - Making switch from Go to {hype}

cloverich · on March 27, 2016

And as long as new ideas come about (in the form of languages / platforms), people will keep trying, learning, and sharing. Thankfully.

collyw · on March 27, 2016

The problem is most of the ideas are not new, they are rehashes of old ideas. Many people aren't learning, they are just jumping on the hype wagon without much reasoning.

tptacek · on March 27, 2016

"So what I'm saying is, we should all still be using PHP."

mwcampbell · on March 27, 2016

Or Perl, or C. After all, even PHP was once a new hipster language.

Of course, it's good to be skeptical of the current hyped flavor of the moment. But some languages really are better than others for specific tasks, and new languages do sometimes improve on their predecessors.

collyw · on March 27, 2016

C is still the choice language for many things. Perl is still cool to me, though seems to have been replaced by Ruby for web development.

brightball · on March 27, 2016

Ironically enough, given the recent improvements, performance and productivity...that's not really out of the question...

bliti · on March 27, 2016

What is star fighter written in? Curious, no bad intentions.

tptacek · on March 27, 2016

Go, Ruby, and React.

anp · on March 27, 2016

IIRC, it's got some Go infrastructure and the trading chapters have bots written in Ruby.

noir_lord · on March 27, 2016

Still using PHP (and Python and some other stuff) in 2016.

I make it a rule to only jump on the new bandwagon when I know it's not going to explode in 3 months and currently isn't on fire.

touristtam · on March 29, 2016

Go has been out for quite a while now, so I'd say it is pretty safe to try it out if you feel like it. I wouldn't swap PHP for Go personally.

ciokan · on March 27, 2016

Results talk. We make the switch to whatever works. I don't take the hype, I'm just grateful we can consider options.

ungzd · on March 27, 2016

2013 - switch from sane, stable, mature language to weirdest scripting engine extracted from web browser. Because lol that's cool to tear out javascript from browser and put to server lol rofl. 2016 - switch to low-level language, much lower-level than PHP, Ruby and Javascript. Because computers became slow lol.

2019 - switch to PL/1 and raw disk sector addressing because bigdata smoothie silicon googley disrupt the world

0xmohit · on March 27, 2016

This is inevitable. However, not everything is {hype}. Consider rust, for example. People have managed good stuff using that. Dropbox, Servo are widely known examples.

AsyncAwait · on March 27, 2016

Spot on, while I see problems with many of the latest technologies myself, not EVERYTHING is hype and I am getting tired of people labeling it as such just because they don't want to learn it/make the switch but want to mask it like they would if it wasn't just hype.

I've seen this very recently personally, where I was accused of adopting hype for switching to Swift for a major rewrite of my app, where I'd pretty much start over in Objective-c anyways so I may just as well do it in a safer, much nicer language that's going to be the future of the platform eventually anyway...

It turns out that you don't have to switch - if your current stack works well for you stick with it. It doesn't make you "less cool". But it also doesn't mean that somebody else doesn't have a legitimate reason to adopt the new technology, and this is a pretty textbook example of what I'd call a justified switch.

dustingetz · on March 27, 2016

Back to Node - all new languages will target node and browsers via compile to js, and many modern languages are gaining js as a target platform as we speak.

bitwize · on March 27, 2016

I lol'd at the bit about blocking Node's event loop.

If Unix weren't so committed to doing I/O the stupid way around, maybe Node wouldn't have to provide fragile, ersatz asynchronicity the way that it does?

fooyc · on March 27, 2016

Care to explain?

bitwize · on March 27, 2016

Two iron-clad rules of data processing:

* All data has a type

* All I/O is asynchronous and interrupt (event) driven

At its inception Unix doubled down on stupid, first by encouraging data to be stored and migrated in the form of untyped text streams, which require ad-hoc parsing and unparsing logic at the endpoints; and secondly by failing to provide asynchronous I/O primitives to user space. POSIX later provided an AIO standard, but it was cumbersome to use, bolted on rather than integrated into the OS, and virtually nobody used it. What you want to be able to do is submit I/O requests to the kernel, and have the kernel notify you when they are completed while you wait or do other processing; you could even put the process to sleep until pending I/O is completed and have it take up no CPU during this time. Instead what people do is burn CPU cycles in select/poll loops, which are the I/O equivalent of asking "Are we there yet? Are we there yet? Are we there yet?" over and over. You can fake asynchronicity by doing a bit of processing before asking "are we there yet?" again, but you have to write your processing code in such a manner as to be done piecewise in a loop, and the lower latency you want the more frequently you have to poll (and the more CPU you have to burn polling). This is how Node does "asynchronous" I/O; the VM stops every few instructions to poll for ready FDs, dispatches to callbacks as necessary, and performs parts of pending large I/O operations which can be done non-blocking. (That's the other sucky bit: you can't just call read(2) and write(2) for large chunks of data on an fd that's been opened O_NONBLOCK and expect it all to work; you have to keep reading or writing in a loop when the fd is ready, subtracting the number of bytes successfully written until the entire buffer is processed.)

But if something -- a call into a C library for instance -- stops the VM from doing the poll and I/O bits of its main loop, all I/O simply... stops. And your throughput goes into the toilet. Whereas if the runtime had been based on an OS that natively supports interrupt-driven AIO -- like VMS, Windows, or AmigaOS -- it would be extremely difficult to stall the I/O pipeline completely this way. You might be able to stall further I/O calls with a really long operation, but any pending calls already initiated would complete, and the runtime would be properly notified of their completion.

And all this stems from the fact that Unix was the Node.js of its day -- an environment designed to make things easy for casual programmers, not to do things correctly.

wtbob · on March 28, 2016

> At its inception Unix doubled down on stupid, first by encouraging data to be stored and migrated in the form of untyped text streams, which require ad-hoc parsing and unparsing logic at the endpoints

I tend to agree with you here: it'd have been better had Unix offered a typed abstraction over byte streams, but OTOH its approach really was a Worse is Better deal. Building a large set of tools which can deal with typed streams (or even trees, or graphs …) takes a long time, and is error-prone: building a few simple tools which treat everything as text is quick & easy. Of course, it does mean that the next four and a half decades are spent dealing with a lack of types, but it means one can ship quickly.

> Instead what people do is burn CPU cycles in select/poll loops

I thought that select/poll blocked, but it's been a long time since I did anything low-level like that, so I could be wrong.

> And all this stems from the fact that Unix was the Node.js of its day -- an environment designed to make things easy for casual programmers, not to do things correctly.

Well, Worse is Better.

bitwize · on March 28, 2016

> I thought that select/poll blocked, but it's been a long time since I did anything low-level like that, so I could be wrong.

Well, yes. That's just the problem. ALL I/O ops in Unix block. Even the ones on O_NONBLOCK fds, which simply read or write -- blockingly -- until the underlying kernel buffer is full/empty, then return.

You don't have the option to request I/O, do some processing, and then wait until the I/O completes. If you want to interleave processing with I/O you have to buzz in a select loop, doing your I/O and processing in tiny explicit chunks. Or spawn subprocesses to do I/O, leadibg to more complexity and confusion.

The facilities in the Windows kernel for I/O are strictly more robust and flexible than what Unix (eve Linux and BSD with kqueue/epoll) provides. The designer of the Windows kernel, Dave Cutler, also designed VMS, knew what real-world I/O requirements were, and famously mocked the way Unix handles I/O.

crzwdjk · on March 27, 2016

> you could even put the process to sleep until pending I/O is completed and have it take up no CPU during this time. Instead what people do is burn CPU cycles in select/poll loops

That's what select() does though: you give it a timeout, and it goes to sleep until either an fd becomes ready or the timeout expires. As an example, the X server on you typical linux laptop uses select() on a whole bunch of sockets, one for each client, and it both has good enough latency to be interactive, and doesn't burn a lot of CPU doing so. Is this the most convenient programming model for this? Probably not. And of course it still has the issue that a long-running function can delay your getting back into the select loop.

bitwize · on March 27, 2016

Sit down in front of an Amiga from the 1980s or early 1990s sometime and you will get a whole new perspective on what constitutes "good enough latency to be interactive". In particular, the UI was always perfectly responsive, even with multiple processes and disk accesses going on in the background -- on a 680x0 machine running tens of megahertz. Once you start spawning enough background tasks and pegging the poor kernel's I/O scheduler, your multi-gigahertz Linux box can't match that sort of claim.

Select is still not nearly as flexible as Windows I/O completion ports, for instance. You cannot kick off an I/O operation, do some other processing, and then wait for the I/O to complete before you use its results with a select loop. You have to do a little bit of processing, pump and poll, do a little bit more processing, pump and poll, etc. You are wasting cpu time compared to the interrupt-driven way, and if you have many processes going at once this can add up in terms of battery power used and latency suffered. Remember that select(2) and poll(2) are system calls -- and all the overhead that goes with that.

The X server, in its effort to implement a GUI which is by nature event driven, historically contained all sorts of hacks in order to simulate the asynchronicity which should come from the OS. And it has always been slow and laggy compared to its counterparts on Windows and Mac OS -- let alone the Amiga where interrupt-driven programming is the norm and the GUI has always been flawlessly responsive even when other tasks are going on.

There's also the fact that nonblocking I/O IS NOT ASYNCHRONOUS I/O. If the kernel needs time to perform an I/O operation -- for example, reads and writes from/to disk -- it WILL block your process doing so. All "nonblocking" really means is "read until read buffer is drained, write until write buffer is full, then return". Your process is still stopped during the read/write! With better primitives such as Windows I/O completion ports that map more closely to the asynchronous I/O subsystems of the underlying architecture -- DMA, interrupts, etc. -- that becomes a non-issue. I/O happens transparently in background from an application (and maybe even kernel) perspective, and nothing is blocked.

In short, nearly anything at all is better than the Unix model in terms of throughput, latency, and more natural programming style. Unix is actively regressive.

prlin · on March 27, 2016

For 1) how do other systems handle data being stored and passed between processes?

bitwize · on March 28, 2016

At a very basic level -- separate handling of text vs. binary. In some operating systems, some files were record-oriented and some were not, allowing the kernel to optimize record access in a database file, for instance. The Burroughs large systems used the disk as a backing store for objects in main memory, which were typed (in proto-OO fashion) and some of the type constraints were enforced at the hardware level. For instance it was impossible to execute a word tagged as "data".

One feature that was implemented in the Amiga OS, BeOS, and just about nowhere else is the concept of datatypes. Third-party software could register OS-wide readers and writers for their file formats, allowing files of that type to be consumed or produced by any application.

PowerShell lets you construct pipelines of typed objects with fields and methods, not just untyped dumb text.

bliti · on March 27, 2016

Wait until they discover Elixir and Phoenix. It's the power of Go (a bit more I think) without the ugly syntax.

15155 · on March 27, 2016

"Power of Go" for me means shared-memory parallelism.

What's the answer for that in Erlang or Elixir?

akshatpradhan · on March 27, 2016

What sets Elixir apart from Go?

bliti · on March 27, 2016

IMO, it's simpler to write and easier to read. It's been a bit faster in t he tests I've run but not by a lot. Probably due to my skills. I just find its nicer all around. It does not carry over anything from C.

kseistrup · on March 27, 2016

For old farts like me gopher is a protocol.

illumen · on March 27, 2016

I think the author made a pretty good decision to choose a solution which was proven to meet the requirements by another author. However Amazon does provide a tool to batch download multiple things from S3 though. Let's pretend for a moment those tools don't exist already... Let's also pretend Amazon doesn't suggest another solution for when there are lots of GET requests [0]. Let's pretend that using something like nginx which is a very well tuned web proxy might be way better because of SPDY, SSL connection pooling and such [1]. Let's also pretend that node.js can not use multiple processes with cluster cluster [2].

Why wouldn't one use python for this job? Since the company already does a lot of python, and this job is actually easy to do in python (I've done something similar).

400 processes with python3.5 is way less than the 4gig on one medium instance (less than 2.5MB each). Just farm the work out to a ProcessPoolExecutor, and have a timeout on the S3 get requests. That would let you have enough resources to match the spike of 20000 requests per minute (334 per second).

A lot of an S3 GET request could be the SSL, AWS auth and such. All quite CPU intensive. So using an async framework that doesn't do SSL+AWS auth async is obviously not going to work well once the requests go up.

There's even an example in the concurrent.futures of downloading urls [3].

Made with the beautiful python3.5

  import concurrent.futures
  import requests
  from awsauth import S3Auth
  ACCESS_KEY = 'ACCESSKEYXXXXXXXXXXXX'
  SECRET_KEY = 'AWSSECRETKEYXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
  
  URLS = ['http://www.foxnews.com/', 'http://www.cnn.com/', 'http://europe.wsj.com/',
          'http://www.bbc.co.uk/', 'http://some-made-up-domain.com/']
  
  def load_url(url, timeout):
      return requests.get(url, timeout=timeout).data
      # return requests.get(url, timeout=timeout, auth=S3Auth(ACCESS_KEY, SECRET_KEY)).data
  
  with concurrent.futures.ProcessPoolExecutor(max_workers=400) as executor:
      # Start the load operations and mark each future with its URL
      future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
      for future in concurrent.futures.as_completed(future_to_url):
          url = future_to_url[future]
          try:
              data = future.result()
          except Exception as exc:
              print('%r generated an exception: %s' % (url, exc))
          else:
              print('%r page is %d bytes' % (url, len(data)))

[0] http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-...

[1] https://coderwall.com/p/rlguog/nginx-as-proxy-for-amazon-s3-...

[2] https://nodejs.org/api/cluster.html#cluster_cluster

[3] https://docs.python.org/dev/library/concurrent.futures.html

mh- · on March 27, 2016

>However Amazon does provide a tool to batch download multiple things from S3 though.

out of curiosity, what tool are you referring to here?

trimbo · on March 27, 2016

Another idea is to use lua in Nginx itself, and keep it all within nginx.

Here's an upload example: https://github.com/jamesmarlowe/lua-resty-s3

DannyBee · on March 27, 2016

How does one live without is-positive-integer and isArray?

blitzprog · on March 27, 2016

The real question is: How does one live without is-thirteen? https://github.com/jezen/is-thirteen

ysolo · on March 27, 2016

Best post here.

tptacek · on March 27, 2016

Wait what why

https://www.npmjs.com/package/is-positive-integer

Ericson2314 · on March 27, 2016

[flagged]

erlich · on March 27, 2016

What do you use?

chridal · on March 27, 2016

Must be Erlang.

Ericson2314 · on March 27, 2016

[Cause I can't comment on my own post directly] I should give some some credit as Go is strictly better than Javascript.

Ericson2314 · on March 27, 2016

But to answer the question, languages with (at least) type parameters and no data races (without using escape hatches). (Hint: that includes, but is not limited to, Rust).

kelkes · on March 27, 2016

How many instances of node octo where running to handle the load?

biot · on March 27, 2016

  > And those 4 load-balanced instances we were running Octo
  > on initially? We’re now doing it with 2.

erlich · on March 27, 2016

Long live Node.js!

As everyone has mentioned this seems to be a simple case where the rewrite is a completely different architecture, which could have been done in Node.js.

I'm interested in why this continues to happen.

Seems like Node.js is a victim of its own simplicity. The baked-in non-blocking concepts of Node.js makes everyone feel like single-threaded is the only way to go, and think that Node.js is inherently flawed when it comes to parallelization.

Because Go doesn't give you a simple and relatively scalable solution without even thinking about it like Node.js does, instead you have to choose what to parallelize and think about the problem more. With PM2 scaling Node.js over CPUs becomes incredibly trivial.

Also, I think programmers naturally like rewrites/starting from scratch, and a new programming language provides an opportunity to rewrite everything (which the OP from this article and every other "switcher" article seem to do).

fauigerzigerk · on March 27, 2016

>I'm interested in why this continues to happen.

The psychology is easy to explain. The nodejs.org home page says "Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient". And the about page goes on to say "As an asynchronous event driven framework, Node.js is designed to build scalable network applications".

But for that supposed efficiency and scalability you pay a price: You must structure your code according to a particular optimization technique instead of structuring it according to semantic coherence. You lose modularity and clarity to gain scalability. That's the deal.

And then one day you realize that the whole thing doesn't scale as well as it could on a given hardware and it isn't very efficient unless you bring back some of the supposedly heavyweight architectural principles that you left behind.

At that point doubts start to creep in: Did we use the right tool for the job? Did we turn our code base into callback spaghetti for nothing?

And it's not that surprising that sometimes people opt not to keep paying the price in terms of code quality when they lose the supposed benefits they were paying that price for in the first place.

zongitsrinzler · on March 27, 2016

Recently I rewrote a medium sized Websocket app from Node to Go. Hoping for better performance and having heard a lot of good about Go.

Since the application relies heavily on sharing resources between sockets I found out the hard way that concurrent programming in Go is not much easier than in Java. Sure, it has a better syntax and channels but compared to Node where you never have to think about things like these the complexity is way higher.

Long live Node.js!

59nadir · on March 27, 2016

> Since the application relies heavily on sharing resources between sockets...

Which resources?

zongitsrinzler · on March 27, 2016

Connections are grouped in various ways. A lot of maps, etc. You can't even use a regular int safely, need to use an atomic for that.

I'm not bashing Go here, the memory usage is about 3x lower (which was the most important for me). And for a static language it's pretty cool.

59nadir · on March 27, 2016

Am I understanding you correctly when I assume that you mean you share data across threads?

douche · on March 27, 2016

> Seems like Node.js is a victim of its own simplicity

NodeJS is a victim of basing itself on a terrible scripting language that was thrown together to wire up basic event handlers to HTML buttons.

zappo2938 · on March 27, 2016

I've been learning Node.js for the past several months and building all my pet projects exclusively with it. A couple weeks ago I was offered a job, they knew I was mostly interested in JavaScript, but they dropped one on me in the interview. The condition of me being hired was I had to learn to Golang and would be developing in Go.

I just felt like I was switching rafts midstream if I went that route so I had to say no. When it comes to deciding which is better, being able to find Node.js developers is a huge plus. Also, there is a lot more resources for Node.js which might make development faster in an industry where the first out the gate is often the winner.

tonyedgecombe · on March 27, 2016

Interesting, for me the language I use would be near the bottom of the list of priorities in a job search.

Vendan · on March 28, 2016

True, though, TBH, if a programmer can't say "Sure, I can learn that", I don't want to work beside him.

spullara · on March 27, 2016

Switching from one essentially single threaded language to another wasn't likely to solve your issues. Instead of Go you could have chosen any JVM language or even C++ and had similar success.

sagichmal · on March 27, 2016

Go is not a single threaded language.

spullara · on March 27, 2016

Obviously. I was just pointing out that switching to Go wasn't the interesting bit. Maybe everyone misread my comment.

_ak · on March 27, 2016

If you didn't want your comment to be misread, you should have been more specific after the "another".