Hacker News new | past | comments | ask | show | jobs | submit login
Node.js 0.2.0 released (groups.google.com)
91 points by shykes on Aug 20, 2010 | hide | past | favorite | 24 comments



So the API with ASCII as the default encoding is frozen :(


I got the impression that was a performance trade-off. UTF-8 decoding/encoding isn't free.


I know it's a trade-off, but IMHO it's a very poor one.

That's premature optimisation. API is forever. This decision sacrificed easy internationalisation and correctness of data for minor performance benefit in current implementation.

It's a big deal, because node.js isn't merely encoding-ignorant (like PHP), it actually removes higher bits. If you forget to specify encoding somewhere, your text will be malformed.


UTF8 seems to be the default encoding. I think he had just forgotten to update the docs for .write:

http://github.com/ry/node/commit/7fc794a8ec55bd9d137c4888404...


That sounds like a good reason to just leave the data in UTF8 rather than converting to ASCII embedded in UTF16 which is broken by design.


let's see.. ascii is 4 bit. UTF8 is 8 bit. Is this really an issue on todays computers?


ASCII is 7-bit (encoded in 8 bits - the high bit is ignored) and UTF-8 takes 8 bits for most characters, but can take 16+ bits for some characters.

Node is built for massive scalability on applications that (mostly) pass text from one source to another. Thus, having to convert the encoding of every string that passes through node can be a bottlenck.

Felixge has a good writeup of this: http://debuggable.com/posts/streaming-utf-8-with-node-js:4bf...


UTF-8 takes 8 bits for most characters

It should be noted that "most" here presumably means "most characters in an average English or western/central European language text" as out of the ~2^21 (~2 million) Unicode code points, only 128 are represented using 8 bits in UTF-8.


It doesn't matter. Whenever ASCII is an option, UTF-8 is optimal too.

ASCII is not an option for languages other than average English with poor typography and inability to deal with foreign names and addresses (e.g. LinkedIn made horrible mistake of using Latin1 initially. I still have contacts with &xxxx; visible in their names).

I think node.js should use UTF-8 by default, and require users to consciously switch bottleneck parts of their apps to ASCII.


I wasn't stating my opinion in my last post, just facts/clarifications.

But yes, I agree that UTF-8 would be a better default than ASCII unless someone provides hard evidence that encoding/decoding is a severe performance bottleneck in most real applications. (even then, I'd default to the "correct", not the fastest)


Felix doesn't seem to realize that JavaScript already has native functions for this. All of his code can be simplified to decodeURIComponent(escape(utf8ByteString)).


It's not the size, it's the work needed to decode and encode data.

Node.js is really, really good at shunting I/O around - it's ideal for writing things like proxies and file upload handlers. With ASCII, the bytes that come in are the bytes that go out again. If you're dealing with UTF-8 and unicode strings every time some data comes in you need to decode it as UTF-8, then pass the unicode string around within Node, then encode it back to bytes before you send it off again.

That makes a lot of sense for a web framework like Django (in fact it's what Django does) but Node is more of an I/O toolkit, so that performance overhead isn't welcome unless it's explicitly needed.


Modern CPUs are constrained by speed of memory, and amount of calculations you do on each byte doesn't matter that much. Node.js already takes the hit by copying memory to convert UTF-16 to ASCII.


The uses you mention sound like something for which you'd use byte buffers, not strings.


Which part of the API has ASCII as the default encoding? From the v0.2.0 docs it seems like Buffer objects default to utf8.


Request, response and streams default to ASCII, e.g. response.write(chunk, encoding='ascii')


That should really be in big red text in the docs considering it actually destroys bits. The api also seems inconsistent wrt net.Stream writes are encoded in ASCII but plain writable streams default to utf8: stream.write(string, encoding='utf8', [fd])


docs are wrong, it defaults to utf8


Finally, a (hopefully) frozen API for 0.2.x!

I hope Ubuntu 10.10 Maverick gets a stable 0.2.x version of Node.


If you want to keep track of hot, fresh node-y goodness independently of the Ubuntu release cycle (as I do), then please enjoy my nodejs PPA builds.

They're built for lucid, but run fine on maverick, and like everything else in that PPA, are used in production (thus I have an incentive to maintain them well).

  https://launchpad.net/~jdub/+archive/ppa
Enjoy!

(Note: I build a static version of node built against the internal copy of libraries it ships, rather than the dynamic build used by the main Debian and Ubuntu node packages. I really only do this to avoid maintaining those libraries in my PPA as well, and ryah keeps up with their updates anyway.)


Thanks for doing this. Any idea if they'll ever make it into the official reps?

I've avoided installing much from PPA (never sure about security / stability there), but this may break my habit.


Nothing like a big release days before Node Knockout! I can't wait, either way.


What types of projects have you used node.js with?


A lot of projects where I would have used python twisted (http://twistedmatrix.com) in the past, I now use node.js.

It's all projects where I need to connect different protocols together. Like amqp & websockets for example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: