Hacker News new | past | comments | ask | show | jobs | submit login
Pigz - multicore version of gzip. (zlib.net)
48 points by vluft on March 31, 2010 | hide | past | favorite | 22 comments



I am just curious, why not implement this within gzip instead of forking it out as a completely new project?


What's the benefit of that? I see downsides: have to deal with the FSF, have to work with the existing codebase, have to fight to get the changes accepted, etc. It can always be ported into gzip if that becomes desirable.


The FSF is not involved in the development of zlib. Moreover, pigz and zlib are maintained/developed by the same person.


Non-sequitur? gzip != zlib


gzip ~= zlib, It's the same authors and the same algorithms.


have to fight to get the changes accepted

One of the main reasons I stay the hell away from OSS unless I can fork it.


That was exactly my thought when seeing the website.


This is REALLY useful when you are bundling AMI images.

When we changed NASA Nebula from XEN to KVM we had to tweak a few files in every image, this sped up the conversion by minutes per image.



The next parallelized compressor I'd like to see is LZMA. It's basically Lempel-Zip, like gzip, but with a much wider window (tens of megs, vs a few dozen k), and because it's dictionary-based compression it decompresses much faster than bzip.


http://jnovy.fedorapeople.org/pxz/

It's definitely needed since XZ is fairly slow.


Sounds good. If it works well, I'd love to see it become the new default on all *nix distros. It's about time some of these basic tools get updated for multi-core hardware.


We used this in SUSE Studio to decrease build time. Worked well.

There's also MPIBZIP2 which uses (IIRC) libmpi to do bzip2 compression/decompression across several machines. We messed with that too, but it wasn't worth the added complexity.


It's in FreeBSD's ports:

  /usr/ports/archivers/pigz            Parallel GZIP


Just out of interest, does anyone have any numbers on whether or not the problems with gzip are more i/o bound or cpu bound? I would've thought with gzip i/o would've been a bigger factor, but am happy to be proven wrong.


Anyone looked at the source yet? Is it just multi-threaded or actually using parallel frameworks like ITBB or OpenMP?


It uses pthreads. The source is very approachable and small, if you want to dig a bit more.


"How is it pronounced?"

Australians would pronounce it as "pig-zed". Just one of those novelties I guess... :)


According to the site, it's actually pronounced "pig-zee."

(I quite dislike the name, myself. And I think everyone's actually going to pronounce it "pigs".)


I'm Australian, and I'd pronounce it "pigs".


Unless, perhaps, they have memories of watching Monkey during their childhood:

http://en.wikipedia.org/wiki/Monkey_(TV_series)


I would guess "pigs", as in "adding cores makes pigz fly" ;)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: