Pigz - multicore version of gzip.

janitha · on March 31, 2010

I am just curious, why not implement this within gzip instead of forking it out as a completely new project?

jey · on April 1, 2010

What's the benefit of that? I see downsides: have to deal with the FSF, have to work with the existing codebase, have to fight to get the changes accepted, etc. It can always be ported into gzip if that becomes desirable.

Rexxar · on April 1, 2010

The FSF is not involved in the development of zlib. Moreover, pigz and zlib are maintained/developed by the same person.

jey · on April 1, 2010

Non-sequitur? gzip != zlib

Rexxar · on April 2, 2010

gzip ~= zlib, It's the same authors and the same algorithms.

krainboltgreene · on April 1, 2010

have to fight to get the changes accepted

One of the main reasons I stay the hell away from OSS unless I can fork it.

ulrich · on April 1, 2010

That was exactly my thought when seeing the website.

anotherjesse · on March 31, 2010

This is REALLY useful when you are bundling AMI images.

When we changed NASA Nebula from XEN to KVM we had to tweak a few files in every image, this sped up the conversion by minutes per image.

jey · on March 31, 2010

Parallel bzip2: http://compression.ca/pbzip2/

lsb · on March 31, 2010

The next parallelized compressor I'd like to see is LZMA. It's basically Lempel-Zip, like gzip, but with a much wider window (tens of megs, vs a few dozen k), and because it's dictionary-based compression it decompresses much faster than bzip.

wmf · on April 1, 2010

http://jnovy.fedorapeople.org/pxz/

It's definitely needed since XZ is fairly slow.

MikeCapone · on March 31, 2010

Sounds good. If it works well, I'd love to see it become the new default on all *nix distros. It's about time some of these basic tools get updated for multi-core hardware.

natfriedman · on April 1, 2010

We used this in SUSE Studio to decrease build time. Worked well.

There's also MPIBZIP2 which uses (IIRC) libmpi to do bzip2 compression/decompression across several machines. We messed with that too, but it wasn't worth the added complexity.

Luyt · on March 31, 2010

It's in FreeBSD's ports:

  /usr/ports/archivers/pigz            Parallel GZIP

_b8r0 · on April 1, 2010

Just out of interest, does anyone have any numbers on whether or not the problems with gzip are more i/o bound or cpu bound? I would've thought with gzip i/o would've been a bigger factor, but am happy to be proven wrong.

kvs · on April 1, 2010

Anyone looked at the source yet? Is it just multi-threaded or actually using parallel frameworks like ITBB or OpenMP?

sparky · on April 1, 2010

It uses pthreads. The source is very approachable and small, if you want to dig a bit more.

pan69 · on March 31, 2010

"How is it pronounced?"

Australians would pronounce it as "pig-zed". Just one of those novelties I guess... :)

javert · on April 1, 2010

According to the site, it's actually pronounced "pig-zee."

(I quite dislike the name, myself. And I think everyone's actually going to pronounce it "pigs".)

jlangenauer · on April 1, 2010

I'm Australian, and I'd pronounce it "pigs".

jsankey · on March 31, 2010

Unless, perhaps, they have memories of watching Monkey during their childhood:

http://en.wikipedia.org/wiki/Monkey_(TV_series)

abstractbill · on March 31, 2010

I would guess "pigs", as in "adding cores makes pigz fly" ;)