There's quite a lot of retro modern LZ activity too! LZ turns out to be amazing on old machines, often only a couple times slower than a block copy. Optimal compressors and control over the algorithm have led to some very tight demos.
https://github.com/emmanuel-marty/lzsa - LZSA - a LZ4-like modern LZ that's more efficient both in speed and compression to LZ4 (at least on the 8 bitters it targets) - includes a neat chart of speed/compression trade-offs on a ZX Spectrum with a variety of algorithms
"Managing Gigabytes" by Witten et al is _the_ book that got me into the field of compression back in the day. Granted, it’s a bit dated now as there’s been a lot of progress in the field, but I’d still heartily recommend it to newcomers.
Only partly related, since it deals with a specific type of data (English language Wikipedia), some of you might enjoy reading about the Hutter Prize: https://en.wikipedia.org/wiki/Hutter_Prize
> First, we just shorten any symbols that are longer than our maximum code length — 11 — to that value. This means that our tree will no longer be a Huffman tree, so we need to do a fixup pass to redistribute the error we've introduced.
I've spend hours trying to understand how you apply this theory. The internet is surprisingly absent of actual examples showing how it's done. The best I've found that explains the packag-merge is this page: https://create.stephan-brumme.com/length-limited-prefix-code...
I just made a huffman encoder on the c64, for fun. I need understand how you go from the variable length codes of huffman, to suddenly fixed lenght codes, because you don't want the codes to be above a certain length. hmm...
This might not apply to all LZ based programs/algorithms but compress (an old Unix utility based on the Lempel-Ziv-Welch algorithm) fell out of favour after it ran afoul of patent issues (you can read more about it on the compress Wikipedia page [0] and a section of the GIF Wikipedia page [1] which covers more about the enforcement of the patent). From what I can gather though, it enjoyed considerable success and was pretty much the standard utility for compressing data on Unix. I think eventually though it would have been replaced because other algorithms (bzip2, gzip, etc.) have slightly better compression ratios (even if they are more computationally expensive).
LZ is popular because it's computationally efficient. When random memory access is slow but sequential access is fast, you want to encode substrings instead of individual symbols. Explicit higher-order probabilistic models require a lot of memory and generate too many cache misses. Implicit models based on the Burrows-Wheeler transform also have poor memory access patterns.
I think that's misidentifying where's credit is due: LZ might not be the most efficient compression format when it comes to size, but it's very efficient computation-wise when compressing and decompressing, and this still holds true today even when you consider newer algorthms like Zstandard (which uses a derivative of LZ). Sure, if you need to archive data into its compactest representation at any cost, LZ and derivatives won't deliver it, but unless there's a significant change to how computing works it's still in LZ's favour. At the time DEFLATE was designed, LZ just beat out every competition since realistically you can't run decompression at real-time using minimal memory.
lzma is slow to decompress. Unless the bandwidth / storage costs are the primary concern, zstd has good enough compression ratio and the very fast decompression makes everything much nicer. Built-in threaded compression, long range compression, binary diff, dictionary compression, etc. are a bonus.
I never understood zstd. It's basically lzma+gzip+lz4 packed together. Show me any zstd level that has significantly different speed and data size than one of the levels of those three can't match.
> Show me any zstd level that has significantly different speed and data size than one of the levels of those three can't match.
In case you are not just trolling, I had some very large JSON file with a large amount of text in my SSD around and the compression time was as follows [1]:
Zstandard is not just lzma+gzip+lz4. It is better than everything you've mentioned at least for my test case. In fact I first compressed with zstd and tried to match the compression time for others, because zstd is very fast and using anything else as a reference point would have taken me much more time. It does have enough reason to claim itself to be "standard".
[1] Tested with i7-7700 3.60GHz and 48 GiB of RAM (but no algorithm tested use more than several MBs of memory anyway). I'm using a pretty fast SSD here so I/O speed is not much of concern. Also note that every algorithm except for lzma is single-threaded.
EDIT: axiolite wanted to see --single-thread and while my Linux box only has an older zstd (1.3.3) without that option I realized I do have a Windows executable for recent zstd (1.5.0). Both executables have run in the same machine but I can't guarantee anything about the 1.5.0 binary.
--single-thread probably doesn't do what you think, it forces zstd to serialize I/O with the compression job and it doesn't mean compression itself is multi-threaded with -T1 [EDIT]. I can't try that anyway because my zstd version is slightly lower (1.3.3), but I can try pigz:
3,664,854,169 0:42 pigz -1 -p4
3,664,854,169 0:28 pigz -1 -p8 (default, also probably the max possible with my box)
Now I'm very curious where you got your copy of zstdmt. (I'm using stock Ubuntu packages for all of them.)
[EDIT] Was "it doesn't make compression itself multi-threaded", which can falsely imply that --single-thread seemingly enables multi-threaded compression but it doesn't ;)
Very interesting. It might be the case that zstd optimizes more for recent machines; zstd famously uses four different compression streams to maximize instruction-level parallelism and that might not work well in older machines. I haven't seen any machine where zstd is significantly slower than it should, but those machines I could test came from 2013 or later. Or either the RHEL package might have been optimized for recent machines. It would be interesting to test a binary optimized for the current machine (-march=native -mtune=native).
Isn't that exactly the point, and why it's so great? It's a single algorithm that's tune-able across that wide range of speed/ratio trade-offs using a single parameter. So you can just use one thing for almost every application instead of picking between three different things.
(Yes, I know that one parameter controls multiple different settings internally, so there are multiple dimensions if you're willing to dig that deep.)
Anyway, my recollection from looking at benchmarks a while ago: zstd used at similar ratios to lzma compresses in similar time but decompresses much faster, and it's also faster than gzip when set to comparable ratios to that. lz4 is still faster than the fastest zstd modes, and lzma at the most extreme settings still gets better ratios that the best zstd can do. But there's a huge wide swath in the middle where zstd beats them all, and that's quite valuable.
> So you can just use one thing for almost every application instead of picking between three different things.
I honestly can't see how remembering several ranges of zstd levels and their time/speed trade-off is any easier than remembering lz4, pigz, xz, which all happen to have good default setting.
I use zstd once a year, and it is very easy:
1. the default (3 on my machine) is fast and compresses well.
2. if you need to get the best compression level for your use case, use the benchmark "zstd -b1 -e15 yourfile" and after a few minutes you have your answer.
I can't see why I should ever in my life use another program for compressing things than zstd.
It's really not lzma+gzip+lz4, except by if you mean "it's an LZ-alike". It's a modern LZ with repcodes (kinda LZMA-ish, but more like LZX honestly), and it uses a novel arith coder system (Huffman/ANS) instead of a range/arith coder (LZMA) or none (LZ4)
Here are some DB dump compression tests done years ago, look at the pareto frontier and see that zstd wins between LZ4 and LZMA (xz): https://zeevt.github.io/compress_pareto.html
> For our goal of a high compression LZ variant, we will want to Huffman code our symbols (zstd actually uses FSE, a variant of arithmetic coding to do this, but we will cover that in a future post).
For high entropy data (=somewhat random data), FSE is quite comparable to Huffman in compression speed.
For low entropy (lots of high probability symbols, like zeroes), Huffman is about 2-3x faster. But on the flipside, FSE achieves markedly higher compression ratio.
I think FSE is worth the speed tradeoff vs Huffman in most cases.
"FSE achieves markedly higher compression ratio". I do not thin it is true, FSE/ANS achieves slightly better ratios in general.
Zstd uses both Huffman (for large alphabets) and FSE (for small alphabets).
"Arithmetic coding is much, much simpler." Let us agree to disagree.
"And decompression speed is not a limiting factor in most applications of data compression like this". It depends on the application. Zstd and Brotli are certainly aiming at the fastest decompression speed possible.
The Huffman encoding loop is 2 lines and decoding loop is 4 lines of branchless code.
Do you have an example of branchless arithmetic encoder or decoder ?
https://www.brutaldeluxe.fr/products/crossdevtools/lz4/index... LZ4 Data Compression - a rather long and in-depth article looking at LZ4 on the 65816 for the Apple IIgs with a decompressor that exploits that processor's block copy.
https://github.com/emmanuel-marty/lzsa - LZSA - a LZ4-like modern LZ that's more efficient both in speed and compression to LZ4 (at least on the 8 bitters it targets) - includes a neat chart of speed/compression trade-offs on a ZX Spectrum with a variety of algorithms