There exists an array-based, in-place algorithm for this, reducing the need to allocate trees and chase pointers.
I mention this only because, when I learned the tree-based approach at uni, I simply wasn't aware that there was another way to do it, and I'm wondering how many of you that's true for as well.
While the tree approach is intuitive and illuminating, it probably makes more sense to work with in-place arrays, since the situations when you care most about compression are probably the situations when you have a lot of data and want to run fast.
In-Place Calculation of Minimum-Redundancy Codes
Moffat, Katajainen. 1995.
http://hjemmesider.diku.dk/~jyrki/Paper/WADS95.pdf
> In-Place Calculation of Minimum-Redundancy Codes
Or in general, refer to "On the Implementation of Minimum Redundancy Prefix Codes" by Moffat and Turpin (1997), as strongly recommended and later explained by Charles Bloom [1].
The JPEG standard ITU T.81 (1992) has a description of the algorithm in flowcharts, so the knowledge of array-based Huffman was probably already somewhat common in the 80s.
I mention this only because, when I learned the tree-based approach at uni, I simply wasn't aware that there was another way to do it, and I'm wondering how many of you that's true for as well.
While the tree approach is intuitive and illuminating, it probably makes more sense to work with in-place arrays, since the situations when you care most about compression are probably the situations when you have a lot of data and want to run fast.