Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How much of this could have been solved by simply using compression? That is, instead of making up a new delta language, just store the full files, and let compression tools do their job.

I thought this was the definitive reason why git tracks full files, and not diffs. Turns out, that is just a better way to do things in most cases.



Compression as in zlib works well on 1-dimensional data such as text. Fonts are vectors, described by geometric generator functions, i.e. : render "O" as: Circle with center at 50%x50%, line strength 1.2%.

That description is already a (excellent) compression: A bitmap for the 1000px x 1000px "O" for your poster would be 1MB.

Whereas before fonts only had rules to change with size changes, this standard defines weight as another dimension.

It's quite similar to how jpeg, mpeg, and mp3 are better compression methods for their respective domains than WinZip could ever be by incorporating knowledge about data being encoded.


You are still looking at it with a "per character" compression. I'd imagine full charset methods could do better.

Additionally, since building the fonts from the source isn't time consuming anymore, you could just focus on compressing the representations that say "circle with center blah". (Which, again, takes this back into METAFONT territory. Not a bad place to be, just bemusing.)


You're right in general, but you gave the example of jpeg, mpeg and mp3 which are all lossy compression. LZW/ZIP compressed images are lossless. I suspect that applying Zip compression to font files might not reduce file size enough to be worth doing for the reason you initially stated.


I guess by interpolating glyphs you also will loss information, so the comparison to jpeg makes more sense.


I suppose the strategy is that, if you know your expected data types and data structures pretty well, it's benefitial to apply delta encoding before general purpose compression, especially for simple RLE compressors.

In graphics, delta might be better, since the glyph is offset anyway on the canvas, so when drawing, the renderer can use the diff information as-is when the last point is known to reduce the number of operations to a simple addition with smaller length numbers as opposed to absolute positioning.

It might also keep the extracted memory footprint low since you might be able to get away with less bits per encoded change in a typed array data structure.

Still wonder if Mapbox ever did some size and speed benchmarking of general purpose compressed raw signed integers vs. their zig-zag and delta encoded vector tile geometries. https://github.com/mapbox/vector-tile-spec/issues/35#issue-1...


There was ~70% saving between using variable fonts and regular instances. Compression is already applied to web fonts, and you can compress variable fonts as well, so it is not like they are mutually exclusive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: