Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: An interactive guide to compression basics (unwttng.com)
256 points by unwttng on Aug 8, 2017 | hide | past | favorite | 66 comments



I find it funny that GIFs are still being referred as GIFs [1] while most places serve them as h264 yuv420p MP4s [2].

[1] https://i.giphy.com/l0Iy1ZcHArR9aAQta.gif (2.7MB)

[2] https://i.giphy.com/media/l0Iy1ZcHArR9aAQta/giphy.mp4 (307KB)

[3] https://media.giphy.com/media/l1J3OGcUiw8NeXuM0/source.mp4 (146KB) <-- optimized with ffmpeg


It reminds me of how viral images are referred to as memes, often by people who are ignorant of the original and fuller meaning of the term as a viral idea (as opposed to and deliberately reminiscent of the view of a gene being a viral biological entity, as in "The Selfish Gene" by Richard Dawkins, who coined the term "meme" itself).

Of course, the term "viral" is itself used metaphorically here, and is also used as an analogy to a biological entity. But in this case, I think more people are aware of the existence of viruses than they are of the original meaning of the term "meme", though they might not make the conscious connection between a "viral" idea, video, or image and that of a biological virus or how it spreads "virally".


...because MP4phy doesn't have quite the same ring to it?


I caved and added a "toggle pointless gifs" button, history will judge me well


I actually came here to commend the author for letting me opt out of the nonsense. It may not seem like a big deal for you but the images are distracting to me, require more scrolling in order to reference something that was mentioned previously and do not add anything to your content.

BTW I think your content is great and I imagine you think so to. You commented on the fact that most of the comments were about the GIFs and not the content. I think that should be a not so subtle clue about the value of some lady waving in front of a green screen.


I read 90% of my content in emacs-w3m, in the terminal, where no images load unless I deliberately want them to. It also has the advantage of getting rid of javascript and most page design nonsense, flash, non-text ads, and various "Web 2.0" annoyances.

The result is beautiful, consistently formatted, lightweight, and perfectly readable plain text (fully integrated with my editor/OS, incidentally).


the most hacker news comment ever


This comes close, but I think there was one where one user talked about how they use Spacemacs (Emacs with Vim style editing), and use Org files[1] to bascially plan their life. Meetings, Calendar events, reminders, notes, TODOs, you name it.

[1]: http://orgmode.org/


Same here.

They also make me see the author as a bit immature (not necessarily true, but still).


Too late. I finally added media.giphy.com to my UBlock filter. This one put me over the edge, since I couldn't focus on the content.

However, otherwise an excellent introductory article. Got me reading about Shannon, Kolmogorov, and information theory now.


Nooooo!


I highly recommend Stephen King's "On Writing." The basic lesson - if it isn't needed, take it out. Kill your babies.

The GIFs are the "your babies" of this article.


Great book recommendation btw. On Writing Well by William Zissner is a great one too. Less autobiographical (but the Stephen King bio IS great and inspirational - still recommend)


Oh, I forgot to do what I always do when I recommend King's book - skip the autobiography at the beginning :P

Maybe if someone really likes King's works it'd be worth reading, but I didn't find it that interesting. I think perhaps if someone wanted to make a career out of writing, it'd be good to get a perspective on what it took him to "make it." Otherwise, eh.


I quite liked it. The struggle was interesting. Reminded me of The War of Art by Pressfield, but less or more extreme depending on how you look at it.


Thank you. Normally I'd enjoy the "pointless gifs" but they make it hard for me to read your article at work.


You got it.

Spoilsport :P


This is great content! Do you mind if I ask how you made it? Specifically the interactive demo? I've been working myself on a project[1] to explore tools for interactive demos, but so far I've just done one with a toy piece of content.

1: https://akud.github.io/visualization-blogs/posts/01_content_...


He made it with React.


I was able to actually read the article because of this. Thanks.


I found this quite hard to read despite the interesting content, mainly due to the animated gifs inserted throughout the article. It's very hard to focus on a line of text when there's an image darting around on the page. I wonder why the author decided to include them?


Same here, they're useless and look very unprofessional. Keep only the ones that are actually useful.


Quick solution: add the element to your adblocker. Most let you quickly select elements on the page.

Make sure to get the actual container (might need to do the image first) so you aren't left with giant gaps.


The fun images may have seemed like a way to lighten things up, but here they're a distraction from the content.. even as still images, every section break doesn't need a happy monkey or Poison Ivy.


Damn though they are _fun_


Gary Bernhardt has a great live screencast where he writes a compressor - for anyone that found this hard to follow: https://www.youtube.com/watch?v=3Eu9ZVZEZ3I


Awesome, I love Gary's videos, and this is one I haven't seen. Thanks!


Lose the gifs, and you have a very good article. I think they distract, rather than enhance.


Or at least a way to toggle them all off at the beginning


Done, please enjoy your (mostly) gif-free reading


OP here - I'm seeing 50/50 support for the gifs. I'm keepin my gifs. I like the idea of adding, and will probably implement, a toggle for all extraneous gifs. I love that most of the commentary about this article is about the gifs. Gifs.


I wouldn't mind the GIFs if they were A. smaller (most of them take up a good 2/3rds of the height of my viewport at 1600x900) and B. could be paused. They're humorous at first, but then they're distracting as I'm trying to read the stuff around them.


Personally I didn't mind the content of the gifs, it was the fact that they made it difficult to keep track of where you were in the text around them as I tried to read it.

Perhaps some people are less susceptible to this than others.


the gifs are funny. hacker news is just a bunch of anoraks.


The pixel art doesn't show up for me in Firefox or Edge. (Looks like they're there, just with a height of 0px?) Also, my motion-sensitive lizard brain thanks you for the gif toggle button.


On it, just me being a shitty web dev :P


Fixed at least on Firefox, please enjoy your pixelly goodness


Unless I mis-read or mis-understood you seem to invert the meaning of compression ratio half way through ?

> (100 / 200) = 0.5. Protip: compression ratios less than 1 are frowned upon.

> Unfortunately, it's not that simple. Say we had an algorithm (let's call it A) that, given any input whatsoever, was capable of achieving a compression ratio of strictly less than 1.


I did indeed, thanks for the catch - fix incoming


Yeah came here to point this out, that's definitely an error.


Once you understand RLE, LZ is only one step away --- instead of repetitions of individual characters/bytes/etc., you encode repetitions of longer strings.

But starting with RLE is IMHO definitely a good choice --- far better than Huffman, as a lot of introductory material seems to do. A minimal LZ12/4 (4KB window, 18B max length; an old favourite of the demoscene intro packers) compressor/decompressor pair is literally a few hour's worth of work, and yields surprisingly good compression for its simplicity, much better than simple order-0 Huffman.


This is nice! One minor error - 本 does not mean a tree, but a book or a root. 木 would be appropriate for the word 'tree'.


Contrary to the rest of the comments here, I enjoyed the gifs.


Same here.


Didn't bother at all here either. Great article.


I'm probably getting old because I can't stand this new trend of putting useless gifs (or mp4's actually) every two paragraphs in every article on the web. Oh, and of course all of them have autoplay enabled because sole purpose of my laptop's fan is to happily spin at full speed. </random rant>


Awesome! Thanks!

Any chance you want to do one on compression of a integer time series? How about variable length integers? Such an article would be very appreciated in IoT circles since data (timed voltage values) transfer and storage can get quite expensive for dollars and latency.

Cheers!


The first example (tree represented in Japanese) seemed a bit misleading, because the "alphabet" has not been kept as a constant. Since the Japanese alphabet is much larger, it may be argued that the number of bits actually occupied in storage by "本" and "tree" are about the same. Could someone clarify if this is correct reasoning?


One could argue that in information theory terms, there is more information encoded in a single "本" than a single "T".

However, this article is dealing with the concept of compression in terms of a simple symbolic representation of data.


Certainly, and I deliberately didn't get into bytes and encoding until after this - I was trying to get across the softer idea that in terms of space-on-a-page-using-a-pen, you've saved.


I really like the article. One point if like to see: you stepped over the fact that your using palettes without mentioning even though that gets you down from using 3 (or 4) bytes a pixel to 1/4. It's a compression ratio of 8 that your completely ignoring!


> This might eek out a little more compression

It's "eke": http://www.dictionary.com/browse/eke


Woah, TIL


This is a fantastic explanation. Thanks! (Also, I love the gifs)


Add a way to toggle the gifs off at the beginning for people that don't like memes distracting them from the content and you've got yourself a winner.


Done and done


Great article, would be interested in more in depth articles on say, JPEG compression in the future.

Really liked the better/worse thing too, added some nice comparison


I came here to ask the same. Can anyone recommend more in depth articles about compression?


Fantastic article. I really enjoyed the inclusion of a live, clickable demo with example output. Absolutely well done (after I disabled the GIFs).


Great article, exquisite gifs. :^)

Including the size of each Obama JPEG could be pretty interesting too.


I toggled the gifs off since I read the comments before the article. I must say, the article is excellent and you could even share it with a non-technical manager or spouse*

*goes ands talks to wife about compression basics :)


write a follow-up. it's interesting


Any thoughts on what you'd like to see in particular? Likely to be more of the same interactivity


I'd like to see some examples of how lossy compression algorithms work


> TODO - proper formatting for maths-y notation

Hmm


Ewwww cached web




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: