Stippling and Blue Noise (2011)

cubefox · on May 22, 2023

I wonder how this is done in inkjet printers. They basically also do "stippling". They even do it with multiple colors, as noted at the end. I somehow doubt they are using similar advanced algorithms though. Partly because those printers exist for quite a long time, and partly because they (probably?) can't control individual drops that precisely.

itronitron · on May 22, 2023

A lot of the research into stippling (dithering) algorithms was funded by printer companies in the 80s and 90s, and before then by Kodak.

http://hajim.rochester.edu/ece/sites/parker/assets/pdf/44%20...

Editing to add:

A reference to Robert Ulichney's author page on IEEE Explore...

https://ieeexplore.ieee.org/author/37325326600

I have a copy of his book "Digital Halftoning" and recommend it.

doetoe · on May 22, 2023

Nice that you posted Ulichney's author page. I am a co-author of the second paper in the list, which describes a parallelizable error diffusion algorithm with amazing image quality, developed by the first author (not me), as part of her PhD thesis.

This has not been deployed in a product however, mostly because the approach mostly taken in new products is that described in the third paper in the list, which is (part of what is) marketed as "HP pixel control", which is inherently a method for color imaging, and which also opens up many new possibilities

itronitron · on May 22, 2023

hmmm... the author pages for the associated co-authors also seem quite interesting :)

cubefox · on May 22, 2023

Oh that's interesting, and makes a lot of sense actually that printer companies were pioneers for these algorithms. Looking a bit into the first paper, we see (pp. 1924, 1928) that the blue noise mask/pattern doesn't quite reach the quality of error diffusion, which seems to be the gold standard.

It's generally interesting that these algorithms exist in versions both for fixed pixel grids and for variable dot distances of printers.

tysam_and · on May 22, 2023

Dithering is very, very freaking cool.

You can do it with any discretely-binned parameter/value/thingie-ma-bobber that must represent a continuous value.

This includes machine learning parameters!

This is something that I've been trying to get the word out about. A rule of thumb that's worked really well for me is "Almost never use a fully discrete approximation of a continuous process if you can get as close to the continuous process as possible."

One very pertinent case is in virtually-continuous batchsizes. You can trivially dither back and forth between the nearest rounded X microbatch (or full minibatch) size using a simple Bernoulli (i.e. a 0-1 weighted coinflip) distribution when doing batchsize growing, which oftentimes happens during LLM training. This averages out temporally (which you'd see if you took, for example, the running exponentially-averaged mean of the value, for example) if you run it for a really long time and seems to be strongly superior to just staying hard-locked at the nearest quantized bin (which sorta makes sense to me).

If you look at it from an information-theoretic perspective, you're communicating more information via discretely-emitted tokens with the dithering process about the underlying continuous variables and thus trivially we can deduce that it must have a higher inherent performance ceiling to it.

I use this in one of the projects that I've worked on that's out in the public, but I really need to tighten it up as dynamic batchsize growing is still a new subdiscipline that is still very much in its infancy and strongly looked-over IMO by a number of folx. Take a look into this method please if you're interested and ping me if you ever have any questions, please!

Happy to answer any questions and to talk more in detail about this topic, this is an interesting topic to me and hoping to get more people to use dithering in more places (not just in Machine Learning, I feel/hope/etc!!!)! Just sort of reminded me of this.

I'm also very interested in the implications of structured dithering for discrete approximations of what is inherently a continuous parameter in an ML setting, as the implications are autoregressive, and I have this fear that randomness is really perhaps the only clean way to avoid some sort of stacking "echo effects" where (high dimensional I'd assume in this particular case) oscillations happen in a very unintentional kind of way (which happens surprisingly often when noise or truly random sampling is not used appropriately....).

In any case, curious to hear people's thoughts, this is an interesting topic to me.

pyinstallwoes · on May 23, 2023

Sometimes pictures of the universe look like dithering, to me I see a similar arising of patterns when looking at globular clusters vs dithering noise patterns.

In fact, I say this because I was looking at this today:

- https://en.wikipedia.org/wiki/Omega_Centauri

- https://en.wikipedia.org/wiki/Alpha_Centauri

Very similar shapes to me!

bhickey · on May 22, 2023

The Cohen paper is great. Simple, accessible, pleasant results. I implemented it as an option for Dungeon Crawl which, to my knowledge, is used in Crypt.