I wrote some similar code in python recently -- I think color visualizations like this are both fun and potentially useful for certain image manipulations.
This includes a readable 36-line implementation of k-means clustering that could be shorter if one wanted to play some code golf :) I used a pie chart layout, with pie slices proportional to their corresponding cluster sizes.
I also did something similar to this recently, but in PHP. I did grouping similar colors (although I rendered mine as tiled squares), as well as a few other things like grouping by brightness. Some neat results, for sure. I can put the code up for it if anyone is interested.
Here was the quick k-means implementation I threw together if anyone wants to play with it (my whole library licensed GPL).
It could definitely use some serious cleaning up (and I will probably OO-ize it when I get a chance -- or I'll take pull requests), but it definitely works.
In general, you can toss any set of numbers into a clustering algorithm, and it's kind of interesting to puzzle over the structures that come out. The more you know about the domain, the more interesting it tends to be.
PCA can be the same way. You toss images or whatever in, and out come either eigenvectors or principal components of the images. Either way it's often interesting to domain experts.
Yes, k-means clustering is well known for its use in color quantization (for instance, reducing the color depth of a 24-bit image to a 8 bit paletted representation that most faithfully captures the original.) Another popular algorithm is median cut which uses an k-d tree to recursively subdivide the color space based on the median color values of the pixels in the source image. Just about any image manipulation program that can output paletted images probably uses one of these algorithms.
OK, so I don't have much problem domain knowledge here, but couldn't you optimize the cluster size based on algorithmic bounds on variation within the cluster?
I could be wrong on this, but we learned about K-means clustering this year in college, as far as I know, the K random exemplars which you use for K-means clusters obviously reduces the number of colors in the image to a certain N ( N <= K). This would mean that any live manipulation of the K colors would actually only modify the image which consists of only N colors.
The net result would be a live image like you suggest, but one with much less detail. Still very interesting though.
K-means is only grouping all of the pixels in to K disjoint clusters, not actually altering the value of the pixels. Taking an entire cluster of pixels and shifting their values in the RGB space could produce some very interesting images. For example, you could easily make all of the almost-white pixels slightly blue. There would be no loss of variety among the pixels, they would just all have their blue values bumped up.
You compute 'centroids' which denote the centers of the clusters, but you don't have to change the values of all points to the centers of the clusters.
In other words, you can maintain the detail in RGB space (as this author has) while reorganizing things in location space by their k-means clusters.
I understand this, but does this mean that it is bi-directional in that a change to one of the palettes would reflect in the image. This is what I understood from the comment. If so, how does this work? Sorry for any misinformation in my comment
This includes a readable 36-line implementation of k-means clustering that could be shorter if one wanted to play some code golf :) I used a pie chart layout, with pie slices proportional to their corresponding cluster sizes.
Code: https://github.com/tylerneylon/imghist/blob/master/imghist.p...
Sample images: http://blog.zillabyte.com/post/11193458776/color-as-data http://blog.zillabyte.com/post/13141231882/hue-histograms
If anyone else is interested in this stuff, Austin A made a great suggestion on the original post to use the Lab colorspace.