There is an improvement that can be made here. There is no attempt to find the brightest colour. If you look at the dark image, the brightest colours (red) stand out over and above the dark colours. Using this knowledge is an illustration technique. Notice how the red stands out on one image, a small blue line on the other?
This particular algorythm and others that I've seen don't take this into account. Depending on the image, you could also take into account the lightness and darkness to find the dominant colour however small amounts in the image.
HSV is a terrible color model for any human purpose. (To be honest, for any purpose whatsoever, unless it’s required for backwards-compatibility with legacy systems.) HSV, like the RGB model it is a trivial derivative of, has dimensions which are not closely correlated with any color attributes relevant to human perception.
Instead, use a model such as CIECAM02, IPT, CIELAB (from the 70s but not too bad), or Munsell (basically a big lookup table, from experiments done in the 40s).
While I support all statements in your post, it misses the point completely.
This is not at all about "human perception". It's the opposite. The computer is selecting some colors from the picture that it thinks are representative of the color palette used.
When you have a look at the code, the way it is done is that it takes the RGB values as coordinates, randomly selects 3 (or N) clusters, and then caluclates the euclidean distance of all points to their nearest cluster. It continues to randomly select new clusters until it decides that the result doesnt improve anymore at which point it just stops and prints out the coordinates (colors) of the clusters.
It should be immediately clear, that selecting another color model, such as HSL or HSV has a direct impact on the calculated euclidean distance, since these color models are not just simple rotations of the rgb cube. Thus it should lead to other (possibly better) results. I am fairly sure that this is what the parent post was suggesting. The color model used to print out the final colors is/could be independent of the one used internally.
Euclidean distance in RGB or HSV or whatever is a terrible metric for judging color differences in any context. Not only are they poorly matched to human perception, but these models are also useless from the perspective of judging chemical details of surface pigments, or whatever physical thing someone might care about.
But really, since we’re talking about “dominant color”, and “color” is defined with reference to human vision, what we likely care about is the color “clusters” apparent to a human observer. Which means we should work in a color model which accurately represents features relevant to human vision.
(The only time anyone should ever care about RGB is when something is going wrong with their physical display or camera, and the technical details of the I/O medium becomes important to study, or if they are writing software to convert colors represented by a better model <-> RGB for input or output.)
"But really, since we’re talking about “dominant color”, and “color” is defined with reference to human vision, what we likely care about is the color “clusters” apparent to a human observer. Which means we should work in a color model which accurately represents features relevant to human vision."
That was my interpretation. The colour I'm interested in is the difference between two colours, delta E (dE) somewhere in the dE there is a rule-set that will do what I'm describing. cf http://zschuessler.github.io/DeltaE/learn/
The models you propose are not what most if any designers use as the purpose is not to manually select best color combinations (there are other ways to do that) but to approach colors a little more structurally.
Whether you calibrate your perception to understand one model or the other doesn't seem to be relevant.
I am glad to be taught something I didn't know, but for now it sounds more like a theoretical claim than an actual useful one.
I think the parent was referring to the lack of properties like perceptual uniformity in RGB or HSV colour spaces. Check out the "Advantages" section on this Wikipedia page:
https://en.m.wikipedia.org/wiki/Lab_color_space
This is because tool builders originally created their software tools for graphics workstations of the 1970s (at PARC, NYIT, MIT, etc.) / desktop computers of ~1990 (e.g. Adobe Photoshop), and on such slow hardware it was impractical to use better models. Also, the early tool builders were not color scientists, but computer programmers with an amateur understanding of human vision.
It’s inexcusable that designers are still using such poor tools today. Hooray for historical inertia!
I personally use my own custom software for many design tasks, and I can tell you from personal experience that better color models make a dramatic improvement. But my methods are currently highly technical and idiosyncratic and closer to experiments than polished tools for non-programmers.
Since the “proof” here involves low-level changes to design tools that have had millions of man-hours of work put into them and don’t provide end-user-accessible hooks into their low-level guts, it isn’t easy for me to “prove” anything to you via Hacker News comment: your “proof” would basically need to be a rewrite from scratch of these complex tools, and I haven’t yet spent the several years of implementation work it would require.
It is a long-term goal of mine though to make a series of products. Knock on wood.
If you want to see real-world examples, some professional color-grading software aimed at film production (for instance Apple’s Final Cut Pro) uses the IPT model. Even in Adobe tools, there is certain limited support for CIELAB, which is not perfect but far better than RGB. Here’s a book about it http://www.amazon.com/dp/0321356780 .... unfortunately these all still have shitty user interfaces for interacting with colors, but it’s better than nothing.
In the physical world, artists and designers have been using the Munsell Book of Color in various incarnations for >100 years with great success. You can buy your own for $1000: http://www.pantone.com/munsell-book-of-color-matte-edition
You might just as well say “Great you folks decided to use positional notation, but I’m happy with my Roman numerals”, or “Great you folks found bézier curves, but I’m happy with my straight-line-segment paths”, or “Great you discovered electric motors, but I’m happy with my windmills and draft animals”, or whatever.
Sure, there’s nothing ”wrong” with using shitty archaic tools. It’s just ineffective.
I take it HSV is HSB (redirected me on wiki). What's bad with that one? I find that using this one in color pickers is the only one that lets me find the correct tones.
- It thinks #00ff00 (full-intensity green) and #0000ff (full-intensity blue) are the same brightness, when the green will be much, much brighter than the blue to human eyes
- A large portion of the hue space is dedicated to nearly-indistinguishable green
Does the while (1) {...} loop always terminate? I am not entirely sure about it and would like to hear some opinions. What if the first guess is already the best possible solution? Could there be any pictures that cause trouble? I.e. ones that consist of 4 (equally spaced) colors only.
I read the same article as the guy and was also discouraged by using PIL. Python 2 vs Python 3 just makes life hard for people not used to Python. Maybe someone should do something about that.
I just went ahead and implemented the code in C++ instead.
> EDIT If the script uses a well formed #! this isn't too much of a problem either.
He does have a point, though: Dependency management in Python is a pain in the ass. Virtualenvs aren't really a solution, distribution packages are usually horribly outdated, …
Don't know if this is the best way to do it, but reading a bit[1] as
well as looking at comments above about color spaces, I came up with:
#shell:
for space in sRGB RGB HSV LAB
do
# I think it should be possible to do this without writing
# tiff-images to disk in-between -- but having a look at the
# resulting images next to the original is actually quite nice
# gives some idea of the differences colorspace makes:
convert akira_800x800.jpg -quantize $space +dither \
-colors 4 akira_lab_$space.tiff
echo "Histogram in $space colorspace:"
convert akira_lab_$space.tiff -format %c histogram:info:-
echo
done
Maybe someone with more knowledge of imagemagick can improve on the
pipeline etc.
[ed: Just noticed that the sRGB output for "RGB" values is different enough that my script doesn't consider them to be RGB values (well, they're not) - so the line for sRGB is blank in the html. Still think it's interesting to see the difference between RGB/HSV/LAB.]
[ed2: Changing the number of colors to 3, to better compare with op, op's algorithm clearly chooses different colors. Not sure which is "best", but just FYI]
http://i.imgur.com/M6Oo6dp.jpg
Will open source it soon, perhaps.