Hacker News new | past | comments | ask | show | jobs | submit login
Mona Lisa in 50 polygons, using a genetic algorithm (2008) (rogeralsing.com)
209 points by liamk on Dec 12, 2012 | hide | past | favorite | 53 comments



Source code? I really, really want to try something like this but I've never done GP and I'm not sure where to start.

In lieue of the source code, can anyone point me to a reference on GP and maybe something about image generation in C? In particular, what's an efficient graphics pipeline for converting all of these polygons to pixels? Something like bresenham for the edges and then additive coloring in the middle? And then how do I convert an RGB pixel array into some reasonable image format? I apologize for my ignorance, I don't even know what to start googling.

I think it would be a good exercise for me to write something like this from scratch on my own, just want some pointers to start.



Here's how I did it: https://bitbucket.org/teoryn/image-approximation

For converting the polys to pixels I render it with OpenGL and then extract the resulting image. I use ppm for both input and output, they're just a plain text file, and any decent image editor can convert to or from them.


Thanks!


Here's a similar algorithm you can play around with, implemented in Javascript + Canvas: http://alteredqualia.com/visualization/evolve/

HN post here: http://news.ycombinator.com/item?id=392036


Also of interest: MonaTweeta, using genetic algorithms to fit Mona Lisa into a 140 "character" tweet. http://www.flickr.com/photos/quasimondo/3518306770/


I did something similar around 2008: (but in Lisp!): https://github.com/smanek/ga

Had to write my own bitmap processing library, since couldn't find anything fast enough off the shelf :-D Handled alpha blending, file i/o, etc (checkout the bitmap.lisp and color.lisp files in the repo).

Here's a video of it 'evolving' a picture of John McCarthy (best individual from each generation): http://www.youtube.com/watch?v=-_VFZ_ON0A8

And here it is doing the Mona Lisa: http://www.youtube.com/watch?v=S1ZPSbImvFE


It is not quite a genetic algorithm: there is no population of individuals s which get mated and possibly mutated. It's more of a dynamic programming polygon match. Still, the result is impressive and amusing.


Exactly, the 'genetic' part of the term implies some sort of breeding, not that you can transform your search space into a vector which you call a genome. In fact, the algorithm used seems to be exactly what is described on the wikipedia for 'Random Optimization' [1].

I would expect that a true GA might work better, but not be the best choice. In my semi-related experience, Particle Swarm Optimization [2] works much better for continuous valued problems.

[1] http://en.wikipedia.org/wiki/Random_optimization

[2] http://en.wikipedia.org/wiki/Particle_swarm_optimization


He answers this in his faq (http://rogeralsing.com/2008/12/09/genetic-programming-mona-l...):

Q) Is this Genetic Programming?, I think it is a GA or even a hill climbing algorithm.

A) I will claim that this is a GP due to the fact that the application clones and mutates an executable Abstract Syntax Tree (AST).

Even if the population is small, there is still competition between the parent and the child, the best fit of the two will survive.


No matter how he tries to twist things, its still just regular hill climbing though.

The whole point of GAs is giving you a very simple heuristic for avoid local optima. How are you going to do that if your population size is only 2?


Roger Alsing the author of that old gimmic here :-)

Some clarifications from my part here, 4 years after the post was released:

1) No this does not qualify as a true GA/GP. by definition GP need to have an computational AST, EvoLisa has an declarative AST. There is also no cross over in play here. (see 3* for explatation on this)

2) Hill climbing or not? According to wikipedia, Hill climbing only changes _one_ value of the problem solving vector per generation. ""At each iteration, hill climbing will adjust a single element in X and determine whether the change improves the value of f(X)""

So it doesn't quite fit the hill climbing definition either, also the DNA/vector is of dynamic size in EvoLisa while Hill climbing AFAIK uses fixed size vectors (?)

3) Why wasn't a larger population used and why no cross over?

This is the part that most of you get wrong, increasing the population size and adding cross over will NOT benefit this specific problem.

The polygons are semi transparend and overlap, thus, adding/removing polygons will have a high impact on the fitness level, in pretty much every case in the wrong direction.

Let's use words as an example here:

organism1: "The Mona Lisa" organism2: "La Gioconda"

Both may have similar fitness level, but completely different sets of polygons (letters in this naive example)

combining those will very very rarely yeild an improvement.

e.g. child(result of org1 and org2) "Lae Mocondisa" that is complete nonsense and the fitness level falls back to pretty much random levels.

Thus, you can just as well use pure mutation instead of cross over here.

If the problem instead had been based on genes that paint individual parts, e.g. a gene for the face, a gene for the background, a gene for the body etc.

THEN it would have made sense to use crossover. In such case it would be possible to combine a good face gene with a good background gene and the fitness level would improve.

However, due to the nature of this specific problem where the polygons span the entire image, this is not effective.

And if crossover is not benefitial, then a larger population gets less interesting also since you cannot combine them.

Increasing the population will only make more separate individuals compete against eachother with no additative effect in any way.

see it like this.

If we have one sprinter running 100meters, if he might complete the run in about 10 sec.

If we add 1000 sprinters to the population, each of them might complete the run in about 10 sec each.

Thus, the problem is not solved any faster by adding more individuals here. Also, by increasing the population size, there will be much more data to evaluate for each generation, so even if we can bring down the number of generations needed to solve the problem, the actual real world time to complete it would increase due to evaluation logic.

Anyway, nice to see that people still find this somewhat interesting. It was pretty much a single evening hack back 4 years ago..

//Roger


And for those interested real GP, see http://rogeralsing.com/2010/02/14/genetic-programming-code-s...

That code uses real crossover and a large population in order to crack black boxed formulas.


Lots of fun to write, I imagine. A cool demo. But computationally, after nearly a million generations it does pretty good. How does it compare to any other directional computational algorithm, e.g. decimating an accurate tessalation? I've tried genetic algorithms for industrial modeling, and not seem anything close to efficiency. If you have a computer model you don't need genetic algorithms (can use better modeling techniques); if you have a real-world model e.g. a starch conversion process, you can't afford a large number of experiments, or even ANY experiments that aren't going to yield pretty-good results.


Here is a guy who used the EvoLisa concept to create image compression. He did beat the efficency of JPG, it did however not outperform JPG2000. Still, it is a good start. http://www.screamingduck.com/Article.php?ArticleID=46&Sh...

But I do agree, GA's have limited use.


Thanks for this great demo. I first saw it a few years ago and bookmarked it. I've gone back to it several times for inspiration since then.

Much appreciated.


To add to the cacophony of "I built one of these too!", here's mine:

http://mattdw.github.com/experiments/image-evo/

It's more strictly a genetic algorithm than the OP, too, as it's mutating a population, and instances age and eventually die.


Only 50 semi transparent polygons? I would have thought a lot more would have been needed.

Also worth checking out, the gallery with more paintings: http://rogeralsing.com/2008/12/11/genetic-gallery/


The polygons overlap with transparency, so 50 polygons could in theory create up to 2^50 unique regions. Think Venn diagram with every possible combination of overlapping areas, any combination of the 50 sets of points. (This is possible if the Venn areas are not convex.)

Realistically, it looks like the final result has somewhere upwards of 200 unique areas created by various overlaps of the 50 polygons.


Thanks for the link. Looking at the Images, I find that the polygonal representation also yields a very cool visual effect.


I wrote my own a long time ago in C (inspired by Roger) to create a Christmas postcard: http://www.mostlymaths.net/2010/04/approximating-images-with...


Ah that'll be where I got my Santa image. I made a 1k version of that.


Awesome! I hope it was a cool postcard ;)


I wonder if this could be used to create a 3D sculpture using semi-reflective film, which if you looked at from a certain angle with light coming from a certain angle, you could recreate the image.


Not to be outdone, I wrote my own genetic algorithm in Perl to do the same. It starts from randomly generated triangles, and evolves them to match the given image.

Here is the result of my first trial after 5000 generations:

http://imgur.com/L9Odx

For this run, I used 50 triangles, each at 50% alpha (fixed), a GA population size of 200, a crossover rate of 0.91 and a mutation rate of 0.01. It took around 12 hours to run, but that's mainly because I opted to do it in Perl and didn't spend any time optimizing it.


Given, that it took more than 900k generations, I couldn't help but think, that this means, it would take way more then 22 million years in human evolution.

A human generation is said to be round about 25 years.

;-)


Since this has popped up again I might as well point to where I took this. http://screamingduck.com/Article.php?ArticleID=46&Show=A...

And some reconstructions from data. http://screamingduck.com/Lerc/showit.html http://screamingduck.com/Lerc/showit2.html


The goal here to use polygons (a set of 2d points, a color, and opacity) to best reproduce the original image.

A given number of n-sided polygons represent a choice of basis set. This can be viewed as an optimization problem, where you try to minimize the difference between the rendered polygon image and the original image.

I wonder if this basis set is ideal? That is, is there a basis set you can choose, that represents the original image equally well, but uses less information?

Each n-sided polygon uses 2n+4 numbers (2n for the points and 4 for the color (RGB) and opacity). What is the ideal number of points in the polygon basis?

One could imaging using a set of orthogonal functions to represent the image. Coming up with a good set that isn't overfit to a training set might be a challenge. Perhaps one can make use of features of the human eye to come up with a good basis (maybe similar how MP3 does this for audio).


That's sort of the conclusion that I came to as well.

For evolving The best basis is some representation that represents the widest ranges of perceived images while keeping some similarity between images with similar data sets.

The range of perceived images is a tricky problem in itself. Many images of noise can be perceived to be the same whereas images of a face will look significantly different with a small change to the nose.

The polygon approach is obviously not good at expressing fine textures. It would be interesting to construct the image allowing rendering into different representations of the same frame buffer. Allow drawing directly into a frequency domain for instance.


What you've described is basically what JPEG does using DCT (JPEG-1) or wavelet coefficients (JPEG 2000) as the basis. The advantage with JPEG is that the forward transform is very easy.

You can use whatever basis you want, but I wouldn't call it ideal in any practical sense if you have to run a GA for a several hours to encode an image.


I read these things and wish so desperately that I was this creative. Mind. Blown.


It all starts with a silly, little thing.


I totally agree with this.

Children play without self-consciousness. ie, they don't think about what other people think or what they're trying to accomplish.

Children play for fun. This is where creativity starts.


Just start.


NASA has used genetic programming to make more efficient antennas. But would it be possible to use Genetic algorithms to produce more efficient aircraft wings? And how about more difficult problems like CPU design, say that you want a 3d stackable cpu consisting of different layers and cooling but you let the computer design it itself. How about using it to construct more efficient solar cells and wind turbines. Is that possible? How do you make it general enough so the constraints does not constrain the possible solutions to much?



Holy crap this would make a good image compression algorithm. Might be complex to encode, but super small and efficient to decode.

Can we have this, please? Someone?


I suspect that there would be a lot of edge cases where the algorithm wouldn't yield very satisfactory results. Think about images without broad similarly coloured areas, like white noise and such. Maybe further research will alleviate this. I'm thinking that maybe dividing the image in tiles, like JPG does, could help.

Another advantage is that the compressed format would be vectorial instead of raster, so it would provide smooth scaling.


The compression aspect was what I looked into. Aiming for 0.125 bits per pixel on 256x256 blocks (1k byte blocks) it does seem to be competitive. The lossiness at that level is quite high but other formats fare much worse. It might be reasonable to use it store higher resolution images for a given data size.

Storing 4 times the X and Y resolution than a 2 bit per pixel jpeg would yield the same compressed data size. (say a 4096x4096 image compared to a 1024x1024 jpeg. Both 256k). That could also be thought of as storing it as scalable 64x64 blocks.


The algorithm requires that you compare the current iteration to the source image, how does that constitute good compression? Not to mention the final image required 904314 generations to reach it.


When compressing, you always are using the source image in some sense.

What you end up with could be a fairly minimal way to represent the image.

Though yes, clearly, the amount of processing required to reach that compression is absurd. But then, most ultra-efficient compression algorithms have this problem at least initially.


Is there a limit on the number of vertices for each polygon? If no, I think there is always a way to emulate anything with a single polygon.

Edit: albeit the colors.


I was thinking the exact same thing. Also, it would have been nice to see what each additional polygon was at each stage, beside the merged results.


From the looks of it the average polygon in the system has about 6 vertices, so at 4 bytes a vertex and 4 bytes for RGBA color that's a total of 28 bytes per poly or 1400 bytes total.

And that's overestimating vertex positioning (at that size, 1 or maybe 2 bytes would suffice). Encoding an image like that would be very slow though.


Original discussion:

https://news.ycombinator.com/item?id=389727

Discussion about bd's javascript reimplementation:

https://news.ycombinator.com/item?id=392036


My favorite is 005456 ... it's obvious that it's her if you look at it but still abstract.


How do you determine how similar two images are?


The easy way the Average difference in pixel values.

Slightly harder psnr. http://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio

Harder again... ssim http://en.wikipedia.org/wiki/Structural_similarity

There isn't any real solution to automating similarity as a human sees it. Humans are tricky.



He answers this on the FAQ linked at the top. http://rogeralsing.com/2008/12/09/genetic-programming-mona-l...

He just loops over the whole image and does a pixel-by-pixel comparison, taking the difference between each of the R, G, and B channels and summing them all.


Check the FAQ. It's pixel by pixel comparison.


How about an iOS (and Android) app that does this for any selected picture?


This is cool !:)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: