Recurrent Neural Network Tutorial for Artists (2017)

If you are planning on training a generative model on vector graphics data, in a sequence-to-sequence style (like sketch-rnn), you might find it difficult for an encoder to capture all of the spatial elements in a coherent manner. One way is to also (as the other commenter pointed out) rasterize the input vector image and feed it into a convnet to extract features, and get the decoder to also use those features. This has been attempted in this paper [1] where they extended sketch-rnn to have a convnet encoder and showed better results.

But if you want to just have a lot of fun, try to train a plain vanilla "char-rnn" model just on SVG text files and see what it generates. The results might look more interesting than you would have initially imagined. Kyle McDonald has tried this before on a dataset of raw twitter SVG files [2].

Good luck with your blog post, and please share when it is out!

[1] Sketch-pix2seq: a Model to Generate Sketches of Multiple Categories https://arxiv.org/abs/1709.04121

[2] Emoji generated with char-rnn and the Twitter twemoji svg files. https://www.flickr.com/photos/kylemcdonald/24308833950

jefft255 · on April 16, 2018

I'm not OP, but my take on this is that it would be quite difficult to design a DNN working directly with vector graphics. I'd say you should rasterize your image before inputting it to the network. Same with outputs. CNNs would definitely struggle with continuous images. This is from what I personally know, and I did not bother to google before writing this, so I might also be very wrong.

sabertoothed · on April 16, 2018

Hi jefft255,

I think you are totally right. As far as I understand, CNNs rely on the correlation of neighbouring pixels. And so directly applying them to vector graphics won't work.

However, vectorisation has massive downsides as well.

I will try to summarise it in some article. This had nerd-sniped my brain and so I need to write it down. Even if I have no solution for the problem yet.

Happy to send you a link to the draft once I have it so you can comment.

kevinwang · on April 16, 2018

In my pretty untrained opinion, it feels like something that outputs vector images might well be trained using reinforcement learning, since it seems like the problem can be well modeled with a policy: "given what I want to produce and what I've produced so far, the next vector object I want to draw is x"

amelius · on April 16, 2018

You could train the network to convert pixel-images into vector graphics; this is a useful operation. As training data you could generate random SVG shapes, and the associated rasterized images.