Hacker News new | past | comments | ask | show | jobs | submit login

I'm from the project team. This is a very interesting point.

While it is easy to crawl many songs from the internet, it is a little harder to gather the same amount but with proper genre/style/etc labels, although it is not impossible.

For now there's only one genre, which we call it "the genre of whatever is on the internet". So whatever music files on there, many of them quite "crappy", were used to train the model. Also there are many other problems on how to better structure and flavor the composition.

This is just a very early-stage attempt, as a CS student's fun side project. We are working with people with real musical talent now and hoping to make better songs in the next version.




Great job! What sort of resources did you use (Time, Processing power etc) to train this?

Edit: Found it in the article.


I mean, where does the Christmas element comes from? The image alone, the music it was trained with, or is it somehow hardcoded in the algorithm?


The Christmas element comes from 1. the image, and 2. a 4800-dimensional RNN sentence encoding bias generated from ~30 Christmas songs.

Not sure how to hardcode this.


I'm interested that you used some Christmas songs as training (which wasn't obvious from what I read of the paper). Were they pop songs, traditional, or a mix?

Further to my comment up there[0] - and I don't wish to sound a grinch because this is a really cool project - but would I be right in thinking you spent more time on the image description than the music?

I saw that you specify a scale for the melody, would it be either possible to use a mode to generate the accompaniment around, so that the melody can move diatonically and risk too many clashes, or to allow the melody to follow the chord sequence somehow?

Again, sorry if I sound too critical. It's a really awesome thing you've done, and I'm just a guy that listens to the music instead of the lyrics.

[0] https://news.ycombinator.com/item?id=13079355


Thanks for the comments! Are you asking the lyrics or music generation?

For lyrics, we actually didn't train on Christmas songs. Training data was a large collection of romance novels. (See neural-storyteller by Jamie Kiros). The "Christmas trick" we did was applying a "style shifting" after image captioning and before lyrics generation, where the shifting vector was obtained from ~30 Christmas songs.

For the music generation. Although we are aware of some basic music performing rules, such as melody following chord etc, we actually didn't add this kind of rules.

For the blues scale here's the thing. I didn't really know much about music, so I spent several hours reading things like basicmusictheory.com. It happened to introduce blues so we just used it. But you're right on the relevance between blues and pop: only a very small percentage in our pop music collection is blues, after we ran the scale-checking code.


Thanks for the reply! I was concentrating on the music specifically. I thought the lyrics generation was really enjoyable.

I was asking more if you'd used any traditional carols, as they can have a more definitively "christmassy" sound than a pop song with sleighbells laid over the top.

Overall I meant that I think the music would be more convincing either following the chords in the melody, or sticking to a single mode for both melody and accompaniment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: