>> Welcome to the new 2018 edition of fast.ai's second 7 week course, Cutting Ed...

pixelHD · on May 8, 2018

Well, I looked at the summary, and they're implementing a Seq2Seq model for this. It is what I think of as an archetype for machine translation and chat bot tasks.

Quite a few new network architectures in this space have been updates to this model, which uses an RNN encoder and a decoder, along with attention between them and a beam search for better results.

I wouldn't call this model a solution for natural language translation, nor would anyone else. But I think fast.ai meant that they're going to explain and go through this model, and how it's helped bring a new generation of models with good performance in this particular space.

jph00 · on May 8, 2018

Yup it's multi-layer bidir seq2seq with attention, and a few tricks like teacher forcing. Same as Google Translate. Their version takes a long time to train on a lot of GPUs, so we simplify it by using less layers, and a smaller, simplified corpus (it only contains questions, and limits them to 30 words long).

By "solve end-to-end problems" I only mean that we show how to do the whole process from beginning to end - I didn't mean to imply that the final model would be human-equivalent or perfect or anything like that.

pixelHD · on May 8, 2018

Oh, nice!

Yeah, I understood the intention behind that statement. Great work with the course!

While you're here, what do you think about using temporal convolution for sequence tasks? I've read a few articles, this particular one by my professor comes to mind now [0], which say CNNs could work extremely well for the tasks traditionally done with RNNs. A recent paper by the people at Google Brain [1] mentioned that their CNN with attention network beats traditional RNN approaches. More surprising is that the network is 130+ layers deep, and yet trains faster than RNNs. Do you think we can potentially switch most machine translation tasks to CNNs?

[0]: https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c7... [1]: https://twitter.com/lmthang/status/989261575482560513

jph00 · on May 8, 2018

Yup potentially. The jury is still out on which will win, or whether both have their place!

YeGoblynQueenne · on May 9, 2018

>> By "solve end-to-end problems" I only mean that we show how to do the whole process from beginning to end (...)

Then why not write just that? What is the point of using language that implies you can teach people how to solve a very hard problem that nobody knows how to solve yet?

I find it extremely disreputable to claim to be able to accomplish feats that go far beyond the limits of current technology. That is the tactic of charlatans and snake oil salesmen, not of scientists and technologists.

YeGoblynQueenne · on May 9, 2018

>> I wouldn't call this model a solution for natural language translation, nor would anyone else.

Well, the passage I quote, by fast.ai, does exactly that.

pixelHD · on May 9, 2018

My bad

stochastic_monk · on May 8, 2018

Machine translation is not solved, but it's reached some surprisingly improved benchmarks for accuracy, so while it's a little presumptuous to call it solved, it's not the most egregious exaggeration I've heard about machine learning this week.

YeGoblynQueenne · on May 9, 2018

Everyone's doing it, so it's OK?