Wide and Deep Learning: Better Together with TensorFlow

john-whelan · on June 29, 2016

I'm curious, is it possible to train a model and send that model to a raspberry pi or something to make something like this realistic to use? Then take the results from the raspberry pi, send them elsewhere to be trained into the model? and repeat?

joefkelley · on June 29, 2016

Yes, this is very frequently done. Training is far more computationally expensive than evaluation in most cases.

I don't know of any specific cases of this being down with a raspberry pi, but many phone apps, for example, have this sort of architecture. Train a model on a powerful server/cluster, send model to phones for use, phones collect more training data that is sent to server/cluster, repeat.

Eridrus · on June 30, 2016

I have successfully run a tensorflow model on my Android phone, I haven't tried the raspberry pi bits, but tensorflow does have examples for running models on it: https://github.com/tensorflow/tensorflow/tree/master/tensorf...

You can certainly send data from your raspberry pi to a central server, and then periodically retrain and update the model on your raspberry pi through whatever update mechanism you create, but that will require your own infrastructure.

malux85 · on June 30, 2016

Yes, I train my models on my GPU cluster, and then transfer them to my iOS app, which does image recognition.

Some of the models can be quite large for say, 3G (225mb), but I'm working on various compression techniques now too.

twelfthnight · on June 30, 2016

I don't know if I missed it, but I would have liked to have seen a performance/accuracy comparison between wide+deep learning and a simple ensemble between wide and deep models. The advantage to having 2 separate models is that you could use just one or the other if something went wrong, or if you needed to make a faster prediction (i.e. when the escalator breaks, you get stairs).

argonaut · on June 30, 2016

Yeah, this is a pretty good point. IMO it's a major flaw in the paper that they didn't empirically compare their new model to other models (like just an ensemble, as you mention), though they do discuss it.

I wouldn't be surprised if a simple ensemble performs better!

plusepsilon · on June 30, 2016

In many cases the simple ensemble is more fragile because you have to keep track of multiple things at once. The winning solution for the Netflix Kaggle competition was an intractable ensemble which never got used in production. Also when you update your models you'll have to tune them individually, then manually tune the ensemble weights.

Another advantage of joint learning (which the authors mentioned) is that the individual models need not be as big when trained independently since they complement each other. Though the joint model will surely be bigger than each of the individual models.

argonaut · on July 1, 2016

They (reasonably) claim the joint model doesn't have to be as big, but, for example, it would be interesting the see an ensemble of 2 models: a wide model of the same size as the wide half of the joint model, and a hierarchical model of the same size as the hierarchical half of the joint model.

ninjin · on June 29, 2016

I am about to catch a flight, so I am unable to do anything better than skimming the post and paper. But isn't this just the good old feature embeddings coupled with learnt features that have been around for several years by now?

mattj · on June 29, 2016

I think the change here is they're learning the embeddings alongside the feature weights (eg they're part of the same loss function).

saltvedt · on June 29, 2016

Can anyone give more examples of specific problems this could solve?

anantzoid · on June 29, 2016

The corresponding paper linked in the blog post explains the recommendation system behind Google App Store. The recommendations generated from this model led to a significant increase in app downloads.

rockmeamedee · on June 30, 2016

Nice.

It strikes me that the example in the blog post is just a general search problem, eg google search could use this: if you type "Brexit" and you want a general overview of a lot of different things, vs you type a specific query and are looking for a specific page.