Hacker News new | past | comments | ask | show | jobs | submit login
Open sourcing Sonnet – a new library for constructing neural networks (deepmind.com)
262 points by lopespm on April 7, 2017 | hide | past | favorite | 38 comments



This isn't "yet another completely different neural network library." This library just has some new layer types for TensorFlow.

Looks like there are some new layers for special kinds of attention RNNs, word embeddings, alternate implementations of spatial transformers, and so on. They also have another Batch Norm implementation that of course requires tons of fiddling to work properly, a classic tf staple :-)

As a machine learning environment, tf is so complex that different research groups have to define their own best practices for it. Using tf is like learning C++ where everyone learns a slightly different, mutually incompatible, but broadly overlapping subset of the language. We're just seeing a glimpse into DeepMind's specialized tooling along with reference implementations of the operations they use in their work.

This will be really useful for researchers who want to mess with deepmind's ideas/papers, but I'm a bit relieved that there isn't anything claimed to be fundamentally paradigm-shifting about this release.


It's nice to see people building on top of TF rather than continuing to develop completely new frameworks.

I've been working with TF in detail for several months now and it is a very nicely designed framework. It is quite low-level, as you pointed out, (and so can be a challenge to get used to, and why you see helper wrappers/libs like this and keras popping up) but ML models and methods are becoming increasingly complex and TF provides exactly the foundation and flexibility needed to implement this stuff while still being able to slap together some basic layers if you need to. It also helps that the engineering is very solid (e.g., distributed models and datasets) and most of the performance kinks by now have been worked out.

For those maybe putting off working with TF because of its steepish learning curve, I'd strongly suggest you dive in. I learned last year in the usual method (docs + google), but the new OReilly books are the first good ones I've seen if that's your style.

The analogy with C++ is a good one, and historically appropriate given how scarce GPU resources are now.


If you're already using TF, have you looked into keras? Does it not have the complexity or control you are looking for?


Keras is a wrapper around TF to make some common model operations easier (construction of linear layers of nets like a CNN) and to make common training operations easier.

So you can still do whatever with Keras by combining it with lower level TF operations, it's not really keeping you from doing anything. But more complex layer types and training might have to be rolled by hand, at which point it can be easier to just do it all at the lower level TF ops. This comes up a lot when implementing newer papers where they are proposing new training methods or Operations where you really need the flexibility to get in there and manipulate the ops directly.


TensorFlow is a dataflow computation system. Keras is for building neural networks. Each exists at a different level of abstraction.


That's extremely well said. While it certainly beats the alternative of rolling something like it yourself, I've struggled a bit trying to get in the flow of tf, pardon the expression.

My impression is that it's an ecosystem that's developing incredibly rapidly with all the associated growing pains. Even for such a simple-on-the-surface task as feeding very large datasets into the graphs, there are many different ways, some deprecated already, some about to be deprecated by functionality not in 1.0, etc.

It's also quite new, but the distributed execution, while awesome, still requires a huge amount of hand coding and tuning the machinery, reminding me quite a lot of just rolling the damn thing in MPI yourself. I'm very excited to see where distributed tf will be in a year or two, but it's a chore today.

I'm really bullish on their Cloud ML product, a tool to auto-distribute the execution graphs and associated data ingestion, but the documentation and examples for that are a bit all over the place at the moment, and require writing models to conform to the newish Experiment API, which is also a bit underdocumented and underexampled.

My frustration is likely only because of how awesome tensorflow actually is. It's new, but so promising that I want to use it for everything, regardless of the development roadmap!


There's definitely some examples of useful ops here:

    # 1. Define our computation as some op
    def useful_op(input_a, input_b,
                  use_clipping=True, remove_nans=False, solve_agi='maybe'):
        # ...


Looks cool, but: "This installation is compatible with Linux/Mac OS X and Python 2.7."

Sad face.


Python 2.7 only is a no-go for me.


I am not a Python developer, but I do use Python occasionally.

Is Python 2.7 requirement really a big deal for mostly using other people's code (adding a bit of your own application layer code)?

I use Anaconda and just have different environments for 2.7 and 3.x, TensorFlow or spaCy, or NLTK, etc.

Really easy to switch to the environment I need.


It is more about ideology.

The 3.x is catching up, but still not fast enough as the community wants it to. But 2.7 is here for a reason, and shaming people that used 2.7 to push 3.x adoption, in my opinion is a bad practice.


Another possibility is this: https://docs.python.org/3/howto/cporting.html

If either their library (or a dependency) depends on something which binds to C, it could take some effort to make it python 3 compatible.

(FWIW, I use 2 or 3 depending on the OS or libraries that I am dealing with)


Which is fine. It is DL library, not for string processing.


If this is text only, maybe it will work on Bash on Windows?


Are you suggesting they add compatibility for Win?


My guess is Python 3. Not supporting Python 3.5+ in a newly released project is disrespectful at this point.


The problem could be one of their dependencies only supports 2.7. I know that's been an issue for me in the past.


Looking at the setup.py they only depend on tensorflow and some testing thing which both seem to support Python 3 just fine.


Possible, but this is the first year that I have more 3 only dependencies than 2 only dependencies. I would be curious what they are.


I agree, but the right thing to do is to update the dependency to support python 3.


In my case it was cmu Sphinx, which is a little too big for a simple update.


Sorry, I kinda meant "for a large company that is releasing a library" (e.g. deepmind), different if it's a single dev.


For us NN newbies, could some one do a comparison to Keras?


Not really, but from a look at the documentation, this seems to be geared towards recurrent nets (and their internal organization) and keras is a little more focused on layer abstraction.

Hopefully they improve the docs as they go along; I could follow along because I work in the area, but they're pretty researchy, and I would imagine most would have difficulty parsing them unless they're regularly reading papers in this area.


> Making Sonnet public allows other models created within DeepMind to be easily shared with the community

I'm looking forward to this - DeepMind has a frustrating track record of not open sourcing their code. While it's their prerogative and I know of many valid reasons for doing so, the many other researchers who publish their (sometimes terribly hacked together) models have been incredibly helpful in verifying that their work, well, works.


Where does this leave Keras? Wasn't it supposed to be the new high level interface to TF?

EDIT: Also, what about TF-Slim?


Keras did make it into tf.contrib now (with the release candidate for 1.1). I don't see it going away any time soon. TF-Slim though I'm not sure about. Never really used it myself.


That's the big long-term question. The nice thing of this is really just having Google backing.

Keras, though, still really incredible for usability.


Generally all of these high level interfaces are somewhat interoperable because of the graph nature of Tensorflow. It might make your code harder to read but you can use some layers from slim (for inception-like nets this is a decent reference https://github.com/tensorflow/tensorflow/tree/master/tensorf...), and some layers from Keras. I don't know as much about sonnet, but it looks like a similar story.


Not sure. I think Keras is more meant for education and off-the-shelf models, not so much for research due to its abstraction. Or in other words, its less customizable. I think the probably best trade-off between low-level code and coding productivity would be tf.layers, which is also in tf.core now.


Keras is heavily used in research.


Tip for the project lead: A quick way to lose half of the early adopters is to release a library without Python 3 support. Luckily, fixing this is easy and shouldn't take more than a day or two. And the ROI is high if your goal is wider adoption.


It's also an easy way for an outsider to contribute.


It looks nice and I will certainly try it. But, no support to Py3? Common, guys, it's 2k17


Anyone knowledgable with sonnet here mind detailing how is it different from keras.


As a startup that's in the midst of hiring, I can't wait to see how quickly CVs get updated to include this.


Do you often find a significant number of CVs listing technologies that are too new for one to have gained any mastery over?


No, but we've only been interviewing for a few weeks. However we see an unbelievable number of CVs that have a full paragraph dedicated to buzzwords and python libraries.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: