We're a small (~50 people) agency doing design and development of digital products for startups and enterprises.
Most recently we helped Coinbase build their new Etherium platform, https://toshi.org/
Bakken & Bæck are looking for a range of people, from machine learning engineers, to front-end devs and designers.
Bakken & Bæck is a digital product development studio with offices in Oslo, Bonn and Amsterdam. We work with exciting startups and great companies in turning big ideas into the next generation of digital products and services. We also build products and spin off ventures of our own.
Among many other things we've built and launched wake.io and orbit.ai
Location: Oslo, Norway. Bonn, Germany, and Amsterdam, Netherlands.
Work is mainly ONSITE.
And on the review pages they stuff a lot of high-star reviews with the same text such as:
> Since the announcement of Google Reader being discontinued, I have enjoyed this easy app. It also has a simple, attractive interface. Highly recommend.
and:
> Feedly has managed to create a visually appealing RSS reader, that also focusus on a clean simple visual style that is intuitive to use.
This looks fun and certainly simple - but I would guess that for many, the actual training of the model is not the show-stopped before "automated, data-driven decisions and data-driven applications are going to change the world."
If you already have clean data in tabular form, with a single target class to predict, and enough training data, the last step was always sort of easy. Much harder is the fact that people expect Big Data and ML to be fairy dust, just give it my DB password and MAGIC comes out. And instead of a clean two-class classification problem you have some ill-defined a bit of clustering, a bit of visulisation and a bit pure guessing -problem.
<quote>This looks fun and certainly simple - but I would guess that for many, the actual training of the model is not the show-stopped before "automated, data-driven decisions and data-driven applications are going to change the world."</quote>
totally agree, indeed whenever I train a machine learning model (for a ranker or a classifier) I spend most of the time building the workflow to generate the datasets and extract and compute the features. I actually haven't found yet a good open source product that cares about that, last time I had to work on a ML related stuff I relied on Makefiles and a few Python scripts to distribute the computation in a small cluster. I needed a more powerful tool for doing that so during my spare time I've tried to build something similar to what I've in my mind. I came out with a prototype here: https://bitbucket.org/duilio/streamr . The code was mostly written the first day then I did a few commits to try how it could work in a distribute environment. It is in a very early stage and need a massive refactoring, it is just a proof of concept. I'd like my workflows to look like https://bitbucket.org/duilio/streamr/src/26937b99e083/tests/... . The tool should take care of distribute the workflow nodes and cache the results, so that you can slightly change your script and avoid to recompute all the data. I hadn't used celery before, maybe many of the stuff I've done for this prototype could have been avoided (i.e. the storage system could have been implemented as celery cache)
Cool project on your side of things. I'm spending a bit of time myself trying to put together a compositional tool chain for machine learning tasks myself. Are there any major design choices you've thought out for your own project youd care to expound upon?
it's only about a year and a bit old - things don't change that quickly. Come back and ask again in 2020 - with some luck, auto-completion in eclipse will have reached some semi-strong AI point that can glue libraries together for you.