Hacker News new | past | comments | ask | show | jobs | submit login
Deep learning startup Skymind (YC W16) raises $3M, launches enterprise AI distro (venturebeat.com)
85 points by vonnik on Sept 28, 2016 | hide | past | favorite | 36 comments



Hey folks - one of Skymind'a co-founders here. My co-founder Adam Gibson has been answering some questions I see. If you want to know anything about deep learning in production, please let us know and we'll share what we've seen.


What are some typical use cases that you have worked on with your clients, and can you give us an example of the impact it has had for a client?


Typical use cases are:

* fraud and anomaly detection

* recommender systems

* predictive analytics (churn, forecasting)

* image recognition

With image recognition, we hit 98% accuracy on a recent project. Until a few years ago, that was unheard of, and it's simply not possible with other algorithms, so for many companies, deep neural nets can make a significant difference.

Here are two news stories about work we've done for clients:

Making deep learning accessible on Openstack https://insights.ubuntu.com/2016/04/25/making-deep-learning-...

For Canonical, we built a solution that predicts server breakdowns.

For France Telecom's mobile unit, Orange, we built a fraud detection solution using anomaly detection:

http://www.orangesv.com/blog/orange-deep-learning-work-featu...


Really cool and interesting. Looks like deep learning and AI is going to make some significant strides in the tech industry. I read somewhere that there is still a lack of professionals to expedite the adoption in the wild. What additional skills are required by your regular full stack software engineer to be useful for a deep learning company? I know there are courses out there but I am not sure if they are too academic. Are there specific technologies that one should be playing with?


It varies. Sometimes it's just understanding enough about data to build data products. (Eg: a website with a recommendation engine component)

Knowing data visualization can also be useful.

Most deep learning companies focus on a particular application.

FWIW I'm actually self taught. I did client projects and learned machine learning on my own.

You could start by branching in to data engineering and understanding how data pipelines work. That's closer to the skills a full stack developer is likely to have.


Very interesting. Thanks for the insight.


Are there some documented and testable examples of fraud and anomaly detection via deep learning? I'm asking from being able to play around / learn more about this domain.


A lot of it is actually our customers. We've been covering that kind of R&D ourselves. That's the fun of being able to look at applications outside of the google,fb hype.


Thats interesting. Would you recommend any learning material for the fraud/anamoly detection using deep learning? Any available datasets one can play with?


That's actually a big problem. Very few datasets out there, because the data tends to be sensitive.

Here's one we use for demos: https://www.unsw.adfa.edu.au/australian-centre-for-cyber-sec...


Can you go into more detail on the image recognition? Do you mean AR-style image recognition ("find exactly this image"), or more image /classification/ (here's what's in this image)?


Both. Given an example: Sentiment analysis on a car show floor detecting sentiment of buyers. It can also just be binary: "Is this object present or not?"


Run that by me again.. So you have an algorithm that attaches to a camera in a car shop and can determine the likelihood that a buyer is interested in the car?

That's pretty awesome and scary in equal measure..


A neural net that recognizes sentiment of facial expressions based on camera footage.


Silly question ahead but I am honestly curious:

How did you come up with the name? Has anyone seen your name and associated it with SkyNet / The Terminator Franchise?


It's a pun on purpose actually. SKIL, our enterprise distro, is also a play on words referencing gaming (I used to do competitive gaming)



--- From the slide on list of Open Source tools and Algorithms

• Kaggle(www.kaggle.com) is a good start for this - start with “somewhat real” problems • Use higher level tools - Keras(https://keras.io/), otherwise easy to get lost in weeds • Consider having a real world goal - eg: if you’re in real estate figure out how to use a simple CNN (not the latest algorithm) for image search • Depending on need consider integration with hadoop/Spark(http://spark.apache.org/)


Don't you think that many companies (with big budgets) will nurture the deep learning/data science projects inside the company instead of outsourcing them? How do you compete with those internal initiatives?

I am not only talking about Skymind since this issue arises in many other specialties.


Banks and telco tend to buy software. So does healthcare. You're thinking of the googles and facebooks of the world. Many companies outsource their non core competencies.


> After launching in 2014, Skymind now has half a dozen customers...

Umm, is that a misprint?


I'm not familiar with the company but there is serious money in enterprise AI, speaking from experience. These could be very large customers.


Hi,

Yes the deal sizes are mid 6 figures or larger.

We have a larger deal pipeline than that though. A lot more to come :).

Red hat/oracle style on premise (non saas) business model.

We usually target NON computer vison applications like fraud, preventative maintenance in data centers (predicting broken machines) and other mission critical applications.

One example:

http://insights.ubuntu.com/2016/04/25/making-deep-learning-a...

This kind of stuff is a swear word on hacker news but there's actually money in it. Fire away if you have specific questions though :).


What's your strategy to source relatively large deals like that? In general terms. Are you cold calling? conferences? Is it from online advertising? offline? referrals?


Channels based sales and lead gen from conferences.


What sort of value add do you provide on top of what the open source frameworks provide?


Main thing is production deployment software and a service level agreement for models we build.

In machine learning in production there are 2 phases: training and inference (usage)

In training we have spark docker images where you can run cuda right from spark submit.

In inference mode we sit on top of DC/OS by mesosphere embedding lightbend's (they created scala) micrsoservices technology conductr to scale out automatically on a mesos based cluster: http://www.slideshare.net/agibsonccc/deep-learning-in-produc...

Here is more on our enterprise distribution SKIL: http://www.slideshare.net/agibsonccc/skil-dl4j-in-the-wild-m...

If you're curious where the talent is, I cowrote the flagship oreilly book on deeplearning: http://shop.oreilly.com/product/0636920035343.do

We also employ deep learning phds doing everything from deep learning research in health care, ex nvidia, ex cloudera among others.


Enterprise sales are hard.


Indeed but that's what our team can do well. I've never been able to build a consumer product. Enterprise is just about understanding the incentives of the other party and knowing how to navigate the corporate latter.


Probably not a misprint. They could be big enterprise customers.


We don't work with startups :).


I presume you mean "we don't currently work with startups". This is more of a financial issue than an ideological, enterprise only issue, right?

I'm assuming if Grail (http://www.grailbio.com/) who launched in 2016 with $100,000,000 in funding were knocking at your door you would be more than happy to work with them?


Right. Startups typically use python/ruby or commodity hadoop clusters on EMR. It's not an audience match and they don't have a budget anyways.

For anyone else we have a very active open source community: https://gitter.im/deeplearning4j/deeplearning4j


:)


FWIW we spent much of the time building the product and open source community.

Many companies with infrastructure products like ours tend to "incubate" inside a big company first. We chose not to do that. So we spent much of our time just growing the user base first.


As someone who has worked at an open source company before I'd love to hear your thoughts on building a sustainable business around open source software. Most use cases I know are basically customization/consulting in enterprise settings build around a free base system.

From reading about the company it seems that you want to monetize on ease of use with the distro? Do you have customers that specifically pick you because deeplearning4j is open or do you find that it's more of a nice to have or even don't care as long as it works type of situation?

I'd also love to hear some reasoning for picking Apache 2.0 (i.e. a non-copyleft one). I've talked to at least one FLOSS founder who would have picked a different license in retrospect but feel like mostly the license matters less than most people think.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: