Hacker News new | past | comments | ask | show | jobs | submit login
A Deep Learning USB Stick (movidius.com)
194 points by zzzzFrog on April 28, 2016 | hide | past | favorite | 53 comments



Lots of misunderstanding in this comments section.

Movidius makes low power neural network processors for mobile application. The Myriad V1 is used in google tango and the V2 (what the USB stick has) is used in the new DJI Phantom 4.

http://www.theverge.com/2016/3/16/11242578/movidius-myriad-2...

The Myriad chips are interesting because they combine MIPI camera interface lanes on the same chip as a general purpose NN/CV processor and an SDK suite of hardware accelerated computer vision functions (edge detection, Guassian blur, etc).

here's the white paper for the chip: http://uploads.movidius.com/1441734401-Myriad-2-product-brie...

Because programming these chips essentially requires having the hardware, and because the hardware was very hard to come by, programming these chips was mostly limited to Google, DJI, and other big partners.

With this release the everyday developer has access to these vision processing chips, and the barrier to development entry is considerably lower.

This is not meant to replace your titan X gpu.


This is their own press release. What does that kind of hardware for CV primitives have to do with deep learning?

(Also, of course, this stick doesn't seem to have any kind of connectivity besides the USB to the host computer. How do I connect my camera? Having to shuffle the data from a camera to the stick passing the host computer somewhat defeats the point.)


>What does that kind of hardware for CV primitives have to do with deep learning?

They have hardware convolutions on 12 SHAVE cores (kind of a DSP core). It means that the chip can run some useful subset of convolutional neural networks very fast and energy efficient.

They also have 2 general purpose SPARC cores, which allows you to have a "normal" program running there. Not sure, how locked the USB stick going to be, and if running your custom program would be an option.

>How do I connect my camera? The chip itself has a couple of MIPI lanes. The USB stick likely does not expose that. And I agree, that's suboptimal.


It doesn't defeat the point. You can still do development on it just fine.


I don't fully get the negativity of comments on HN for this product on the sole fact that it's currently an usb stick.

I think it's just can make sense to push to the market this kind of chip early and don't wait to be bundle with another device. Kudo for Caffe support.


This is intended for development. I think HN commentators are missing that.

Given the limited availability of phones and products with the Myriad 2, and the fact that you might not be targeting a phone anyway, being able to buy and develop on a cheap USB stick is an incredibly smart move.


I don't see why you would need deep learning NN on a phone.

If this is for CV, why not use the myriad DSPs from every silicon company ever?

Seems like a buzzword fueled product.


Tell that to Google and DJI that use them in their own products.

They arent just for NN, they are general purpose image processing chips with NNs as one option.


Because phones are ubiquitous?

Likewise, I can imagine people using this with an embedded Raspberry Pi.


You guys know if the Myriad V2 processor is locked (just neural networks) or it will be programmable?


Well it would be suited to other computer vision tasks based on the hardwired accelerator cores.



Haven't found any info of the availability or price. Interesting concept anyway.



Wouldn't this be limited to something like 10Gb/s throughput? That's not much compared to a GPU bus. Cool idea though. I wish this press release wasn't so light on specs.


It looks like they have developed an ASIC for ML. The obvious use case would be mobile devices where performance/power and latency (versus going to the cloud) is a concern.


Can you connect it to a phone with a micro-B USB adapter, and then use it to run pre-trained networks with your phone's camera like I imagine it is used in thd DJI Phantom 4? I know the USB to the camera and onboard CPU will not be the same as the DJI's bus, but it would make for an interesting mobile testing platform for me to learn on.


Use a surface tablet?


This page absolutely wrecks Firefox 45.0.2. UI thread gets blocked indefinitely and spins a core. :-(


this collab with FLIR for thermal cameras is interesting too. The video gives a bit more info about the myriad 2

https://www.youtube.com/watch?v=hsopAM8FexE


How about this kind of extra processor comes in a mobile phone, which improves regular camera vision, all health sensors, better everything that we can do with mobile phones..


It does.

The Google Tango uses the Myriad V1. The new tango to be released in may comes with this new V2 chip.


This is not at all the case. Deep learning can do some great things, but its not magic.


You don't necessarily need it, Qualcomm have already demonstrated (live) image classification on tablets and it runs on a Snapdragon CPU:

https://www.youtube.com/watch?v=jkDSACXkErY


Someday they will be I'm sure. Probably part of the SoC.

But not yet, software is not ready for it yet.


Seems like the perfect opportunity to insert a plug for my project: http://liblikely.org/


Betting on local instead of cloud is always an interesting gamble.

Pros:

- Security

- Control

Cons:

- Resource limitations

- (...)


It looks like this is for running already-trained networks, and I think that's really the only practical way to do things right now if you want to make sense of something like video in real time because of bandwidth constraints. It looks like training would happen in 'the cloud' or similar.


Local is also useful in situations with limited or no internet access. Say you're trying to do deep recognition on a live video feed: many places this would be useful simply do not have the bandwidth available to stream video.


Also useful in latency-sensitive applications, such as the drone flight demo they use in their video.


Pro: Reliability/longevity (your product still works even six months later when the cloud API provider has been acqui-hired.)


How powerful is it?


Not very...

At 15 inferences per second in fp16 for Googlenet, I'd guesstimate 50-60 GHFLOPs. That would give it very roughly 2x perf/W over TitanX.


Given that figure and the Titan X at 250W TDP would put it somewhere around 1/125th of the performance. Which is probably more than good enough for inference given a network but you still need something beefy for the training.

It's still pretty interesting, though, since only need to do the training once.


I almost closed the tab with this page, the classifier in my head marked it as an ad.


look like fraud


Hey, I'll bite. Why do you think that?


their product is based on a buzz word and uses unknown hardware, the specs seem underwhelming (1W + usb = ?). their page is filled with videos of models and nature shots and a bunch of news categories that have a few articles from 2013/14. there's no way to order anything, yet they put out some wild claims:

'With Fathom, every robot, big and small, can now have state-of-the-art vision capabilities'

'It means the same level of surprise and delight we saw at the beginning of the smartphone revolution'

'With more than 1 million units of Myriad 2 already ordered'

tl;dr: because, to an outsider, it sure does look like fraud.


Their hardware is integrated with many devices like googles Tango.


To clarify. I'm not expert (only wrote one thesis and designed nonlinear manipulator motion control system based on NN). But... Watching their video, as a noob, feels like drinking stream of bullshit from fire hose. If it really works and do what they claim - I beg, fire you marketing/media/whatever guy. It look like ad of self sharpening knifes.



Just saying - I'm digging the idea of a USB stick that turns my laptop into an artificial intelligence.


Thought TensorFlow already did this, how is this USB different?

Seems like the more logical approach would be to have a widget app developers could easily deploy embedded TensorFlow builds in Android & iPhones. Has anyone looked into doing this or found someone already doing this?


My understanding is that it allows you to take your tensorflow model, transfer it to the USB and plug it in (interface with) your 3rd party hardware (other than your phone) which isn't connected to the cloud.


This is a dev board, not a consumer product. And (contrary to the title) the press release explicitly says that it is not intended for "deep learning."

Acceleration is needed for training -- not running the models themselves. A quick glimpse of the power used (1 watt) lets you know exactly how much "acceleration" is going on in here. This is meant for tiny devices.

EDIT: My point is that this is a small-run dev board for a chip for some future $19 nannycam. It's not an "accelerator" you install on your PC to put your graphics card to shame running TensorFlow.

EDIT #2: This is another one of those HN threads that's overrun by enthusiasts. Jamming a chip onto a stick is simply how they sell embedded crap now.

Here's a crypto chip that'll really get you guys going: http://www.atmel.com/tools/AT88CK590.aspx


>Acceleration is needed for training -- not running the models themselves

This isn't true. Running neural networks (including CNNs) can be computationally and power intensive, and lends itself to the vector operations of GPUs, FPGAs, and ASICs. Putting the computations on devoted hardware could enable embedded applications that simply aren't possible otherwise.

Here's a whitepaper by Microsoft about using FPGA's to speed up CNNs: http://research.microsoft.com/apps/pubs/?id=240715

Article by Google explaining the importance of optimizing neural networks to run on mobile phones: http://googleresearch.blogspot.com/2015/07/how-google-transl...


The Google article specifically says they were able to solve the problem on the (tiny) CPU in the phone by using vector operations on the CPU. Again, this is a tiny doodad for power-starved devices that lack any micro.

Have you played with Apple Accelerate? They've been baking this stuff into their chips for quite some time. Apple's FFT outperforms FFTW. https://developer.apple.com/library/mac/navigation/#section=...


Are there any good examples of FPGA implementations of CNN?

I see one example in Verilog on github: https://github.com/ziyan/altera-de2-ann/blob/master/src/ann/...



Acceleration is actually useful for both. If you're running in a constrained, mobile or sensor environment, then you really do want acceleration that improves your power efficiency. It's also useful if you do high volume serving, but that's obviously not the case here.

(For example, Google's voice recognition on Android can run when offline.)


Sure, maybe a nannycam. Maybe whatever you want to run on your Arduino / Raspberry Pi. Maybe bolted on to a small lightweight drone swarm that communicates to itself while offline to learn about its environment.

What happens when 1000 of these are plugged in to 1000 drones and they all communicate? I don't know enough to even guess but perhaps something interesting and dangerous?


At first glance I think they would take off and be airborne. Other than that they would do whatever they were programmed to do. I know, my comment is stupid, thought I should fit in.


>> Acceleration is needed for training -- not running the models themselves.

Sometimes training is done on the customer site. For example a noise cancellation algorithm may learn audio characteristics of the user's environment to offer better performance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: