Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: Piccolo (YC W18) – Camera for controlling your home with gestures
98 points by marlonmisra on March 15, 2018 | hide | past | favorite | 55 comments
Hi HN — we’re Marlon and Neil, founders of Piccolo (https://www.piccololabs.com/). Piccolo is a smart camera that lets you control your TV, lamps, fans, speakers, and other devices with simple gestures. For example, you can point at your lamps with your hand to turn them on or off.

The two of us have had an interest in computer vision for a long time and were in Udacity’s first self-driving car nanodegree cohort in 2016. We started this as a side project to control one lamp and soon had our entire house connected. For some actions, we found gestures to be much faster and more intuitive. For example, pointing at a lamp to turn it on is way more natural than saying “Hey Alexa, can you turn on my left living room lamp?”

To set up Piccolo, you can place it anywhere (near the TV is usually best), and then on the app you can indicate with bounding boxes where the devices are. After that, you connect those same devices (Chromecast, Hue lights, smart plugs, etc.), and you’re good to go. Some processing happens on-device, but the more complicated models are run in the cloud. Since we’re not a security camera, there’s no need to store video and so no image/video data is ever stored.

We’re excited about the experiences you can build when you have a camera and apply computer vision techniques. With recent progress in human pose estimation, object classification, and object tracking, there’s really a lot you can do. We’re starting out with gestures, but our goal is to build a platform that lets anyone create and deploy vision apps. Here's a few things we're excited about:

- New apps. For example an app that detects medical emergencies (like an elderly person falling). We'd also love an app that can tell you where you left your phone and keys.

- App integrations. For example, letting Netflix know which people are in the room to get tailored recommendations for everyone vs. just the person signed in.

- Smarter hardware. For example, an Espresso machine that, with one click, makes your favorite drink because it knows who pressed the button.

- Voice-vision fusion. You should be able to trigger Alexa just by gazing at the Alexa device instead of saying "Alexa". You should also be able to hold something and say "Order 5 more of these".

We're giving away 20 pre-release units next month to anyone that joins the waitlist. We’re happy to answer any questions and look forward to your feedback. If you want to follow up, our emails are marlon@piccololabs.com and neil@piccololabs.com.




From the Hitchiker's Guide:

A loud clatter of gunk music flooded through the Heart of Gold cabin as Zaphod searched the sub-etha radio wave bands for news of himself. The machine was rather difficult to operate. For years radios had been operated by means of pressing buttons and turning dials; then as the technology became more sophisticated the controls were made touch-sensitive--you merely had to brush the panels with your fingers; now all you had to do was wave your hand in the general direction of the components and hope. It saved a lot of muscular expenditure, of course, but meant that you had to sit infuriatingly still if you wanted to keep listening to the same program.

Zaphod waved a hand and the channel switched again.


I wanted to post the exact same quote.

To understand the problem faced here it's worth trying to just watch people. Guess what people are saying to each other.


Definitely cool technology, only not for me. I'll never place a camera in my home that is connected to some cloud service or company database. That data is simply too valuable for me to send out to a "trusted" party.


I'd also like to hear more about the tech to know how seriously they're taking privacy... I was already against an Amazon Echo in my home, and that only records audio rather than video.

(For the convenience, I ended up using a Google Home, and I'll likely switch to an Apple HomePod next as they seem the most trustworthy with your data.)

If Piccolo will make money through data (like Amazon or Google), there's virtually no way for it to be non-intrusive unlike with audio, as we'd just be using it for turning off our lights rather than searching for the nearest French restaurant. Data would have to come from visual recognition. Hopefully the lack of connection needed to a cloud in this situation will result in more security? Anyway, curious to see how it works.


That's a fair concern. We are very serious about privacy. Existing cameras today focus on security so they have to store your data. Piccolo on the other hand is for real-time interactions so images are never stored.


What do you plan to do with other sorts of data?

Like does the cloud just trigger events and this it, or does it keep a log of where people are located in the scene and whether they have their arms crossed, etc?


The demo on the front page looks super cool. Signing up and looking forward to more progress and updates in the future. I wonder how accurate the finger recognition is? Is it accurate enough to pick up numbers so your fingers can be used to change the channel for example?

I did my final year engineering project with the Kinect (years ago!) and controlled a robotic arm to pick up an egg. Didn't know how else to apply this, but it seems like you guys really thought outside the box.

Only thing is the name...not sure about naming after the green alien from DBZ.


As much as I love the anime reference (I see it on the TV on your homepage!), I would also be concerned about search result placement. It's unlikely that "Piccolo, home automation gadget," would ever be as popular as "Piccolo, DBZ character."

PS, please make this gesture an easter egg: https://qph.fs.quoracdn.net/main-qimg-3671587cdc2afad44da412...


It's accurate enough to pick up number of fingers from a fairly long distance (~5m), and it works well in most lighting conditions, including complete darkness.


Actually the lighting was the other question I had in mind. Impressive. When I worked on the Kinect code, lighting was never really an issue we had or wanted to tackle.

I think something that people don't realize until they start using motion sensing technology is that how intuitive it can really be. It might look awkward at first, but once it's seamlessly integrated (ie. stepping into the sensor's view), it's really just like magic. A good enough sensor can pick up and understand very specific motions.

Anyway, good luck guys.


Very cool. I shudder to think of the GPU costs to run these models though. Perhaps they're using TPUs to be as efficient as possible. If you imagine a room occupied in the evening by people for several hours and you have any decent framerate, you're running your pose estimation network on each frame for several hours. And these models are big as far as I have seen. So that pretty much means you have one cloud GPU per camera allocated every evening. I suppose another option is they are running pieces of the network on the device but I think that's unlikely.

Of course these models are getting smaller over time and I'm incredibly impressed that these guys have put together the hardware, computer vision, and cloud setup. I also think they've nailed the MVP - not too easy but not too complicated either assuming they have decent models.

I'm signing up!


I think you're overestimating the processing requirements. The original 2010 Kinect did fundamentally similar processing (multi-person tracking and skeletal mapping) on the Xbox 360 which had a PowerPC CPU from 2005.


Running neural networks is usually much easier than training them (computationally).

Was the Kinect even a neural network? I don't think it was.


First off, signed up. This looks like a pretty sweet tool for home automation.

I'm slowly adding home automation to my house, but I'm a little more privacy oriented. I totally recognize I might not be your target audience with some of my constraints. Couple questions: If the service is paid (monthly or for a 1 time app purchase); how much is getting published to your servers? Because I like the control over something as intrusive as cameras in my house I'd prefer to self-host as much as I can.

Does it have to be the camera you provide/sell or could a high enough quality camera do that?


> but the more complicated models are run in the cloud

This is a massive problem with home automation tools today. I need to rely on Comcast to stay alive to be able to turn lights on and off in my house? To me, it's an absolute dealbreaker. I've spent more time than I feel like I should have had to in order to control devices I own without any data leaving my LAN. Am I the only one who feels like it's a totally absurd situation?


Our first prototype ran locally end-to-end. To make that work we used a Jetson TX2 ($600 computer), but the performance was abysmal. In a few years, it might be possible to do what we're doing locally and at a reasonable price.


I just think it's unfortunate that every player in the space has to ship their own hardware (and, obviously, force the user into a walled garden so that competitor's products are inconvenient to use.)

Give me (a power user) something I can run in a docker container locally, or on any of my local Macs or PCs.

Or, assume as a constraint that smart homes need to be able to run themselves. I would appreciate if everyone in the space made an effort at detente while the technology matured to a point where a massive privacy and security hole wasn't required in order to have the thing work at all.


I was looking for this. Cloud dependencies are a dealbreaker for me when it comes to home automation. I guess the alternative is to have hardware capable of doing the cloud work on your internal network.


I have a refurbished HP tower ("only" 2 GHz and 16GB ram) running Home Assistant, an MQTT bridge for my SmartThings hub (so I can use it as a dumb z-wave radio), Plex, and whatever else I feel like using it for.

It cost $300 I think. That's about the same price as the Google WiFi 3-pack. /shrug


I think it's okay as long as it fails to a usable state when the internet connection is interrupted.

To me, the bigger worry is having cloud connected video cameras (or microphones) in my home at all. Obviously many people are happy to trade privacy for convenience, but this this is a red line for me. There are too many ways for it to get deeply dystopic really quickly.


> as long as it fails to a usable state when the internet connection is interrupted

It depends on your definition of "usable".

For instance, if I want to tie 3 bulbs together (e.g. treat them as a single bulb for on/off/brightness/color) in SmartThings, that has to go to the cloud. That was when I stopped trying to make things work in SmartThings.

Comparing ST (which costs $$) to HA (which is free) is a no-brainer; HA will run Python scripts (or shell out) on hardware I own, on my LAN.

We, the users, should refuse to pay for anything that doesn't actually let us control our homes. The case for FOSS is stronger here than it's ever been.


Super cool idea, although the privacy implications are a little bit discomforting.

Anyway, I'd be interested to know how you're dealing with the depth problem. For example, if there's a light behind you and a light above you, pointing up looks like you could be pointing at either light.


This is one of the most challenging parts of this problem. We’re in the middle of transitioning to a 3D model that will be able to tell the difference really well.


Wow, that short clip on your homepage is crazy effective. I understood it immediately. Super cool product. I would buy this in a heartbeat if it came in a self-contained hardware package that included the camera and a module that does the CV processing in a way such that no video of my home touches a 3rd-party server.


Thanks. I think in a few years we'll see devices that can do everything locally, and cost a reasonable amount. For now, like others have said, some of the models for pose are too complex.


Is it basically a smaller Kinect without the Xbox platform tie-in?

Looks very neat. If you open the API, I’d love to play with it!


That's right, Piccolo is not tied to any console or platform. We're also very excited about opening this up. We think there is potential to build really powerful applications with the information available here. Stay tuned!


Smartthings integration would be nice. I mean I guess its possible to do it yourself, but out of the box would be sweet because all my lights and motion run off it.


That’s a great suggestion. We are working on connecting with a few different smart home hubs including smartthings.


Brilliant home page video. Explains the premise quickly and clearly. I suspect the power and convenience will evaporate privacy concerns for many.

It could recognise a couple in a dancing pose and randomly play something appropriate. It could recognise heroic arms aloft and play applause. So much potential.

There are countless things we do daily that have loads of room to be simplified. Even getting an Apple TV where the remote can switch on the TV and the device, and auto-change the TV's source to the device input felt like a game-changer, but this could ramp that up significantly.


I had a similar thought on the OpenPose library. How did your team end up with such a robust model without using OpenPose? As far as I know, OpenPose is only licensed for Academic use and was created using the Panoptic dataset which I believe is open. The problem is, creating a system that reaches the parity level of OpenPose with simple 2D image rec could be a startup in itself. I'm a bit cynical...

EDIT: you answered my question with DensePose. Had no idea that exists!


Your demo on the front page is super spot on. I get what you do instantly.

Hope you guys all the best. This is so cool, I am on the waiting list!


This looks interesting... No pricing listed, but I'm guessing it will be between $200 to $400. My main issue with "smart" devices is the price vs the longevity.

(Cameras + bulbs + switches + doors + sensors) * how many rooms you have = this automated home thing get's expensive really fast.

And then you end up with a not so smart home. Not to mention the struggle that's to figure out what devices talks to what system and the many apps... everyone want's to make their own app, and then you need a middleman app to talk to other apps. All of this work, and in 5 years you will probably need to replace everything. Just thinking about it, a smart home today will probably devalue the price of your home in 10 years.


I think this is a great idea, it reminds me a lot of the OpenPose library. I found very interesting how you found each other via Udacity's course. Could you expand on how you went about actually teaming up and starting out? Thanks and good luck with the final pitch!


Yes, OpenPose gave us a lot of inspiration. DensePose is similarly impressive and came out recently.

We did the course together but met many years before, in high school in Canada in '07.


Do you see the entire video and process it, or do you only see the skeleton + target areas?


We do some preprocessing on-device and run our more complex models in the cloud. It’s similar to Alexa which always listens for the trigger word, but only streams to servers for the few seconds afterwords. While some image data is processed in the cloud, it’s never stored.


I've wanted to do this but with the Nest cameras I already have in my house. It's too bad that Nest has made the decision to completely discourage anyone using the video from these devices.


I love home automation and I love this idea but I'm deadset against having a cloud service connected for my home automation. Would it be possible to run a server locally, using my GPU for the processing?


Awesome! Can't wait to no longer have to yell "ALEXA. HEY ALEXA" -> "sorry I could not find birdroom lights".


Very cool!

I got a 502 and "Oops! Something went wrong while submitting the form" initially though but it worked on the second try.


We're giving away 20 pre-release units next month to anyone that joins the waitlist.

Are you sure that wording is accurate?


I wonder if this could be an iPhone app, so we can just use an old iPhone. Do we need a special camera at all?


You could but the downsides are: 1/ It doesn't have the ideal sensors (stereo cameras, or other depth sensor), so you'd have limited functionality 2/ Most people wouldn't want to run their iPhone consistently day after day.


Looks really cool. Your landing page does a really great job at demonstrating what it does. Great work!


Congrats, this is very impressive! I'm curious are you computing the 2D or 3D pose of the person?


The demo video on our website was done with 2D pose. We’re in the middle of transitioning to 3D.


Few things come to mind:

1: Amazement 2: DBZ's character 3: Google's Project Soli


As an aside, what was your background before doing the self-driving nanodegree?


I taught myself programming a few years ago and did the ML nanodegree before. My cofounder Neil studied software at Waterloo and was an engineer at Pinterest.


This looks AWESOME! Can't wait to run around the house like Iron Man.


Does it work with Apple HomeKit?


congrats guys!


Will Piccolo help you not hit me with your car again Neil?


It’s funny because, ycombinator loudly boasts that, ideally, it seeks to fund billion dollar ideas, and yet I so very much do not want a thing like this.

Computer vision is this incredible thing, until you realize it means you’ll be in front of a camera you cannot control, and that camera will require broadband internet, if not for a one-time activation, then perhaps continuously and indefinitely.

I’ve never seen a high-end electronic device produced after 2010 restrict itself from the internet. No company seems have the self control necessary to consider even the idea. No internet? Impossible!

What kind of echo chamber would shout down such an idea? Do people understand why I might worry about a camera with unrestricted internet access? Why must I trust anything so deeply?

I don’t want to hear about how I should just give up this fight, because I’ve already lost anyway. I don’t care about what other devices already do. I don’t care about how different this time is.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: