Hacker News new | past | comments | ask | show | jobs | submit login
How did Roomba-recorded photos end up on Facebook? (technologyreview.com)
253 points by DamnInteresting on Dec 19, 2022 | hide | past | favorite | 140 comments



Key paragraph for our friends who don’t RTFA

> iRobot … confirmed that these images were captured by its Roombas in 2020. All of them came from “special development robots with hardware and software modifications that are not and never were present on iRobot consumer products for purchase,” the company said in a statement. They were given to “paid collectors and employees” who signed written agreements acknowledging that they were sending data streams, including video, back to the company for training purposes. According to iRobot, the devices were labeled with a bright green sticker that read “video recording in progress,” and it was up to those paid data collectors to “remove anything they deem sensitive from any space the robot operates in, including children.”

Seems like the real story is that training data was leaked, rather than the attention getting “they’re watching you” narrative the title suggests


Here's a funny bit.

Roomba iOS app refuses to go past its welcome screen unless its granted access to the location info.

This is unreasonable, they don't need this info for their app to function.

However their devices are all but unusable without an app, so they ultimately blackmail people into giving location data to them.

Meaning they don't really give a sh#t about users' privacy, so it's not that "they are watching you", but that they won't think twice about hooking up to a random Roomba and shooting a video with it. Consent or not.


And, so, did you do the right thing and return it or did you just accept and therefore effectively indicate to Roomba by your actions that you're ok with it?

I've returned several "smart" devices that required an app and an account. One example is GoVee smart blubs. As soon as they refused to work without an account I sent them back. Another is some Meross smart outlets. I retuned them for Eve Energy outlets.


> And, so, did you do the right thing and return it or did you just accept and therefore effectively indicate to Roomba by your actions that you're ok with it?

The burden is on them to not perform those actions, not on me.

I should be fully entitled to make use of the product without agreeing to any additional contracts or permissions.


You should be, but you’re not.


Many consumer protection laws would disagree. Also check out the Andino v. Apple ongoing lawsuit


An approach that has worked for me is to ensure that any smart devices I buy are supported by Tasmota. I just flash the Tasmota firmware as step-1, then pair the devices with HomeAssistant. Doesn't necessarily send the direct message that returning it does, but at least I can avoid the unnecessary telemetry/tracking.


I returned my $4000 LG TV after 2 years of use, as they suddenly tried to force me to agree to a new set of T&Cs and privacy policy. I called them and they told me that there's no way for me to continue using the TV without agreeing. Incredibly illegal (in my country), but I doubt most people notice or care.


The app always tries to explain that location is “required” for Bluetooth/WiFi pairing in my experience, can someone more knowledgeable confirm if this is an iOS/Android limitation or are app developers blowing smoke to get this data from users during setup/onboarding networked devices


How I deal with this: I don't install the app. They still work pretty well without. I did recently buy one that has great Valetudo support, but for now the flash instructions are a bit too complicated with respect to the fact it works well without setting up apps or internet connection.


I don't have a Roomba (or have used their app) but when something annoyingly insists on my location to setup/install, I just hit "Allow Once" with "Precise: Off" just to get past that screen. Then you can turn off the Location Services to "Never" for that app in Settings straight after. Not perfect but good enough?


This can be fixed by an app update as they may check each time you open the app whether they have the required permissions or not.


Didn't Android have a feature at some point where you could opt to supply it with a fake location? Maybe that was 'hacked' Android phones.

Either way, Android (and iOS) should be stricter about these things; apps should work without any permissions. I mean a navigation app without location access won't work very well and permission can be denied by accident, but that can be resolved.


Pair the roomba with home assistant and you wont need the app at least for day to day activities. Home Assistant is more responsible because its talking direct to the device IP address instead of using cloud retrocombulators.


Turning off “Precise location” can help with those apps, they’ll just get a vague location


I was talking about the parent comment's workaround to disallowing the location access permission after allowing it the first time.


The dishwasher I got was the same, the app wanted access to location at all times (precise not approximate).

If I disabled the permission it would stop working.

So I switched to the home assistant integration for it and removed the app from my phone.


Why on earth would a dishwasher need an app? You have to load it before you start it... later, you walk by, it's done, you unload it.


My friend's stove has an app... "Oh shit the milk is boiling over, quick get the phone, unlock it, tap the app icon, wait until it's loaded, navigate to the proper submenu and turn it off!"

Best thing it's he didn't even know at first. He bought a new washing machine that had app support, so out of curiosity he installed the app and scanned for devices, but the app only found his stove.

Technology was a mistake.


The main use cases are around notifications - it's done, it needs X/Y consumable, A/B maintenance, etc.


And with location data, turning it on when leaving the house.

Another smart scenario I can think of, link it with energy prices and a real-time dynamic energy contract, and/or solar panel output or battery charge levels, to only run it when it's cheapest to do so.

Of course, that means having it stand-by to run at all times, meaning it may run half-empty and you need to keep the door closed which will cause mold etc.


I'm pretty sure it's to enable a feature where it can turn on when you leave the house. However, that should be an optional feature, with them only asking for location tracking when you actually enable the option.

Should be, but here's the thing: they don't want to build the best app or customer experience through convenient features, they want your (location) data, because aggregated, that shit is worth more than the stupid dishwasher.

They don't make money on selling dishwashers anymore; components, shipping, marketing and middle men will swallow up any profit. But location data and subscriptions will bring in the real money, over a long and continuous period of time.

See also this story from a Twitter engineer who talked to mobile phone companies that were prepared to pay big money for location data; I can't find the source anymore, but it's been reposted on various outlets: https://hindupost.in/media/an-ex-twitter-engineer-reveals-ho...


> Roomba iOS app refuses to go past its welcome screen unless its granted access to the location info.

Does anyone know, is this like Android where some bluetooth functionality is behind the location permission?


Not Bluetooth, but you need(ed?) location permission to send/receive some broadcasts. I have a remote app that needs location permission to send a broadcast searching for TVs, so I have it set to “ask me every time” and imprecise location. (Which on iOS is about a whole city wide)

There’s a specific “local network” permission nowadays so maybe it was the older permission API because you can deduce location from MAC addresses?


You need location to scan for Bluetooth in app, because scanning for Bluetooth was how advertisers got your location. You don’t need this to interact with a Bluetooth device, the user can connect to it from the settings.


And if it's a Bluetooth Low Energy device that doesn't use connections?


Especially if it's BLE. Commercial sites are full of tags that advertise their MACs. Phone app reads this and use it to deduce where you are. You can try something similar yourself with WiFi using Google geolocation API.


Not sure about iPhone, but on Android, the permission to let apps pair things in surroundings using BT/WiFi is included in the location permission.

You can't pair anything with an app unless you turn the location on. Main feature of the Roomba app is to pair a vacuuming robot using phone's wifi and a special wifi network between phone and vacuum.

So while it's not a perfect solution to block you from going further, you can't really be that paranoid about them asking for that access. And ofc, I'm sure your location flew to the iRobot servers along with your other details that could've been scraped from starting the app on your phone.

edit: added paragraphs for readability.


You can pair Bluetooth with location off, but not Bluetooth Low Energy devices. But it's even more complex in reality, depending on the Android version, see here:

https://source.android.com/docs/core/connect/bluetooth/ble


I have a few Roombas and no app. I just hit the button and the robot does its thing.


More often than not just one or two rooms need cleaning, but not the whole place. Can't do that with a button press.


I see. I don't live in big enough place, perhaps. I just fire it off and don't think.


I close a door or put a box in its way. No app.


Has that happen to me with a gifted watch. I returned it and got a basic casio


This sucks, and it would normally be Apple's App Store duty to remove an app that requests unreasonable access to the device's details.


Is iOS like Android where Bluetooth permissions are a part of the location info permissions?


No.

Out of spite I took the iPad to a friend's house, enabled location, clicked through the welcome screen and disabled it again. Then came back home and hooked up the device.

So, no, it's not linked to its BT needs at all.


How did you connect your device to your wireless network?


You connect to it via BT first and then configure WiFi details.


Friend clearly didn't live very far away


And therefore the location was within range of the WiFi network, not really netting a privacy gain, if at all? Especially since SSID databases exist and are used for more accurate location detection.


iPad with mobile data?


That actually changed in Android 12. Now there is a nearby devices permission that doesn't require GPS access


Meanwhile, privacy watchdogs are sleeping. They have responsibility too.


That’s really annoying but can’t you use use the “while using the app” permission? Or do they insist on “Always”.

And I am not sure I buy the “anyone who does one bad thing will necessarily do all of the bad things” view.


Why would I want them to know where I live? It's just a thing that sucks dirt of my floor.

And that's without getting into why it must have an Internet connection (to enable a large chunk of its functionality). It has a Bluetooth, it is perfectly capable of talking to the smartphone using it, but it can't be controlled that way.

Re: bad thing - not a "bad thing" per se, just a very cavalier attitude, which is as troubling.


Bluetooth won’t cover my whole house, but wifi can. It’s a heck of a lot easier and cheaper to route through the internet.


> our friends who don’t RTFA

I opened the article (on a phone) and no fewer than 3 separate popovers appeared over the content. “Hey! This is our cookie policy” “Happy holidays! We have a special subscription price!” And something else that was covered by the first two before I had a chance to read it.

Thankyou for summarising. I noped right out of there out of disgust.


I read the article and I don’t think your interpretation of the “suggested” narrative is there. The title is “ A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?”. That’s not really implying a “they’re watching you narrative” - who’s “they” and if it’s global syndicate why did they do something as innocuous as putting it on Facebook?

But yes, the real story is that training data (and “real” data) does leak all the time and that most companies don’t take insider risk as seriously as they should.


The part that's missing is that it isn't just "a Roomba", it's "a Roomba labelled with “video recording in progress”". Saying 'a Roomba' implies that it could've been done from any Roomba.


The article addresses this head on. Even though those users once consented to share data “It’s not expected that human beings are going to be reviewing the raw footage.”

The meat of the article is that what technology and tech companies are doing is divorced from the expectations that we have as a society.

It couldn’t have been done from any roomba, but it could happen to almost everyone who didn’t understand the exact ramifications (which we click through several of every year to try to get the vacuum up and running). That’s why a lot of ppl on HN put masking tape over their laptop webcam. Or are you calling those people paranoid?


how did we get to the point where people have Internet-connected cameras and microphones everywhere in their houses yet implicitly trust that anything recorded will never under any circumstances be viewed by another human being


I think the second part of your sentence is a least a partial answer to your first part.


Couldn't it have been? after all they can remotely update the firmware. They just weren't in this case.


it's great that you personally are thorough enough to dig into the article and see for yourself - my favorite data set size of 1 and conclusions from that - but half the people even here, let alone outside of hn, will just grab the headline and run with it, saying "I heard roombas spy on people", and that's a problem. And the people who wrote this article know this full well.


What's the worst that happens, someone buys an older roomba without a camera, or gets some exercise using a vacuum?

Is it a bad thing if people err on the side of being too private, for a change, instead of blindly trusting big tech tracking everything they do, or being completely uninformed about this sort of thing?


> "I heard roombas spy on people"

Precisely how the "Zoom is Chinese spyware!" stuff spread around.


AND if you were doing testing for them through CenterCode/Betabound, they made it abundantly clear what you were being asked to test before you agreed to an NDA or completed the Study Application.

I participated in an iRobot Study in 2019, since then I have received multiple invitations that I have declined specifically because they mentioned streaming video or cameras.


"Seems like the real story is that training data was leaked, rather than the attention getting "they're watching you" narrative the title suggests"

Was the HN title changed. I am having trouble seeing how one who did not read the article would interpret the HN title to suggest a "they're watching you" narrative. For example, one could interpret it as posing a question about leaked photos, e.g., who leaked them and/or why.

The idea that a "they're watching you" narrative is "attention-getting" is interesting seeing that many HN commenters repeatedly tell us that people outside of HN do not care about privacy or surveillance. If that is true, then why would a journalist/publication seeking the largest possible audience try to advance such a narrative. No one cares. That's what they tell us, anyway.

NB. I am not suggesting that I believe the HN comments arguing that no one outside HN cares. They could be correct. However I believe the commenters making them could be biased if they are invested in the survival of the so-called "tech" industry, i.e., targeting internet users with data collection and surveillance and selling online advertising services as a "business model".

The actual title is: "A Roomba recorded a woman on the toilet. How did the photos end up on Facebook"

This website is a publication of MIT. A strange source for "they're watching you" narratives.


Do you truly believe someone would see the headline "How did Roomba-recorded photos end up on Facebook?" and not ask themselves "wait...is MY Roomba recording photos that might end up on Facebook???". This is clearly clickbait-y, the question is intentionally vague to encourage people to read more. A less clickbaity and more clear question would be "how did photos from research Roombas end up on Facebook?" but that's spoiling the story's punchline, now isn't it.


I had a beta unit before they released the first AI powered model and my beta unit was set to upload to photos. The original goal was to use AI to recognize and avoid things that the vacuum can get caught up in, such as cords under desks and dog poop.


But then they also refused to let MIT review the consent agreement or contact the users to review their understanding of the agreement. At this point it's just iRobot's words that they were explicit about what people were gettin into.

Here's the following para:-

"In other words, by iRobot’s estimation, anyone whose photos or video appeared in the streams had agreed to let their Roombas monitor them. iRobot declined to let MIT Technology Review view the consent agreements and did not make any of its paid collectors or employees available to discuss their understanding of the terms."


Seems like the real story here is that iRobot can't protect the data it records from its own internal reviews and R&D process, so the people who were paranoid about "they're watching you" are actually MORE justified, because not only do we need to worry about deliberate watching, but also the company's general incompetence with data security.

Seems like the narrative's just as accurate as it was prior to reading the title.


Everyone RTFA in this case!

I did not expect to see actual photos of the woman sitting on the toilet in this article. But damn, they're real and published dead center. It's awful and voyeuristic to feature, but in a way it brings to life the freakishly perverse Orwellian horror of all of this.

This piece hits hard, as it should.

How did neither Roomba nor ScaleAI have safeguards against PII of this nature? This is inside people's intimate spaces. It could have been sex. Or children. How did they not think of this?

This sort of disregard for privacy should be punished, and this woman should be able to sue Roomba and ScaleAI for a handsome sum.

Maybe they did have some kind of internal data privacy policy or 3rd party policy, but it was wholly inadequate.

My team once had a certain perennial Billboard chart topper's login credentials due to suspected mishandling by one of their team (I'm still afraid to say whom), but you'd better believe we treated it - and all of our customer data - as sacred taboo. Mishandling PII was fireable at minimum, and could probably land us in litigation with a permanent mark against our careers.

We need GDPR/CCPA++ protections here. As an added bonus, the companies that play nice will get a comfortable moat in the form of their compliance.


If you read the article you'll also find the owners were specifically aware that these units are special in that they upload all data - commercial devices do not share any images or video without the users consent [0].

It's also absurd to think they didn't face safeguards. We can only speculate if the individual was fired, or if stronger policies were put in place since 2020, but it's naive to expect that whatever policy is put in place will stop a human data labeler from smuggling PPI for personal reasons.

[0]: https://homesupport.irobot.com/s/article/964


> If you read the article you'll also find the owners were specifically aware that these units are special in that they upload all data

Well, you'll find a Roomba spokesperson saying that, anyway.

I don't mean to imply that the spokesperson was lying and these were plain old roomba's instead of special R&D Roomba's.

I do mean to imply that I think a roomba spokesperson would describe "included in paragraph 12 of a 23 paragraph terms of service that they clicked through" as "specifically aware", but I would not consider that any kind of "proof" that the people actually were "specifically aware". At best it is an argument they ought to have been, which is a different thing (and it's an argument).


As well as the spokesperson from Scale. The evidence provided to the author also strongly cooborates this fact. Seeing reviews of their commercial devices, they're pretty clear about what is and is not uploaded, with nothing visual uploaded be default.

Therefore I don't agree with your implication as I just don't see any evidence to support it. Even the article's author, with the evidence they were given doesn't push this point.


I don't understand what you're suggesting or pointing me to about what evidence there is about what the people who had these Roomba's knew.

You are making claims about what the owners "specifically knew". What is the evidence about what the roomba owners (or, uh, holders) did or did not know, "specifically" or otherwise?


Re-quoting the GP's quote,

> They were given to “paid collectors and employees” who signed written agreements acknowledging that they were sending data streams, including video, back to the company for training purposes.

So it hardly seems likely that they would not be aware of this.


The owner who said ok probably was not the girl with her pants down.

But as a guy, I don't know if girls pull them down farther if they think they're not being watched. She might be the one who agreed to it.

Still, the owner of a space is making the decision for all people who come into that space.


> James Baussmann, iRobot’s spokesperson, said in an email the company had “taken every precaution to ensure that personal data is processed securely and in accordance with applicable law,” and that the images shared with MIT Technology Review were “shared in violation of a written non-disclosure agreement between iRobot and an image annotation service provider.” In an emailed statement a few weeks after we shared the images with the company, iRobot CEO Colin Angle said that “iRobot is terminating its relationship with the service provider who leaked the images, is actively investigating the matter, and [is] taking measures to help prevent a similar leak by any service provider in the future.”


Oh a spokesperson said that? Must be true then. No spokesperson has ever pulled something like that out of their arse.


The question I want to ask is: how did Roomba-recorded photos end up in a major publication on the Internet?

And the sequence seems to be:

1. iRobot hires people to use special development versions of the Roomba in their homes to collect training data. These are clearly labeled, and the participants are informed that the images are being sent to iRobot for training. This seems fine - if you want to exchange some degree of privacy for money, that should be your right as long as you're clearly informed about it.

2. A contractor posts some of these photos to a private Facebook group used by other contractors on the project. This is obviously bad, but at the same time, it's limited in scope to people who would have had access to these photos or similar ones.

3. The MIT Technology Review gets a hold of these images and decides to publish them on the Internet for everyone to see, just to get more clicks on their article. This feels like the most egregious privacy violation in the sequence.


> A contractor posts some of these photos to a private Facebook group used by other contractors on the project.

That's where it went wrong. Everything else seems reasonable for a visual AI training project, well signalled to the participating users and the data securely communicated.

Thereafter, the data was mismanaged.

There is clearly no such things as a "private" Facebook group. So called "contractors" [1] using a disservice like Facebook to communicate beggars belief.

[1] people with the unremarkable skill of being able to spot ordinary household objects and label them - so someone probably had the bright idea of creating a CAPTCHA "Find all the women on toilets".


Private Facebook group, free tier Slack, a Zoom meeting, free Google Drive. I fail to see the difference.

All of these are reputable, popular SaaS applications, widely used to collaborate at work. All of these are equally trustworthy (read: not that much), and give you the same privacy guarantees. Using either in a company setting without a contract in hand is unwise, IMO, but that ship has sailed long ago.

GP is correct. #2 is bad, but not significant. #3 is the bigger violation here.


> Private Facebook group, free tier Slack, a Zoom meeting, free Google Drive. I fail to see the difference.

You didn't fail. Because there isn't one.

> All of these are reputable, popular SaaS applications, widely used to collaborate at work.

As you admit yourself;

  "these are equally trustworthy (read: not that much)"
If you fail, it is to believe that careful reputation management and widespread use of substandard tools makes them acceptable.


> Thereafter, the data was mismanaged.

And this is the problem with most of the technology spying on us. Once the data is in someone else's hands, there is nothing you can do to prevent it from being "mismanaged". It can exist forever and you aren't allowed to know who has it or what they're doing with it.

Bugging your own home with any device designed to spy on you is just a terrible idea and I'm amazed at how many people are oblivious to the harm they risk bringing on themselves and everyone else around them.


>1. iRobot hires people to use special development versions of the Roomba in their homes to collect training data. These are clearly labeled, and the participants are informed that the images are being sent to iRobot for training. This seems fine - if you want to exchange some degree of privacy for money, that should be your right as long as you're clearly informed about it.

No one would ever sign up for their pictures being taken in a bathroom.


[flagged]


Can you elaborate? I disagree with OP that 3 was the the most egregious event, I agree that it's questionable for the article to use the two specific images containing people so plainly and without further censorship, perpetuating the invasion of privacy for the sake of a more captivating article.

3. The MIT Technology Review gets a hold of these images and decides to publish them on the Internet for everyone to see, just to get more clicks on their article. This feels like the most egregious privacy violation in the sequence.


The two pictures that include people have their faces removed. Yes they contain people, they don't contain a useful way to connect the photo to any real person. I suppose if you are that person you could recognize yourself. Appears to be in Asia from the TV set contents.

"Just to get more clicks on their article" reads to me as a flame on media, I don't detect that motivation.

If that was me on the toilet I'd be pissed at Roomba and their trust in bogus contractors, but since you can't tell it's me I don't really care. Maybe that's just me.


Roomba is the company that planned on using their vacuum cleaners to map people's houses so they could sell that data to third parties.

https://nypost.com/2017/07/25/roomba-maker-wants-to-sell-you...

Considering that this is how poorly they protect the sensitive data their devices collect (even those used for development purposes) I guess it's a good thing that public backlash over their spying plans forced them to reconsider.


There was a huge backlash to this. There was the potential for thieves to find out the layout of someone’s home and when it was unoccupied. Add cameras to that and you have a disaster waiting to happen.


>There was the potential for thieves to find out the layout of someone’s home and when it was unoccupied. Add cameras to that and you have a disaster waiting to happen.

This feels like pearl clutching for no reason.

It's not hard to find out the layout of a house. In a given neighborhood there are probably dozens of houses for sale, which would have their floor plans publicly available as part of their listing. At the same time, there is a high chance that multiple homes in a given neighborhood share a floor plan. Therefore the layout of any given house isn't exactly some sort of a secret.

As for figuring out whether a house is occupied, there are far easier ways. Any internet connected laptop/phone has built-in cameras and microphones, which also allow you do determine occupancy.

Of course, none of this matters. The typical robber isn't going to do some ocean's 11 heist shit where they're scoping out your house's layout in advance and hacking into megacorp's servers so they can learn your daily routine. They're far more likely to make an educated guess based on whether your car is parked and whether the lights are on. If they're really prepared they'll maybe drive through your neighborhood several times a day to get a better sense of occupancy.


Given the long history of IoT devices becoming compromised, it’s common sense to be cautious.


Like how Google Streetview became a disaster? I dislike mapping like this as the next privacy conscious guy, but the world hasn't really changed since then. But it was anticipated to.


Oh please - If someone wants to know the layout of my house they could just look at public property records which are published online.


Can anyone also look online to see what types of furniture you have, how often you replace it, how you have it arranged, if you have children in the house, how clean/cluttered your home is kept, how often/quickly you put things away, whether or not you are a hoarder, if you have pets, etc.

Continuously updated maps of your home and its contents over time is a lot more revealing than just a basic floorplan.


While Scale is very good at marketing, I wonder if companies using scale ai genuinely understand how their data is being labeled. They have been completely obfuscating that they primarily do crowdsourcing through https://remotasks.com/. The brand of remotasks is entirely different than scale ai for apparent reasons. I don't think major companies would opt in for crowdsourcing if they knew it. There are plenty of managed labeling service providers who provide a guarantee that the labelers come to a facility without phones to label data. There's a reason why they exist.


For what it's worth, the company offers pricing tiers for secure facility vs crowdsourcing and customers get to choose.


The answer: special robots used during development for training ML image classification. Presumably leaked by human gig workers in Venezuela who were hired to perform image classification.

> All of them came from “special development robots with hardware and software modifications that are not and never were present on iRobot consumer products for purchase,” the company said in a statement. They were given to “paid collectors and employees” who signed written agreements acknowledging that they were sending data streams, including video, back to the company for training purposes. According to iRobot, the devices were labeled with a bright green sticker that read “video recording in progress,” and it was up to those paid data collectors to “remove anything they deem sensitive from any space the robot operates in, including children.”


> Milagros Miceli, a sociologist and computer scientist who has been interviewing distributed workers contracted by data annotation companies for years. Miceli has spoken to multiple labelers who have seen similar images, taken from the same low vantage points and sometimes showing people in various stages of undress.

> The data labelers found this work “really uncomfortable,” she adds.

This is an interesting point - the article seems to present that this was not done out of malice (as the woman's face was pre obscured).

> Labelers discussed Project IO [another assignment by Scale] in Facebook, Discord, and other groups that they had set up to share advice on handling delayed payments, talk about the best-paying assignments, or request assistance in labeling tricky objects.

It's clearly against policy,

> But such actions are nearly impossible to police on crowdsourcing platforms.

> When I ask Kevin Guo, the CEO of Hive, a Scale competitor that also depends on contract workers, if he is aware of data labelers sharing content on social media, he is blunt. “These are distributed workers,” he says. “You have to assume that people … ask each other for help. The policy always says that you’re not supposed to, but it’s very hard to control.”

I'm honestly not that suprised that something like this happened, where similar things happen for mturk.

1. Society demands this kind of automation 2. Companies reacting to this demand have to hire humans to perform manual labelling 3. Humans that perform labelling don't always follow the rules and policies in place 4. Data leaks occur 5. Article like this are written 6. Demands for automation don't really change

(repeat)


Scale AI contractors likely leaked these. Regardless of how they came to be posted onto Facebook, it still seems like iRobot's responsibility to keep this data under wraps.

We see similar stories all the time, whether it's about companies leaking data that was collected via consent, data collected without consent, or data collected without anyone knowing about it gets leaked.

Even Apple has been caught recording via Homepods without consent.


Valetudo [0] supports local only operation of various supported robot vacuums.

Even apart from the privacy stuff, the fast local web interface and open standards integration support (mqtt, homeassistant etc) are brilliant.

0: http://valetudo.cloud/


Highly encourage folks to take a look at Valetudo. I recently installed it on a Dreame Z10 Pro and it works well. Even without various integrations, I added a DNS record for it and can happily go to http://vacuum/ to control it locally.


As someone trying to figure out how to do this with mine, thanks for saving me the headache of finding one.


Since Scale.ai provides labeling services to the US Department of Defense, how do they address the issues presented in this article? Have their labelers go through government background checks? Provide labeling software but not the labor?


they hire cleared labelers in st louis. The software likely needs to run on-premises or in government networks


Always knew Scale AI was complete BS and overvalued, the lack of controls and oversight is embarrassing, I can't believe they have govt contracts.


Their gov business is questionable for long-term sustainment. Labeling services are akin to transcription services provided by others like Leidos... its not a technology business. Got contracts through political connections...

there's a major turnover in their federal team...


Yeah, I’m not sure I buy their explanation about special development roombas since they offered zero proof. I have an ancient Roomba that’s still going and dread having to replace it one day.


>Yeah, I’m not sure I buy their explanation about special development roombas since they offered zero proof.

It is simple for any person with even basic knowledge of networking to independently come to the conclusion that Roombas are not uploading video streams (or photographs) to the internet.

I know the IP (10.0.0.11) and MAC (50:14:79:1E:AB:6B) address of my Roomba and using the Insight Netflow Analyzer for OPNsense I can see how much data it has sent to the internet. In the last six months it has sent approximately 72MB of data outside my network. That's about 600KB per day.

It has received much more, presumably firmware downloads.

This is consistent with firmware update checks, notification traffic, and me periodically adjusting its schedule remotely.

That's just me clicking on some tabs in my router's web UI. Hundreds if not thousands of people globally are constantly reviewing and monitoring Roomba network traffic in fine detail in order to understand and/or reverse engineer it for research and other purposes.

So one of three things is happening:

1. All Roombas send photo and video streams to iRobot and they have thus far managed to hide this from the public and the thousands of eyeballs constantly monitoring the network traffic of their products, or

2. A subset of Roombas send photo and video streams to iRobot and they have thus far managed to hide this from the public and the subset of eyeballs monitoring the network traffic of their products

3. These are development devices like they claim.

Based on my own experience we can eliminate 1, based on the images accompanying the article option 3 is highly likely.


You really think Roomba owners know how to do any of that, or even would?


Yes.

A non-trivial number of them. Thousands, out of millions, at least.


You could buy a lidar based robot vacuum instead of visible-light camera. The only benefit I know of that visible-light provides is so-called "poop detection" and avoidance, but that's somewhat unreliable anyway.

Lidar, in theory, could create a photo-like image, but that resolution costs money and none of these robot vacuums are anywhere near that. Plus they map depth, not texture, so anything it does create is somewhat abstract.


Apart from the most recent top of the line models, lidar sensors are mounted at the top of the device and cannot detect short items (slippers, cables). They are not much more capable with obsticalr avoidance than the front bumper system on roombas.

The few robot vacuums that have only a front lidar sensors perform poorly compared to the camera only or lidar augmented camera in the latest Roomba and Roborocks.


There is no possible way they could prove anything about this in a form adequate for a short article. You either trust that the writers are responsible journalists representing the truth of the matter to the best of their ability or you don't.

That goes for the device manufacturer as well. They couldn't possibly prove the fidelity of their statements except on a witness stand under the penalty of perjury. So unless you think they are conducting the type of conspiracy that could see some of them sent to prison, we might just have to trust that they don't defraud the public on a regular basis.


Their policy (https://homesupport.irobot.com/s/article/964) seems pretty clear, and from what I've seen of the app it explicitly asks before uploading photos. If you're concerned you can turn the recognition feature off, or block it from the Internet and trigger it manually.


> In iRobot’s case, over 95% of its image data set comes from real homes, whose residents are either iRobot employees or volunteers recruited by third-party data vendors (which iRobot declined to identify). People using development devices agree to allow iRobot to collect data, including video streams, as the devices are running, often in exchange for “incentives for participation,” according to a statement from iRobot. The company declined to specify what these incentives were, saying only that they varied “based on the length and complexity of the data collection.”

> The remaining training data comes from what iRobot calls “staged data collection,” in which the company builds models that it then records.

> iRobot has also begun offering regular consumers the opportunity to opt in to contributing training data through its app, where people can choose to send specific images of obstacles to company servers to improve its algorithms. iRobot says that if a customer participates in this “user-in-the-loop” training, as it is known, the company receives only these specific images, and no others. Baussmann, the company representative, said in an email that such images have not yet been used to train any algorithms.


There are some cheap Chinese ones that can be spoofed into talking to a local, open source clone of the official cloud service.

As a bonus, they seem to be far ahead of iRobot at vacuuming/mopping.

It's not a great option, but might be the best if you are looking for something that will map out a house and clean designated rooms, and that will function when not connected to the internet.


You're operating in a framework where you already think they are lying. What proof would you even deem acceptable?


‘iRobot declined to let MIT Technology Review view the consent agreements and did not make any of its paid collectors or employees available to discuss their understanding of the terms.’

The reporters tried to validate and were blocked. That’s where my suspicion lies.


If it were my company I would not let journalists anywhere near it or my employees either. All downside.


You don't need to replace it, really.


> hey were the sorts of scenes that internet-connected devices regularly capture and send back to the cloud

Most of the home devices like vacuum robots really have no valid reason to connect to any cloud let alone send any pictures there. Such a robot can run all the necessary code [ran in the cloud normally] on itself. If only the consumers would not be so naive to accept the cloud bullshit for a norm.


Real data of robots getting stuck is a valid reason to improve performance and avoid common real case scenarios impossible to replicate in test cases.


It can make sense to upload volunteer's data (also let them view it before uploading), yet this is not a valid reason to make a device refuse function without the cloud.


These are internet-connected devices with cameras. A firmware update could be sent to record everything you do and stream it on youtube all day, and another later applied to remove any remnant of what happened. Any privacy you have with a device like this comes from either the benevolence, a lack of a profitable opportunity, or a fear of being caught by the company that has root on that device.


Why does a Roomba need a camera looking UP?

Why are they labeling furniture in home that Roomba can't possibly reach from the floor?


That's for navigation. You want to be able to tell it to 'clean the living room', it needs to know what the living room is(or some of the landmarks). The robots are low on the ground, so tilting the camera up helps.

That's not the only approach though. You can look forward (or just use lidar), but this navigation approach seems to be less sensitive to, say, furniture been moved around.


Say a couch has shiny metallic legs that mess with the depth estimation. An estimate of where the corners of the couch are could give better estimate of the legs and weight one possibility more than another.


I mean, the Roomba drives itself under the couch and gets stuck on it because of a lack of clearance. If you're working at iRobot and trying to make better Roombas, and need more data as to why your height sensor aren't working to prevent the condition, isn't a camera looking up the obvious way forwards to collect additional data?


Reminds of me of a conversation I had with an engineer from a internet-connected thermostat company. When designing their device, they disassembled many of their competitors' thermostats. They found that some had microphones on their circuit boards. Scary stuff.


The roombas in the article were testing units, and people here seem to think this is just available for dev hardware. However it has been available as part of the iRobot beta program for a long time:

https://www.reddit.com/r/roomba/comments/qrs5e5/roomba_beta_...

and, of course, iRobot is in the middle of being acquired by Amazon, who have a long history of giving police, access to customers' camera feeds without proper opt-in processes.


Roomba's claim is that only users enrolled in/with special units were part of the data being stored & labeled by contractors.

The sort of live-view functionality described in that post doesn't necessarily require Roomba to store/transmit video/mean it was part of the set of images leaked by ScaleAI contractors.


Yeah, but owners of roombas can do the same thing.

So can people with credentials of compromised iRobot accounts.

So can Amazon, post acquisition.

People that purchased these devices weren't signing up for a system with such terrible privacy implications.


Let's all remember that iRobot, the company behind Roomba, is now owned by Amazon.com, and not everyone likes it. [0]

[0]: https://www.theatlantic.com/ideas/archive/2022/08/amazon-roo...


Is there any verification that there actually was consent? E.g. has anyone come forward with evidence that there really were devices which did this?

Even then, who made the consent? Did they verify the person who did legally had the ability to do so? What happens if the person on the toilet is a child?

I hate lawyers and I’m not one myself but even I can tell you this is a bad fucking idea.


Hell with Roomba. Vacuuming floors is nothing. I want a robot that cleans and sanitizes toilets and bathrooms, another that does dishes and the kitchen sink, and one for the car.


Importantly:

> All of them came from “special development robots with hardware and software modifications that are not and never were present on iRobot consumer products for purchase,” the company said in a statement. They were given to “paid collectors and employees” who signed written agreements acknowledging that they were sending data streams, including video, back to the company for training purposes. According to iRobot, the devices were labeled with a bright green sticker that read “video recording in progress,” and it was up to those paid data collectors to “remove anything they deem sensitive from any space the robot operates in, including children.”


Misleading title. They should try to have the title give some indication it was a development roomba in a special opt-in data collection program.


The silver lining of is that now I can show this article to anyone who accuses me of being overly paranoid.


I just wish the companies making these devices had the same level of care with other people’s privacy as I would if I was making these devices. It’s not right.

At the very least, companies should have sign an oath to protect their customers and employees - not to abuse them… similar to how health professionals have an oath to do no harm. Is that too much to ask in this world.


Clickbait. There is no news here. No privacy was infringed. This was a private dev version.


Even those who opted-into this data collection certainly didn't opt-in to it being posted publicly on Facebook?


Well the picture of the woman on the toilet is in the article, so it's fairly safe to say her privacy has been infringed.


It was infringed long before it was added to an article.


Pretty sure none of the users consented to photos of them on the toilet being posted on Facebook (and now a news article; wtf was the Tech Review thinking).


Face blurred, no way to see who it is (though I haven't checked for EXIF in the picture, that would be fubar). IMHO and I don't expect anyone else to share it, the picture drives the surveillance aspect home better than word would, though perhaps that makes more of a difference for people who don't ordinarily read Tech Review.


It is still mortifying to have a photo of you sitting on the toilet with your shorts down even if your face is blurred out. Yes, the damage has already done and this made the article even more compelling, but I don't think it was necessary or appropriate.


>...and I don't expect anyone else to share it,

...aha!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: