Hacker News new | past | comments | ask | show | jobs | submit login
Mycroft – An open-source voice assistant (mycroft.ai)
431 points by doener on March 27, 2020 | hide | past | favorite | 113 comments



As far as I know Mycroft does its speech recognition in the cloud, so your voice has to leave your network unfortunately. This is the reason why I don't have a voice assistant yet. There was snips.ai which tried to solve this problem locally but they were acquired by Sonos.

edit: spelling


I've been having great success building homebrew voice assistants with Rhasspy [0] and voice2json [1] (they're sister projects from the same maintainer). The code's not ready to share yet, but I have a voice-controlled Raspberry Pi music server in my car now, working 100% offline (got it ready just in time to never have to drive anywhere, lol).

[0] https://github.com/synesthesiam/rhasspy

[1] http://voice2json.org/


Please share! The community can help with ‘readiness’ (:


Haha, it's a completely garbage hodge-podge of Node and Bash right now, but I'm cleaning it up and will be sharing soon!


The pieces of Snips that were OSS before Sonos bought and ditched the community are being leveraged along with new work in Project Alice https://github.com/project-Alice-assistant It continues to strive to be modular and offline. By design, the choice of online/offline elements including Google ASR and Amazon TTS along with corresponding quality and privacy tradeoffs is your choice. Come give a hand.


This is really nice, are there any plans for Android deployment?


Pretty cool but weird choice of logo... are they run by The Umbrella Corporation?



Probably referencing Program Alice from the films.


The default configuration is cloud based. However you do get to choose where your voice data goes. They also have a method to keep the TTS local and I haven't checked in on the project in a while but they were either working on or already had a local home server so you can make the whole system internal if you wish. I'd provide reference links but as noted site is down currently.

Edit: There are also several skills available that can point at full, local, downloads of wikipedia and the like. So if you prefer faster results and keeping some of your queries internal as a sort of hybrid thing that's an option as well.


Snips were acquired? That's bad news, they were the only ones doing offline voice recognition...


They were definitely not the only ones, just the ones with the most aggressively spammy marketing team. HN mods actually stepped in once or twice about it.

Rhasspy is a really cool project that is compatible with a bunch of fully offline speech frameworks. PicoVoice is a super neat company too.


Thanks for the tip! Both Rhasspy and PicoVoice look great. Anyone can recommend a microphone array that works with the raspberry pi?


Look for keystudio respeaker hat for raspberry. I got one for this exact purpose but still haven't got the chance to test it.


Very cool. I see that respeaker has even a 6 mic array for the pi: https://respeaker.io/6_mic_array/


I have the (older, discontinued) 6+1 USB mic array, but I'm in general wary of the ReSpeaker stuff. I have the original ReSpeaker Core v1 hardware as well, and it's incredibly unstable. It comes with a tiny amount of flash that you're supposed to augment with a SD card (which then gets overlayfs'd onto the root filesystem), but the hardware is so buggy that running things from the SD card causes random SIGSEGV and SIGILL all over the place (and yes, I've painstakingly verified that this is not a software issue).

They essentially abandoned development of the v1 line (but appear to still be selling it!) for the v2 line, and I have a hard time trusting them.

Admittedly, I haven't done that much with the mic array I have, mainly because I got burned out dumping tens of hours into trying to get ReSpeaker Core v1 working (which I ended up trashing; it was that bad). I'd like to use it with a Raspberry Pi; hopefully that works out.


Seconding both ReSpeaker and Rhasspy/voice2json, I've been having great luck with that combo. The docs and example code for ReSpeaker are great, very easy to work with.


Oh, thanks for that, I've been looking for a good offline speech framework. Unfortunately, the hardest part for me is the microphone array, hopefully I can find something good there.


I haven't had time to play with Rhasspy, but the fact that you can plug and play components from different speech platforms is really promising, particularly when I want to interface it with my own assistant at some point.

Wake word detection, speech to text, intent recognition, etc. is all split out and able to be plugged into separately.



For n=1, I got DeepSpeech 0.6 working on a Raspberry Pi, and the recognition accuracy was atrocious. Haven't tried the new 0.7 branch, though.


Keep in mind deepspeech is an implement of speech recognition algorithm... the results will largely depend on model used which typically require lots of manual labor


That is one of the toolkits used by mycroft. It isn't everything needed for an assistant but if you want to make one it is probably the best starting point.


I found about Mycroft through their Kickstarter project, Mycroft Mark II. What convinced me to back them was that they said that, theoretically, I should be able to self-host their server on my own hardware at home.

Unfortunately, they mismanaged Mark II so badly that I lost all faith in Mycroft in general.


Same.

I, too, saw the Indiegogo campaign and gladly handed over nearly $200 for the Mycroft Mark II.

Two and half years later, there has been little progress, with no telling when they'll actually deliver the device to backers.

It's one thing for a company to delay production to have lost money on this with no help from Indiegogo and a refusal from the company to refund backers, I have no hope for this product seeing the light of day any time soon.

Shame on Mycroft for this scam.


That is the nature of crowdfunding. This is not a scam unless there is deliberate malice involved. A mere inability to deliver a product is not a scam. They cannot give you a refund as the money had been spent on development. If they had held your money so that they could now give you a refund, that would be extremely problematic.


I can't wait for a community-driven, open-source data model that we can take offline and plug into any software adapter. With a little version control and some voluntary data samples, a large enough community could get it going.


Last time this came up the options were the Mozilla TTS service or do it locally with tensorflow lite and DeepSpeech.


Well, there is some current research in Europe at least on how to make voice assistant technology more respecting of their user's privacy: https://www.compriseh2020.eu/ It's still far from market-ready, though...



"As far as I know Mycroft does its speech recognition in the cloud..."

How practical would it be to recognize my most common commands, eg "resume podcast", offline?


The fact that they are online is stopping a bunch of people from shouting at nobody in particular to "dim the lights", "play this (local) playlist", "turn ac on", "play two random episodes of paw patrol"... Eg. me.

Many of those who are concerned with apps phoning home would be the ones to keep their data and media locally too!


Very practical, there are even dictation apps with quite good accuracy (and that is dictation, which is much harder because the sentences are free-form, not predefined) which work offline (like some offline Nuance Dragon apps).


This is a major drawback of mycroft and I really hope they fix it soon.


Your voice is already compromised if you use a smartphone or windows/Mac with a mic.


One theory of security is that you can't ever be truly secure; but you can make it difficult/expensive for your security to be violated. If you lock the door, someone can always break a window, but that creates more noise and therefore more risk, making an intruder more likely to seek a softer target. A ten-foot wall can be breached, but it will likely dissuade anyone without an eleven-foot ladder.

The problem with mass surveillance is not that the NSA/etc can breach anyone they want; that's effectively always been true. It's that it's cheap to breach everyone, by default, all the time. Taking reasonable precautions such that you can't have your data swept up cheaply, instead requiring targeted human effort and/or a court order, does create a deterrent, and helps counteract the current imbalance of power between TLAs and We The People.


If you are a target of the NSA or some other state actor, then perhaps. If not, and you take a bit of care to disable the voice assistant on your devices (Siri, Google. etc), then you should be fine.


It's not the NSA that does it, it's Microsoft, apple, Huawei, and so on. And no, disabling the voice apps does not out you out of telemetry.


Do you have any citation/proof? Or is this just a suspicion?


That just disables the voice assistant. It does nothing to ensure that your mic isn't on, listening, and recording at any time. I've had too many targeted ads start popping up after conversations with people to be convinced otherwise.


My pet theory is that this is actually a geo-location type of tracking, combined with the recency illusion.

Let's say person A was doing research on a product they're interested in buying. That product category is now associated with person A. Person A goes and visits person B. Because our phones track where we go by default for most people, now there's an association between Person A and Person B, and a potential likelihood that person B is also interested in that product. Thus the occasional ad shows up for the product Person A was researching. Since Person A and Person B are friends, it's possible Person A will talk about the product they're interested in. Person B now notices that they're getting ads for the product that Person A talked about.


it's observer bias, nothing more. test it out. pick a phrase, my spouse and i chose "snowmobile". For the next week, we had random conversations about how cool snowmobiles are, how badly we wanted one, what price we were willing to pay, financing, etc. we peppered our conversations with click bait, honestly. neither of us particularly ever had any interest whatsoever in snowmobiles, so we figured this would be a pretty ok test for anecdata.

at the end of the week, we both saw no increase or even a mention in our targeted adverts regarding snowmobiles. likely what is happening is that you searched something somewhere on your phone, then brought it up in conversation, then saw an increase in ads related to that subject.


I think people just underestimate how good targeted ads are. They're probably predicting something we might want before we even realize we want it.


> At the end of the week, we both saw no increase or even a mention in our targeted adverts regarding snowmobiles.

While I do respect you and your spouse for trying this experiment, it's entirely too small a sample size to really prove that this doesn't happen. There are a ton of variables involved, the biggest being which advertisers may or may not be listening in at any given time. It's hardly a controlled experiment, and I can't say I can put any stock in it.

It also does not prove that it is observer bias. I have had things pop up that I am %100 absolutely positive I never in any way shape or form looked into, yet there the ads were after a short conversations with others who did have an interest.


One thing i thought of and wanted to test: what if you yourself don't do any internet queries but someone on the same network as you does? I've had conversations with people, one time we were talking about a concert venue which i did not look up, but i did start to receive ads for that concert venue. It creeped me out initially, but I believe the other person on the same home network was googling or looking for information on that venue on their phone. I believe we were just associated together for advertisement purposes.


I’m pretty sure this happens. When I was doing work with NetSuite, my roommate (who has no interest in anything remotely related to enterprise software, and doesn’t have any other devices that could have been listening in) started getting ads for them on his laptop.

It makes sense for them to target by IP, so if one person in an office is researching the product they can start targeting the entire office and maybe get their ad in front of an exec who can make a purchasing decision.


Do you live somewhere snowmobile advertisers are likely to target?


Even if he doesn't there are snowmobile resorts. People pay good money to go to places with a lot of snow, rent a snowmobile and ride in the wilderness for a week.

Also a fair number of people live where a snowmobile isn't practical but have a cabin where they are and so they will be making weekend trips to where they are practical.

Your point stands though, anyone who lives in the wrong area is unlikely to be targeted if the only indication they might want a snowmobile is home conversion. If you are serious you will do other searches (tracked for sure), and have location history in a target area (might not be track able, these area often have poor cell coverage)


That experiment is meaningless my friend. Aside from it not being applicable, ads are only a fraction of the issue.


Your defeatism is making things worse not better.

Your message is basically "You're already screwed, why bother trying?"

Why act that way? Who are you helping?

If you support privacy, then people taking steps to try to share less data ought to be encouraged, right?

Maybe it's not perfect - But it's worth trying to do better.


In what way did my message suggest defeatism?

I take active steps to minimize when and how the microphones around me listen in. In no way do I subscribe to the idea that "we are already screwed".

Please do not take the worst possible interpretation of what people say on HN. It's in the guidelines and generally looked down upon. This is not reddit.


Android doesn't listen. You can check by yourself using adb logcat


No, stop wasting your energy on change you won't make happen. Imagine if every innovator spent their time appealing to others.


Good thing it runs on a RPi then.


Worth a mention they are fighting a patent troll right now https://voicebot.ai/2020/03/04/mycroft-ais-legal-war-against...


Wow. The patent is for "Using voice commands from a mobile device to remotely access and control a computer". That's so obvious it's laughable. How was this patent granted?


Once at a meeting to discuss the patent application the presenter mentioned how Jeff Bezos patented "one click buy" - i.e. Buying something with a single click. I asked why we didn't patent "2-n click buy" to prevent any competitors from selling anything.

Everyone laughed but I meant it seriously. I still don't get how "1 click buy" is a thing you can patent but you can't patent "2 click buy". The patent process is incredibly stupid and arbitrary as far as I can tell - at least when it comes to software.


FWIW, there was a patent troll trying to get people for the idea of "online shopping carts" which is pretty much "two-click buy" imo.

https://arstechnica.com/tech-policy/2013/01/how-newegg-crush...


Patents are not granted for what they achieve, but instead how they achieve it. So the patent is for a specific methodology of "using voice commands [...]". Not that the content of the patent is novel -- I have not looked deep into it personally. It is likely a general BS methodology that the troll never used for a real product.


Mycroft as software is used by a small group of users and seems pretty stable. More features are continuously added and the design principles look promising (open source, as private as possible).

The biggest problem is their hardware: they have a Mycroft v1, (to me personally) a prototype alike piece of hardware. There have been successful campaigns for a v2 release, with new hardware and an improved design.

However, they fail to work with reliable partners and there's still no working device which resembles the final production level. I have been a backer of the indigogo campaign but it's frustrating they postpone their Mycroft v2 every time again. I really hope the can deliver the device at some point, but they keep rewriting software and if they ship, the hardware is pretty outdated probably.


It's true v1 was more of a proof of concept device. The core hardware target has always been the raspberry pi family of devices since they are pretty well ubiquitous. 2-3 is the current set. Base software is linux so if your audio devices, mic and speaker, work with Linux then you're pretty well good to go. Most microphone hats for Pi are supported. I use the Google Voicekit AIY v1 with raspberry pi 3 b+. Works a treat.

As with most open source efforts(especially early on) there's a lot of tinkering and DIY at the get go and they've designed their product to be supportive of this. Their "retail" devices are, much like Googles intent with Nexus, intended to be a best possible target for other vendors to target including the DIY crowd. Whether that's the right way to come at it is open to debate but the premise that they don't have a set hardware target is at best misleading.

Edit: Link to hardware spec https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/get-mycr...


I think perhaps it’s hard to hit a hardware target when the software is still in a pretty big state of change. It’s hard to say what hardware will be needed and what is the best compromise between hardware performance and the DeepSpeech NN for STT. DeepSpeech is still in development as well.

I think the priority needs to be getting the STT to a good, neural net backed, open source engine. Once the software is stabilized I think there’s room for a whole ecosystem of hardware interfaces.


This is already a step-up from Google Home devices in that you can trust it's not sending audio to Google outside of what you intend to, but I'll be properly excited when the Open-Source speech-to-text component is working[1] and I don't have to send my voice to Google at all.

[1]: https://mycroft-ai.gitbook.io/docs/mycroft-technologies/over...


I might be wrong so grain of salt. The main website is down. I believe the open source STT was working and in. As with many community projects it relies on folks contributing time to update documentation and unfortunately Mycroft hasn't received the love of other projects so I believe, again I may be wrong, that this documentation is outdated.


I get a Database Error on the homepage. Hug of death?


Error establishing a database connection. And it didn't say that in clear voice.


I mean, I would say that's consistent behaviour compared to other voice assistants I've tried.


Wordpress without cache plugin + HN traffic = downtime =)


The dreaded hackernews effect!


Mycroft can also be used on Raspberry Pi-like developer hardware which is awesome :D

Check out this example with the MATRIX Creator and MATRIX Voice: https://www.hackster.io/matrix-labs/matrix-devices-running-m...


I have used it! The MATRIX Devices are very good choices to deploy this kind of voice processing solutions.


Not really the first one though :p

https://github.com/rcbyron/hey-athena-client/commit/50b37628... (2015)

Vs

https://github.com/MycroftAI/mycroft-core/commit/8e470ce7c15... (2016)

Jokes aside, it's a great work, I'll probably try it in my spare time.



You beat me on this :)


Home assistant is a home automation tool, not a voice assistant. You can even hook up Mycroft to Home Assistant.


Home Assistant has nothing to do with being a voice assistant.


I’m getting “error establishing database connection”


I made a fully offline voice assistant on a Raspberry Pi with Pocket Sphinx and Festival Lite glued together with Python. Performance wasn’t great but it was a fun project nonetheless.



Bummer their site is still down. Not sure why a marketing site needs to maintain connectivity to a database to render the homepage?


Most web pages need to be able to be updated. I think the two most popular options for that are a CMS or a static site generator. In general it is easier for "less technical" users to update the content in a CMS. Hence you get a database for rendering.

Alternate analysis: Personally it is not surprising to me at all that a marketing website would be run on Wordpress, Drupal, etc. And therefore it is quite clear that a database connection would be needed.


Even with wordpress you don't necessarily need the database connection. Either a colocated reverse proxy can cache the result, or you can configure cloudfront or some similar 3rd party service which will not hit the origin unless it has to. Or you can even use one of the wordpress plugins which caches the resources/pages as static files.

It requires extra half an hour of work, but... it's probably worth it.


The website is down but a lot of work happens at the forums and the forums are still up. If you're interested in some insight in to the community around the project have a look over there. https://community.mycroft.ai/


Stupid question: has anyone hooked up a voice assistant sucessfully against upnp renderers and media servers? Would be willing to invest some effort if I could use them to browse music. Seems quite tricky particularly for non native speakers :)


Interesting you should mention this needs, as there's something that integrates Mycroft that just came up in one of my feeds:

Plasma Bigscreen from people involved with KDE:

https://www.forbes.com/sites/jasonevangelho/2020/03/26/plasm...

https://youtu.be/yylFiE4QtUE


And it got posted here too: https://news.ycombinator.com/item?id=22693172 (which has a more authoritative link than my original one)


Someone has a plex based music skill. https://community.mycroft.ai/t/plexmusic-skill-test-and-feed... Doesn't seem to be quite what you're after but I know very little about upnp renderers so this might work for you.


I wonder if you can change it's name now.

Last time I asked, the dev asked me back how much I'd be willing to pay for it...


Like change the trigger word? Looks like you can: https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/customiz...


Awesome.

I'll have a loot at it again.


What was your answer?


I stopped following the project since that.


You can absolutely do this. I haven't played with it in a while so I don't recall if that got put behind the service paywall. The wake word is processed local though and the software is open source so absent the online management dashboard I'm certain you can hard code it. They were also working on, or had, a home server for managing your devices entirely internal to your network and it seems unlikely they didn't include a way to manage the wake-word.


Site is down, any WordPress and php based stack will crash after 30-50 simultaneous requests ...

I am amazed that this doesn't seems to bother anyone ... And adding a cache layer is not a real solution ...

And in other parts of the internet there is few stacks delivering 5k to 150k rps per core without cache ...

But still everyone use WordPress ...


This is flat out false, WP is able to handle much more than this. I've scaled WordPress personally to 10M+ users per day


Because the way PHP work (1 process per request), it can't handle a lot of simultaneous requests and also anyone can crash ANY wordpress website with a single WRK command ... If you have all the content cached by your CDN it might be ok, but if you have a dynamic website with API/DB, not just static content ... wordpress is slow as hell from my experience


I've been using WordPress for 4 years with good scaling and no downtime. It's only "slow as hell" if you don't actually know WordPress (which most WordPress users don't). ;)


I relied on WordPress for several years (converted sites over from Rails):

https://battlepenguin.com/tech/a-history-of-personal-and-pro...

Quote often I'd get issues with the database failing due to load. I'm very glad I moved everything to a static content generator.


Would be very cool of them to at least alias the mycroft.xxx domain


is there any good terminal assistant? something that accepts these kinds of queries but with in a cli, preferably without the need of constant internet connection


Mycroft has a curses CLI: `start-mycroft.sh cli` or `mycroft-start cli` if services are running, `start-mycroft.sh debug` or `mycroft-start debug` to start services and go straight into the CLI.


So maybe MyCroft and Mozilla should get together?


Hug of death.


their home page has gone all errory.

Error establishing a database connection


Named after Alan ?


I'd guess it's named after the computer in Robert A. Heinlein's The Moon is a Harsh Mistress.


No, they're both named after Mycroft Holmes, Sherlock Holmes' equally brilliant brother from the series by Sir Arthur Conan Doyle.


I guess we'll find out once the website is up again.

https://mycroft.ai/blog/why-name-it-mycroft/

Edit: Google cache has a copy[1]:

> Mycroft is named in honor of Mike, the supercomputer in Robert A. Heinlein’s classic novel “The Moon is a Harsh Mistress”. Heinlein’s Mycroft was a High-Optional, Logical, Multi-Evaluating Supervisor, Mark IV, Mod. L” – a HOLMES FOUR. Mycroft’s friend Manuel named him “Mycroft” after Sherlock’s elder brother Mycroft Holmes. This was later shortened to Mike.

[1] http://webcache.googleusercontent.com/search?ei=mwx-XrKxLauV...


Mycroft was actually more brilliant than Sherlock, though also lazy.

"My dear Watson," said he, "I cannot agree with those who rank modesty among the virtues. To the logician all things should be seen exactly as they are, and to underestimate one's self is as much a departure from truth as to exaggerate one's own powers. When I say, therefore, that Mycroft has better powers of observation than I, you may take it that I am speaking the exact and literal truth."

... "I said that he was my superior in observation and deduction. If the art of the detective began and ended in reasoning from an arm-chair, my brother would be the greatest criminal agent that ever lived. But he has no ambition and no energy. He will not even go out of his way to verify his own solution, and would rather be considered wrong than take the trouble to prove himself right. Again and again I have taken a problem to him, and have received an explanation which has afterwards proved to be the correct one. And yet he was absolutely incapable of working out the practical points which must be gone into before a case could be laid before a judge or jury."


Perhaps Holmes


i see error establishing database connection.


How does it work


db error :p




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: