Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Squawk – Walkie Talkie for Teams (squawk.to)
153 points by zumachase on June 3, 2020 | hide | past | favorite | 114 comments



This is cool. I can't wait to try it out!

Crazy Feature request: One of the things I've wanted for years out of a software based walkie-talkie like this is manual audio spacial diversity control (a.k.a. 3D audio). That way I can have multiple simultaneous channels, but be able to change the position (left/right/forward/45deg/etc) and the volume of each feed so I can have situational awareness with all channels while being able to use my brain to separate and focus on a particular channel.

These kinds of systems are often marketed to dispatch agencies, mission control, government/military customers, etc, and are super expensive[0]. I've never seen a consumer-grade version of 3D audio like this, but that would be super beneficial to ad-hoc communicators in disasters, scanner enthusiasts, public event coordinators, county/city-level EOCs that can't afford expensive systems.

[0]:https://www.thedrive.com/the-war-zone/24741/new-3d-audio-wil...


Aha! I'm a huge fan of positional voice audio in video games. Mumble has a positional audio feature[0] that is relatively straightforward to integrate via mods. I built a mod[1] recently that adds support to Raft, and I attempted to list some of the benefits in the readme. In my experience, it completely changes the feeling of talking with people online; they become their avatars. I'd love to see it come to discord or zoom, and be lifted out of the video game space.

I haven't looked at Mumble's actual implementation, but I imagine it's possible with the web audio APIs.

[0]: https://wiki.mumble.info/wiki/Positional-Audio [1]: https://www.raftmodding.com/mods/mumble-link


I love that Mumble has that, but I wish it were possible/easy to enable it without a video game and manually position participants.

As for webapps, I think Mozilla Hubs does this: https://hubs.mozilla.com/#/

...as does Freeswitch I believe, though I've had a heck of a time trying to get that up and running: https://freeswitch.org/confluence/display/FREESWITCH/mod_con...


You might be interested in https://www.calla.chat/ ( https://github.com/capnmidnight/Calla ) which supports spatialized audio based on participant location on a top-down map.

Edit: in some ways, this might be totally off-base since it does involve a game and participant self-arrangement. Hopefully it's somewhat relevant, though!


I could be wrong, but based on the README this looks like it just sets speaker volume based on position - which is neat, but is not quite positional audio.


I wondered about that too, and found https://github.com/capnmidnight/Calla/commit/abc851b49bd1801... before commenting which seemed to indicate that there's at least stereo positioning. It's possible I'm mistaken though.


Oh interesting, it looks like you're right! They do appear to be using a StereoPannerNode rather than the full 3D PannerNode[0], but that is more than just speaker attenuation.

[0] https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_A...


That's an old commit. It's had spatialized audio for the last month, at least.

I'm actually going to put a feature back in to let you select the audio "spatialization" model, because some folks have had trouble with PannerNode not working right. I used to have a selection between volume scaling, StereoPannerNode, and PannerNode. Been a little too busy at the day job to do it lately.


The now defunct Dolby Axon had this feature for up to 50 people. It was great, you could drag people in a grid around you (see picture in [1]). I’m still sad they discontinued it, it’s one of the few voice chat services that I would gladly pay money for.

[1]: https://venturebeat.com/2009/09/04/dolby-launches-axon-surro...


I made a proof of concept chrome extension a bit ago: https://github.com/anateus/amphisonic

Currently re-writing it to be actually decent code. Basically, it takes up streams on the page and arranges speakers in a semi-circle in front of you, letting you adjust the radius (I've found for particular numbers of speakers there's an optimal radius for speaker separation).


I would love for MozIlla hubs (or any similar app for that matter) to allow one to set a webcam feed as your avatar's "head".

If anyone knows of such a thing please let me know. Would be perfect for socialising during quarantine and beyond.


It's at the other end of the spectrum, but Remotion is really cool and does what you describe. We wanted to get away from always on video, but I know lots of people like that. I cannot say enough good things about what the Remotion team are doing.


Kosmo.io has a poker app where your webcam feed sits at your seat at the table.


I have some multi-tenant PBXs based on Freeswitch however, I have not seen a positional audio feature in Freeswitch. Can you give more insight? Your link is the conference module and unrelated I believe.


Aureal A3D. Technology was developed in the nineties for NASA. Stolen by Creative. Creative simultaneously stole their patented tech and sued Aureal for bogus patent infringement. Aureal "won" the Lawsuit, but ran out of money due to legal fees. Creative bought their assets, IP and funny declaration releasing them from any damages from lost patent lawsuit.

Patents long expired, nowadays smallest $2 microcontroller can run it in real time.


Check out https://www.highfidelity.com/

I know it's not really what you are asking for, but it's some very cool spatial audio work, letting you separate out where folks are in a virtual space.

[disclaimer, I know some of the folks working on it]


I tried out highfidelity a couple weeks ago and didn't get any of the directional audio to work. There was a radial but not an angular effect if that makes sense.


interesting. It was working for me on Chrome, I could differentiate left/right


Virtual Airwaves is working in this space.

https://virtualairwaves.com/

and see their iOS app: https://appsto.re/us/jrj3hb.i and the web app: https://cb.virtualairwaves.com/channel/1

They have several patents on "3D Audio" style techniques to give distance cues via the signal:

See:

https://patents.google.com/patent/US9730023


I wish podcast producers would do this (or more of them at least in case anyone knows of one). I’ve done it manually as an experiment and it definitely improves clarity in cases where there are 4 or more drunk folks taking over each other.


Thanks! The tech in the link is awesome...might be above our pay grade.


It's definitely doable. Dolby Axon (which shut down a few years ago) had positional voice chat.


It should be easy to integrate game audio engines, such as OpenALSoft.


Hooray! I have been holding out hope for more push-to-talk apps on desktop platforms. It's such a step up from always-on listening. As an audio engineer, I grit my teeth every time there is microphone feedback, echo, or other audio issues on group calls, and these generally don't occur on PTT apps because everyone tends to be muted most of the time. And if it does start happening, the user more easily realizes what they did wrong because they correlate it with the button press. In fact, even if the user does not realize what they did wrong, others on the call can identify the source because "it's only happening when Jane speaks." I can't tell you how many times I've been on a call with feedback where the person who is causing it is complaining about it to the group without realizing it's their own fault and others on the call don't necessarily know, either. This never happens on PTT calls. There's also the obvious privacy benefits, as it avoids the issue where people speak without realizing they are unmuted. PTT is better in nearly every respect. I'm very excited about this.

I do have some questions that the website doesn't answer for me:

1. What about screen sharing? Seeing the word "collaboration" implies to me that I should be able to do so, but it's unclear. In the screenshot of the app, I see an icon or two in the right-hand sidebar that might be relevant to this, but they seem kind of generic.

2. What about mobile devices? I routinely do Slack calls where one or two people are on their phone for various reasons. It would be useful to know if that is supported or will be at some point.

3. I want to know more about the encryption. As you're probably aware, there has been a lot of controversy over the security of Zoom. In particular, there is an ongoing lawsuit related to their false claims of end-to-end encryption. [1] I think any new product, especially a chat app, that claims to be end-to-end encrypted really needs to show us the details of its protocol and stack, and ideally open source as many parts of that as possible. Does it use the Signal protocol? The site says "Squawk groups are invite-only and end-to-end encrypted." But which parts are E2E encrypted? The group membership? The message content? The message metadata? Everything?

Lastly, it would be great if you could add the app to Homebrew Cask, as it's my preferred way to download and manage apps on macOS.

1: https://gizmodo.com/zoom-accused-of-misrepresenting-security...


Glad to see we're not the only PTT fans out there.

1. Screen sharing is next on our list, it's the thing that we want most ourselves.

2. This is definitely not mobile optimized. It does work-ish on mobile phones, but it maintains long-lived webrtc connections so it's not ideal (we ensure these are not transmitting when muted, but we have a keep alive protocol which ensures they don't die, and would be harsh on mobile batteries).

3. Squawk uses webrtc, which is e2e encrypted by default. Additionally, we don't use any SFUs so we never have the audio unencrypted. All link negotiation and audio transmission are done entirely p2p and thus completely e2e encrypted.


Is there a plan for (at least self hosted) SFUs for firewall transversal? I imagine in some corporate environments that'll be necessary.


We have TURN servers setup for NAT traversal...but they don't terminate ssl like an SFU does.


That's great to hear. Having battled webrtc in the past it sounds like your team is doing a great job!


I don't think anyone ever truly wins the war against webrtc


I feel in the thread from feross lie some synergies. https://news.ycombinator.com/item?id=23408831


We actually use simple-peer. feross is great, suggest having a look at his repos.


So I'd have this running and then anyone from work can just start talking to me and I'd hear it straight away?

That sounds like a nightmare to be honest. How will you ever reach any level of concentration this way? Even the thought that someone could just start talking to you would probably ruin your chances at getting in any sort of flow.

Also, it feels like only a small step away from a form of work surveillance. "Where were you? I was talking to you on Squawk".

Unless my assumptions of what this is are completely wrong. There isn't much to go on on the page.


Think of it this way, if you are sitting in the office then anyone can come talk to you in your cube. I think it can be great app if you keep it running during fix hours like 3 hrs in the afternoon when anyone can come and ping. Rest of the time you dont have to keep it on.


think of it this way, most knowledge workers are finally free from being constantly interrupted by people who just feel like stopping by to chat, or don't really know or care that you're focusing on something. Now, there's a tool to bring that back!

<sarcasm>It's only like 15 minutes to get back on task after every single interruption. That won't hamper productivity at all. Much more important for me to get instant answers and boredom reduction whenever i feel like it. </sarcasm>


Think of it this way, some enjoy chatting with people to take a break. Some enjoy posting comments on HN with <sarcasm> :)


I get that. I have plenty of time where I zone out of Squawk. But there are definitely projects/periods where I need to constantly be checking in with a dozen people and various subsets of that dozen.

If that's not a problem you face, then Squawk is definitely the wrong tool for you.


> If that's not a problem you face, then Squawk is definitely the wrong tool for you.

Either that, or you are solving 'that problem' the wrong way.


Yeah don't use it if it's not right for you. I could see how this could be extremely useful in the right scenario. ( I will not be showing this to my boss).


Bingo.


Hey - the landing page is definitely in need of love.

Part of this is cultural (you wouldn't constantly shout at your colleague in person across the desk) but there are per-group and global mute buttons when you need peace and quiet. Everyone else can see that you're muted.


Hi all - open beta of a tool we created for ourselves during lockdown. We got sick of trying to replicate the effortless comms we had in the office, and hated managing a half dozen always-on Zoom/Slack/etc calls. So we built Squawk.


Are you planning to open-source this? Will it become a paid product? Is it P2P or are you hosting it?


We're not planning to open source it. Some features down the road will become paid but the free version will always contain at the very least what you see today.

The heavy lifting is done P2P falling back to our TURN servers if NAT traversal is necessary.


That's good to know; you should put that information on the web site. Especially the part about the business model.

Another thing I'd like to know before considering downloading the app is system requirements. A lot of people aren't on the latest version of Windows/MacOS yet.


This seems cool. I'm curious -- did you try out TeamSpeak before building a custom tool?


I looked into TeamSpeak when searching for a multi-channel PTT tool a few weeks ago and couldn't find a way to listen to multiple channels and talk/reply to each channel individually. Does TeamSpeak support that?

I ended up having to compile from master what was a very recently merged feature to Mumble: https://github.com/mumble-voip/mumble/pull/4011

EDIT: typo


When I saw the capitalized Teams in the heading, I was excited for a minute, because this would be a great add-on for Microsoft Teams. Alas it appears to be 'yet another thing to install and run in the background'.

Any plans to provide the functionality as an add-on for other communications apps?


Hey - we're looking at integrations with other apps, but the walled gardens make that difficult unfortunately.

I hear the frustration of downloading another thing - I've changed the landing page to make clear you can access in your browser at https://app.squawk.to


Sugestion: volume normalization, compression (as in audio effect, not as in zip). A lot of people have shitty mics or change their distance.


See also Zello.

It doesn't have the e2e stuff afaik, but works on all major platforms, including mobile. They also have a native desktop app (not Electron). As far as I know, it's used a lot, mostly by drivers for Uber or similar services. I haven't used it for a couple of years now, but it's worth giving it a go.


FYI Squawk team, there seems to be an issue with your main email. I tried to contact you can got this back:

> Hello,

> We're writing to let you know that the group you tried to contact (hello) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren't able to post:

> * You might have spelled or formatted the group name incorrectly.

> * The owner of the group may have removed this group.

> * You may need to join the group before receiving permission to post.

> * This group may not be open to posting.

Looks like google apps group configuration issue.

Related support article:

https://support.google.com/a/zumaltd.com/bin/topic.py?topic=...


Thanks for that. G Suite groups are a pain, I'll look into it. Feel free to email me directly at chase@zumaltd.com


I wonder if a "dispatcher" model might be interesting for this, like an old style minicab/police service. Then people could use squawk-scanners for the channels they want to listen into.

It could get annoying if its on all day, SOMA FM provides an interesting police scanner with audio mix which is an oddly relaxing soundtrack https://somafm.com/scanner/

Are you going to add some jargon, the stuff that made CB radio kooky back in the day?


See also http://websdr.org for ham radio listening.


Cool! Might give this a spin for our live-ops response team.

You should fix your meta description tag on the landing page :) I just pasted it into slack and got a...less than useful preview.


On it! Thanks


I really like the idea. Push to talk is a proven concept when a lot of people work together and should stay informed (e.g. fire fighters). I see a huge potential for it...


Pricing?

Also: similar (from yesterday I think) https://www.walkie.chat/


Free


Thanks!


This looks really good.

Is it possible to mark yourself as away? I could see it might encourage the expectation to always be present on the receiving end.


If you mute yourself, everyone else can see you muted. But statuses are on one of our upcoming sprints.


> mute yourself

The homepage describes it has push-to-talk; is muting yourself distinct from not pushing?


Yes, muting is on the receiving side: when you mute a group (or global mute) you won't hear anyone, and they can all see that you've muted and aren't there.

The push-to-talk is on the sending side. So when you're not pushing, your mic is muted. And when you click the mute button, your speakers are muted.


Oh I see, muting a group rather than yourself.


Reminds me of Pragli [1] but without the animated avatars.

How sustainable is this project? I see it's not open sourced, but there is no pricing page either?

Also, how are the connections made, P2P or through a central intermediate server?

[1]: https://news.ycombinator.com/item?id=22134329


Good question: there's no pricing page at the moment because we'll always offer the current functionality for free. We have some enterprise features (freemium) that we're planning, but we also won't charge for those until lockdowns are lifted.

Squawk is webrtc based (audio and data) so it's mostly p2p (except for tracking the swarm).


Feedback: Where's the video demo? I don't want to install it to see it in action and decide if I want to install it.


Much appreciated. That's highest priority for me. Misstep on our part.


What happens when you have incoming audio from two different groups?

Can you select a particular group to use PTT?


This is a tricky problem, and we have a couple of ideas. But currently if we're having a longer chat (double clicked will latch-open the mic) then we mute the other groups. Anyone else in the other groups will be able to see that you're muted.

We're contemplate auto-muting other groups when you have incoming audio. Would love to hear your thoughts as well.


When I played milsim games like Arma, a useful feature was to have my "squad channel" in my left ear, "command channel" in my right ear, and separate keys to PTT for each channel.


Did you find anything with those features already? I have a very similar scenario and I'm currently exploring what is available.


Just a heads up, but your header description appears to be from the template you used.

"Aptonic app landing page template helps you easily create websites for mobile apps, product landing, promotion and many more."

I linked to your landing page in Slack and that is what showed up


Is there a video of it in action?


There's not at the moment. I will sort one out in the coming days (too late for our HN moment unfortunately). Apparently we're not the best marketers.

But it's just like a walkie talkie: press the group you want to talk to, and everyone hears you.


Awesome! Great idea - if you are looking for feature requests, it would be great if you could also create only bi-directional channels on demand. Basically like a Star Trek communicator...


Thanks! Definitely looking for feature requests. I'm not sure what you mean. Do you mean quick/throwaway groups?


Not necessarily a feature request, but a request regarding the website, possibly a nitpicky request at that.

At the top of the page, under "Push-to-Talk Collaboration", it says "Squawk delivers instant team chat".

That line specifically confused me heavily. That line makes it sound like this is the feature your product is delivering. But MS Teams already has instant team chat functionality. What does this mean then?

Aside from this, I love the idea and the implementation of your product. Haven't had a chance to try it out yet, but I definitely have a strong urge to do so now.

EDIT: Looks like I was misled by the title of the HN submission. I thought this was a walkie-talkie plug-in for MS Teams (mostly because of "Teams" in the title being capitalized), not a standalone product (which it, turns out, is). Please ignore my original request regarding the website. Still excited to give your product a try, probably even more now, after finding out it is a standalone product.


I just meant more - 1<>1 channels. So I could say : sqwark Dave and it instantly creates me a 1<>1 channel with Dave of he’s available.

Also just noticed the second download link lower down the page doesn’t work.


Ah yeah that's on our todo list. At the moment, we tend to create groups like "Person A / Person B" but it's not ideal. The plan is that you can also Squawk anyone inside a group 1-on-1. Screen-sharing is next on the list.

Download link fixed...thanks!


Cool, yep that also works. Another awesome thing would be the option to run a speech to text algo on the conversation and have it transcribed into the channel log so if I miss the conversation I can catch up on things. But also you would need an ‘off-the-record’ Mode if you wanted to complain in private.


Another thought, is maybe you could also make this a slack plugin. Would reduce the onboarding friction.


Great idea! Switching the model from idling in one channel to being in multiple groups at the same time is a good idea.

Do you have plans on releasing a binary for Linux?


Thanks!

Yeah we definitely will release linux (it's electron based so no reason to leave linux out). It works better as an app but you can also access it in browser at https://app.squawk.to


+1 for linux support. This is a great idea we would like to test as a confined/work for heom team, but most of my collegues (including myself) is using Linux.


How is this any different from a local Mumble server?


How does it compare against Mumble or TeamSpeak?


Mumble is entirely Free Software and self-hostable.


TeamSpeak and Discord are oriented towards staying in one channel for longer periods of time which we found cumbersome. Squawk is more conducive to having many groups that you can switch between seamlessly: just push-and-hold to talk to a different group.

Technically speaking, probably very similar: Squawk is webrtc based and fully end-to-end encrypted.


Hey @zumachase - was hoping to get in touch but the contact email on your homepage is bouncing. (hello@zumaltd.com)


interesting app for the "corridor conversations" that have been lost with everyone Working from Home. It definitely needs integration with other communications apps, as its important to know when to switch context from another medium to PTT. Seems like it becomes a feature for Slack, For Teams.


Typically, it's switching from PTT into another context: we often start short conversations, and then realize there's more work to be done and we slack the results of that to each other.


Why not mumble?


I look forward to an AppImage for Linux.


Definitely on the todo list. In the meantime - https://app.squawk.to


Isn't Flatpak preferable?


I prefer AppImage because it has no requirements on the host other than basic libraries like libc.


>... end-to-end encrypted...

How do you verify that you are connected to the person you think you are connected to?


There's a handshake before you accept a connection to anyone. Each peer generates a keypair and sends the public key to our servers (which they're authed with). On connection, peers receive the public key from the Squawk servers, and perform a handshake to verify their identity. This all happens p2p.


What happens on failure?


A failure would indicate some sort of malicious actor, so the connection is logged and rejected.


How does this compare against Zello?


Website needs javascript to even display something -> ignored.


How does this compare to something like teamspeak?


This is like being in many teamspeak channels at once. It's also built on open standards like webrtc so you could easily integrate into a squawk swarm.


Seems more like Discord voice channels. Which I guess is analogous to being in multiple Teamspeak servers at once.


I love it! Any plans to integrate it with Slack?


Definitely in our plans


Great, I'll stay tuned!


Finally, something better than Zoom.


We built this specifically because of Zoom fatigue.


I hate seeing the faces of my coworkers on my screen all day. This will make a huge difference.


looks good!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: