Crazy Feature request: One of the things I've wanted for years out of a software based walkie-talkie like this is manual audio spacial diversity control (a.k.a. 3D audio). That way I can have multiple simultaneous channels, but be able to change the position (left/right/forward/45deg/etc) and the volume of each feed so I can have situational awareness with all channels while being able to use my brain to separate and focus on a particular channel.
These kinds of systems are often marketed to dispatch agencies, mission control, government/military customers, etc, and are super expensive[0]. I've never seen a consumer-grade version of 3D audio like this, but that would be super beneficial to ad-hoc communicators in disasters, scanner enthusiasts, public event coordinators, county/city-level EOCs that can't afford expensive systems.
Aha! I'm a huge fan of positional voice audio in video games. Mumble has a positional audio feature[0] that is relatively straightforward to integrate via mods. I built a mod[1] recently that adds support to Raft, and I attempted to list some of the benefits in the readme. In my experience, it completely changes the feeling of talking with people online; they become their avatars. I'd love to see it come to discord or zoom, and be lifted out of the video game space.
I haven't looked at Mumble's actual implementation, but I imagine it's possible with the web audio APIs.
Edit: in some ways, this might be totally off-base since it does involve a game and participant self-arrangement. Hopefully it's somewhat relevant, though!
I could be wrong, but based on the README this looks like it just sets speaker volume based on position - which is neat, but is not quite positional audio.
Oh interesting, it looks like you're right! They do appear to be using a StereoPannerNode rather than the full 3D PannerNode[0], but that is more than just speaker attenuation.
That's an old commit. It's had spatialized audio for the last month, at least.
I'm actually going to put a feature back in to let you select the audio "spatialization" model, because some folks have had trouble with PannerNode not working right. I used to have a selection between volume scaling, StereoPannerNode, and PannerNode. Been a little too busy at the day job to do it lately.
The now defunct Dolby Axon had this feature for up to 50 people. It was great, you could drag people in a grid around you (see picture in [1]). I’m still sad they discontinued it, it’s one of the few voice chat services that I would gladly pay money for.
Currently re-writing it to be actually decent code. Basically, it takes up streams on the page and arranges speakers in a semi-circle in front of you, letting you adjust the radius (I've found for particular numbers of speakers there's an optimal radius for speaker separation).
It's at the other end of the spectrum, but Remotion is really cool and does what you describe. We wanted to get away from always on video, but I know lots of people like that. I cannot say enough good things about what the Remotion team are doing.
I have some multi-tenant PBXs based on Freeswitch however, I have not seen a positional audio feature in Freeswitch. Can you give more insight? Your link is the conference module and unrelated I believe.
Aureal A3D. Technology was developed in the nineties for NASA. Stolen by Creative. Creative simultaneously stole their patented tech and sued Aureal for bogus patent infringement. Aureal "won" the Lawsuit, but ran out of money due to legal fees. Creative bought their assets, IP and funny declaration releasing them from any damages from lost patent lawsuit.
Patents long expired, nowadays smallest $2 microcontroller can run it in real time.
I know it's not really what you are asking for, but it's some very cool spatial audio work, letting you separate out where folks are in a virtual space.
[disclaimer, I know some of the folks working on it]
I tried out highfidelity a couple weeks ago and didn't get any of the directional audio to work. There was a radial but not an angular effect if that makes sense.
I wish podcast producers would do this (or more of them at least in case anyone knows of one). I’ve done it manually as an experiment and it definitely improves clarity in cases where there are 4 or more drunk folks taking over each other.
Hooray! I have been holding out hope for more push-to-talk apps on desktop platforms. It's such a step up from always-on listening. As an audio engineer, I grit my teeth every time there is microphone feedback, echo, or other audio issues on group calls, and these generally don't occur on PTT apps because everyone tends to be muted most of the time. And if it does start happening, the user more easily realizes what they did wrong because they correlate it with the button press. In fact, even if the user does not realize what they did wrong, others on the call can identify the source because "it's only happening when Jane speaks." I can't tell you how many times I've been on a call with feedback where the person who is causing it is complaining about it to the group without realizing it's their own fault and others on the call don't necessarily know, either. This never happens on PTT calls. There's also the obvious privacy benefits, as it avoids the issue where people speak without realizing they are unmuted. PTT is better in nearly every respect. I'm very excited about this.
I do have some questions that the website doesn't answer for me:
1. What about screen sharing? Seeing the word "collaboration" implies to me that I should be able to do so, but it's unclear. In the screenshot of the app, I see an icon or two in the right-hand sidebar that might be relevant to this, but they seem kind of generic.
2. What about mobile devices? I routinely do Slack calls where one or two people are on their phone for various reasons. It would be useful to know if that is supported or will be at some point.
3. I want to know more about the encryption. As you're probably aware, there has been a lot of controversy over the security of Zoom. In particular, there is an ongoing lawsuit related to their false claims of end-to-end encryption. [1] I think any new product, especially a chat app, that claims to be end-to-end encrypted really needs to show us the details of its protocol and stack, and ideally open source as many parts of that as possible. Does it use the Signal protocol? The site says "Squawk groups are invite-only and end-to-end encrypted." But which parts are E2E encrypted? The group membership? The message content? The message metadata? Everything?
Lastly, it would be great if you could add the app to Homebrew Cask, as it's my preferred way to download and manage apps on macOS.
Glad to see we're not the only PTT fans out there.
1. Screen sharing is next on our list, it's the thing that we want most ourselves.
2. This is definitely not mobile optimized. It does work-ish on mobile phones, but it maintains long-lived webrtc connections so it's not ideal (we ensure these are not transmitting when muted, but we have a keep alive protocol which ensures they don't die, and would be harsh on mobile batteries).
3. Squawk uses webrtc, which is e2e encrypted by default. Additionally, we don't use any SFUs so we never have the audio unencrypted. All link negotiation and audio transmission are done entirely p2p and thus completely e2e encrypted.
So I'd have this running and then anyone from work can just start talking to me and I'd hear it straight away?
That sounds like a nightmare to be honest. How will you ever reach any level of concentration this way? Even the thought that someone could just start talking to you would probably ruin your chances at getting in any sort of flow.
Also, it feels like only a small step away from a form of work surveillance. "Where were you? I was talking to you on Squawk".
Unless my assumptions of what this is are completely wrong. There isn't much to go on on the page.
Think of it this way, if you are sitting in the office then anyone can come talk to you in your cube. I think it can be great app if you keep it running during fix hours like 3 hrs in the afternoon when anyone can come and ping. Rest of the time you dont have to keep it on.
think of it this way, most knowledge workers are finally free from being constantly interrupted by people who just feel like stopping by to chat, or don't really know or care that you're focusing on something. Now, there's a tool to bring that back!
<sarcasm>It's only like 15 minutes to get back on task after every single interruption. That won't hamper productivity at all. Much more important for me to get instant answers and boredom reduction whenever i feel like it. </sarcasm>
I get that. I have plenty of time where I zone out of Squawk. But there are definitely projects/periods where I need to constantly be checking in with a dozen people and various subsets of that dozen.
If that's not a problem you face, then Squawk is definitely the wrong tool for you.
Yeah don't use it if it's not right for you. I could see how this could be extremely useful in the right scenario. ( I will not be showing this to my boss).
Hey - the landing page is definitely in need of love.
Part of this is cultural (you wouldn't constantly shout at your colleague in person across the desk) but there are per-group and global mute buttons when you need peace and quiet. Everyone else can see that you're muted.
Hi all - open beta of a tool we created for ourselves during lockdown. We got sick of trying to replicate the effortless comms we had in the office, and hated managing a half dozen always-on Zoom/Slack/etc calls. So we built Squawk.
We're not planning to open source it. Some features down the road will become paid but the free version will always contain at the very least what you see today.
The heavy lifting is done P2P falling back to our TURN servers if NAT traversal is necessary.
That's good to know; you should put that information on the web site. Especially the part about the business model.
Another thing I'd like to know before considering downloading the app is system requirements. A lot of people aren't on the latest version of Windows/MacOS yet.
I looked into TeamSpeak when searching for a multi-channel PTT tool a few weeks ago and couldn't find a way to listen to multiple channels and talk/reply to each channel individually. Does TeamSpeak support that?
When I saw the capitalized Teams in the heading, I was excited for a minute, because this would be a great add-on for Microsoft Teams. Alas it appears to be 'yet another thing to install and run in the background'.
Any plans to provide the functionality as an add-on for other communications apps?
Hey - we're looking at integrations with other apps, but the walled gardens make that difficult unfortunately.
I hear the frustration of downloading another thing - I've changed the landing page to make clear you can access in your browser at https://app.squawk.to
It doesn't have the e2e stuff afaik, but works on all major platforms, including mobile. They also have a native desktop app (not Electron). As far as I know, it's used a lot, mostly by drivers for Uber or similar services. I haven't used it for a couple of years now, but it's worth giving it a go.
FYI Squawk team, there seems to be an issue with your main email. I tried to contact you can got this back:
> Hello,
> We're writing to let you know that the group you tried to contact (hello) may not exist, or you may not have permission to post messages to the group. A few more details on why you weren't able to post:
> * You might have spelled or formatted the group name incorrectly.
> * The owner of the group may have removed this group.
> * You may need to join the group before receiving permission to post.
I wonder if a "dispatcher" model might be interesting for this, like an old style minicab/police service. Then people could use squawk-scanners for the channels they want to listen into.
It could get annoying if its on all day, SOMA FM provides an interesting police scanner with audio mix which is an oddly relaxing soundtrack
https://somafm.com/scanner/
Are you going to add some jargon, the stuff that made CB radio kooky back in the day?
I really like the idea. Push to talk is a proven concept when a lot of people work together and should stay informed (e.g. fire fighters). I see a huge potential for it...
Yes, muting is on the receiving side: when you mute a group (or global mute) you won't hear anyone, and they can all see that you've muted and aren't there.
The push-to-talk is on the sending side. So when you're not pushing, your mic is muted. And when you click the mute button, your speakers are muted.
Good question: there's no pricing page at the moment because we'll always offer the current functionality for free. We have some enterprise features (freemium) that we're planning, but we also won't charge for those until lockdowns are lifted.
Squawk is webrtc based (audio and data) so it's mostly p2p (except for tracking the swarm).
This is a tricky problem, and we have a couple of ideas. But currently if we're having a longer chat (double clicked will latch-open the mic) then we mute the other groups. Anyone else in the other groups will be able to see that you're muted.
We're contemplate auto-muting other groups when you have incoming audio. Would love to hear your thoughts as well.
When I played milsim games like Arma, a useful feature was to have my "squad channel" in my left ear, "command channel" in my right ear, and separate keys to PTT for each channel.
Awesome! Great idea - if you are looking for feature requests, it would be great if you could also create only bi-directional channels on demand. Basically like a Star Trek communicator...
Not necessarily a feature request, but a request regarding the website, possibly a nitpicky request at that.
At the top of the page, under "Push-to-Talk Collaboration", it says "Squawk delivers instant team chat".
That line specifically confused me heavily. That line makes it sound like this is the feature your product is delivering. But MS Teams already has instant team chat functionality. What does this mean then?
Aside from this, I love the idea and the implementation of your product. Haven't had a chance to try it out yet, but I definitely have a strong urge to do so now.
EDIT: Looks like I was misled by the title of the HN submission. I thought this was a walkie-talkie plug-in for MS Teams (mostly because of "Teams" in the title being capitalized), not a standalone product (which it, turns out, is). Please ignore my original request regarding the website. Still excited to give your product a try, probably even more now, after finding out it is a standalone product.
Ah yeah that's on our todo list. At the moment, we tend to create groups like "Person A / Person B" but it's not ideal. The plan is that you can also Squawk anyone inside a group 1-on-1. Screen-sharing is next on the list.
Cool, yep that also works. Another awesome thing would be the option to run a speech to text algo on the conversation and have it transcribed into the channel log so if I miss the conversation I can catch up on things. But also you would need an ‘off-the-record’ Mode if you wanted to complain in private.
Yeah we definitely will release linux (it's electron based so no reason to leave linux out). It works better as an app but you can also access it in browser at https://app.squawk.to
+1 for linux support. This is a great idea we would like to test as a confined/work for heom team, but most of my collegues (including myself) is using Linux.
TeamSpeak and Discord are oriented towards staying in one channel for longer periods of time which we found cumbersome. Squawk is more conducive to having many groups that you can switch between seamlessly: just push-and-hold to talk to a different group.
Technically speaking, probably very similar: Squawk is webrtc based and fully end-to-end encrypted.
interesting app for the "corridor conversations" that have been lost with everyone Working from Home. It definitely needs integration with other communications apps, as its important to know when to switch context from another medium to PTT. Seems like it becomes a feature for Slack, For Teams.
Typically, it's switching from PTT into another context: we often start short conversations, and then realize there's more work to be done and we slack the results of that to each other.
There's a handshake before you accept a connection to anyone. Each peer generates a keypair and sends the public key to our servers (which they're authed with). On connection, peers receive the public key from the Squawk servers, and perform a handshake to verify their identity. This all happens p2p.
This is like being in many teamspeak channels at once. It's also built on open standards like webrtc so you could easily integrate into a squawk swarm.
Crazy Feature request: One of the things I've wanted for years out of a software based walkie-talkie like this is manual audio spacial diversity control (a.k.a. 3D audio). That way I can have multiple simultaneous channels, but be able to change the position (left/right/forward/45deg/etc) and the volume of each feed so I can have situational awareness with all channels while being able to use my brain to separate and focus on a particular channel.
These kinds of systems are often marketed to dispatch agencies, mission control, government/military customers, etc, and are super expensive[0]. I've never seen a consumer-grade version of 3D audio like this, but that would be super beneficial to ad-hoc communicators in disasters, scanner enthusiasts, public event coordinators, county/city-level EOCs that can't afford expensive systems.
[0]:https://www.thedrive.com/the-war-zone/24741/new-3d-audio-wil...