"On top of standard-grade performance anxiety, the "big face" image that Zoom uses by default in its "speaker view" can trigger a "fight-or-flight" surge of adrenaline, writes Jeremy Bailenson, founding director of Stanford's Human Computer Interaction Lab."
From the hyperlink in the article. This for me is the kicker. From the start to end of a zoom call I feel like I'm on a stage. When I am in an in-person meeting I feel like I am "hanging out". One drains me of my mental reserves, the other fills it up. Zoom just feels draining. They noted the fact you see yourself and there is real time self critique to the hissy sound and pixelation. After a year of it .. I'm close to done.
This is interesting to hear. I have the opposite experience, I'm generally an anxious person and feel stressed going into meetings, especially with groups of new people. I've found that I'm much more relaxed on zoom calls than in person, I think because it feels like there is much more of a buffer between me and the people I'm talking to. Seeing faces on the screen does not provoke the same feeling as a real life meeting.
(Incidentally, I really dont like working remotely and want to get back to real meetings as soon as I can, even if it will make me more anxious in the moment)
What this really means is that different people are different and good collaboration tools will allow people to customize their experience while providing some kind of shared infrastructure. Even in an in-person meeting, you have lots of different behaviors, the people who chat to each other, the people who are super focused, the people who take notes on their laptop, the people who sit farther back in the room, the people who arrive late or leave early, the people who distribute the agenda, etc.
I agree, seeing my self during a zoom call is similar to looking in the mirror at myself while in a meeting, i've never done it in person, and would be very odd to see someone do it during a meeting. Ive been trying out new approaches to tackling the "looking in the mirror" tendency I get while in a virtual call. So far, the simplest and most effective is just minimizing the zoom call all together and listening to everything. I've never needed to see myself during a meeting before, why do I need to now?
Interesting! I have noticed this is a thing for me as well. I have found that by disabling the display of my own video, I feel much more relaxed. I highly recommend you give this a try.
I suggested to the folks behind pop.com, which does not show your face to you, to implement some form of indirect feedback mechanism when your face is aligned and centered. I thought maybe a small face+camera icon that turns red/yellow/green depending on whether your face is not visible, sideways/non-centered, or centered and in focus. That way even if you don't see yourself, at least you will know that others can see you fine.
This should be trivial for the people at zoom to add. They already split the background and foreground in the video for displaying images that aren't the wall/messy room behind you.
One trick with some videoconferencing software is to make the window smaller and put it at the top of your screen (or wherever the camera is). The other face will be smaller and you will be more directly looking at the camera and hence making eye contact which makes you look more natural to whomever you’re calling.
I've done a couple of interviews via Codepen where you're watched over video while you come up with a solution, and it's much more difficult than a typical whiteboard interview.
I don't want to invalidate your experience at all, but I find exactly the opposite.
I do a lot of video pair programming, and I feel supremely comfortable screen/video sharing. Getting to work on my own machine with my own IDE and shell set-up, my own hardware in my comfortable home office, 10000% better than awkwardly scratching at a whiteboard speculating about what may or may not work if committed to code.
It doesn't make me anxious but it gives me exactly the same feeling I get when someone looks over my shoulder when I'm on LSD.
I haven't tried it for a big call yet, but Jitsi seems better. For just a chat Discord still reigns supreme for call-quality of the services I've tried, although it may be bad data because I'm usually playing a game or helping someone solve a problem rather than solely conversing.
This is why I rarely go on video to begin with (well besides having my laptop in clamshell mode and not wanting a webcam literally pointing at my bed 24/7
I'm a firm believer that video adds nothing to the call other than add distractions.
> I'm a firm believer that video adds nothing to the call other than add distractions.
I agree. And I would add that voice does not add anything either. It's better to do everything by chat or by mail, that at least leaves a usable trace.
I wonder if we can compare our Briggs Myers personality categories ;-)
I’m the opposite. In-person meetings are extremely taxing while zoom meetings are very relaxing. I’d love to never have an in-person meeting ever again.
My experience with video conferencing is that it's mostly draining people's attention away, while being completely unnecessary. We switched from seeing each other's distorted, pixelated, and badly lit faces in a video conference back to plain old phone calls. Nobody missed the video, so problem solved.
But also, the problem is by no means new. Here's a great quote from 1996:
"And the videophonic stress was even worse if you were at all vain. I.e. if you worried at all about how you looked. As in to other people. Which all kidding aside who doesn’t."
- David Foster Wallace, Infinite Jest
Read his funny take on the rise and fall of video conferencing, and you'll be cured from the hype :)
I don't understand why people think of video conferencing as "video streaming yourself"? We almost always keep our cameras off during the meetings, but still video conference allows us to share the screen easily, which is immensely helpful (for IT people at least). Over a phone you can't copy&paste, you can't share screens, you can't draw on the screen, it's way more limiting
If I'm not seeing you I feel a disconnect. I have some colleagues that insist on keeping their camera off, indeed it has a "focus on business effect". I also have groups of people that all have their camera's on, with WFH and lock down we see each others kids and each other emotions, it creates a much more personal bond which I highly appreciate. With the camera-on now being the default I even have a much better and relaxed relationship with my colleagues in the US (I'm in the Netherlands) I guess it's what you want to get out of your digital forms of contact.
Seconding this. So many people are adamant on video being a necessity or suitable substitute for in-person meetings, specifically the aspect of visually seeing people.
Yet, I find most of the things people visually do in person to be fluff that can be done without. Meanwhile, video-conferencing either requires an expensive setup (not everyone here works in the US, let alone FAANG) to really start emulating "in-person", or it forces somebody to self-police themselves to a much larger degree sitting in front of a webcam with no video or audio privacy (I guess this at least emulates the open office disaster). And it still misses the forest for the trees. Additionally, trying to shoehorn fun into work rarely works.
(Also, as far as I know, there is no way to stop looking at your own face in Teams besides turning off the cam. Looking at a mirror 24/7 isn't even close to reality).
For those of you who like me who don't immediately get the reference, DFW probably refers to the novelist David Foster Wallace - at least the 12 years reference fits: https://en.wikipedia.org/wiki/DFW
Well, it's not just a matter of vanity, showing up on a board meeting in your pajamas is still not a thing unfortunately. It's especially harsh on women because of all ridiculous social expectations. Even when others don't really care we're mentally condition to presume that world is expecting us to look in a certain way and that creates a lot of stress - plus it's a total waste of time if you work from home.
Most faces of my colleagues are also not particularly pretty under the TL lighting of our office. My own face even seems to benefit quite poorly from proper lighting. Yet, somehow, I'm not really expecting #insta quality selfy images but (call me crazy) I like how real it is and the fact that I can read their emotions.
Sure, I check if I'm not showing overly dominant nostrils due to camera angles or have food anywhere on my face but other than that, I's just how I look.
Super agree that new modes of conferencing software are needed -- I see similar ideas, "drifting locational chats" come up a lot, but it's never seemed to 'fix' things in my experience.
I'm personally an advocate for more asynchronous communication, for less video (I find video very tiring and time-consuming), and for more audio at work. I feel audio is the sweet spot of informal, low effort, and still more emotionally engaging than pure text. I believe this so much that I'm actually building a chat application, called heysync[1].
It's somewhere between slack and discord in terms of features, but intended to allow for asynchronous, conversational audio to fit directly into the way teams already chat. You speak, and it can play live to any teammate who's listening right then, but it also records the audio, and transcribes it inline with regular chat, so the message can be easily read or listened to in-context later, allowing for true asynchronous audio conversations without dedicated audio rooms. Of course, you can also just chat via text like you would with slack or IRC.
There is an early info site up at https://heysync.chat/ if anyone's interested in this idea :)
I was contemplating async audio earlier today, and thought it would be nice to be able to attach audio responses directly to different team conversations. That way you can compose and conduct different discussions in a branching tree structure like manner, but still have the warmth of voice.
Anyone who prefers text, could view a transcript or submit text.
To make desktop-work more collaborative, and fit different personalities / attitudes / work-modes, Multiuser desktop environment where everyone controls widgets, and there is one "host" user for a widget who can delegate.
"Inverted Slack / Zoom / IDE" ... turned inside out so they support attaching multiple simultaneous editors, and attaching apps to each other to form ad-hoc "desktops".
This has been done over the years, in different forms. And it is a lot of fun, speaking from personal experience. And the market need was not there, for a business to be built on this, or a standard, until perhaps now because we're forced to be separate for at least four more months.
Video conversations for the talkative, in a mutable / auto-transcribed, resizable / mutable widget. Polls for those who just "need a decision", where additional "poll options" can be added by others. Text chats for asynchronous discussions. App-scoped shared copy-paste-buffers. Live people Directories showing which apps your colleagues are "in" now, groups of which can be collected into ad-hoc dynamic work-desktops, to focus on a task.
All we need is better designers. Video chat apps today are extremely naive and try to get away with "what if we just dump everyone in the same room and hope it works". Zoom added "reactions", but for some reason decided to only show reactions on your own tile. Push-to-talk is barely supported. None of them have implemented the bare minimum feature of showing participants in the same order. None of them have support for dynamic breakout rooms.
I swear, the best video chat experience I had was from The Go Game. At no point were you dumped into a massive shared room. Instead, people can join others at-will into small "rooms" of up to six people, which is much more manageable. Then you can implement whatever games you want and let people decide what they want to do. No need for next-gen hardware, no fancy "VR spaces", just a different interface paradigm that actually scales.
> None of them have implemented the bare minimum feature of showing participants in the same order.
This is one of the things that bugs me the most. It makes it really difficult to have a shared conversation, such as going around the room answering a question.
Every. Time. I don't think they realize what a big deal that one small feature would be. How can video chat platforms boast about "shared spaces" when not even the participant order is shared?!
Because it's not a feature that will show in quick demo used to sell the product which will lock-in everyone supposed to cooperate using it, ensuring high MAU metric that will drive business KPIs and company validation up.
It's because none of the video chat apps are competing on features, only on user lockdown. Their KPIs is how many users they have, and that only needs enough features/price to convince the corporate buyer which rarely accepts feedback from larger set of employees.
As such, outside of certain older very expensive corporate videoconferencing setups that remember targeting inter-corporate meetings and the like, the actual feature set ossified. You get some things that are easy to demo in 1:1 call, you get locked down room systems that leave me weeping for Webex ffs, and you get omnipresent WebRTC in-browser setup that often murders your CPU unless you get lucky.
I find it really bad that the best video conference experience I ever had involved Cisco Webex hardware and Bluejeans, all interconnected with so-maligned SIP. And before that, the glorious days of wider ITU-T based video conferencing, like NetMeeting with its whiteboards and active screensharing.
Imagine if video conferencing solutions competed on top of common protocols, and you'd separately invest in infrastructure and clients? Instead of the rare enterprise cases where Lync/Teams/Webex/other SIP is set on dedicated lines in large corporation, make similar setups common everywhere (WebRTC/SIP bridges/transcoders on the edge, with dedicated bandwidth bough by company, etc.?) with vendors competing on something more than "you have to use our stack because we don't interoperate with anyone else".
Sincerely, someone who has had enough keeping 3-5 video conferencing apps around.
>"what if we just dump everyone in the same room and hope it works"
Yep, it's pretty abysmal for class settings.
I know it's partly down to configuration/instructors but being shoved into a room by an instructor with 40 other people forced to stare into a webcam while watching everyone else doing the same is not an ideal environment.
There needs to be a lot more attention on how and when to select additional participants in the general case where there's only one or two priority speakers and the rest are basically just sitting around listening.
Now of course me personally I'd just rather not use a webcam at all (Or only when specifically called on), but often that's not a choice unfortunately.
Most new companies in this space have been using "AI" marketing to pitch a better world of video conferencing and it's been largely a failure. (unless you call getting bought and then shut down by Cisco a win)
But I 100% agree with you that this is a design problem to be solved first with better interfaces and workflows, not AI. Innovation is the sum of behavior change and AI is not going to be doing our knowledge work for us anytime soon. Given we're doin the work, we deserve better tools to do it ourselves.
I'm currently building grain.co that layers a new kind of video conferencing on top of Zoom, focused on brokering the recorded information gathered during sync calls into the async tools where knowledge lives and actual work is done after the convo ends.
Our approach is to make it easy to annotate in real-time, clip out those parts you want to save/share, and then push them into async tools where anyone can view them like it were text in a document or a message in Slack.
> [...] this is a design problem to be solved first with better interfaces and workflows, not AI
> [...] I'm currently building grain.co [...]
> (From website)
> Instantly record, transcribe, and highlight the best parts of your Zoom video calls.
EDIT: not a criticism, just made me think of a tangentially related thing:
This is so emblematic of how AI is a moving target. Once something becomes an everyday tool it's no longer AI?
Not so long ago, being able to transcribe speech from multiple people in natural conversation would be the hallmark of AI research.
You can on zoom (at least if you’re the host and the other is presenter). I use this all the time. It might even work if you’re not the host. I think it’s called Annotations- but the friction is that it’s a meta mode and IIRC you can’t do that while living in the main menu.
The advantage of push-to-talk is that you are muted by default (necessary for large groups), and the app knows when you want to talk because you have to push the talk button. That is extremely valuable information that you can use to build other features on top of. For example, there's no need for a "raise hand" button anymore, since pushing the talk button clearly signals your intent to speak. The UI can highlight your square, make it larger, whatever, which can make the whole experience a lot smoother.
My team has had success with using Gather.Town as well! The only downside is that it is a little unwieldy, in that you have to be careful where you are positioned so that you don't accidentally lose contact. The "rules" of what constitutes a shared space and when you have to rely on proximity, are also a little unclear.
They're configurable. In our Xmas party house the entire kitchen is one broadcast domain, like the kitchen at a party. But in the local pub we built for Friday evenings we have "booths" and you must sit down in the booth to talk to people in that booth. By default it's just proximity, so like a real pub there were always two or three girls in the toilets having presumably a private chat.
So you do need somebody technical to make the space work how you need, and if your users are all non-technical you probably only want one type of rule everywhere or it'll confuse them.
It's down to the design of the space. We made our own, and "Private Spaces" were visually distinguished by "carpets" ... it made a huge difference. The rules for shared space vs proximity are very clear, but the design of the graphics need to be clear and consistent.
Gather.Town is interesting, but it has its own issues. We just used it last Friday for a company offsite, and these issues killed the experience for myself and a few others:
1. You don’t see who is speaking when there are many people present— you have to scroll through participants to find the person talking. Lots of new employees, I didn’t know who was talking about half the time
2. Connection issues. I couldn’t hear the speaker but others could, for extended periods of time. Missed entire presentations due to this unfortunately, as I couldn’t hear or have any means of connecting to the speaker.
Don’t get me wrong it’s fun software, but it’s not ready to be your rock solid all-hands replacement quite yet. Had the pleasure of experiencing Welcome at one of the YC networking events, and that worked really well, curious to hear if they’re interested in filling this gap
> Connection issues. I couldn’t hear the speaker but others could, for extended periods of time. Missed entire presentations due to this ...
Was the speaker at a podium? Were they spotlighted? Were you close to them?
There are several reasons why you might not have been able to connect that are part of the design. We found initially people were confused by the connection rules, but once they got the hang of it, it all "just worked".
I agree completely that it takes time to get used to, and perhaps you did have genuine connection issues, but we ran with 150 to 200 people for 48 hours without any significant[0] identified connection issues, which is why I wonder if it was mis-identification of a "symptom".
I'll check out "Welcome".
[0] There were occasions when people needed to reload the tab, but doing so always fixed the problem.
As for the connection issues, we were all next to each other at a conference room table, not an issue of proximity. I was able to hear the person at first, then they cut off and didn’t come back, even after reloading the page it seemed but I only tried once and then gave up to be honest :p
FWIW the person I could no longer hear was located across the globe, so maybe that’s related.
I apologize for not having a more thorough write up of what happened, but it was certainly a real issue.
There's no need to apologise ... it's perfectly reasonable that you should say something short and clear, but not in complete depth. Your elaboration gives me useful information ... thank you.
It does sound like it was a real issue ... have you provided feedback to Gather.Town? Perhaps it is related to the speaker being across the globe, and they could find that useful, even if anecdotal. It's still a data point, pointing at a potential underlying problem.
Hey Cory! CEO of Gather.Town here. Connection issues are always our top priority, and at this point, we're chasing down and squashing the last 1% of bugs. If you run into this in the future, reporting a bug will help us solve the issue! (Grape icon > send feedback/bug)
As for the other issue, active speaker detection is on our near-term roadmap. It's become more pressing for us too, as our team has grown and no longer fits in one page :P
I’ll keep the feedback method in mind for future usages!
Also, I get it, having started to work with the WebRTC stack myself. While the service is free to use I think I can stomach a few issues while it’s actively being worked on :)
> Other startups ... are trying to use VR to make digital gatherings more immersive, though doing so takes away one of the few benefits of virtual meetings — being able to easily multitask
I used to multitask during meetings fairly frequently. I’m trying really hard to stop.
If there’s the chance that someone can multitask at a meeting easily and very effectively, maybe they don’t need to be at the meeting?
> If there’s the chance that someone can multitask at a meeting easily and very effectively, maybe they don’t need to be at the meeting?
Speaking from personal experience, you often don’t have control over this. If I don’t attend meetings I get invited to, it’s seen as a lack of interest and being disrespectful. So I attend them to show my face, and then just do my regular work.
And if I don’t, then people stop inviting me all together and assume I’m not on the project anymore. That leaves me with little choice.
The whole thing about “leave meetings where you’re not needed”, only really works when you’re in a position of power.
It’s not so much as “I can multitask therefore I don’t need to attend” but sometimes I’m there for 1-2 topics and it’s a status meeting where those may come up at any given point during the call but my involvement to other topics is limited and I can just passively monitor. This is lost successful when I have some rote task like filing TPS reports, replying to team slacks with confirmations and the like.
I live 30ms away from my boss. What do I have to do to get low latency 4k (or 1080 since 4k cams are expensive) conferencing for one on ones? Serious question. This was a problem before wfh and now it's an even bigger issue for me. Id rather work somewhere for half the income where I have to mask up and deal with people if it means I can stop spending hours on terrible video calls every day.
I've come to two conclusions about video conferences:
I really like cameras on so I can see people's faces. It feels much more like seeing people in person and the fact that my new team doesn't do that is making it harder to feel part of a group.
While I really like cameras on for calls, video quality isn't important (to me), but audio quality really is. Bad audio whether it's background noise, bad internet, echo etc. makes vc very tiring. The video could be 640x480 and 1 FPS and that would probably be fine - helps me know there's a real person on the other end of the call.
Audio carries 80% of the intellectual content of most telepresence. That's why when there's a live cross on TV News and the video works but the audio is missing, they give up and return to the studio until the audio is up.
I don't know if there's a realistic way to get low latency video[1]. But the way to get low latency audio is to get as close to POTS as you can. All of the little delays here and there on modern internet audio paths add up. More so if you've got wifi on either side.
[1] I mean, ISDN video calling is probably low latency, but good luck getting an ISDN line installed these days, and video quality was trash.
The UI for every video chat app I've ever used is complete garbage. As for why, my best guess is that there is no money in selling video chat apps directly to consumers. Instead, they sell to businesses, in which case things like having a quality end-user experience is not a very competitive feature.
It seems like every single application had bits and pieces of what is needed but no one has still been able to offer the entire package.
- Zoom definitely has mastered the ease of logging in whether you have the application or not as well as the ability to see multiple people, but obviously they have had their privacy issues.
- Teams seems to have good screen sharing and chatting integrated as well as some other small features here and there, but if you dont have the application, it takes some time and the browser based isnt bad but could be better.
- Google Meet- All sorts of improvements are needed...
- Jitsi - Definitely has its positives but I have always had connection issues for whatever reason.
Would love to see Apple do something more with Facetime...
You can add all kinds of fancy stuff, you can improve the design and usability of the UI's but here's a big important problem waiting to be solved: audio quality and graceful degrading of the quality when needed.
At the end of the day we want to be heard in the first place. Nothing is more irritating than someone's (or your own) poor connection that makes people ask each other to repeat, watch participants freeze, wait, waste time trying to reconnect etc. etc.
Can it be solved? Degrade the quality, compress more but deliver the human speech as nicely as possible, on time. I feel not all options have been tried in this area yet.
There has to be something these companies are doing wrong. When I use discord, even while on mobile data and walking around, the audio quality is flawless (or as good as the mic) Why is it that ms teams and the rest end up sounding worse than AM radio?
Audio is hard, then audio streaming adds anohter layer of complexity. Even a relatively trivial problem of audio feedback hasn't been solved in all software yet - occasionally you hear echoes in some of them (Skype, WhatsApp?), or otherwise the peer is interrupted when you speak making talking simultaneously imposslbe and the whole experience irritatingly poor. It all seems like a downgrade from the classical telephony where you didn't have those problems.
Ah! I thought this was going to be about video and audio codecs. I was especially hoping to hear some advocacy for royalty free AV1!
My only real qualm with video conferencing is the camera placement issue that makes it difficult to have eye contact with the remote side. I consciously try to look up at my camera so that it appears I'm looking directly at the other side, but that's a bit unnatural.
I'm not sure what the point is. I'm sure not interested in any sort of VR teleconferencing and I don't really want Second Life 2.0 either. I do think easier sharing of sketches and the like might be interesting--although that tends to require some commonality of hardware and whiteboards don't necessarily scale down to iPad size very well. (Furthermore, just shared docs actually serve a lot of the functions we thought we needed whiteboards for pretty well.)
We used Gather.Town for a virtual conference, a substitute for an event I've been running every November since 2010. It went really well, capturing the unique aspect of my event quite well, and mostly "just working". I was particularly pleased at how it mostly "just worked" for people who aren't into gaming or similar environments.
Not perfect, but it was pretty good, and people liked it.
I went to a social gathering of former co-workers using Gather.Town. Each person has an avatar which they walk around a 2-dimensional map. You see video (and audio) of everyone within a certain distance, and apparently if you’re in a “conference room” you see everyone in there. It did replicate the behavior of in-person parties where you groups would form, and people would join them or break off to find a new group. It was far better for the purpose than trying a Zoom call with 50 people, where there’s a single awkward conversation.
Some of the user interface was confusing as a first-time user, but overall it was effective and people wanted to do it again.
Here's what I want: an always-on group audio chat app with:
* A virtual 2D 'office' with positional audio
* Super low latency (~10ms)
* Minimal data/power/CPU usage
This would simulate what it feels like to be in an open office (when they work well). I can hear interesting things going on around me but because the audio is in 3D space I can pretty much filter it without thinking. If I say "hey has anyone gotten this error before?" the context clues of my volume and position make it clear to others who I am talking to.
I actually started building this last year but then the pandemic hit and I figured someone would beat me to it now that there's infinite VC money flowing into team chat. Still nothing!
One thing I miss very much is some kind of spatial voice conference software with virtual characters. I have seen a videogame-like attempt, but nothing for real business or real events. I thinking something like a full 3D models of the people freely moving around in a virtual office or something.
The current solutions where it shows everybody's face is not the best solution for bigger events or conferences, where you can go from conversation to conversation freely.
It would be very minimal compared to a game, like one room or something, but the most added value would be to be present, more actively participate and not just listen somebody else talking.
My team have been developing Remotely (https://meetings.remotelyhq.com/) with this very thing in mind - customizable avatars, voice-driven animation and expressions, animated emoticons, cinematic rooms, interactibles.
It's still very early stages, but the main observation is that the scope of problems people are having with communication is too large.
Some people want to feel less anxious in meetings, some want accountability, some productivity, some want "fun" and customize stuff, others want video, screensharing etc. It's hard to pinpoint "the problem" that's tangible and solvable outside of the base must-have features. "feeling more like part of the team while working remotely" is not an easy problem to tackle.
Gather.Town provides avatars moving around in a 2D world that you can design, or you can clone from a library of templates. it's not full 3D characters walking around a space, but when we used it for a 48 hour weekend conference we found it a great balance between simplicity, usability, flexibility, and power.
We could go from room to room, conversation to conversation.
I'm guessing that's what they mean with "I have seen a videogame-like attempt". I can see that a lot of companies wouldn't consider it stylistically appropriate.
It's possible to create much more "sophisticated" settings, although choices of avatars are still limited.
I agree, I can see how some companies would dismiss it on those grounds. I'm glad I've never been involved with companies like that, because it feels like a good balance between system demands, usability, effectiveness, and ease of use.
So far I've found attempts at full 3D "proper" environments cringe-making. Perhaps they'll improve, and my fear is that when someone gets the 3D environment glossy and slick, the suits will make people use that, even if the video and audio components are vile, simply because it "looks better".
It will be interesting to see how all this plays out over the next 5 years.
Exactly! Not "corporate" looking/feeling enough. A more formal kind of software would be more attractive for even serious events like Customer events or business meetings.
I have tried VR chat, which sounds like a massive fail on paper:
o there is no video feed
o You only have an avatar
o You need to wear VR goggles
However in practice it feels far more natural than "good" VC. The biggest reason is spacial audio. In VR VC its perfectly possible to turn to your neighbour and have a conversation without disturbing the rest of the meeting
Even better, its possible to sustain a normal social speaking flow. There are no awkward pauses, just free flow of talking like you are in the room.
Different people like different things. Last year I had a team of ~8 people who hadn't worked together before working closely together. Cameras on most of the time helped us build relationships. I've just joined a different team now (who have been working together for 3 months already) and it's really hard to feel like I'm part of the group.
This is the sort of thing I've been thinking about in building Bucket Brigade (https://echo.jefftk.com). It's video chat, but you can switch into a mode for singing. Each person is in a bucket, and you can hear the people in earlier buckets and be heard by the people in later buckets.
So hard to believe, or many it is totally likely this type of interactivity was lost in time: I was director of research at a dotcom era live video broadcaster named Rotor Communications. We were doing one to many live broadcasts with interactivity back in '99. Beyond chats independent of the video, there were interactive elements (surveys, choice selections) presented and their responses could be used by hosts, often the purpose of the shows. We had universities using it for online lectures back then, we had game shows and pop music talk shows on the general internet too. Then, of course the dotcom bust blew it all away.
Why after all these years is something like this not in every teleconference application? Why can't one give an interactive power point presentation with questions the participants privately answer, tabulated on the fly for the speaker?
I was a dev at a live widget company called decentral tv, and we did this using Adobe Flex. We pivoted to live and prerecorded video with customizable presentations, and the occasional poll.
This will return. We just need a "what is a client-host" / "what is a distributed server" + structured data standard for sharing the interaction part, like IDK torrent, and a way to describe dynamic groups / sets of apps which are relevant to a workflow or an experience.
The level of interactivity, the cognitive / attention demands on a user also should be modeled. Asynchronous texting, or live chatting, or even most active: producing an experience moment-to-moment.
IDK - it's possible and clearly this thread shows the interaction / synergy potential. To help keep us together.
To host these experiments. For instance, if you want to chat with people around tables organized by pubs in the golden mile (from the movie The World's End), go to:
Two things I really miss compared to meatspace for zoom (thinking both about social gatherings and meetings):
1) the ability to direct-channel voice call someone in the room. Can mute (or dim) the rest of the conversation that is happening, but allow for focused discussion aside. This would be a huge advantage over actual meetings where we must use furtive whispering. Obviously not great for small meetings (distraction maybe) but excellent for large ones and really excellent for social gatherings.
2) when using breakout rooms, the ability to still see what is happening in the other breakout rooms. Just like when we are separated in groups in a physical room and we can still peek over at what may be happening in a different group (in case we may want to bounce to a new conversation).
I'm interviewing for my first employee, and I just used voice calls.
We've been in lockdown for a month. We're all a month or more from our last haircut. We live in the homes we had when it all started. We don't have control over that and it shouldn't affect my perception of candidates.
Besides, it's a remote job. You don't need to look nice for it. You can work naked for all I care.
It made me realise how superfluous pictures on resumes (Germany), LinkedIn profile photos and video calls are in a neutral hiring process. Even without them, the WhatsApp or Skype profile photo betrayed their looks.
I just want a client that auto connects me a set number of minutes before the meeting. And other “smart client” basic features.
I have zoom meetings that prompt me for my name every time. I don’t want to create a zoom profile, I don’t want to login in. I just want it to remember my name, on the client.
I think stuff like this will get ironed out now that folks are using it day in and day out. Although, it’s still funny how it hasn’t hit 100% yet. My org has been 100% teleworking since April and last week someone said “this is the first time I’ve done a Teams meeting.” I was so curious and wanted to figure out how she’s avoided any video calls (my org is 100% Teams) for so long.
You want webex. Its not auto, but it pops up with a countdown letting you know when it's possible to join a call, typically 5 minutes prior to the meeting. This fixes the problem of everyone joining in 2 minutes after their last outlook alert.
I have been thinking about this. you know how computer nerds have been using IRC, email, source control, github etc. to work online for decades?
And you know how highly competitive e-gaming teams compete at things like Rainbow Six and other things without being in the same room? (Not for a tournament, but competitive enough to beat any casual team online).
I feel like, yes there is lots of room for improvement, but part of this is actually being overblown. By people not willing to type in a Discord or who don't own a comfortable headset and know how to use it. Or people who have not learned how to use an online whiteboarding system or Google Docs etc.
There are huge potentials for disruption in the VC market. If Facebook would give free Oculus Quest to all its employees I think we would have VC in VR by now. The big problem is presence, and organic discussions in large groups. The grid view of Zoom et al. still sucks. There needs to be a 3D, or AR, or VR experience that allows us to move freely and talk more naturally to people in the room.
Latency is always going to be an issue unfortunately, maybe the camera could be better at recognizing facial expression and body language to indicate when someone is about to speak, or wants to speak.
I think VR could be really cool solution to some of the problems with video calls, but the headsets are just too heavy at this point. I have an oculus quest 1 and can't stand wearing it more than maybe 30 or 40 minutes, it just hurts my face. I've tried adding weight to the back, etc., but haven't found a good solution. I have a 20th percentile head size (a really big noggin) so maybe I'm in the minority that wouldn't like this... yet. Maybe some VC could help push to lighter tech.
I haven't tried the Quest 2, but I hear it's lighter. Also have yet to try a Valve Index.
At this point in my experience with video calls, I'm just happy when everybody's wearing headphones so they don't get their audio dropped when somebody talks over them.
Anybody here use blue jeans? Or can explain why it feels so much better to me? Whatever they do for audio is magic. I spend half my day on teams and 4 to 5 hours a week in zoom, and both are totally exhausting. I bought my own blue jeans account and I push it on family/friends just because the audio ...makes sense to my brain. I /enjoy/ talking to people there. I even convinced my d&d group to switch from hangouts/meet/duo/allo/??? And they seem to agree, blue jeans+roll20 is great.
Yes! We used it extensively in my previous industry (film VFX) and it was fantastic. I moved somewhere that's all zoom and it honestly feels like a step back in time. BlueJeans is great, with a cisco web phone camera thing it was the pinnacle of VC in my opinion.
I'm a hard no on this. I work in education, where everything went from in-person to online in the course of a few weeks and everyone (expectedly) hated it. synchronous online destroyed the previous advantage of online (asynchronous) and hasn't found a way to provide any benefit over synchronous in-person.
there is a clear opportunity for better video experience online but I expect that 2021 will be the year that everyone runs screaming for the classroom and office.
It will evolve. Video meetings are meetings, hence synchronous. Email and messages are asynchronous. Each have their uses, both are abused and misused, and over time, teams and groups start to evolve their work processes based around the capabilities that exist.
The best managers guide that evolution, but recognise that different people work in different ways.
But video conferencing/meetings is a capability to be used, and I look forward to seeing how it improves. Rejecting it out-of-hand simply because it's currently sub-optimal and badly used seems short-sighted.
I would be happy if people would use good microphones (wireless earbuds tend to have the worst mics) and laptops would come with waaaaay better cameras.
Unfortunately I can't recall the names of the startups or the related sub-industry, but there are multiple categories of software that are in this area, and many startups.
I found one called "onespace" in my search now but there are quite a lot of them.
The creator of Nim was involved with one called "3dicc".
The are multiple recent top-down localized audio ones.
I am excited about the optical waveguide AR/VR systems getting cheaper and consumer-oriented because they are so lightweight. I actually personally find that VR is often more direct interaction than I want. Things like Spatial and VR Chat are mainly missing eye contact but that eye tracking is coming within a few years.
I’d like wireless webcam tech to evolve in the same way that we now have robust, ubiquitous protocols and hardware for wireless audio. It should be as easy to switch source to one of your wireless/Bluetooth webcams as it is to switch audio output to a Bluetooth speaker.
A bit ho hum - the author is right that higher quality video and audio helps a lot.
I have never understood the problem people have having video active - id describe my self as fairly introverted and have no problem with this, in fact I do between 4-6 hours a week streaming DnD
I think teleconferencing needs more spatial flavoring. I have always thought that what virtual meetings lack is 3D awareness. Maybe do some 3D representation visually and audibly of each person sitting in a chair across a table.
getmibo.com is also working on a more interactive conferencing system. It uses a combination of video calling and a 3d world to spatially separate participants and adds new content like games while keeping the value of seeing someones actual realtime face.
Pick a popular video chat application, have you and someone else on the same wireless network both join the same chat, go into separate rooms to avoid crosstalk on the mics.
Have one person talk, then realize there is a half a second lag.
It's possible, but it's not likely to come from established players. And it's going to have to sacrifice visual quality or use a ton of bandwidth (or both).
I don't know if video calling tends to multiplex audio and video, or send separate streams; multiplexing audio onto the video could help reduce audio delay though. Standard voip uses a packet rate of 50 pps for audio, which means at least 20 ms between sampling and sending. Reducing that would be nice.
> It's possible, but it's not likely to come from established players. And it's going to have to sacrifice visual quality or use a ton of bandwidth (or both).
Switching to Peer to Peer for 1:1 or small group chats would be nice.
Low latency requirements being part of the BT spec would also be nice[1].
OS's fixing their drivers and entire end to end audio stack, that'd also be nice. Get that end to end latency locally to less than the e2e latency of the network packets!
[1] Qualcomm has a low latency version of apt-x which is actually low latency, and Apple does fine with Airpods connected to Mac hardware.
AirPods Pro connected to Mac hardware still run 144ms. That’s an eternity in live audio world, latency for a genuinely “pro” digital wireless microphone system is closer to 1.4ms.
That is... so bad. How? Did they let the same people who designed Bluetooth LE's ANCS[1] go hog wild with the desktop stack?
Edit: Also I wish that site tested latency for phone calls, since that is often a different protocol.
[1] ANCS is, or at least last time I looked at it, a protocol designed by people who had obviously NEVER worked in the embedded or low power space. No limit on message size and no headers indicating the size of the incoming payload are two giant tells.
AFAIK the killer is (video) encoding latency, which in turn comes from a tradeoff between encoding efficiency (bandwidth use) and number of frames that need to be "buffered" inside the encoder. This is exhibited inside the videoconferencing software and the webcam itself (AFAIK webcams don't send an uncompressed stream to the computer).
Because packet based networks are the worst basis for real time communication. There is a good reason there's a whole separate international connection oriented network for making phone calls (it's a shame it's no good for video - unsurprisingly the internet wins on open platforms and innovation).
There are two important differences with VC compared to that ping:
You're not trying to exchange data with a CDN built for exactly that purpose, you're trying to exchange it with someone else's (probably crappy) internet connection. The analogy I used the other day is "your supermarket has better roads to it than your friend's house".
You're probably pinging 32 or 64 bytes. Now try that again at a more realistic video data rate (not that the CDN endpoint will allow that). Keeping latency down is a lot easier on small packets than on full pipes.
Gather.Town has the ability to move around an entire 2D world designed by the meeting conveners. Multiple rooms with inter-connections, and the teleport portals provide some interesting capabilities for a creative designer to exploit.
Ultra-charitable interpretation: Perhaps they meant to use gaming voice communications instead of video conferences, because those voice comms work pretty well and video is 99.8 % unnecessary anyway.
<because those voice comms work pretty well and video is 99.8 % unnecessary anyway.
<sarcasm>
Imagine if there were a publicly available system that could switch connections directly between the end users, without the need for computers, or anything wireless? It could send uLaw companded audio in a 8,000 samples per second, to cover the 3 KHz bandwidth of human speech, with almost no latency. You could even hook circuits directly up to the end user equipment to avoid any need to share the audio channel. Because the bandwidth was dedicated on a switched channel, there would be no drop-outs or packet loss!
What an amazing improvement over cellphones and Bluetooth that would be.
I feel like the problem needs different questions. 2021 will demand better modes of virtual collaboration. I'm putting my chips behind it. Imagine the lovechild of a spreadsheet and a persistent, distributed MMO Sandbox. We need to level up as a species in one area: collaboration. Humans are able to thrive due to our ability to collaborate flexibly, in large numbers, usually by way of shared myths[1]. Yet, we've seemed to hit a threshold to how well our species can solve large problems that transcend ideas like national borders and currencies. There is a well-known collaboration threshold in software engineering that gave us the 2 Pizza Team.
I believe we can transcend that threshold by adopting a shared canvas for structured information and systems.
Suppose my Fermi math is somewhat plausible when I say that the annual aggregate of all human food consumption is about 34 peta-calories and that we use 24 zettajoules of energy. What if we the people could collaborate to make that 50 pCal and 19 ZJ 5 years from now, while flattening the per capita distribution? Religion, privately-controlled social networks, and democrapitalism aren't likely to carry us there alone. Humans are good at using, creating, and distributing tools and that makes us highly adaptable. They are good at pooling resources to tackle large objectives.
Here are components I believe are necessary to build an machine for transformative virtual collaboration:
- Simple primitives that transcend literacy and align with human behavior and cognition [2]
- Tactile consumption and manipulation of data and business logic across many surfaces [3]
- Simultaneous replay of media and event data from disparate sources through a multitude of lenses [4]
- Immutability + Time Travel for all data/transactions
- Arbitrary auctions against time, resources, data access, and more
- Ubiquitous simulation, modeling, and learning systems
- Rapid, contextual, stake-driven agreement systems for contract formation and decision-making.
- Zero-trust, edge-first privacy model
- Containerized knowledge work - like each task you pull from your queue gets its own OCI container and X session with appropriate tools, files, keys, and roles. Pausable, snapshottable, sharable, replayable, work.
- Sortable task queues that align with personally defined values, priorities, and constraints.
- Personal Optimization aligned with Global Optimization
- ... better looking video :P
Such a system would work to remove friction related to misunderstanding, misinformation, and misalignment. It would empower and enliven individuals to do work that is truly meaningful and impactful where needed most. It would scale our ability as a team to sense, think, react, and observe. It puts humans and facts in a place of primacy, and allows for async work to... work.
Thank you for coming to my TED Talk.
I've been trying to unpack and realize this vision for the past 5 years. I've got mountains of notes and some meaningful progress on the data and compute infrastructure, business and growth models, some lenses, and an obscene amount of bunny trails into geometric UI paradigms (isomorphic hexagonal tiling, aperiodic tiling and armand bars, ZUIs, and inevitably ur patterns). I've also got a family, day job, and a needy old house. I want to make deeper progress, faster, and in collaboration with others, but I have no idea what the right venue is to pursue this work. Where are the people who are working on this sort of thing? I don't have much of a pedigree for academia, and I'm not in a position to take much personal financial risk. If you have any ideas please let me know.
[1] https://fs.blog/2019/01/yuval-noah-harari-dominate-earth/
[2] https://www.8ways.online/
[3] https://worrydream.com
[4] I define lenses as views, filters, renderers for any device, where the audio/visual output streams of a lens are a function of the data itself. Those could range from 2D rectangle screens to 3D worlds, AR/VR, watches, voice assistants, raymarching projectors, haptic holograms, auric interfaces, smart dresses, whatevs.
Over the past year I've been invited to, and become involved with, several organisations that are not within commuting distance. Video conferencing allows this, but it's still not great. Better video conferencing makes it a better experience, and more conferencing apps gives more choice, and makes providers compete to get better.
We don't need more video conferencing apps, but the end of the pandemic is mostly irrelevant.
There's a mutant strain in South Africa that could potentially be vaccine resistant. Let's hope either it isn't or that turnaround time for a modified vaccine is pretty fast.
Because many people will remain WFH even after the vaccine rollout. Video conferencing will be essential to connect those that go into the office with those who don't, especially during mixed in person/remote meetings.
From the hyperlink in the article. This for me is the kicker. From the start to end of a zoom call I feel like I'm on a stage. When I am in an in-person meeting I feel like I am "hanging out". One drains me of my mental reserves, the other fills it up. Zoom just feels draining. They noted the fact you see yourself and there is real time self critique to the hissy sound and pixelation. After a year of it .. I'm close to done.