I used to support production at an automotive assembly plant. My IQ went up 10 points at least when I went to the plant floor to see the problem. People assume you don't know what you're talking about over the phone. When you show up, they believe you.
Now I'm a tech lead on an enterprise application. Daily I experience people talking past each other at all levels of expertise - architects not understanding architects, junior developers not understanding junior developers, so on forever. Nothing beats a quick demo of the issue.
I do think that this article will help people communicate better, but it's also up to people who can understand the article to take the lessons in it to their particular environment. I have a document ready for quick sharing title "The System is Down" because I've heard that particular phrase in one flavor or another WAAYYY too often with WAAYY too little context.
Oh, and calibrating the level of detail in technical conversation is very hard too, because it can lead to people feeling like they are being talked down to, or it can lead to people assuming the other person knows more than they do. It's a Hard Problem.
When someone reports to me "The website is down" I roll my eyes and immediately seek more information. I wrote up information below, imagining that one day I'd share it with someone reporting that "The website is down". Of course, I probably never will because I try not to be a dick. But I'll share it here:
FYI, "down" has a lot of meanings which most people don't differentiate between when reporting the problem:
1) Site can't be connected to at all (connection timeout), due to one of the following:
a) DNS doesn't resolve to any IP
b) DNS resolves to the wrong IP
c) DNS resolves to the right IP, but the server is powered off or otherwise unreachable over the network
2) Site responds with a connection reset
3) Site responds with "connection refused"
4) Site responds, but is slow to load
5) Site responds but has invalid or misconfigured TLS settings, resulting in the browser showing a security error.
6) Site responds with a server error (HTTP code in the 5xx range)
7) Site responds with a 404 Object Not Found error
8) Site responds with a redirect loop, causing the browser to give a "too many redirects" error page
9) Site responds with little or no content. (Empty page)
10) Site responds with the right content, but it is abnormal:
a) Missing Images
b) Odd formatting (i.e. missing CSS)
c) Some dynamic behavior doesn't work (i.e missing JavaScript)
11) Site responds with other, unexpected content.
Saying a website is "down" is WAY too vague to be useful. Heck, at my old job I was told "Our dev site is down" and I'd have to ask "Which dev site?!" because we had several sites. Ugh.
I may have gone a bit into the weeds with the some possible causes, but nearly each point has a distinct difference that can be seen. Sure, the user may look at a website/browser error and not have any idea what it means, but it doesn't mean they can't relay the message.
For example, "The website is down" vs "When I go to the website, I see the message 'This site can't be reached. some-invalid-website.com's server IP address could not be found.'"
“Go and see” is great. I have been in a lot of discussions where people talked about something for hours that they had never seen first hand. It’s so valuable to see directly how your users are using your product or to see people’s workplace. With offshore people we had problems lingering for weeks until somebody went there, saw what they were actually doing and what equipment they had, and solved the problem quickly. Same for setting up equipment at customer sites. Struggle for months because the local sales guy doesn’t give you the correct information or go there and solve the problem in an hour with their IT guys.
All the data we are collecting is useful but seeing things directly is extremely useful. Seems the modern workplace is often becoming very abstract where everything is compartmentalized and people see only their little box.
One of the nice things about where I work is that QA is a very short distance down the hall. When they tell me "It's doing X", I can walk over and say "Show me". And then I see that they didn't mention that it was in mode Y when they made it happen, and they did step Z on the way to get there. They may not have thought that those pieces mattered, but to the developer, sometimes they do.
Because of this, screen-sharing capabilities are critical to resolving computer-related problems that people report.
I once got a report from a coworker that he couldn't import a text-based configuration file into an application. I was unable to reproduce the problem. It wasn't until I had a screen-sharing session with him that I saw the problem. The configuration file extension wasn't correct. It looked right at first, but then I realized that it had an extra .txt at the end of it that was being hidden by Windows. The application saw the .txt at the end of the filename and rejected the file.
Nothing wastes time like emailing/texting/chatting back and forth when a voice call and screen share are available.
“And then I see that they didn't mention that it was in mode Y when they made it happen, and they did step Z on the way to get there. They may not have thought that those pieces mattered, but to the developer, sometimes they do.”
Exactly. You take things for granted and suddenly all your analysis work is based on incomplete or wrong assumptions .
Oh, and calibrating the level of detail in technical conversation is very hard too, because it can lead to people feeling like they are being talked down to, or it can lead to people assuming the other person knows more than they do. It's a Hard Problem.
In the end, I think, people want their problem fixed ASAP. To that end, I make zero assumptions about what others know, especially over asynchronous communication (and it's frustrating when you're working with someone halfway around the world and only get to leave them one email per day, for example). I'm aware that it might come off a certain way, but in all honesty, more than feelings, I'm interested in getting issues fixed as quickly as possible and giving people all the info, techniques, tools, etc. they could possibly need to fix their problem now, fix a future problem, and maybe prevent this problem from happening again. I've gotten negative feedback once, and when I explained why, it seemed like they understood and there were no hard feelings.
I think of it like this, if the shoe was on the other foot, it's how I would want it to be handled. My feelings aren't hurt if someone is working from a script or dumps a set of five steps at me, four of which I already know, etc.
Demoing fixes (or research) is pretty key in the SRE world. Even SRE's across different domains (SRE-SE <> SRE-SWE <> SRE-DB) will inadvertently question each other. Especially for the fact that a lot of problems you're solving look complex through the lens of abstraction, but once you get them to the primitives they're quite simple.
The important part is tailoring it to your environment.
Also ask yourself if your system makes issues observable and testable. If you're spending a lot of resources on support or testing, those may be features worth adding. This is hard to quantify and defend to management, so sometimes you may just wind up working on it for your own sanity.
---
What to include in your question:
ENVIRONMENT
EXCEPTION
WHAT YOU'VE DONE to localize or investigate the problem
SERVICE NAME
URL
Before you ask Is X Down? Please take a moment to ask yourself these questions:
Is the service up? (Check Cloud Provider X, Cloud Provider Y)
Is the server up?
Which environment am I asking about?
Have I checked the logs?
Have you googled the exception?
Have you looked for code related to the exception in the codebase?
BAD: The service is down.
GOOD: When I run service ms-random locally, I get the following error.
> after being told by a user that the page fails: "The user said that when they submit the form, they get a server error page." (Importantly noting the source of information)
Apparently, there are languages (such as Mongolian or Eastern Pomo) that "automatically" include that information into the sentence by grammatical markers of "evidentiality" (just as English always marks the tense (does/did/will do)).
Evidential type Example verb Gloss
nonvisual sensory pʰa·békʰ-ink’e "burned"
[speaker felt the sensation]
inferential pʰa·bék-ine "must have burned"
[speaker saw circumstantial evidence]
hearsay (reportative) pʰa·békʰ-·le "burned, they say"
[speaker is reporting what was told]
direct knowledge pʰa·bék-a "burned"
[speaker has direct evidence, probably visual]
Everything he says makes perfect sense, and is a conclusion I came to a long time ago - but I found that it didn't help much. In fact, I can't find the study now but I remember reading a (very depressing) psychological study that explained why trying to be precise seemed to have to opposite effect intended - something along the lines of "people interpret detail and precision as hostility". I've learned, the hard way, through trial and error that unless you want to make an enemy of everybody, it's better to describe a problem at a high level and then wait for somebody to ask for more detail before you provide it. It's not the way the world ought to be, but it's the way the world is.
Yes, the article is great from a rational standpoint, but much of it fails in practice. I've definitely found providing more detail rarely helps with the situation, unless the other party is willing to receive it. In problematic conversations, the latter is often not the case.
The article doesn't address the emotional problem at all, which is likely over 50% of miscommunications I've seen at work. Things like:
Junior person talking to senior person: Senior person assuming the junior person doesn't know his stuff, and simply not processing what the junior person is saying. But if a fellow senior person uses the exact same words as the junior person, he listens. Nothing in this article will help.
Person A is unsure person B will agree with him, so speaks very defensively (e.g. repeating the same stuff over and over, even though it's clear from B's responses that B understands). As a lot of communications books have pointed out, the way out of this is to have B repeat back what A is saying and ask A if his understanding is correct. This problem is so common that I default to doing this regardless of who I'm talking to.
Person A doesn't trust person B or doesn't really like person B's ideas, and will act stupid intentionally (not too common but happens often enough to need to be aware of it).
And there are others.
Being precise is the least effective technique I've found.
I have had too many discussions/arguments that were resolved as soon as it was deduced that the meanings of the words over which the discussion revolved meant different things to the interlocutors. And yes, I worded it that way to make a point: using large words with precise meaning requires a level of education that, sadly, is lacking these days. Too often I need to dumb-down my vocabulary which results either in a lack of precision or a wall of text to describe something in detail that a handful of multi-syllabic words could have done more concisely.
You pretty much always have to make sure that everybody associates the same meaning with words. In politics this is often used to be deceptive. Somebody who says he is a “socialist” may mean something different than what some people are hearing.
I wish developers would put as much effort into honing their technical communication skills as they do pursuing the latest and greatest new framework or language. This is really good stuff and I wish it was a more common skill set. Whenever a new grad / junior developer asks for advice on what languages/tools they should learn because they are worried they don't know enough, I tell them the biggest lacking skill is project management and communication skills. In the long list of skills that they think they are lacking, these are never included!
You don't need communication skills to climb the ranks, you can be horrible at communicating information as long as you are great at playing politics and telling people what they want to hear you will be much better off than the people who are good at communicating. Sadly most people mix up the two and think that politicians are great at communicating, but what they are great at is manipulating people not communicating information to people. Trump is a good example of this, he has horrible communication skills but he is great at politics.
This is my experience at least, people are equally bad at communicating no matter what role they have, since communication skills doesn't get much appreciation anywhere in any role.
Start a blog and start explaining things you’ve learned to the world.
This isn’t enough on its own, but reframing the knowledge you’ve gained in your own words is something that is worthy of practice, and will help you translate things in a work environment.
But here’s the key: pick someone you know who doesn’t have the same knowledge you do. Preferably someone smart but not technical.
Write for that person. As you write, try to imagine the questions they’d be forced to ask about the sentence / paragraph you just wrote. Is there additional context needed, or a useful abstraction to help explain the concept?
I gradually earned a reputation as an explainer after spending a lot of time blogging for my employer. I wasn’t good at it when I started. This really boils down to: empathy for the intended audience, and a willingness to write not for your fellow developers, but for those hoping to understand you and your fellow developers.
I transitioned into a PM role once I could translate things well. It’s challenging but rewarding. I don’t blog very often these days, but credit much of my current success to the things I learned by explaining hard things to a non technical audience.
Teaching and talks. You can do talks at local groups (or even conferences!). Even if you don't actually get to do the talk, writing the submission and preparing the talk help a lot.
Writing the talk lets you practice. Giving the talk lets you get feedback on how effective it was.
In the book/movie "The Giver", I remember the main character's mother saying "Precision of language." It was an instruction/request to be clearer and use less vague terms.
I wish we could say that, or something similar, to people without them getting upset. Implying that someone is communicating poorly and needs to improve is often dangerous. Instead you have to do a whole song and dance to say "I'm sorry, I'm not understanding. When you say X did you mean Y?"
I have a bad habit of using overly complex sentences to avoid irrelevant semantic edge cases, even when a bit of hand waving would get the job done just as well. I'd love it if the people I talked with were less worried about being polite and more worried about tuning the conversation to the right level of precision.
If we're going to have an article about precision, especially when discussing misattribution of actions to actors, perhaps the example "When the pod uses too much memory, kubernetes will shut it down" should be omitted. In virtually all cases it is the kernel which terminates tasks in a control group that has exceeded its memory limit, not the kubelet.
A lot of truth in this but most of us are not really taught how to discuss or describe. Philosophy I guess.
How many times have I had to ask a Support Tech to go back and get basic details, "not working"? How? What are they expecting and what are they seeing? What time did it happen? Can they screen grab the browser console. All basic stuff that can help most of our fault-finding.
The fundamental problem is that words are an approximation, a low resolution model of the actual thing. The actual thing is the code, the schema, the configuration, etc... but because we need to get on with our lives we use low resolution models.
Then the problem is, how to make sure that when I say "user", the people I talk to understand that I mean the "User domain object" and not "a human using the system" or the "user_table".
There really is no way around the need to have a calibration making sure that all have a more or less similar mapping word => actual thing.
How much calibration is needed, how to go about it not wasting time and not offending people is tricky.
I used to support production at an automotive assembly plant. My IQ went up 10 points at least when I went to the plant floor to see the problem. People assume you don't know what you're talking about over the phone. When you show up, they believe you.
Now I'm a tech lead on an enterprise application. Daily I experience people talking past each other at all levels of expertise - architects not understanding architects, junior developers not understanding junior developers, so on forever. Nothing beats a quick demo of the issue.
I do think that this article will help people communicate better, but it's also up to people who can understand the article to take the lessons in it to their particular environment. I have a document ready for quick sharing title "The System is Down" because I've heard that particular phrase in one flavor or another WAAYYY too often with WAAYY too little context.
Oh, and calibrating the level of detail in technical conversation is very hard too, because it can lead to people feeling like they are being talked down to, or it can lead to people assuming the other person knows more than they do. It's a Hard Problem.