I have a degree in computer science, but at the end of the day it's just a piece of paper - I learned programming mainly through the internet. I talked to people who cared enough about programming enough to hang out on IRC, or write technical blog posts, or write their own libraries, etc.
What has shocked me since entering the "real world" is how little most other programmers I meet care about programming, their craft, code quality, or learning new things. I was legitimately shocked at how good I was compared to the real world population, as opposed to the "internet programmer" population where I was mediocre at best.
This isn't a value judgment - not everyone is passionate about their day job. But it might resonate with someone out there.
One time, when leaving a company, I helped interview candidates for my old job, since I could best tell what skills are needed and to what extent.
I reckoned at the time, from most interactions on the Internet, that I am (globally) a mediocre, if not below-average, developer - in each discussion I participated online, there were always people above my level, from the ones having a slightly better understanding of the subject to the people whose depth of knowledge appeared somewhat otherworldly.
This belief was particularly strengthened by the fact that I didn't have much contact with real-world developers in real life, as my friends tended to come from different backgrounds.
When the interviews came, I was astonished how many of them couldn't code their way out of a paper bag. The ones who did, wrote code with messed up indentation, inconsistent style, ugly and hackish approaches, and the like. I couldn't imagine how someone could leave their code like that and say "there, this is my finished product." But here they were, and it seemed like they didn't really see anything particularly wrong, or just not important enough to justify the added effort of cleaning the mess up. It was a very strange moment for me.
I believe this to be an example of the classical selection bias: people inclined to participate in online discussions, answer questions on StackOverflow, write blogs and type out detailed technical comments on HN are on average a lot more skilled than the average just-earning-my-living developer fresh out of some half-assed "be a software engineer and make tons of money!" three-month course.
There is another very strong bias that goes on when hiring, though. Take 9 competent developers who tend to get offers and 1 that can't code their way out of a box. That 1 could very easily do more interviews than the other 9 combined before landing a job (and will be back on the market sooner to boot) and interviewers might conclude them typical and the other 9 atypical even though the reverse is true.
I worked with someone once who had a really sloppy/inconsistent code style, but the code he wrote was rock solid.
His subsequent (and current) job is now as a very well-known expert as as developer advocate for one of the big four tech companies.
You'd be surprised at what you find developers out there doing, the only good metric on a person being a good developer is writing fundamentally sound code. Literally nothing else matters.
In Zen and the Art of Motorcycle Maintenance the author puts forward two types of welders. One is the type that excels at really complex and difficult welds, but isn't very good at making simple welds in repetition. The other might get tunnel vision when trying to figure out an unusual weld, but has the discipline to produce consistent, flawless welds in great quantity.
It might be explained by what is know as Yerkes–Dodson law[1], which describes how performance increases with mental arousal up to a certain point then decreases. If different types of tasks correspond with different arousal for different people, then the type of welder or developer one is is orthogonal to their fundamental skill. Instead it would mean that the first type needs to cast the problem as something novel in order to gin up the necessary level of concentration to get the problem done, hence the rock solid code that has sloppy/inconsistent style.
It makes sense that a good advocate would be highly skilled, but the type who has difficulty with necessary work that seems mundane. In other workplaces this might be the person who seems to have low output, but gets their coworkers past their roadblocks.
The term Big Four originates in accounting where there actually are a specific set of four large, established companies.
For tech, I usually remember it as AmaGooBookSoft - Amazon, Google, Facebook and Microsoft. Apple is sort of a hardware company so sometimes it is included and sometimes it is not.
I've always heard it as Google, Facebook, Apple and Amazon. Microsoft isn't very sexy to the typical developers that participate in the startup/Silicon Valley "scene" and its accompanying communities (HNews included).
Big four almost always includes google and facebook but the other two slots seem to vary a bit based on who is saying it. I've even seen companies like yahoo and twitter thrown in there.
Honestly, I feel like we should have gone further and decoupled display and serialisation formats of code a long time ago. (Although I understand why we haven't.)
A lot of people without much programming experience seem to really struggle with nesting. We take it for granted, but I suppose it actually involves some reasoning about evaluation order of arguments.
For those people, persuading them to indent can be the difference between them having a chance of learning the knack or just flailing randomly, getting frustrated, and giving up.
(The other lesson is: if you're building a DSL for non-programmers, try to avoid putting in nesting and make everything really flat instead.)
Emacs does the same with C-x h TAB, or you can mark a region manually and reindent its contents with TAB.
As with vim, this depends on being in the right major mode for the language you're working with, which should be automatic for most languages (except Perl, which is screwy, because of course it is), and depends also on that major mode knowing how to cope with the code and being configured the way you want, which should be reasonably trivial for most languages (except Perl, which is screwy, because of course it is).
Please don't do that sort of thing -- unless you give the candidate a piece of code and specifically ask what could be improved.
But don't expect candidates to spontaneously criticize the interviewer's code. Yeah, I know, you want to work with people who can think critically etc, but at the time of interview you aren't coworkers and aren't on equal footing.
Personally, if I as a candidate see you writing code with bad indentation I will conclude either (a) you were making a point about e.g. correctness or efficiency and didn't bother to indent because it's literally throwaway code, so mentioning the indentation would be a distraction; or (b) you and your team don't care about indentation at all.
i just create an editorconfig file <http://editorconfig.org> in root directory of projects, saves having to remember to switch tab settings across them (there might be another plugin with more functionality but it does what i need it to)
yeah, i meant for the tab issues specifically since others arent as easy to miss
hopefully someone will get around to adding more functionality to the plugin so i don't have to edit my visual studio formatting settings for parentheses/comma spacing whenever i switch projects :p
Yeah, I don't even have a CS degree. I was considering myself below average as long as I was working mostly alone or on specific domains, for I could not compare with others IRL. I could only read people on Usenet or on the Web and they often looked 10 times as knowledgeable as I was. Also, I could see that other professionals listed impressive job titles and definitions, so I was, yes, really impressed and couldn't see how I could ever match those advertised skills.
Then I got a job in a bigger company, where because of the nature of the job, I could easily compare with other people in a large team, in the rest of company, and in other companies; and I quickly noticed I was in the top 10% or perhaps even 5%. That was at the same time very surprising and... very appalling, because that meant that all the big titles of those guys were just kind of fake, empty. it felt a bit as if I had been deceived. Even though they had CS degrees, they had vast holes in basic programming know-how an in computer knowledge, and they didn't give a fuck about it, as little as they cared about doing a clean work.
So the people I used to read on Usenet or the Web were indeed exceptions and not the norm, and sometimes pretty much gurus in their domain. Anyway, they were not at all representative of the majority of developers that populate companies.
> I quickly noticed I was in the top 10% or perhaps even 5%. That was at the same time very surprising and... very appalling, because that meant that all the big titles of those guys were just kind of fake, empty.
Keep in mind that following that reasoning, people in the top 1% would feel that your title is fake and empty. Would you then say that it's a fair perspective?
If that supposed 1%er (all else being equal like negotiating skill, personality, etc.) is making less money or receiving less respect, then yes, that's a fair perspective.
In theory yes, but the world isn't strictly meritocratic. Age, school (or lack thereof), social connections, height, attractiveness, clothing style, voice, accent, social class, birthplace, medical conditions, and other non-work-related attributes can still affect a person's achievements.
I agree with everything you say, but would also like to add that the level of passion and 'pride' which programmers have seems to vary by specialization. I work in embedded systems, where having well-factored code, with well thought-out scope is the anomaly, not the rule.
> I work in embedded systems, where having well-factored code, with well thought-out scope is the anomaly, not the rule.
I had always romanticised embedded programming is being the domain of tightly written, well thought out code. I figured there was much less "room for error", so there'd be more scrutiny per functionality. That's been your experience across a few companies?
Embedded software can be very tightly written, with large sections of code written in assembly to save a few instructions, and other portions of code loaded into RAM for faster execution.
There are also quite a few programmers who couldn't 'cut it' in web programming, and went over to embedded, where they wreak havoc on small low-cost devices, and another group of programmers who somehow 'fell into' embedded programming many years ago, and have not updated their skills since.
I have honestly seen the exact same thing. I was rating myself quite average or just below average. I found actually that as a web developer, if you have a general culture of lower level computing and can produce readable and maintainable code, you are already in the top 10%.
This agrees with my own experience. I learned programming with people that would do things like write their own graphics library, in Assembly language, or device drivers, just because they wanted to see if they could. I thought every programmer was like that in the professional world - with rare exceptions, they were not.
When I came out of college, some real life examples of people who do not seem to care at all which shocked me:
- CTO of a company I worked for (brother in law of the company director) typed with 2 fingers. No Joke
- Senior Software developer to nullify a list, he just creates a "new list" instead of setting it to "null"
End result from practical point of view is the same. But the lack of computer science insight baffled me.
- Team lead implements lines of code as a tool to measure productivity.
Being a new kid out of school, I felt too young to call bullshit but intuitively, I felt the metric was flawed.
One thing to keep in mind is that there is a difference between a metric and measure. A measure is basically exactly what it says on the box: it's something that you have measured. A metric is a measure that you use to drive process changes.
LOC is a good measure, but a truly terrible metric. The reason it is a terrible metric is because it easily gamed (both consciously and unconsciously). The same functionality can be implemented in 1 line or 100 lines. The amount of time it takes to write 1 line can be more than the amount of time it takes to write 100 lines. You get the point.
LOC can, however, be a good measure. You can test this yourself if you like. Go through an old project (something with a year's worth of code). Make a list of all the features that were implemented in that year. Estimate the amount of work for each feature (or use whatever estimates you had before). Make a rolling average of "amount of work" over each week. Now go into your code repository and make a rolling average for the number of lines of code changed each week (so basically just add the lines added to the lines deleted).
I think you will find that there is a correlation between the average estimated effort and the average change in LOC every week. This is because over the life of a project, developers tend to use the same kinds of techniques, the same idioms, etc.
Occasionally, you will find that the relationship between estimated effort and LOC changes dramatically. In my experience, you will discover that this corresponds with a change in development mentality. Something has happened on the team that changes the way they write code.
So it's a good measure for that kind of thing. For example, I once had to deal with political problems from upper management where they were convinced that the programmers were not focusing on their job. I could show them very easily from graphs of estimated effort and lines of code that it was very much business as usual.
All this to say that if your team lead is encouraging you to increase your number of lines of code out the door, then it is indeed a flawed metric. But if the team lead is measuring productivity with lines of code, it may very well be a decent measure. You can't use it to affect productivity, but you can use it for other purposes.
"Being a new kid out of school, I felt too young to call bullshit but intuitively, I felt the metric was flawed."
You should have said that to the team. Especially that if anything goes wrong in project, usually it's new person on the board / junior programmer that gets the blame.
Besides not telling that was just not honest to the team. I think we should all learn from each other, no regard with the job title.
This is interesting. I do think there can be a problem around the concept of "competent" vs "average". For instance, consider a poll like this: On a scale of 1-10, where 1 is inexperienced, 2 is competent, 3 is above average, and 4-10 are various levels of good-brilliant, where do you rank yourself?
My guess is that even with these explicit instructions, people are reluctant to rate themselves a 2. I can see a few reasons why higher scores would actually indicate that people see themselves as average. For instance, some people might view anything under 5 is incompetent. In this case, a 7.5 would indicate the middle of the range, which is actually 5-10. Alternatively, people might think of grades in US based schools, where 75% is a C, technically "satisfactory", but essentially failing in an era of grade inflation. In many PhD programs, an A- (91%), is a shot across the bow, and a B+, (88%), is meant as a vote of low confidence. This could lead many people to rate themselves as an 8 or 9 simply to indicate their averageness.
Lastly, programming is still a young field with a wide variance of talent, but I think we need to be more of a "paper hat" distribution than a "pyramid". To explain, I don't think we're a field that works well with a large, unskilled base, that slowly narrows to a tiny elite at the top. I think the baseline to be a competent developer is quite high. I know, I read about these "CRUD" apps, and what can I say? The framework churn, the challenge in communicating with clients who understand business but aren't familiar with the unforgiving logic of a program, the need to deal with deadline pressures under uncertainty, the challenges of data integrity and testing coverage... honestly, simply reliably deploying a relatively simple web app is considerable, in my opinion. This is why I go for the "paper hat" distribution - devs are largely on a long, flat curve, with a small little mini-pyramid in the middle, where the true innovators live. In short, maybe we are a field represented by wide, short rectangle of 1's-3's, with a minuscule steep and thin little triangle of 4-10s popping up at the top. However, the 1's really are smart people who are learning, and 2-3 represents quite a valuable and talented developer.
This seems like a really significant issue. First, because "5 as median" isn't a universal grading scheme, and 50% is often failing. Second, because this poll didn't specify what the numbers mean.
The bounds of "programming ability" are basically undefined. Does 1, being the lowest, mean "cannot program"? Does 10, pain-scale style, mean "best programmer imaginable"?
Or do the numbers represent deciles of skill? If so, do they include everyone who has written code? Only people who have completed a substantive project? Only people who have been "programmers" in a non-hobby setting? Only in a non-hobby, non-academic setting? Only people taking this quiz?
And worse, it's a meta question. We aren't guessing what the survey meant here, we're guessing what the respondents thought it meant - which means that the skew may be totally accurate under their chosen boundaries! Keynesian beauty contest and all that.
So this is an interesting result, but I can't really use it as data about self-reported skill, only about interpretation of surveys.
My own perceived rating significantly lowered with the rise of social networks. As you start to follow more closely some really great programmers out there, you stop comparing yourself with your peers and start comparing against the best in the game... and that is self-confidence killer :)
Move jobs a few times, inherit some crappy application and look at the code that others write, that has done the opposite for me.
Though a lot depends on what you define as being a good programmer. I used to be a lot better at clever algorithmic stuff. Now my focus is on clean maintainable code and getting the higher level architecture right, so that I don't need clever algorithmic stuff.
"I used to be a lot better at clever algorithmic stuff. Now my focus is on clean maintainable code and getting the higher level architecture right, so that I don't need clever algorithmic stuff."
This.
'Clever' is BS.
There's way too much 'clever' going on in high tech, and way too many people interviewing for the wrong skills.
I used to be very clever as well, today, I don't think I could get a job at Google (not that I care) - but I think those interview styles are upside down.
These days I look at Eng. more like construction or plumbing: most of it is not rocket science, really. You don't want new or fancy anything unless you have to.
Good code should be boring and read like English. There should be nothing special about it, unless the problem space really calls for it.
The paradox is - simple code is not psychologically impressive, at least to some people.
I'm actually trying to develop an interview method that will allow me to measure this.
Clever is expensive, dangerous, and usually not worth it.
In fact - if you're trying to do something clever, it might be a sign that something is wrong. There is probably a library for that.
One caveat: it's good to have done a bunch of clever but not-so-useful things in the past, because when I use a library for something, I know what's going on - that's worth something.
Sometimes I feel that first 2 years in dev is basically 'training' - and I wonder if someone with < 2 years experience should even be writing production code.
I must admit I am wary of this attitude. I've worked at places where lambdas are clever, or generics are clever, or writing constructors that take arguments is clever, using LINQ on collections is clever, refactoring copy-pasted code into methods is super clever, etc etc.
One mans clever is another mans blindingly obvious. Should we always work to a lowest common denominator?
Yes. I have worked in similar places and it is maddening. "Sorry, you can't use that highly idiomatic construct because an inexperienced person might not understand it straight away. Ignore the fact that someone experienced is going to wonder why the hell it is done the naive way"
"One mans clever is another mans blindingly obvious. "
I think that, when in hindsight, something looks really obvious, it was probably the right thing to do.
I don't think lambda's or anything that specific are clever.
Every tool has it's use.
Here's an example of 'bad clever' - writing an optimized 'find' function which may very well be a little faster than the lib 'find' - but which will have absolutely no impact on the software.
I find that 'performance' is one of those areas that tends to be way over-engineered right off the bat.
Because of the historical (and current) performance constraints of systems - we soft engs. have 'built in' impetus to want to make things perform better.
There are very, very few things I will allow myself to optimize from the get-go. Even now - I have to urge myself to not do this. Code for clarity, then do the tweaking where necessary, because it's nary impossible to tell where the real bottleneck will be.
Does the choice have to be about the lowEST common denominator? Maybe a good criteria is to code for a lowER common denominator, just to account for the Bus Factor, maintainability, and so on. The lowest common denominator that makes sense, rather than playing denominator golf or disregarding the fact that you're coding for the benefit of your colleagues, too.
I disagree. Rather than coding for the benefit of someone who doesn't know the language very well - people should learn the language. If the company is hiring maintenance programmers who don't know what lambdas are - again, their problem.
I feel like we should have higher standards, and use the abstractions available to us grow the code towards the problem. Instead I see people writing long sprawling application that use the same low level library routines everywhere.
Maybe it's a quantity vs quality thing. And quantity is much harder to hire for, so my ideas lose out.
I's about common sense. I'm not saying your former place of employment is full of idiots, but some people hold fast to the that's just the way it's done paradigm. None of what you listed is clever. Those are basics, and I think you'd be hard pressed to find people that disagree that those are basics.
It will vary across languages. Someone who only knows Java 7 (or 8 without the use of lambdas) is going to find 4 chained anonymous functions in JavaScript clever simply because it's unfamiliar. Language specific or something from an unfamiliar paradigm (think functional programming for many new grads) will seem clever. Breaking 500 lines of continuous code into manageable methods is what everyone should be striving for.
There's the type of cleverness that shows off too complex a solution or intricate knowledge of a particular system in a less portable way. It's fanciful and surprising and is a mess to maintain later. People who employ that cleverness for anything other than toy example code meant as a curiosity are employing the bad sort of clever programming.
Then there's the sort of cleverness that makes a leap that becomes a new idiom over time. It's not immediately apparent before you've seen it, but it's deceptively effective and looks really obvious in hindsight. Some of those idioms are local to an organization or vertical to an industry or language community. Some are so elegant that they jump from one community to another. The "Orcish Maneuver" or using !! to force a variable into a 1 or 0 to approximate a boolean are some examples that come to mind. Yes, someone was doing something clever the first time something like those was done. When an example is seen it's simple enough to recognize and understand, though. It turns out to be useful and becomes a common and understood to be almost straightforward way to do things. That's the sort of cleverness to which one should aspire. That sort of obvious in hindsight elegance is what's truly clever.
I think familiarity is a key factor here, and I think your examples highlight that.
!! seems slightly arcane to me, and I'd rather write some function like to_bool() to make the intent clearer, but I imagine if you're used to it !! is immediately obvious, and to_bool() would be confusing because it's not !!. (And conversely, if to_bool was part of a standard library and widely used, !! would seem arcane.)
I don't think either technique has any real practical advantage, and the real trick is writing your code for the people who need to read it.
This! So much this! (Unfortunately my single up-vote is not enough to help you sir.)
Unnecessary complexity is god-mother of all evil (premature optimization included). The goal is to use as simple (to understand and to maintain) means, code, idioms etc. as the problem you are solving allows. I know fancy stuff and I know when not to use it (i. e. almost always) and value simple robust code much more than fancy fragile crap.
Seems like more people are trying to solve easy problems by complex means than people trying to solve difficult problems by any means. Sad thing is by using unnecessary complex means even the simplest problem can be obscured beyond mental capacity of anyone. Welcome legacy spaghetti mess.
I agree in principle, but "boring" is in the eye of the beholder. Some developers don't understand "map", that doesn't mean you should replace every `.map` with a for-loop.
Down-vote me if you like, but I'm speaking from a lot of experience, many companies, many projects, many languages, from 'real time / embedded' C w/machine code + custom ASICS, all the way up the stack to the highest level of abstractions + UX.
Clear and boring code written by people who know what they are doing is worth it's weight in gold.
There's different dimensions of clear and boring, however. I've had to wade through a lot of clear and boring code that was mostly fashionable OO ceremony; the verbosity of it obscured what was going on.
New abstractions need to be introduced sparingly. Building new kinds of abstractions is one of the easiest ways that "cleverness" gets into a codebase, and if the abstractions are don't pay their way, they just obscure. Often they aren't even necessary at all (indirections that only ever point to one thing).
Yeah, I've definitely developed an aesthetic taste for code cleverness. Like, one time I was writing some python and using third-party code that expects a list of user-provided classes. It was really annoying and I kept passing it a bare class which made things fail. I ended up making a class that behaves like a list containing itself, and it completely "fixed" the problem. That really shouldn't be production code, though.
Although I'm not a fan of it. If you are doing "clever" coding. I urge fellow coders to please write some test cases for your "boring" programmers. Helps the rest of us to be brought up to speed and maintain it.
I think the crux of the problem is: brilliant developers exist (and we tend to follow + see their work more often), but sub-mediocre developers also exist and are way more abundant. If you're looking at a pre-existing code base at work, it's more likely it was created and maintained by the latter rather than the former.
I follow great developers via social media/news and always tend to compare myself to them too much. That leaves me with the impression I'm a failure. But then I just have to go to a local tech meetup (in one of the biggest cities in the US no less) to be facing people with an appalling lack of knowledge, who like to discuss platitudes as if it's the second coming, and who really barely know what the hell they're doing and yet think they're the most clever people on earth. The feeling of wasted time is often an infuriating experience. Many times I've had to excuse myself during Q&A rounds after a presentation, given the outright ignorance displayed by people asking questions. Still, that's one experienced that serves to show me I'm probably doing something right.
I know I sound very negative and judgmental, so apologies. My point is, that's one thing I always recommend to people: go to local tech meetups. You can see how you "compare" there.
I think this is a selection bias. In my experience most of the people attending meet ups are people trying to learn about the topic. They view it as a free class, way to network into a job using that technology, or something similar.
"Move jobs a few times, inherit some crappy application and look at the code that others write, that has done the opposite for me."
I remember when some of Microsoft's Windows code leaked I took a look and realized that these guys are just regular programmers like you and me. There was really no magic there.
I am sure there are brilliant programmers out there but often they work in a nicely defined environment they can actually control instead of having to deal for weeks with incompatibilities of several systems like many of us.
I'm always worried when I feel too good at something, I can't help but wonder whether there's an entire world of knowledge out there I'm not aware of (there usually is).
I also noticed my peers tend to consistently rate me 1-2 points higher than I do rate myself. I'm unsure if this is a good thing, if my peers are overestimating me, or if I'm underestimating myself.
I approach your last point differently however; I find it inspiring to look at people being at the top of their fields; makes it easier to plan a route to ramp up your knowledge to that level as the work has already been done.
That 1-2 point gap crops up in a lot of settings - not just programming, but writing and public speaking and many others.
The common element seems to be hidden process followed by a finished product. Your peers see the code you write (or essays, or speeches, etc). You see the writing process, with all its inevitable inefficiencies and foolish mistakes. No one sees Obama stumbling over his speeches as he gets dressed in the morning, they just see a skilled final delivery.
So without more context, I would guess that you and your peers are rating different things. You're rating every stage of your development, and they're rating the results, or at least the commits.
It's likely that being a better programmer shifts your rating scale down in general. Becoming better at anything doesn't just make you better at doing the thing - it also makes you better at seeing mistakes and areas for improvement. Like, I remember watching jazz education videos by Hal Galper, and one of the things he mentions is that there's a moment where you start feeling like everything you do is terrible, and that this means that you're getting way better because you're now hearing the things you've always disliked about your playing.
Social media is deceiving in both directions: I have worked with a lot of nobodies and with big names with 10K+ twitter followers. Some of those well known programmers know a lot of trivia about obscure parts of computer science, but that doesn't mean that they are really all that productive when you look at their job contributions as a whole.
Other times the fame comes from being big external communicators: Giving a lot of talks about programming doesn't meant that your can code quickly, accurately, or that you can come up with good designs.
When we look at what it is for someone to be a great programmer and a good teammate, there's a whole lot of characteristics that are extremely hard to measure in a job interview. Do we really think it's easier to measure them by just hearing someone give talks or reading some articles? Do you really know that your favorite speaker writes good tests, cares about code readability, or will be there for you whenever you have a technical problem? Will they even work as much as you do, or do they have a contract that says that their speaking is their job, therefore leaving your team behind to do things less famous people have to do, like being on call for cyber monday, or making sure the new shiny system is actually operable in production?
Therefore, my personal experience working with a lot of the socially famous has given me a lot of respect for the great developers I have met that spend less time in self promotion, and more time working on their own craft and take care of their job and family obligations. I've worked with many that are as good, if not better, than the average famous developer I've work with. You just don't know about them.
If you're a little self-aware you can use the comparison as a source of inspiration and drive to better yourself instead of turning it into a self-confidence killer.
Even more importantly than that... is you are comparing yourself (presumably new/newish) to those who have spent years? decades? getting where they are at.
Behind every successful person is years of hard work. Behind every success is an untold number of failures and learning experiences.
You can't compare yourself to someone who has been making games for 20 years when you're learning.
I normally dislike working with survey data since there
is a high possibility of selection bias among
the respondents.[..] For this reason, I will
show confidence intervals whenever possible to
reflect the proportionate uncertainty for
groupings with insufficient data [..]
That... is not how statistics work(?). I mean – confidence intervals help with small sample sizes, but they do nothing for systematic errors such as those introduced by selection bias.
[continued from above] and to also account for
possibility that a minority of respondents may
be dishonest and nudge their programming ability
a few points higher than the truth.
There's a surprising amount of assumptions that went into this sentence. I'd question the assertions that:
- people are "dishonest" (My intuition would point to a subconscious bias more than actual dishonesty)
- It's a minority (The second chart shows that >50% of respondents with one year of experience or less rate themselves as better than average).
- That subconscious biases only work in one direction
... and once again I have no idea how confidence intervals can help. A large interval may indicate bad measurement. It may also indicate high variability in the actual data.
Also, keep in mind that these groupings alone
do not imply a causal relationship between
the two variables.
... someone paid attention in his middle-school statistics club...
Employing traditional regression analysis to
build a model for predicting programming ability
would be tricky: does having more experience
cause programming skill to improve, or does having
strong innate technical skill cause developers to
remain in the industry and grow?
... but failed statistics 101. A regression analysis doesn't care about causality. If "Mac users are more likely to be college-educated" it doesn't matter that "buying a Mac may not actually make you smarter". I can still make the prediction "a given Mac user is more likely to have a college degree".
Microaggressions aside, these are fair counterpoints. I spent far less time editing the body of the post than optimizing the visualizations/Jupyter Notebook (especially in this particular post). I've taken more care in future posts since.
If you are looking for feedback, here is another suggestion for future posts: I found the use of violin plots for discrete data to be confusing. To be honest, I still am not sure how to interpret the unlabeled Y axis. I think a histogram would have been easier for me (and others) to interpret.
But suggestions aside, I found your article to be interesting. Thank you for it.
I second the criticism of using violin plots here. A violin plot is a kernel density plot. It is designed to show a distribution at a scale much larger than variations in the data. Its raison d'etre is to smooth out and aggregate these small-scale variations.
But in this survey data, the values in the distribution are spaced far apart. The discretization is so large that the violin plots show meaningless and weirdly inconsistent curves between x-values which actually have data. A bar plot would be much clearer.
OP, I think you have done a fine job of styling your plots tastefully. But I recommend taking another look at the visual language you have chosen to communicate the data.
There are many domains of expertise. Dealing with networking, graphics, sound, threads, filesystems, embedded platforms, compilers, cryptography, operating systems (which is in itself another world), distributed systems, databases, drivers, high performance computing, machine learning, etc.
Ask a senior network programmer to write you a shader and ask a graphics programmer to write you a production-strength multithreaded socket server...
Ask a senior web designer to write you a compiler from scratch, and ask a senior compiler engineer to write you a responsive website that renders correctly on all browsers.
Then, some programming abilities are rarer. There are more web developers than compiler engineers, the amount of jobs and the requirements for them are different.
What I am trying to get at is that software development comes on many flavors and it's really hard to touch all those surfaces in your career.
Looks exactly like driving competency. Over 93% of US drivers self-rate their driving as above average (even significantly above average). It's very frightening to see that 36% rate themselves as above average drivers whilst sending text messages.
I get this asked a lot and if I tell people "If Carmack and Torvalds are a 10 and someone who can write a hello world is a 1, then I would rate myself 5 max" and most just think I must be a bad programmer because of that rating.
Well even with those bounds, is that scale in terms of the population as a whole, some measurable notion of productivity, or in terms of some abstract concept of absolute skill? Depending on the definition, 5 could mean many different things.
Not to mention that there are different aspects of skill. Is Torvalds a good functional programmer? (I actually have no idea, so this might be a bad example, but hopefully I make my point.)
The consistent trend in unemployed / part time / full time is a little strange considering that many programmers value time to work on their own projects, and can gain new skills that are valuable to an employer. Seems like HR and cultural attitudes about employment are large factors in programmers view of skill.
The number of people that picked "never check in or commit" or only do it "once or twice a month" but that also rate themselves as a 7+ makes me a combination of sad and irritated.
Interesting analysis. I do wonder what a self rating for developers actually means: how do you define the difference between a high rating and a low rating? What is a great developer vs an average one?
The definition of competence wildly varies based upon who you ask. There are many experienced developers in the work force, whom I am sure would rate themselves high, who cannot do their jobs (at all) without their favorite frameworks and abstractions. Epic fail. They might rate themselves with an 8, but I would give them a 3.
I am sure they likely do, which only suggests developers' opinions of stylistic qualities and tool choice is an extraordinarily weak metric of competence. A better and more objective metric is an evaluation of the product: the speed of code writing, execution speed, fewer dependencies, few lines of code, and so forth. From that you can guess an opinion that better software is written by better developers. Asking developers to rate each other, or themselves, is next to worthless.
Just the hick in the salary/ability plot at 160k-170k makes me wonder what's going on there. It might just be inaccuracy due to small sample size, as this range is probably an unusual salary. I guess people tend to gravitate more towards round numbers like 150k and 180k when negotiating salaries.
i'd like to see the same group of people asked to pick a random integer between 1 and 10 inclusive, and compare that distribution to what we see in the first graph.
Nice data analysis. Definitely a selection bias with respect to self-assessment - I am sure many are comparing to 'programmers' they know in the workplace who are not particularly strong and would seldom use a resource like S.O. (and consequently not respond to a survey like this).
This was the first thing that I thought of also - surveying Stack Overflow survey-takers about their place in the larger order of programming threatens sample bias in several ways (experience, community engagement, etc).
> As it turns out, there is no correlation between programming ability and and the frequency of Stack Overflow visits, as the averages and distributions are virtually identical across all groups.
As an infrequent SO participator, this is somewhat comforting to hear.
> As an infrequent SO participator, this is somewhat comforting to hear.
I'd assume that people who visit infrequently are better on average than those who visit often because it means you don't have to look things up. Why would you assume the opposite?
Uhm, I feel your reply is a little bit of a non-sequitur, or at least I'm having trouble following your line of logic.
The linked article says that people's self-reported skill doesn't seem to depend on how often they "visit" SO, which includes answering questions and commenting. Given that this data is from people who actually filled out a survey on the site, that automatically excludes a big portion of the people who are "just looking".
The reason I say it's comforting is because it weakly contradicts a worldview (one I dislike) where "good programmers are visible programmers", and people are often get judged by their blog-posts and their tweet-followers and their SO answers.
What has shocked me since entering the "real world" is how little most other programmers I meet care about programming, their craft, code quality, or learning new things. I was legitimately shocked at how good I was compared to the real world population, as opposed to the "internet programmer" population where I was mediocre at best.
This isn't a value judgment - not everyone is passionate about their day job. But it might resonate with someone out there.