A healthy chunk of that is probably folks who work somewhere where you can't/don't have time to write tests, and keep up with it on their own so they are current when job searching.
Or to scratch their "do it right" urge, to have a repo that they can show potential employers, to learn TDD, to learn a new language they can't use at work but hope to use professionally at some point, to have a repo that they can eventually turn into something revenue generating, or because they contribute to open source projects that require it.
I've written tests on personal projects for all of those reasons at one point or another (except the last).
My current workload is mostly a bunch of data-collect scripts (hitting APIs, building a CSV, loading that to cloud storage and databases), plus prototyping small web apps for internal limited span projects. Testing doesn't make much sense for our current tasks, but I also really don't want to be left in the dust.
I need to start doing this. Any recommendations on where to start? I'd consider myself a pretty experienced python developer, that has not only not done unit testing, but have actively avoided it.
On the web side, any generators that create unit tests I normally delete...I know it's terrible.
I’ll second that. I owe a great amount to those willing to publicly denounce TDD.
If I had been a working developer as this nonsense ramped up, I would have avoided it too. Nobody really does TDD. It’s a mythical concept.
Some sanity checks are fine, and some integration tests. Unit tests are often a pit of time waste, depending on the unit. I aim for my unit tests to look like integration tests.
If there's any consistency to your work, you can always write sanity check "tests". My friend works in a Python shop, and for a lot of their infrastructure work, they'll write stuff like "assert that the database has between at least 20_000 records" in it. Many of there tests were reactions to little things that went wrong at some point in the past, but he's said it really helped improve the quality of their stuff over time!
> 7% of respondents do "software testing / writing automated tests" as a hobby? Bravo to those hardy souls.
All of my hobby code is unit tested. Writing code is easier with unit tests ... why would I make things harder on myself just because I'm doing it on the weekend?
One of my side projects that's also work related but also done on my own time is a cli-focused wrapper around the aws boto3 library. I ended up writing my own test library once it reached that point where, well you want tests before every commit ... I'd hardly call it "hobby" testing though.
Agreed. You don't really need a testing framework in Python. It's nice, and I use one, but a script with asserts (potentially importing other scripts) goes a long way.
I guess you ignored their disclaimer beside the editor statistics?
From their methodology section:
> The data include responses only from the official Python Software Foundation channels. After filtering out duplicate and non-reliable responses, the data-set includes more than 18,000 responses collected in October and November of 2018 via promoting the survey on python.org, the PSF blog, the PSF’s Twitter and LinkedIn accounts, official Python mailing lists, and Python-related subreddits. No product-, service-, or vendor-related channels were used, in order to prevent the survey from being slanted in favor of any specific tool or technology.
If I remember correctly, I believe the JetBrains / IntelliJ / PyCharm brands of the sponsor were very visible when taking the survey.
That might have influenced significantly in the selection of the Python users taking the survey: why answer a survey setup and sponsored by a private company you are not really familiar with and that you don't necessarily trust as a result?
There is also a second selection bias: the people who follow the PSF and the official python.org communication channels are probably nerdier than the average Python programmer. This is reflected in the OS statistics. The survey results suggests than 53% of the Python users do not use Windows at all.
On the scikit-learn online documentation, after removing the mobile traffic we have: Windows at 61%, macOS at 23% and Linux at 15%.
On stackoverflow, the primary OS statistics are: Windows at 47%, macOS at 27% and Linux at 23%.
This is the js survey results all over again: no. Unless you can statistically prove the results are biased, you don’t get to ignore the results because you dont like them.
Finding data points with no methodology that contract the survey result does not invalidate the survey results.
Thats. not. how statistics work.
A great deal of effort was put into this survey, and the stats you’re looking at are more likely biased than the ones in this survey.
The stats and the methodology here are clearly documented; if you want to argue with them, be specific and provide concrete statistical proof for your assertions.
Specifically, why do the stats you have prove anything, and what confidence do you have that they are representative?
I think the point here is people tend to reject survey data because they can only see some tiny minimal subset of the data and it doesn’t match in aggregate.
In this situation often the smaller dataset is wrong.
...not always. But often.
Human intuition based on limited data can seem compelling, but it’s always worth acknowledging you might be the outlier.
18000 respondents is a lot, especially when a specific effort has been made to sample from various sources.
The parent post didn’t even bother to check their own biases.
Another thing that I've noticed is that often what we, humans, feel makes intuitive sense (and therefore must be correct), when you actually look at the data, is often very wrong. Just because it makes sense to us doesn't mean its correct. Basically, humans are naturally biased not only towards certain held beliefs, but also towards things that just "sound" correct, regardless of what the reality is. It makes sense though, if an explanation of something is understandable to us, then we can think about it logically (or otherwise) and come to conclusions, but unfortunately often things that are understandable are also based on flawed assumptions.
I don't know if that's the case here, but I do often think that people gravitate towards the first thing that they can understand that seems to, at a glance, check out, without investigating.
Your statistics don't necessarily contradict this survey. Scikit-learn users are a subset of all Python users. Maybe data scientists are more likely to be using Windows than other Python developers? Its impossible to know from either set of stats.
Similarly, stackoverflow has a lot more users than just Python users, so it says little about how many Python programmers use Windows.
Not really, even though I use IntelliJ for JVM langs I haven't visited Jetbrains website in years. I think I saw the ad for the survey on Stackoverflow or some Python library website(Requests?)
Read the Docs hosts a lot of Python module docs and we ran community ads (meaning free) promoting this survey. There was no mention in our ads of JetBrains - only the PSF.
The ads wouldn't have appeared on the Requests docs but it could have on many others.
No surprise there. But it can be good to get an actual number, to help your data analysis. Presenting this as representative for the python community would be inappropriate, of course.
> Surprisingly almost two-thirds of respondents selected Linux as their development environment OS. Please note, for this question we allowed multiple answers. We’re not drawing primary OS popularity conclusions here.
What makes that surprising? A past result? A preconceived idea?
The surveys 9th takeaway is the same:
> Surprisingly, almost two-thirds of Python developers choose Linux as their development environment OS.
... Why is it surprising? Did you expect an equal split? Did you expect Windows to dominate?
There's no answer here. I have no idea why the result is surprising: it fits what I would expect.
It is surprising to me, for probably similar reasons that the authors expected. OSX has, for a long time, been the OS of choice for arguably the majority of developers that aren't developing specifically for windows (.NET etc). OSX has been obviously popular, though I recognise that in the last year there has been a trend of people complaining about OSX and threatening to leave.
I took the survey and I marked both OSX and Linux, because I deploy (local/docker remote/vm) to linux and spend quite some time in linux shells. I think it likely that a lot of windows users would do the same.
To be crystal clear, I wouldn't think that the majority of python developers are using linux as their primary development machine. I bet next year the survey will focus in on selecting primary OS as a choice.
> been the OS of choice for arguably the majority of developers that aren't developing specifically for windows (.NET etc)
Not necessarily. In absolutely every enterprise Java development environment I've seen in Europe, they're using Windows as the developer OS. Easier to manage using AD.
Sure, you've added another segment that is not linux. The point I was trying to make is that I don't think many people would expect linux (of any flavour) to be the dominant development OS.
JetBrains is so nervous that PyCharm ended up as the most popular Python IDE in a survery that they conducted themselves :)
No worries guys, As a happy CE user, I find it's not only a phenomenal contribution to the overall Python experience but also a huge gesture of goodwill from your part.
I completely dropped PyCharm in favor of VSCode + Kite; PyCharm became unacceptably slow and VSCode fixed their missing pieces; all my Deep Learning/ML workflow is now done either in Jupyter or in VSCode.
I think there's a mismatch between their classification and that of the people surveyed.
The survey characterizes "Scientific development" as "Data analysis + Machine learning", with 28% of the people selecting one of those two latter categories as the answer for "What do you use Python for the most?"
However, only 6% of the users said they were in a company which did science, and only 2% develop for a science industry.
Now, it's certainly true that a scientist can work for a company which neither does science nor targets science research. As an example, an ice cream company may employ food scientists.
It's also possible that people who do, for example, actuarial science might group themselves as working in "insurance" rather than "science".
But it seems wrong to infer that "Scientific development" is equivalent to "Data analysis + Machine learning" without stronger support.
After all, an engineer uses data analysis to evaluate a design, and while engineering is an applied field of science, with a great amount of engineering science to back it up, I don't think many engineers consider themselves as a scientist or as someone doing scientific development.
I think you're expectations of what makes something "scientific development" is a little too high. An analyst using statistical methods - ANOVA, market basket association, association rule learning, outlier detection - is doing ""scientific development". A network engineer working on a congestion algorithm is, too. Postgres? RoR? PHP?
These are all in the realm of scientific development, vis-a-vis non-trivial impacts both socially and commercially.
By "mismatch" I believe that an analyst using statistical methods would typically not consider themself to be a scientist or doing scientific development.
I agree that "science" can be a broad term - I gave the example of engineering as a branch of science. But what use is there to describe 28% of the respondents as doing scientific development if only (say) 6 percent of the people would describe themselves as doing scientific development?
Also, is there nothing else to scientific development besides data analysis or machine learning?
How about this then - if science is a broad term which includes research and development, what sort of software development isn't "scientific development"?
What we have today is the result of scientific development, and software development is among the fields at the very forefront of modern growth. In that context, all of it.
AirBnB? Yup. Uber? Definitely. Facebook? Beyond a doubt. Flickr? Twitter? MySpace? How about open source projects like Apache httpd? Kafka? Linux? What about the guidance software used in the Apollo missions? Think about what the MP3 coding format did for digital media.
It's very easy to take these things for granted, boiled frog and all that.
Then you also disagree with the survey's definition that "Scientific development" = "Data analysis + Machine learning" = 28% of the users.
You agree that there is a mismatch between their classification and yours, and believe it should be 100%.
This supports my argument that their definition is not useful. Rather, it could have been "foobar programmers", and been a more useful as it wouldn't have come with a large amount of existing associations with different meanings.
Yes, especially considering that a lot of the respondents are doing data analytics and therefore probably running multiple different types of analyses and cross checking numbers and results, so testing is sort of baked into their workflow.
Its a bit strange that they consider data analysis and machine learning should be put together, yet, web development, dev ops and automated testing shouldn't. I guess my job falls mainly into web development, but I still have to get data out of the database every so often with custom queries, so should data analysis be included there my job as well?
The Experience in IT bar chart is underwhelming, it would be nice to see the breakdown by incremental years of experience in IT. NONE of the year ranges span the same number of years. Making a basic inference from the percentages and bins it would seem that Python use is negatively correlated with years of experience.
Also you can compare it with the age below. Most people are 21-29 years old, but have 11+ years of experience? I think the meta-physics of the job market make people write bigger numbers than truthfully would apply.
A smarter way might be to ask when was the first month/year that one got paid for python dev work and then calculate the experience number oneself.
The pool of respondents aged 30-39 was 31% and all 30+ was 49%. So 25% of all respondents having 11+ years in IT shouldn't be farfetched. I myself am in my late twenties and I'm not far from 11 years of professional experience.
Kinda surprised (pleasantly?) that Anaconda is the most popular non-official distributor. Though I'm surprised Enthought has almost zero representation. I was considering a move, and though they would be worth the try.
How is it that the largest portion of people responding had over 10 years of experience in software development but the largest portion of people responding were between 21-29 yeas of age?
The largest bucket may have been 21-29 years at 39%. However, the next largest was ages 30-39 at 31%, and considering all ages 30+ you have 49% of respondents. Assuming only people aged 30+ had 11+ years of experience, that would only account for half of them (49% to 25%). That doesn't sound implausible.
All my co-worker kids code in python. My 8 year coded python stuff on a kano for Minecraft. So it seems reasonable that folks would work on more and more projects unpaid/oss before entering the workforce.
I like that python is both super easy and super professional and can be used for both.
Hmm, I spent >10 years with Python and on large/important apps it's hard to refactor and slow. (PS they're still using Python 2 as well). I want to pivot to C# on Linux / .Net Core.
i started as a professional developer at 18 years old (i didn't go to college because i didn't have enough money since i was already living by myself), and i'm 30 years old. so 12 years as a developer (with about 9 mainly with python).
i know at least another 5~6 people with the same career as me (bar the python experience).
I use Python as a prototyping tool only, and I mean prototyping in the sense that the code will always be thrown away and rewritten in a more suitable language.
Python sacrifices so much in the way of efficiency, safety, and maintainability that it's hard to justify the wins in expressiveness. Easily checked errors that should have never made it to production manifest at runtime even after rigorous testing. Increasing performance requires moving to asyncio which drastically reduces readability and limits the code reuse from non-async libraries. Eventually code bases grow so large mypy becomes a must, and we have to contend with interfacing untyped libraries and refactoring all of our existing code with annotations.
Python has its place, but in line with how scary modern software practices are to me, it's crazy that Python has become such a mainstay in production software.
Maybe you have the luxury of rewriting that others don't have. Do you rewrite to assembly? That's going to be really fast you know!
With a proper test framework, people can and are building large scale applications. There are plenty of ways to optimize performance without a complete rewrite. Python is really Python+C to a large extent.
Prototyping is something that should be time boxed and usually costs me a day to a week of time at the most. Experience tells me that most applications do not survive the first iteration very long without prototyping, and Python lets me maximize my time spent prototyping.
As for your jibe, my original point was that there is something between Python and Assembly. There are many languages that provide all or many of the nice things that Python gives us that are staggeringly more efficient and safe.
Python is really Python+C if you’re willing to write C extensions, which completely negates the point of using Python. C extensions also don’t maintain safety, or solve Pythons shift of compile time bugs to runtime. It really only addresses the efficiency losses by making you write code in another language.
People can and will build successful applications in all sorts of crazy ways. It’s a fallacy argument that just because something is done a certain way, that must mean there are no better ways that could save us time, money, and sanity.
Python is not perfect and languages do evolve. Maybe Rust or Go or something new will take over the world. But there's a trade-off between developer time and execution time and you can't focus on just one. There's also the benefit of a large ecosystem of libraries (and talent, if you are a large enterprise) when you embrace Python that may not yet be available in some other newer languages.
Prototyping usually involves working with existing code base and the ecosystem, although if you are building a brand new product or application, your context may be different.
It might be scary to you, but who runs Python at scale?
youtube ran it for quite a long time.
reddit
yelp
Those sites seem mainstays to me, with good load. Perhaps performance is a problem for you for the server load and costs that you want, but it does allow you to iterate quickly to get your product out the door.
Last thing though: Checked errors. I think that's largely solved with type hinting (along the lines of TypeScript) where you get some guarantees.
Stacks of choice though for me is really Python and Go. I would use Go for when I truly need that raw speed with a lot fewer servers in Go in terms of horizontally scaling.