> And the majority of Python code written does not run under any of the 3.x interpreters. This makes it harder for its users to be productive.
What a load of bollocks. For new projects this only matters if libraries aren't ported, which they are for the most part. For old projects, either you're in a situation where you can spend time porting your code to Python 3, or you don't; but as TFA mentioned pep-404, the writing has been officially on the wall ever since 2011 so at that point you have to admit you did choose to incur tech debt and do nothing about it, so the claimed loss of productivity is on you.
> Unlike 2.7 code, Python 2.8 wouldn't be able to guarantee exact 3.x compatibility, since there are some python scripts that will run under both Python 2.7 and Python 3.x but produce different output, and Python 2.8 chooses the 2.7 behavior in these cases.
What a terrible, terrible situation. Now you'll have "python" code that will neither run on 2.7 nor run compliantly on 3.x. As for the latter, please explain how that will alleviate anything on the following point, since behaviour at runtime will be subtly different:
> adding these remaining Python 3 features would greatly simplify running code targeting Python 3, and allow people to use Python 2.8 to run a mix of Python 2 and 3 code.
I don't know what recourse the PSF has but maybe they should even go all in and defend the "Python" name so as to prevent confusion and stop a potential community fracture. Just call it anything else but "Python 2.8" is not Python.
> I don't know what recourse the PSF has but maybe they should even go all in and defend the "Python" name so as to prevent confusion and stop a potential community fracture. Just call it anything else but "Python 2.8" is not Python.
This! A thousand times! I love open source and free software. I absolutely love the fact that you can fork the code and adapt it to your needs. If you find others who like it great! But please, don't use the Python name! It will create more confusion than help. This fork with a different name is 100% fair in my opinion.
(Replying to the top-ranked comment so that as many people as possible see it)
While I wish Naftali well in his efforts - I have a private Python-derived language myself! - this is not "Python 2.8." For trademark purposes, "Python" is only what is released or endorsed by the PSF.
We have already reached out to Naftali and asked him to change the name of his project and update this blog post accordingly.
Obviously, though, this is someone who cares a lot about Python, so let's be sure not to rain down on him with a lot of scorn; I admire that he was willing to sit down and 'scratch his own itch.'
"I don't mind renaming this project. Any other suggestions for good names? I personally like "Pythonesque (/usr/bin/pesque)" the best so far, thanks @dbohdan! :-)"
- The Author (who isn't me)
https://github.com/naftaliharris/python2.8/issues/47#issuecomment-266240525
>What a load of bollocks. For new projects this only matters if libraries aren't ported, which they are for the most part. For old projects, either you're in a situation where you can spend time porting your code to Python 3, or you don't; but as TFA mentioned pep-404, the writing has been officially on the wall ever since 2011 so at that point you have to admit you did choose to incur tech debt and do nothing about it, so the claimed loss of productivity is on you.
I call BS (to counter your "bollocks").
Whether the "writing was on the wall" or not, doesn't change the fact that people had to actively port their old code if they wanted it to run on 3.
Sometimes that code could run into the tens of thousands (or even millions for large companies) of lines.
And why would they do it (and at a great cost and time effort)? For the marginal improvements Python 3 brings?
The "writing has been on the wall" is not an excuse, it's mostly blackmail ("port or else you wont run on 3, and we'll stop the 2.x line"). And most people didn't (and shouldn't) fall for that.
This argument is a reasonable one and is why we all support IE6 for web dev.
However, at a certain point it is worth your time to move forward instead of doing nothing, you gain a little time savings now and you run into few moments of "Oh @#$%^!!" later. Real world example: You don't bother updating ssl to deal with weak DHE and suddenly chrome users can't see your payment site.
My approach has always been to try to front load the work instead of doing it in crisis mode later. It sorta sucks but that's just how software is right now.
All changes are controversial. In a large enough group, there's no way to make everybody happy. The question is whether the new arrangement makes more people happy long term. Judging by the rate of Python 3 adoption, it took a long time indeed, but it got there.
As for Python 2... well, there are still people signing petitions for Microsoft to bring back VB6. Last one was this year, I think.
There's no controversial changes in python3? Except if you consider print() controversial but that's so silly it's laughable.
There are however non backwards compatible changes, like unicode by default, IE7 was also non backwards compatible so the comparision still holds. (with the exception that IE had a compatibility mode if you sent some magic http headers)
It can be silly, but that was one of the reason I picked Ruby over Python 5 years ago for a project. I felt at the time, Python is awesome, however they are taking a weird path.
> it's mostly blackmail ("port or else you wont run on 3, and we'll stop the 2.x line").
Would you also call the RHEL life cycle a blackmail? I'm using version 5 now and the normal support ends in March 2017. My options now are "port or pay extra for extended life cycle or else my RHEL will be without security fixes". And like Python, major RHEL versions break backwards compatibility.
Yes, and that's why you pay for it. Yet here is somebody complaining about that.
If you e.g. can't be bothered to do continuous integration or automated testing, then you might consider RHEL with it's life cycle to be an acceptable alternative. Which is fine. Just be ready to pay for that service.
Similarly, if you wanted continued Python 2 support, you could have donated time or money towards that goal. I would be surprised if anybody complaining did that. There's just not that much business value in dragging legacy Python further along.
We have several large projects that are written in Python. Most of these aee production applications that are critical to what we do, and the others are libraries and tools for internal work. We haven't even started thinking about porting these to python3. We have so many other things to worry about (but fixes, new features, etc) that it's hard to justify the time investment to port these now. I can't imagine we're the only ones in this situation.
You're not, of course. And there are similarly many people running production critical code written in Perl 5 on RedHat 9 or something like that. "If it's not broken, don't touch it" is a wise rule to follow for that kind of stuff.
But to keep it running, you don't really need Python 2.8 with new features, right? You need extended support for Python 2.7 - basically, making sure that it keeps working with updated versions of other software (like OSes), and that bugs are fixed.
>But to keep it running, you don't really need Python 2.8 with new features, right? You need extended support for Python 2.7 - basically, making sure that it keeps working with updated versions of other software (like OSes), and that bugs are fixed.
Those systems are not just sitting there untouched.
Heck, not even 70s COBOL systems are "just sitting there" (they are hooked to newer systems, get new forms, have alterations, etc. all the time), and those Python 2.7 systems have been written 10-15 years before or less.
And they continue to get new subsystems, new features, alterations, etc. In 2.7.
So, yes, people would very much like to get not just "extended support for 2.7" but also the ability to keep running it in newer versions, and be able to take piecemeal adoption of new features to make their life better and eventually organically refactor in their own timeline.
The "writing has been on the wall" is not an excuse, it's mostly blackmail ("port or else you wont run on 3, and we'll stop the 2.x line"). And most people didn't (and shouldn't) fall for that.
So let me get this straight.
1. A bunch of people you've never met and probably have never paid or financially supported,
2. Gave you a high-quality programming language, for free, to use for any purpose you liked,
3. And then when you and they disagreed about the best way forward in a new version, you claimed their refusal to continue supporting and adding new features to the old version for you, for free, essentially forever, constitutes "blackmail" on their part.
1) You frame this as some single random individual on HN is the only one that is concerned with the switch.
2) You seem to have missed that companies and individuals that do dislike the switch have contributed to the Python ecosystem, from employing core developers in the past, to creating frameworks, libraries etc that helped Python succeed.
3) You have missed the fact that some (a lot? most?) of the concerned people have actually donated to the PSF through its PayPal donate link (as I've done in the past, and I've used Python since 1998).
4) You seem to think that an open source community project is pretty much "anything goes" and end users be damned. And then the team can complain about "lack of adoption" for the new version.
So, do you still think it's "blackmail" when something you were getting for free decides to no longer support the version you like?
Python 3 adoption has been rising for a couple years now as people realize that A) Python 3 is a quite nice language, B) porting to Python 3 is not as hard as people keep claiming it is, and C) Python 2 is going to run out of zero-dollar-cost support one day as the number of people willing to support it without being paid for their trouble diminishes.
If someone does want to commit to supporting Python 2 + backported Python 3 features, they are of course welcome to do so provided they observe the license and trademark terms (not terribly hard to do). But I suspect it won't last very long, at least not as a small-team zero-dollar-cost project. Between Python 3 gaining steam and people staying on 2 in order to avoid work and expense, I just don't think it's going to work out on the kind of decades-long horizon the Python 2 die-hards seem to want.
And this captures the good and the bad of open source all so succinctly.
Here we have a person (the author) who has rejected the path that an open source project has taken, and invested the time and energy to move the source along a path they prefer.
In this particular case, there is a natural constituency of people who share that desire but are unable or unwilling to put in the effort to push the source down the path.
When there is critical mass, that group forks off and begins to bring other people along to the alternate path.
At that point the people who endorsed the change in direction come out in force to yell at these people who aren't doing what they are supposed to and threaten them and implore a higher power to emasculate their effort.
Sometimes that works, sometimes it doesn't. But it always results in massive amounts of confusion when someone new comes to the community and sees these two different paths for the same thing and can't really figure out why they are different.
Further because there is no mechanism for "righting" the ship as it were, the diverging paths lead to a lot of wasted time and effort on everyone's part. This happens to be a Python fork but its happened to window systems, video codecs, graphics libraries, data bases, hell even C compilers.
The nice thing about a Cathedral is that the Pope keeps the Cardinals toeing the one and only line.
I came here to say exactly this ... you're better off upgrading your systems as you can. There's one point missing above - do you really want to use a version of Python that's maintained by one person and of unknown quality? You're better off staying on 2.7 if you can't afford to upgrade.
Yes totally agree. In the end this won't matter because the momentum of the ecosystem is so great at this point but it just baffles the mind that someone would think this is a good idea, especially with the "can't run on 2.7 or 3.x" situation.
> the writing has been officially on the wall ever since 2011 so at that point you have to admit you did choose to incur tech debt and do nothing about it, so the claimed loss of productivity is on you.
When Python 3 was released, it offered Python users a trade: In exchange for a productivity loss (porting your Python 2 code), you'd get a productivity gain (new features in Python 3 and removed cruft). Some projects and companies thought this was a good trade, and have upgraded over the years, and many have not, and haven't. The interpreter I've been working on tries to improve on the terms of that deal for people who have not switched to Python 3.
> What a terrible, terrible situation. Now you'll have "python" code that will neither run on 2.7 nor run compliantly on 3.x.
That's the point, yes. Obviously any interpreter that's backwards compatible with 2.7 but includes new features from 3.x is going to let people write code that doesn't run under 2.7 or 3.x. But what does it matter if your code doesn't run under interpreters that you aren't using and don't intend to use?
There's a lot of good names based on Monty Python properties, but I like "Cobra", starting at version 2.8. Ignoring the MP stuff and going for something on the snake theme.
> What a load of bollocks. For new projects this only matters if libraries aren't ported, which they are for the most part.
Except when they aren't. And then what?
I've run into this multiple times. Sometimes there's a branch of the project for 3 that's underway, and I sit and wait. Other times it means dropping the project or committing to reimplementing a library.
Except it kinda is the current version to lots of us. I moved to Python as a hobbyist from .net languages and loved the freedom of not having an IDE and working with Linux. The first decent book I read was on Python 3 so I learned Python 3. Lots of us 'newcomers' (not so new in my case) learnt on Python 3, find perfectly good library support in Python 3. In fact the 'old guard', sound a bit like my Dad talking about how old cars or pre decimal currency to me. I just don't find myself hitting problems I can't solve on Python 3, that I could have solved under Python 2. To be fair I have a pretty minimal amount of code in production, but in each case it is not so monolithic that I couldn't have some of it using Python 2 and some using Python 3 or even some other language for that matter.
Python 3 is not only the future but the current version of Python. It is the version kids learn in School (in the UK kids do some CS from the age of 6 or 7, starting on scratch and then normally Python), it is the version colleges teach.
However there are lots of reasons enterprise users may want to use a legacy codebase. It is not like Python 2.7 is about to stop working! When a section needs a major re-write, then consider porting it. I don't see how this is different to any obsolescence problem. I know an enterprise software company that wrote a lot of stuff in VB6. Some of it is still in VB6 and they have to manage everything that means (especially around 64 bit architecture problems), when they do major updates they use .net. How can we be in the technology game and not just except that life moves on!
The fact that there's a huge split in the community over the issue shows just how divisive Python 3 is. That said, 3 > 2 in version number doesn't make it better or more "current" (and I've seen a number of projects where the "latest" version wasn't even the latest - it was often an experimental). Yes, you could do just about everything you need to in Python 3 that you can do in Python 2, except that it can be much more difficult depending on what you're doing.
e.g. this guy -> http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/
I found it's easy enough to add the line:
# -- coding: utf-8 --
to the top of my Python 2 files so I can get UTF-8. That, and Python 2 has the bindings for GTK (which I like to use).
Both versions of the language have their usage, and to each his own. There's no sense in bickering about it.
I do think the split is in use-cases. I've worked in film at both small and very large places. You can get by very easily ignoring things like color management and frame rates when you're doing small, homogenous work. As soon as you need to take it seriously, the only real way to handle these things is to tag and/or convert these things at the perimeter so you can reliably handle things internally in a consistent way, then convert on the way back out. Even knowing this upfront and being highly motivated, it can take companies years to transition with pain in the meantime.
Text handling is the same. The thing is, many people just deal with ascii compatible English so they don't realize this is a problem for other people and aren't motivated to change. The reason both sides can't just do their own thing (i.e. Python2) is because of libraries and shared code makes it miserable for people using other character sets (most of the world or any company growing bigger than a certain size).
Indeed. One has to wonder, then, about the controversial nature of Python breaking backwards compatibility for the sake of improvement. People are down on "Python 2.8" for changing the name, but the fact that there is no standard set-in-stone led to these problems to begin with. People want a reliable standard, so are the authors at fault for breaking that implied standard and carrying the name with them? Should they have changed the name to PythonU? Or do we always defer to the author's right to change both the VM and the implied standard at will? It's a human-party-pleasing problem because the changes obviously hurt some people (those having to rewrite libraries, which may be easy or very hard and time consuming) for the sake of helping others (who would have time-bomb buggy programs in 2). shrug
It's worth noting that Microsoft is still keeping VB6 on life support, in a sense. The tooling is not guaranteed to work on modern OSes (and there is a bunch of actual breakage, although community has found workarounds so far).
But the runtime is still supported - in fact, it ships with the OS! If you have any non-ARM version of Windows, up to and including Win10, around, check the file named msvbvm60.dll in C:\Windows\SysWOW64 - that's it ("MS VB VM").
And because it ships as an OS component, the official support policy is the same as the rest of the OS, which is at least 5 years of mainstream support (longer if there's no successor release), and then at least 5 years of extended support. This is even clarified specifically for VB6:
Since VB6 was first released in 1998, this marks 18 years of continued support to date; and if it's not dropped from the OS within the next 2 years, it has a chance of hitting 30 years...
For what it's worth, PSF also has a fairly generous (especially for a non-commercial OSS project) support policy for Python 2.7 - it had already extended the end-of-life date for it once to 2020:
That's the gamble you took when you chose to go all in on Python3. I personally didn't buy the arguments made and chose to stick with Python2. If I migrate anywhere, it would be to a Python2 compatible fork or something like Go.
> What a terrible, terrible situation. Now you'll have "python" code that will neither run on 2.7 nor run compliantly on 3.x.
I don't say you are wrong, but I am afraid this situation looks "terrible" only to people who do care about 3. If someone doesn't care about it and thinks that he can survive with never porting to 3 or start using it, for those people the situation isn't that terrible...from that perspective, his 2.7 language evolved to next step, and he know that new features can be used if he upgrades from 2.7 to 2.8.
> What a load of bollocks. [Snip] For old projects, either you're in a situation where you can spend time porting your code to Python 3
The majority of Python code is old projects, just like every other established language. This might change in the future as I now see people starting new projects in Python 3, but if your company is older than 5 years old, then there is a good chance that you started with Python 2 simply because at the time of creating your codebase a whole lot of libraries weren't ported to Python 3.
I currently work for a client who has decided to shift away from PHP and towards Python. They had a monolithic PHP app with perhaps 250,000 lines of code. Now we are developing a series of Python apps in the microservices style. We've decided to develop everything as Python 2.7. We are not looking at Python 3.x. There are a few reasons. Some libraries that we want are in Python 2.7. And Amazon only supports 2.7. And we are not wild about Python 3.x's attempt to imitate a classical object oriented style.
We would look very closely at a Python 2.8, if it existed.
If I was your client I'd be pissed that you decided to rewrite my code into a legacy version of Python. Make no mistake: Python 3 is the future of Python. There will be no version 2.8 and there is no going back to 2.7.
Also, I don't know what you mean by "Amazon only supports 2.7" because boto (the main client for Python) has supported Python 3 for 2 years now. Perhaps you mean Lambda?
+1. Also it's easy to write code that supports both, so if you really need python 2 support right now (surely AWS Lambda python 3 is coming soon, you can already use it unofficially) that's a much better option than being entrenched in python 2 (mainly the string handling is the issue for buggy code that'll run in 2 but not 3).
Breaking it down to microservices, could you not have some parts as 2.7, like any that need a specific library that does not have Python 3 support, and some in 3. Or for that matter GO, or RUST?
Software typically evolves and grows and changes over decades. There very often isn't any point where you can go "for our next project we choose X". Each new project is a feature using 90% of some existing huge codebase.
"Starting fresh" is something many (most?) companies simply never does, over decades. (And if they attempt it is often an all out disaster..)
2011 is fairly recent in this context, and many popular libraries were not available on Py3 until much more recent than that, even if you have the rare luxury of starting fresh.
That has been official since 2011, but even today you may need some lib only available in Python 2.x and thus may need to start a new project in 2.7. I started several python projects since 2011 knowing that it was a dead end, but my hands were tied. At the time there was not even a working wsgi spec, and no web frameworks for 3.x, and the first ones has an awful performance (2.7 is bad enough).
Only recently 3.x has become a viable alternative. I for one welcome this 2.8 fork.
As others have said, maybe this project fixes some actual problems and backports some features from 3, but this isn't "Python". Beyond the fact that Python is a trademark of the Python Software Foundation, Python is more than the language, it's the community and the tools (as with every programming language). So while there are some vocal people that really dislike Python 3 (either in part or wholly), my understanding is that with the planned phase out of Python 2 and Python 2 only receiving bug fixes at this point, much of the industry is transitioning to Python 3 (either currently doing so or planning to) and so it seems relatively fruitless to attempt to build upon Python 2. I personally think the effort put into this would be much better spent making tooling around Python 2 to 3 transformations.
I also think it's pretty irresponsible of the author to call this Python 2.8, because it may cause confusion to developers unfamiliar with the history and come from a tutorial that is still in Python 2 (it does show up on the first page of Google for me). It's also especially irresponsible and hubristic to attempt to make a language that is seemingly compatible with both Python 2 and 3, because 1) I trust that if it was possible Guido and the other developers would have made it, and 2) it can cause significant confusion when code doesn't work when it hits an edge case, and then the whole tooling around it can't be guaranteed to work. The last thing I'd want in my programming language is unaccounted for ambiguity.
> It's also especially irresponsible and hubristic to attempt to make a language that is seemingly compatible with both Python 2 and 3, because 1) I trust that if it was possible Guido and the other developers would have made it
It is possible actually, that's kind of the point! The interpreter I've been working on passes the 2.7 unit tests (i.e. those in Lib/test/), and as well as unit tests for the new features that have been backported from Python 3.
Even if you don't believe me, it's interesting to note that, e.g., while Python 3.0 was being developed, function annotations and keyword-only arguments coexisted with tuple unpacking. I built the code and ran it myself, in fact: https://twitter.com/naftaliharris/status/784421498291310592. Tuple unpacking was actually removed later, introducing the backwards incompatibility after the new functionality had been added. Timeline:
Oct 2006, keyword-only arguments.
Dec 2006, function annotations.
Mar 2007, removing tuple unpacking.
There was also a promising backport of keyword only arguments to CPython 2.6 (!) that was never merged, (http://bugs.python.org/issue1745), due to lack of follow-through.
I don't know about that. It forked a Python compiler, and is fully interoperable with 100% of Python 2 code, and much of Python 3 code. It's even compatible with Python C extensions.
Why isn't it a valid Python compiler?
To me, the whole morass about trying to end-of-life Python 2 is a bit silly. People have gotten emotional about the situation.
On one side, people like Zed Shaw are calling the Python maintainers 'evil' and claiming conspiracy.
On the other side, people are calling companies using Python 2, 'lazy' and claim they're a threat to the ecosystem.
Yet elsewhere, C is still being written in all of its various year-specific formats, and people end up using 'old' versions simply because they join pre-existing projects or need to totally interface with something that's written in an 'old' version.
Python is an extablished language, it's likely that 10 years from now there will still be Python 2 codebases going strong.
That's the thing - people expect core developers (some of them working on the project for free) to keep backporting crap out of the kindness of their heart. It's like asking Microsoft to backport security fixes and .Net features into VB6.
There are some fair points, but unlike C or other languages that have "old" versions, most of the newer versions are compatible with this old code (as in, if you have some C89 code, you can compile it in the newest C compiler. Same with Fortran). This isn't the case with Python 3 (there are breaking changes), and I think it's fair that the Python core developers who, besides Guido, probably work on this for free decide that it's time to end the older, non-compatible version and give ample time for developers to move their codebases, adding bug fixes and security fixes in the mean time until EOL is reached.
My biggest gripe with this project besides calling this Python is that it's seemingly ambiguous with its code compatibility. I don't mind ambiguity in programming languages, but generally the ambiguous cases are explicitly defined with cases to explain them, and I don't see anything of that nature here, only something that vaguely says that if there's something that works in both the Python 2.7 way will be the default. Without defining those it's hard to know what could happen in an edge case and this could introduce specific bugs that don't present themselves immediately but introduce data weirdness because the cases where something may be ambiguous wasn't defined.
In any case, I think that if a company has a really big, maintained code base in Python 2, it's their fault for supporting an older, in-2020-unsupported version of a programming language and the money/developer time spent supporting the codebase could be spent transitioning it to Python 3. I can understand a little more with an open source project because time is more precious and generally that time is donated, but even then most bigger projects (Numpy, Scipy, Django) have moved to Python 2/3 compatibility so unless the project is gargantuan there's no real excuse besides the project is not maintained.
They transitioned with a lot of community support and did sooner the course of years. It wasn't easy to support 2/3 out of the box. For a normal company, staying on 2 is like keeping technical debt and we all know how companies loathe to allow weeks for major refactoring when the gains aren't immediately visible. My company switched to using puppy and eschewing c extensions before we supported python 3.
As I said, it's my understanding working with Python as well as seeing what others work with in the community and so it could be wrong. Do you have data to back up that people aren't?
I make software that people can write plugins for in Python. After months, years of struggle we finally dropped support for Python 2 because our small team could not bear the overhead of maintaining two bindings. We work a lot with researchers in signal processing domain and we have hard time as it is to get people to use Python 3. Please, do not put obsolete software on life support.
If you have several large software products rolled out and churning away at hundreds of customer sites, moving from Python 2.x all the way to 2.7 alone is a slow and tedious process of tests and deliberations. And we're still not talking about going all the way to 3.x which breaks things in even more new and exciting ways.
So scoff all you want, but Python 2.x isn't going away that soon.
If moving to a 2010 version of your programming language is slow and tedious you're doing something very wrong.
In the Java world (conservative and slow-moving) JRE 7 (2011) is considered the absolute minimum, and if you're not targeting JRE 8 (2014) you have to have a very good reason.
I think this is an inherent problem with dynamically typed languages. There are no reliable refactoring tools so even the slightest non compatible change can be lurking anywhere for years. And with things like meta programming, monkey patching and relying on private members, even changes that are supposed to be backwards compatible might end up to not be.
Try to move with half a million LoC of dynamically typed code with a handful of developers with not a single update breaking for any customer and I will be impressed.
Obsolete is a funny word to use. In this case, it would mean that Python 2 is in good working order, but is no longer wanted. That's bound for a flame war, because:
- There is a community that wants it (largely enterprise).
- The Python team does not want it.
A less controversial word is deprecated - the Python team is discouraging use of Python 2, but not prohibiting it's use or development. That's fair, and if you read this page:
they are not very opinionated about it, largely saying "Use 3, unless you can't, then use 2 and start trying to migrate, unless you can't, then just use 2."
I will say, not to give somebody a bad day but, 2.8 seems like a bad idea. Currently python's development has still largely been a straight line, which is good for transitioning, but 2.8 would cause a fork. It would give a lot of people a short-term win for a long-term lose. Better not to tempt people.
Obsolete was a wrong word to use, I admit that. But from an integrator's perspective supporting both versions is a mess. The problem is that the interpreter has the same name (python), the libraries export the same symbols (well, same names, different signatures for extra fun) etc.
Like you said, Python team sees the 3.x series as the successor AND as a replacement for Python 2.x. They were never meant to exist one beside the other (or, there was no thought put into this before the release).
From my perspective, giving people the choice between 2 or 3 will only give us problems down the road, which is why I vehemently discourage it.
I wonder what about this made this difficult. Was it because it's a language interpreter? Libraries have this problem sometimes, but not as much. (I never hear of issues with gstreamer between 0.10 and 1.0, for example.) Maybe it was just that a binary called python existed? Maybe we should have just said "screw it, python means python2, end of story."
Well, in my ideal world maintainers would have put all possible effort to porting libraries to python 3 and put python 2 versions into legacy mode (e.g.: security updates, fork it if you want to continue on the 2-branch).
In my field what seemed to keep people on python 2 for a long time was numpy or scipy (or both, I do not remember which) which did not get a 3 upgrade for a long time.
Either that, or just call it something different, kind of like perl6. There is no perl6 distribution shipping a perl library or some perl.dll that clashes with perl5.
>Please, do not put obsolete software on life support.
Regarding adoption, it's 3.0 that's obsolete, and 2.7 that's vibrant. Even for new code (they conveniently only count totally greenfield projects, but most new code is written in fact to work with established 2.x codebases under Python 2, not as a totally greenfield project).
You keep saying things like this but numbers don't bear that out. Since we switch to Python 3.5 for new code, I can barely tolerate working in 2.7 now. It went from feeling "vibrant" to feeling "OMG this is legacy" in about a week. I would never voluntarily go back.
There is a clear selection bias here, which is revealed in the first response (unless that was what you're pointing people towards). IDEs are a lot more common on Windows, which has the best adoption rate. The data from the two sources in that comment point towards a large majority of users still using 2.7, which agrees with my experience (which is in the scientific community). The number one reason being that there is no incentive. There is a lot of that attitude is common, "if it isn't broken, don't fix it". At this point we need to recognize that python2.7 isn't going to die anytime soon, unless there is a drastic change.
While the claim of selection bias in JetBrains' survey may be argued, it is still a valid data source, which is what the gp was asking for.
Also, your anecdotal data is arguably biased as well.
My take is that many sources, including the ones linked to in the tweet's replies point to solid growth in Python 3 adoption. Python3 might not have overtaken Python 2 overall, but it's very far from being "dead".
>My take is that many sources, including the ones linked to in the tweet's replies point to solid growth in Python 3 adoption. Python3 might not have overtaken Python 2 overall, but it's very far from being "dead".
I wasn't making the point that Python 2 is dead, in fact quite the opposite. I'm saying that with this many users the adoption rate is too slow. There is no real reason for people to switch. Until there is that incentive Python 2 will not die.
I am glad that you explain why you downvote, but I disagree.
The fact that people love and use something doesn't mean it cannot be obsolete.
At work, I care about more than 35 years old software. It is obsolete (it's written in mainframe SAS with 3270 green screens and some assembly), but people still love using it, mainly because there is no good alternative and it does the job very well.
What makes something obsolete in your eyes then? Just because some people want A to replace B, that makes B obsolete?
For reference, Oxford dictionaries define (..."define"? are multiple dictionaries involved here?) "obsolete" as:
1. no longer produced or used; out of date.
Clearly Python 2.7 is in widespread use, and version 2.7.12 came out just a few months ago, so it's neither "no longer produced" nor "no longer used" nor "out of date"...
Python 2.7 is outdated by Python 3.5. The fact that there is a bugfix release doesn't change that.
I mean look at other things. You can still program in C 89 or FORTRAN 77 or COBOL 74 (and no doubt there is somebody still supporting compilers and runtimes for those), but they are all obsolete standards.
Addendum: I think for standards like programming language semantics (which in case of Python is directly embodied in the C implementation), "obsolete" means there is a new standard by some official body (say, the developer of the old standard) that addresses shortcomings of the old standard. So "out of date" is the fitting equivalent of "obsolete" from the Oxford definition.
>Python 2.7 is outdated by Python 3.5. The fact that there is a bugfix release doesn't change that.
That's just what the lead project team declared. Not what the user base asked for or wants.
>You can still program in C 89 or FORTRAN 77 or COBOL 74 (and no doubt there is somebody still supporting compilers and runtimes for those), but they are all obsolete standards.
That's because people stopped using them organically. That's not the case with Python 2 -- Python 3 was declared "the new hotness" with a decree from above.
It's like as if the W3C comes out with some incompatible HTML NG on their own and says that HMTL 5 is "end of line", giving billions of webpages the middle finger.
Even worse, it's also as if HTML NG only had some marginal improvements over HTML 5, and was otherwise the same.
> That's just what the lead project team declared. Not what the user base asked for or wants.
You think they are doing it just for kicks? There are no issues with Python 2? They are also part of the user base, and they did it for a reason.
> That's because people stopped using them organically. That's not the case with Python 2 -- Python 3 was declared "the new hotness" with a decree from above.
Well, I for instance stopped using Python 2 when Python 3 came out, if it was possible for me to do so (I had all the libraries I needed). I understand that many people can't do it, but people are organically moving from Python 2 to Python 3, not the other way around.
Nobody is really forcing you to not use Python 2, just as nobody is forcing you not to use FORTRAN 77 or COBOL 74. It's just that the language will not evolve anymore, and as far as runtime goes, you will be on your own eventually.
> It's like as if the W3C comes out with some incompatible HTML NG on their own and says that HMTL 5 is "end of line", giving billions of webpages the middle finger.
I think I already addressed this in my other comments.
Yes. From a false sense of "we know better than you what's good for you". And also from not being connected to actual business and end user needs.
>There are no issues with Python 2?
That's irrelevant. There are issues with Python 3. Besides, the issues that Python 3 fixed over 2 are marginal at best and most could be retroffited to 2.x (as this 2.8 release proves). Nothing earth shattering to make the transition worth it.
>Nobody is really forcing you to not use Python 2, just as nobody is forcing you not to use FORTRAN 77 or COBOL 74. It's just that the language will not evolve anymore, and as far as runtime goes, you will be on your own eventually.
It's more likely that Python will suffer from people moving to other languages (and already, first Rails and then JS have won the server side over Python big time, and JS looks poised to be more general use too), than that anything good will comes out of this "you're free to use 2.x, it just wont be updated anymore".
> And also from not being connected to actual business and end user needs.
I am not really personally bothered by Python 3 being incompatible (with that one Jython exception that I already mentioned). But I would like to point out the comment https://news.ycombinator.com/item?id=13146127, I think you're the one who is wrong here.
> It's more likely that Python will suffer from people moving to other languages
Unlikely. Rails are probably going out of fashion. Javascript is a terrible language, which only saving grace is a decent support in browsers. I am not sure for what other reason, choosing a language today, I would choose Javascript over Python 3.
So people moving from Python 2 are most likely to end up with Python 3, I don't really see compelling alternative for them (unless they are going to something more functional like Clojure or Haskell or Scala, but that's entirely different discussion; for example I like Python a lot but I feel pure functional is where the future is, I find the imperative programming quite annoying these days, I would prefer Haskell, but frankly, I am not nearly as productive in it as I am in Python, because Python's focus on usability is very hard to match by any language).
On the other hand, I think Python 3 will actually gain in science and data analysis thanks to things like @ operator for matrix multiplication.
I see no future for Python be that 2 or 3. It's not great at anything, but projects a veneer of friendliness (that one should quickly outgrow) on top of a pile of bad implementation decisions and terrible design.
Its popularity is based on superficial attributes rather
than solid foundations. Eventually, the entire ecosystem will collapse and the masses will flood to the next attractor.
Julia can be a good competitor to Python, but far in the future. Now it's just not there yet.
Go is interesting, but really a different (and perhaps smaller) use case. What, for instance, I do in Python? That little one-off script that converts one thing to another or calculates something - I am not sure why I would even bother thinking about Go.
I have no doubt that at some point, Python will be replaced by something. But I don't think it will be any of the languages that are currently in widespread use. Heck, C is also not based on solid foundations (I mean like type theory or something), and it wasn't fully replaced yet.
> You can still program in C 89 or FORTRAN 77 or COBOL 74 (and no doubt there is somebody still supporting compilers and runtimes for those), but they are all obsolete standards.
The situation with C89 and Fortran 77 is completely different than what you see today with Python 2 vs Python 3. For 99.9999% of C89 and Fortran 77 code you can build the old code with new C and Fortran compilers and use it from today's standards. You can take a piece of code written 30 years ago, recompile it and it usually works.
I don't disagree it is different. My argument was that a new release of compiler for an obsolete standard doesn't cause the standard not to be obsolete.
I am pretty sure there is theoretical way to run Python 2 code alongside Python 3 code, but they simply decided it's not worth the effort.
Edit: I think historically Python 2/3 divide is more akin to Maclisp/Common Lisp divide, but in the latter, the situation was even more complicated. But I doubt you can just take Maclisp code and run it on Common Lisp implementation, despite that fact that it was meant as a successor.
Thanks for bringing up Fortran 77. Lots of F77 code is in use today through R and SciPy bindings etc. -- millions of users every day, thousands of compilations every day. Just because new code is not written in it, it is very much in use, nobody wants to rewrite solid code that has stood the test of time.
The word you are looking for is "deprecated", not obsolete.
> Addendum: I think for standards like programming language semantics (which in case of Python is directly embodied in the C implementation), "obsolete" means there is a new standard by some official body (say, the developer of the old standard) that addresses shortcomings of the old standard.
Really? So even if no one ever uses it, it still renders the old one obsolete?!
> So even if no one ever uses it, it still renders the old one obsolete?!
This is a strawman, because I don't think this ever happens (feel free to give an example). There will always be people who try to use new standard; they may abandon it later, but they will at least try to use it.
In any case, this is not really relevant to Python 3, which is used plenty and more and more every day.
>> So even if no one ever uses it, it still renders the old one obsolete?!
> This is a strawman, because I don't think this ever happens (feel free to give an example). There will always be people who try to use new standard; they may abandon it later, but they will at least try to use it.
...I thought it was obvious I didn't mean the case where literally NO ONE was using it, but apparently it wasn't. Sorry. My point was, if it doesn't catch on, then does it still render what came before it obsolete? Is it only about time and whether it fixes some things from before? Not about whether it's actually used, or whether it introduces other problems, or whether the previous technology is still in widespread use, or a million other factors? Really?
As for Python 3 being used and more every day, yes, I never claimed it was obsolete or dead or anything else. I'm just saying Python 2.7 is being used too, and hence it's not obsolete either as you seem to think. You claimed it was, so I asked for your definition of the term. You're rejecting the standard one and you still haven't given me one that you're willing to apply to things other than Python. Not to mention I don't see why software deserves special treatment for the word's definition here.
I don't think this is going to be a productive discussion, so this is my last comment on the matter.
I already gave you a definition of what it means to be obsolete for standards (such as specifications of programming languages).
In general, an old standard will become obsolete once the new standard (that is supposed to replace it) is finalized (for example, in RFCs, they explicitly say that). At that point, there are probably no serious users of the new standard yet, so the actual usage doesn't matter.
Of course a standard can be de facto rejected by people abandoning it instead of accepting it. Then usually there will either be another standard that obsoletes the old one again (as was a case with XHTML and HTML 5), or people will move on entirely to something else; in either case, the old standard will remain obsolete.
You should be aware that trying to badger people with strict adherence to an arbitrarily-chosen definition of a term as a way to avoid countering their arguments does not make you look intelligent, does not make you look well-qualified to argue the topic, and does not make you look like you're winning the argument. Resorting to technical haranguing about the definition of a term typically, in fact, gives the appearance of someone who does not have an argument to make and is searching for any way to try to salvage a declaration of victory.
I'm not trying to make anything or anyone "look" any particular way. I'm trying to present an argument against the undeserved misuse of a label that carries a negative connotation and that is carelessly thrown around in this industry far too much in order to dismiss technologies people don't consider smoking hot enough for their tastes.
How in the world was my definition "arbitrarily-chosen"? I literally chose definition #1 on Oxford English dictionaries. That's arbitrary?!
I disagree. He uses complicated words where simple ones would do and other words that are redundant. For example, read his sentence after removing "typically, in fact,". It conveys the same meaning but sounds less pretentious.
I'd argue Python 2.7 counts as no longer produced.
The 2.7.x releases with their bugfixes are akin to an electronics company still honoring the warranty of tape recorders and still repairing them.
That doesn't mean tape recorders are not obsolete, especially since the company is not making them anymore. I consider the parallel 'making software' to be the process of
In your eyes is anything that isn't getting more and more features added every few months necessarily obsolete? Can't something just become mature and fulfill its goals at some point? Do you consider T-shirts to be obsolete too? If they kept adding more and more attachments ("features") to your clothes every few months to prevent them from becoming "obsolete" you'd be walking around in really heavy clothing...
Your comment is particularly apt as the programming language du jour is driven by fashion, not technology. We could still be using COBOL and be just as productive churning our CRUD apps as we are with the latest JS frameworks today... But one's hot and one's not.
The reason COBOL is obsolete is because there are no useful programs which are easiest to express in COBOL anymore, unless you are already using COBOL. It us truly obsolete in a way unrelated to fashion. Even if somebody were to supply the tooling, writing apps in COBOL would not be as productive as using a more modern language. It just lacks the expressiveness.
For comparison, C is not obsolete because a lot of useful programs are easiest to express in C still. I'd argue that C is obsolete for App development too, but there will be people who disagree with that.
This is just wrong. For example, the "best" recording microphone (to many artists), the U67, can no longer be made because the parts aren't available anymore. Yet it is the most popular mic, and 100% not obsolete.
Similarly, Python 2 might not be made anymore, but it is used everywhere, and people are making new things with it. So...it's also not obsolete.
That's not for the manufacturer to decide. Python 3 is like Coca Cola declaring that the New Coke is all people should drink, and stopping production of classic coke.
Python 2 is still being "worn" by millions of programmers, and is what runs in the biggest installations. This includes new code written for those installations, that it's written to run in the same 2.x environment.
> That's not for the manufacturer to decide. Python 3 is like Coca Cola declaring that the New Coke is all people should drink, and stopping production of classic coke.
That is within the rights of a manufacturer though. Coca Cola can continue to produce classic coke because it isn't any more or less complicated to produce than New Coke. The PSF's opinion is the new features they develop are best developed on top of the core changes in Python3, and that adding new features to Python2 is too expensive to maintain in addition to Python3.
I feel like the cases are too different to work.
Python 2 is still being "made" in the sense you consider T-shirts are still being made, though. it's still provided for download, and people are downloading it and using it, and even using it for new things. Heck, it's even getting bugfixes which is a plus. It's just not getting features added, and it happens to be software so reproducing it happens to be trivial compared to "hardware" like clothing.
So, do your comparisons correctly. No matter how much you insist, Python 2 just isn't dead (or obsolete, etc.). Lack of new features doesn't imply obsolete.
You can still download the operating system for an Amiga, yet the Amiga is obsolete (despite diehards still using 31 year old computers).
Python 2 is no longer being actively developed. It receives bug fixes, and even those are scheduled to stop before too much longer. The fact that people are using Python 2 means that it's not dead, but that doesn't mean it isn't obsolete.
Think of it this way. You need a library to solve some problem, and you find one on Github. The project hasn't received any major updates in 3 years. Some people have submitted pull requests, and a few of them have even been merged, but it's clear that the maintainers are focused on other projects these days. Is that an indication that this project is simply mature and no further work is required? Or could it indicate that the maintainers want to work on other things, and this is not a priority for them any longer?
Saying that Python 2 is obsolete isn't an insult. Python 2 is popular, and loved by many, many people. It has been adopted as a teaching language by many schools, has inspired multitudes of people to learn to code, and has achieved prominence in data science, machine learning, and scientific computing. It also happens to be at the end of its lifecycle, development has moved on to Python 3, and the developers don't have much interest in maintaining Python 2 any longer.
That doesn't diminish the accomplishments of Python 2 or the people who love it. However, it does mean that the label fits.
(No, it's not related to the matter at hand, and certainly should not be taken as an attempt to further complicate this not especially fruitful wrangle. But it's a goodie and you reminded me of it so I figured I'd share.)
Well, in some respects T-shirts made in the 80s are obsolete, even though functionally they still work. Fashion changes, materials change, cuts change, etc.
You could still wear them today but you'd be working against today's "protocols".
Yeah, so for them it wasn't obsolete. For other communities it was. I don't see the problem. Obviously obsoleteness (word?) isn't a property of the product; it depends on the context (how much it's used, what else is available, how much the alternatives are used, etc... however you want to weight them).
I agree, I also wish he would give actual numbers, rather than points on log chart, which are very hard to estimate by eye.
But to be fair, even with that, I doubt you would get more than 30% of Python 3 users. Which is kinda in line with other surveys, such as the one from JetBrains. (It's probably a good guess that users of Python applications are even more conservative in upgrading than developers of Python applications.)
The worst thing about open source is that people can do stupid stuff with your software.
If you're going to create this abomination, at least do us all a favour and DON'T call it Python. Call it Retardython or something. I don't want to imagine people coming into the official support channels and claiming they are using "Python 2.8", then other people lecturing them about what that software really is, etc. Sounds like a horrible waste of time. (Source: I spend many hours a week helping fellow Python users.)
Python programmers and companies with python code should spend the time and effort to move to python 3 instead of spending that time and effort to backport stuff to python 2 because python 2 is deprecated and the future is python 3. Python 3 I think people and businesses with python 2 code would be better off moving their code bases to python 3 instead of doing things like this.
Companies with Python code are probably better off keeping their working, tested code than switching to an incompatible interpreter and set of libraries which among other things will print "b'Hello',b'World'" into their mission critical CSV files.
Yes, the built in csv module really does that in Python 3.
Yes, the built in csv module really does that in Python 3.
If you pass it bytes, yes, it does. If you pass it strings, no, it doesn't.
If what you pass to the built-in CSV writer is not a string, the CSV writer will call str() to get a string representation it can write out. The string representation of a bytes object includes the 'b' prefix.
Meanwhile, you discovered your bug: you were treating bytes as text, which is likely to blow up on you sooner or later, and thanks to how Python now handles text, it blew up on you immediately as a way to remind you not to treat bytes as text.
What you probably think you want is for the CSV writer to realize it got a bytes object and, instead of calling str(), call its decode() method to get text it can write. But that is once again a dangerous operation, and sort of the whole point of Python 3's text changes is it won't let you get away with that stuff anymore.
It doesn't "blow up". If it blew up and retired, I would have seen the problem. The problem, like several python 2/3 incompatibilities, is that Python 3 merrily did something different, without telling anyone, until eventually we track down what has changed. I spent quite a while on this very bug myself, and it, along with others, persuaded me to switch to a different language (serious I know, but I was just getting annoyed with python's general loose dynamic nature, in combination with the python 2/3 changes.)
It's unreasonable to say "merrily did something different, without telling anyone" when fixing the string implementation was a significant reason for creating backwards-incompatible Python3 in the first place.
But the fact that the same code now silently does the always wrong thing in Py3 wrt CSV is clearly a bug.
Actually, the design defect here is calling str() on everything, and assuming that the output is sensible for CSV. It may be a decent rule of thumb, but it clearly does not apply to bytes. Given the likelihood that someone might mistakenly use bytes as a string (for example, because they're porting a legacy Py2 codebase), this should be a hard error, immediately reported as such, and not just a silent behavior change.
Except if you make an exception for bytes, what about other types that might get passed into a CSV writer, whose __str__ is something "wrong" for CSV purposes? Do they also get auto-detected? Do we add a new __csv__() method just for when outputting to CSV (since it might not be "wrong" for other output formats)? Or do we ditch str()-ifying altogether, but then add back in a bunch of special cases for numeric types and other things where str() is "the right thing"?
Or do we say "CSV outputs strings, whatever is the string representation of what you passed in is what gets written out", and trust people to figure out when they're working with something that has a "wrong" string representation for their use case?
Because remember: the whole underlying cause of this was treating a dangerously non-string value as a string. Those bytes objects should have been decoded to strings long before reaching the CSV writer. Python 3 does raise more and louder exceptions when you pass bytes to things that expect strings, but the CSV writer isn't a thing that expects strings; it expects things that have a string representation, and several common use cases get much more difficult if you change that to force every user to explicitly do throwaway casts to string in the name of protecting people who keep insisting on writing dangerous "I'll treat bytes as string until it breaks, and then complain that the language did the wrong thing, not me" code.
`bytes` is plainly special case for historical reasons here - it's something that is not a string, but that so many people assume to be a string.
So yeah, I would be fine with making an exception for it (and providing some kind of option to disable that exception, for that incredibly rare case where someone really does need b"foo" in their CSV output).
And then in 5 years, flip the default of that switch, and deprecate it. In another 5, remove it entirely.
Also, note that raising an error in this case is not placating the people who insist on using bytes as strings. Quite the opposite - it very loudly and unambiguously tells them that they're wrong, and how exactly they're wrong.
The entire problem, though, is people assuming bytes and strings are interchangeable. Anything which allows that assumption to go unquestioned, or without program-wrecking consequences, leads right back to where we were. And the "phase it out" model doesn't work; you proposed a ten-year phase-out, but in ten years people are just going to say "we never updated our code, we're not ready, keep it this way another ten years and we'll think about fixing our code". The only thing that works is actively breaking people's programs when they try to intermix bytes and strings.
> The only thing that works is actively breaking people's programs when they try to intermix bytes and strings.
Um, this is exactly what I proposed above!
"this should be a hard error, immediately reported as such, and not just a silent behavior change."
What I'm asking for is that csv writer raises an exception if it sees bytes anywhere by default. The problem is that right now, it doesn't! It just gives you "incorrect" output, that might go undetected for a long time.
Oh we are perfectly well aware of our encoding issues. In NumPy, one of the most important libraries in Python 2 or 3, strings always take one byte per character. And it's not going to change. So the built in csv module needs to support a reasonable behavior. Which it does not.
If a company tries Python 3 and discovers basic things like CSV produce utter gibberish, they would do well to opt out. And they do--in droves.
My data is not misencoded, you see. It's just misunderstood.
Numpy isn't really used like that though. It's for numerical computation. There might be cases for putting text in there but you can always keep it locally and map it to an int that you use in numpy for mapping (I do that in places).
> So the built in csv module needs to support a reasonable behavior. Which it does not.
It could be argued that the csv module's behaviour is reasonable, and NumPy's isn't. (I'm not 100% sure about all the details of this issue) Hopefully, NumPy will change it's behaviour to match Python 3, but if not you could still use the NumPy CSV routines like `loadtxt` or `genfromtxt` [0]. So then this becomes a documentation change to add some warnings to both modules.
> they would do well to opt out. And they do--in droves.
This is simply not true. They would do well to handle strings properly and so avoid bugs in future - something Python 3 actively encourages, and Python 2 obscures. And while I can't speak for every company, our metrics show that our Python 3 code has far less customer issues than Python 2, Perl, or Ruby. Now that's business value. (Edit: I mean it's hard to make the comparison - the Perl code is e.g. older - but we're writing code now, and when the interns add new stuff to the Python 3 codebase, it breaks less. All of them are still actively developed, and the Ruby one is about as old as the Python 3 one).
I say this as someone who uses the latest version of Python available in every new project or script.
Text encoding issues are absolute garbage in Python 3.x
I fucking hate the way that csv module works with text encodings.
As soon as I can figure out a reliable way to take latin-1 and save it as UTF-8 without breaking everything, I will try to shoehorn in a PR.
Right now, it's fucking awful. My ETL pipeline hates it, I hate it, my boss hates it, and my internal constituents hate it. Because it sucks.
A file I can read in one encoding and write as another should be readable with the encoding I wrote it in. That is not currently the case with the latest version of Python.
You're definitely making a mistake somewhere, because I just tested it for myself and it worked perfectly fine. I made a latin-1 file, applied the above code with it, and got a correct utf-8 file out. Are you reading the final file back as latin-1? You have to read it as utf-8 of course.
To be perfectly clear: bytes (b'') is not a string. Again: bytes is NOT a string. It is an array of octets, aka bytes, aka unsigned 8 bit integers. NOT characters. NOT a string.
If you are dealing with bytes that are encoded representations of a string, then you have to know what encoding they use to decode them and treat them as strings.
I'm not sure what you mean. If you don't know what the encoding of the input file is you have a problem. As far as I know there are libraries to guess the encoding, but it cannot be determined completely accurate.
> It could be argued that the csv module's behaviour is reasonable
I don't see how silently printing a binary literal, if that is indeed what it does, is reasonable. Simply put, b"foo" is not meaningful CSV.
What it should do is 1) raise an exception by default, informing the user that they need to be supplying strings and not bytes, and 2) provide an explicit switch to treat binary data as pass-thru, which would be useful in scenarios where you're just reading a file and dumping it elsewhere, and don't want to spend time decoding and then encoding everything.
The docs say that "[a] row must be an iterable of strings or numbers" [0]. So I guess an exception could be raised. However, the docs do tell a lie; non-strings are accepted and get converted to strings. You can pass any object in which has a string representation - including a byte array. It actually wouldn't be too hard to introduce a check for a bytes field, https://hg.python.org/cpython/file/3.6/Modules/_csv.c#l1227
+ if (PyBytes_Check(field)) {
+ append_ok = FALSE;
+ Py_DECREF(field);
+ PyErr_SetString(PyExc_TypeError, "Field is bytes");
+ }
else {
This would then raise a TypeError.
I don't think this is the right solution. It seems weird to have a special case because people aren't watching what they're putting in. Garbage in, garbage out, consenting adults and all that.
Yes, "special cases aren't special enough to break the rules".
But "practicality beats purity".
And "errors should never pass silently"!
It's the same kind of thing that leads to safety warning stickers put on products. You may read it and think that it's something so obvious that consenting adults should know better. But then you look at the statistics about how many people did not, and realize that, yeah, a sticker along the lines of "don't stick your finger into a food processor" is actually a good idea. Especially given how cheap it is, and how expensive reattaching fingers is...
Basically, products should be designed around known human weaknesses, and that includes entrenched modes of thinking by past products. It doesn't mean that new products should accommodate those entrenched modes, especially when they lead to other problems. But they should try to detect them, and issue clear and explicit warnings, to guide the person to the proper way of doing things.
I moved to Python and got back into coding specifically to work with csv files. And from the first time I used code copy and pasted from SO or wherever to import and export CSV's with Python 3 I have not found that I am plagued by unreliable behavior and mischievous encoding. Yes there seems to be a type conversion necessary with NumPy, but it simply is not true that Python 3 has an endemic problem with CSV files!
In fact, the reason I chose Python was because I was able to dive so quickly into real problems like this with no problems whatsoever.
This is a minor issue for someone porting from 2 to 3, it is not a problem with 3
Companies using the built in csv module would be better off moving to something like csvparser.
Python's a superlative language but it has a pretty terrible set of included libraries. urllib2 isn't the only library with a superior alternative on pypi. Pretty much all of them do.
> Python programmers and companies with python code should spend the time and effort to move to python 3 instead of spending that time and effort to backport stuff to python 2 because python 2 is deprecated and the future is python 3.
I hate Python 3's removal of the (lambda (key, value): blah) tuple unpacking syntax, and the forcing of parentheses for print statements. They might seem minor but they aren't for me. So I'm not at all eager to move to version 3 and don't really see any benefit. Not sure if those who aren't migrating feel the same way, but I wouldn't be surprised if some of them do.
(Edit to address comment below: There are more issues I have with Python 3. It allows more bugs to slip through, for instance. I actually particularly like a comment I just wrote, so I'll link to it here: https://news.ycombinator.com/item?id=13145299 Do note that this was added after the reply below.)
Benefits? Unicode. Async. Extended library. Required keywords. syntax inprovements (lots of m, many more than just the removal of the print statement). Type hinting.
I meant I don't see any benefits for me, not benefits for other people. I assumed that was clear; sorry if it wasn't.
> Unicode.
Yeah, but some people have still been living without the changes, and it's hardly enough of a reason on its own (for me anyway) when there's other things I hate about the language.
> Async.
It's a nice feature, yeah. I can live without it, as people have for many years. Maybe if I was used to having it around I wouldn't want to go back, but I'm not.
> Extended library.
Cool! I'm not sure what exactly falls under this that I'm supposed to be missing, but pip install has sure been taking care of everything in the blink of an eye in version 2.
> Required keywords.
Cool! I need it about as much as I need a donut.
> syntax inprovements (lots of m, many more than just the removal of the print statement).
nonlocal is literally the only positive one I can think of right now that I'd actually care about. But then again, it comes up maybe 50x less often than the parentheses I have to write for print, or the tuple unpacking that I have to do. So yeah, it's hardly a reason to migrate.
> Type hinting.
Nice to have. I'm living just fine without it. Maybe I'd have migrated if it actually optimized things or did something more useful.
Well, things change, especially in tech. For better or worse, but most of the time for the better.
You should read some changelogs of past python 3 releases. 3.6, for example, has ordered dicts by default. Which is quite convenient when you need to write test testing a small dict with two items for example.
I like driving an old muscle car, most of m look beautiful and bring me everywhere i want. But damn, those new cars changed a lot and are much more comfortable. (But they do break as much ;))
The intention is to make the order guaranteed in 3.7 or 3.8, AIUI. There was some desire to prove the new implementation before guaranteeing its behaviours (i.e., in the worst case, if it turned out to be broken, they could revert to the 3.5 code and it would be valid).
So you _hate_ Python 3 because of two changes in syntactic sugar? I would understand if you hated it because of the real breaking changes, but no...
I think you just don't comprehend the multitude of problems that Python 3 fixes by handling strings correctly... Maybe you've never handled Unicode before.
> So you _hate_ Python 3 because of two changes in syntactic sugar?
First of all, the tuple unpacking one is a HUGE readability AND maintainability issue; it's not just syntactic sugar. var[0][1][2] is not only far less readable than the unpacking notation, it doesn't even have the same semantics (doesn't enforce the structure of the tuple).
That means Python 2.7 helped me catch more bugs. Think about that!
Second, no, I just listed the two that irritated me the most every time I tried to switch, because they were the first things that came up by far the earliest and most frequently. Small inconveniences can be amplified through their frequencies. There are lots of things I don't like about it though... division becoming floating point division, having to say list(foo.items()) or list(map(...)) instead of just foo.items(), etc... again, more verbosity and typing for common cases where I really didn't mind the old way. If I wanted imap(), I could've just used imap; they could've just moved that to __builtin__ and made my life easier that way.
By the way -- the lazy nature of map(), etc. also means you catch fewer bugs now. Again, think about that! Just because it looks more efficient, that doesn't mean it's actually better. If there's anything I've learned, it's that even the smallest things tend to come with non-obvious tradeoffs.
Finally, regarding strings: if you look at my earlier comments, yes, I already acknowledged the Unicode changes were for the better. Awesome. I agree. Cool? OK, but there are other things in the language besides Unicode though, and they're not as awesome. I don't spend my entire programming life dealing with Unicode strings, so I care about other things too, and they make my life harder. Simple as that.
Regarding tuple unpacking, instead of writing, say:
def foo((birthname,surname)):
...
you can write in Python 3:
def foo(name):
birthname, surname = name
...
It's not less readable. I also missed it in the beginning, but it's not really a big deal. (I think they couldn't keep the feature because of how '*' is used, but I am not sure.)
Edit: If you have problem with this in lambda expression, just create a named inner function. It's a feature/shortcoming (depends on POV) of Python that you cannot bind variables in an expression. I hope you understand that they couldn't keep the feature in lambdas if they didn't keep it in proper functions.
I am sure if you think about other things, there are good reasons to do it the way Python 3 does it, usually there is a hidden case where things need to be disambiguated (like your list() examples).
I wish I could downvote. I specifically said I hate the tuple unpacking syntax change in lambdas. That's where I used it so much to begin with, not in defs! I obviously can't put statements like that in lambdas, and until now I didn't need to do that to make my code readable. Now I have to name all of my lambdas just to make this syntax work, which is nonsense. It used to be there and it worked perfectly fine.
Why would you want to downvote somebody trying to help you?
In any case, I think I see your problem. You are not the sole user of Python language. There are features that other people like (such as using '*' in unpacking), and so features you like are weighted against their use cases, and a reasonable compromise is made.
And frankly, I think if you like to use lambdas that much, you really want to program in a language where everything is an expression, such as Lisp or Haskell.
> In any case, I think I see your problem. You are not the sole user of Python language.
I'm glad I'm not. Otherwise I probably wouldn't be using it either. Not sure how that is my "problem".
> There are features that other people like (such as using [asterisk] in unpacking), and so features you like are weighted against their use cases, and a reasonable compromise is made.
I like that [asterisk] syntax too.
> And frankly, I think if you like to use lambdas that much, you really want to program in a language where everything is an expression, such as Lisp or Haskell.
Or I could just keep using Python 2.7 which works just fine, and not move to version 3 where I'm not welcome.
It's your problem in e.g. where you want list returned by default where Python 3 returns an iterator by default. Why is that useful for many people was already explained.
That's why I used the word "trying". Maybe you should read more carefully before you want to accuse others from misreading something. ;-)
I think it's unfair to say that I misread his comment - he doesn't explicitly mention he is aware of the workaround I outlined for the functions, and that he is bothered with lack of tuple unpacking in lambda expressions only, not in ordinary functions.
Regardless, I still think it's quite impolite to downvote somebody who wants to help you and misunderstands you, if they are not e.g. factually incorrect. If you don't actually tell me where I am wrong, I cannot improve my answer. Also, this is not Stackoverflow, where that could be marginally acceptable (I am very strongly against downvoting without explanation).
> Regardless, I still think it's quite impolite to downvote somebody who wants to help you and misunderstands you
Well, I don't consider it a "misunderstanding" when there are literally just 2 things to note in my comment that you're replying to ("lambda" and "tuple unpacking") and you still somehow miss 1 of them. I think it totally deserves a downvote, because it makes me look stupid when you present a reasonable solution to a non-problem and make readers assume I was saying something other than I was, and on top of that I have to waste some 5-10 minutes of my time replying. That's not something I appreciate.
That said, like I said, I never actually downvoted that comment (because I obviously couldn't). So you don't need to worry about the internet points.
If you weren't busy taking things so personally, you could note that I already hinted in my comment on why I made it - I missed the feature of unpacking within function arguments myself at first too, until I realized that unpacking within the body isn't really less readable. (And please - do not waste time replying.)
I admit I don't use lambdas that much, since generator expressions (which is like Python 2.3) they aren't really needed too frequently. And in most cases you're better off using function anyway, because in Python statements are not expressions, as I already also stated. For example, I use print() for debugging frequently and this is tough to insert into lambda. (And even in Haskell I prefer to name subexpressions to lambda syntax.)
> That said, like I said, I never actually downvoted that comment (because I obviously couldn't). So you don't need to worry about the internet points.
I am not worried about internet points (I actually got about 80 of them on this discussion alone, which is frankly ridiculously too much, and in practice, I find that comments I personally find to be the most insightful only rarely get most points), I am just really annoyed when somebody downvotes my comments without any explanation, because I am a very curious person and in most cases it's just a honest misunderstanding, which could be cleared up with, I don't know, actual communication?
And at least two or three other people actually downvoted my original comment, so I would like to use this opportunity to invite them to come forward with an explanation what they found so wrong about it.
list(map(lambda (some, thing): some + thing, everything))
# better
list(some + thing for (some, thing) in everything)
Or, as the parent suggests, just create helper function, preferably one your python environment doesn't need to set up every time your outer function is called
# okish, "verbose lambda"
def compute(everything):
def magic(elem):
some, thing = elem
return some + thing
return list(map(magic, everything))
# probably better
def magic(some, thing):
return some + thing
def compute(everything):
return list(magic(some, thing) for (some, thing) in everything)
> Lambdas are not really pythonic these days anyway
Too much nonsense in your comment. Really now? How about you give a realistic example where syntactic sugar doesn't substitute for it? Like what am I supposed to pass to sort(key)? And incidentally, this tuple unpacking problem comes up when sorting frequently... and that's the prime example on their web page (and a realistic one at that) for where you're supposed to use lambdas: https://docs.python.org/3/tutorial/controlflow.html#lambda-e...
If you're telling me this isn't Pythonic, you're really just not being sensible.
Yes, Really Now!
Your disdain for the answers people are giving you, and your abrasive demeanor tell me I can spend my time better than to discuss this further with you.
Regarding laziness of map() - lazy is a good default, because you can always make eager out of lazy, but not the other way around. Lazy is also more general, because it can handle both lazy and eager inputs, while eager will always force a lazy input.
This has been the general trend in mainstream languages lately, not just in Python. E.g. in C#, all LINQ operations are lazy. in Java, the new stream API, to be used with lambdas, is lazy.
Notice in said languages they added lazy APIs. They did not remove eager APIs. Python already had imap, ifilter, izip, etc... I already said this and I'll repeat: I would've been just fine if they made those easier to use (e.g. no import). There was no need to change the behavior of existing APIs.
There were generally no map/filter/fold APIs in those languages, eager or lazy.
In cases where the APIs were there, they were generally not as easily accessible (i.e. they were the equivalent of imap etc, with some hoops to jump before you could use them). The new APIs are more straightforward to use.
The reason to change the behavior of an existing API is because the default (i.e. most obvious) API should also be the most flexible, and do the right thing in as many cases as possible. This was not the case with map etc in Py2.
The disadvantage of changing an existing API like that is that it breaks code. But Py3 broke code anyway, so it was a good time to introduce breaks like that for the sake of better defaults.
> There were generally no map/filter/fold APIs in those languages, eager or lazy.
Array.FindAll, Array.Convert, etc. all existed in C# beforehand. Though maybe this is what you meant in the next sentence.
> In cases where the APIs were there, they were generally not as easily accessible (i.e. they were the equivalent of imap etc, with some hoops to jump before you could use them).
This is going on a tangent but LINQ still has hoops to jump through. You have to say "using System.Linq;" at the top if you want to use the new syntax. That's like saying "from itertools import *" and then using imap, which you could've always done.
> Array.FindAll, Array.Convert, etc. all existed in C# beforehand. Though maybe this is what you meant in the next sentence.
Yes, it's what I had in mind. I have to admit that I completely forgot about ConvertAll (and so assumed there was no map).
I think the biggest reason why those weren't all that commonly used in practice, is because in .NET you often deal with opaque collection types (like ICollection<T>, or ReadOnlyCollection<T>, or even custom-made collections pre-generics) that are usually exposed on properties of objects. Since the concrete type is not known, you can't do List.ConvertAll etc.
This, by the way, is another point in the favor of lazy implementations - they don't care about input type, because the output type is always "lazy sequence". Of course, you can have an eager map similarly not care about input, but then what should be the type of its output collection by default? No matter what type you choose, someone will complain that they wanted someone else. Given Python's preference for explicitness, such design would warrant several functions like map_to_list, map_to_tuple, map_to_set etc. But, of course, if you have a lazy map, you might as well just write list(map(...)) etc.
> You have to say "using System.Linq;" at the top if you want to use the new syntax. That's like saying "from itertools import " and then using imap, which you could've always done.
It's a bit different, though. When you import itertools, it brings all those functions into your global namespace. But when you import System.Linq, it only brings one static class into your global namespace; the actual functions are extension methods that only show up on the types to which they are applicable. So the resulting namespace pollution is far less in C#.
There's also the issue of import being generally frowned upon in idiomatic Python, largely because the way conflict resolution works there (silent override). In C#, if you happen to have clashing identifiers from usings, it'll prevent you from using them unqualified, so there's no good reason to avoid it.
That seems kinda petty. Surely you can work around that. If they really were killer features that many devs used I doubt they'd have been deprecated and removed in python 3. I do agree that though I wish lambdas could be more useful instead of one liners.
Python programmers and companies with python code should the time and effort delivering new features for their customers instead of spending that time and effort forward porting stuff to python 3 because python 3 isn't delivering enough additional value to justify switching. I think people and businesses with python 2 code would be better off continuing to improve their code bases with new features instead of doing things like this.
This is how you keep using those efficient, numerically-stable subroutines written by a smart guy who retired 20 years ago in your new code. Unlike Python, Fortran has managed to add significant new features without breaking old code.
Unlike Python, Fortran hasn't seen an increase in usage and hasn't brought the joy(?) of programming to thousands of new programmers in the last 10 years. So sticking to Fortran or backwards compatibility blindly doesn't solve all the problems, either.
Maintenance is key. Most people don't stick around for 20 years anymore either. I know I'm going to have an easier time finding a new hire for a Python codebase. And he's going to have a far better chance at understanding said codebase. Code which nobody knows how to maintain will hurt us either with a fiendish bug, or limit out growth. So for me, slowly moving away from legacy stuff is good business value in the long run.
Remember, you can never be sure that Fortran code is 100% bug free. The test of time is as good as any other test, but not perfect.
Fair enough. But having worked with Fortran code that was designed in pen on yellow legal pads, typed in, and run on a suite of standard test problems chosen to expose subtle bugs, I have learned to respect an archaic engineering style of which few people are capable nowadays. Sometimes code is essentially "finished;" you'd be amazed at how rarely it has bugs. It should be changed only with care and for good reasons, not because the programming language changed to conform to some new fad.
(FWIW, I'd probably write most new numerical code in Julia rather than Fortran 20xx, and either call into existing Fortran via FFI, or drive it from the command line with some scripting language.)
> Unlike Python, Fortran hasn't seen an increase in usage and hasn't brought the joy(?) of programming to thousands of new programmers in the last 10 years.
I should add that in my mind this is akin to working around a design flaw instead of refactoring and modernizing and upgrading. The python deva have provided numerous tools one of which is the 2to3 tool for simple things.
People don't seem to be considering the possibility that a stagnant python 2.7 may actually be a reason to like that version of the language. I must admit it is nice to not have those oh-so-keen python developers messing with my favorite language.
However, I recently had a gig working in py3. Apart from screwing up every single print statement for a long time, it was entirely drama-free, and actually pretty great. There really is a difference between living and dead languages I think. 2.7 is the latin of python.
Oh, people are considering that possibility very much. But the notion of Python 2.8, with new features backported from 3.x, is kinda the opposite of that approach, and is very much "developers messing with my favorite language".
Author here. Imagine my surprise when I got back from a day of sightseeing (I'm on vacation in Spain) and saw that this had blown up. I had intended to "release" this project after New Years, after I'd gotten back and a week or two after 3.6 is released [1], and didn't expect this to get picked up since the project has been on Github for over a year (although inactive for much of that time) and since my blog usually doesn't get much traffic.
A lot of people here have strong opinions about the name "Python 2.8". I don't mind changing it, and intend to do so, (https://github.com/naftaliharris/python2.8/issues/47). I picked it initially since when talking with friends about this project it conveyed pretty darn immediately what the project is and does. I'd be very keen to hear people's suggestions for alternate names!
For those of you with 2.7 codebases or projects, I'd be extremely interested in hearing about whether you were able to get this interpreter to run your code. Personally, the biggest challenges I've had so far are with dependencies that check for `sys.version_info[:2] == (2, 7)` as opposed to something like `sys.version_info[0] < 3`. But I'd be very interested in other people's experiences, particularly with larger codebases.
[1] A minor and somewhat pedantic point: The interpreter I've been working on includes PEP 515 (underscores in numeric literals), which is new in 3.6. I didn't think it was right for me to "take credit" for this new feature before it was even out in Python 3.6. Obviously, the real credit for this feature existing (in 3.6 or in any interpreter) goes to the CPython core devs, and especially Georg Brandl.
Regardless of the detractors, I think this is brilliant work. All of the negativity around your project is over politics or subjective opinions based off of individual experience where they are lucky enough to work in environments on the bleeding edge.
Keep up the good work, this could help a lot of people!
Reading a thread from the last time something like that was proposed (although the only change in that proposal was that 2.8 would, on Windows, use the C runtime from a more recent MSVC compiler), I found a relevant argument (https://mail.python.org/pipermail/python-dev/2013-November/1...) against this "Python 2.8" idea:
What if more than one person did that? (paraphrasing Raymond Chen's "What if two programs did this?")
There can be only one official Python 2.8, but since it will never exist (see PEP 404), there can be many unofficial mutually incompatible "Python" 2.8 implementations. Therefore, calling it "Python 2.8" is a bad idea.
I see many people vigorously defending Py3 but I wonder how many of these have a paying-the-bills kind of job. Where you would look at the cost of porting a large project to Py3 and get an answer like half a million USD (easily). Do you go "of course we do that, that money is easily recouped with the added programmer productivity of Py3"? No chance.
So the question is do you want to basically light that money on fire, or just keep your perfectly fine Py2.7 code running and maintained another few years.
More discussions of the monetary value of programming languages please. What is "correct" or "right" isn't all that interesting to many, for good reasons.
Kudos to this project and hope it can set us on a saner migration path to Py3. (Should totally change the name though.)
I don't want the CPython core devs to do anything different and are not angry with them at all. It is their pet and they can do what they want. I fully agree with what you say in that sense.
But I do mind people saying in these discussions "you should all move to Py3 now or you are stupid/evil".
No. There are legitimate reasons for staying with Py2.7 and embracing it.
So I hope "Python 2.8" gets a cool name, perhaps even some funding from a company who wants to keep their Py2.7 code alive and invigorated, and the community part as friends.
Is that self-entitlement in any sense?
All I want is for people who I think have a huge blind spot to stop calling me ignorant for decisons I make about MY code. I totally don't expect core CPython devs help me out though.
> I don't want the CPython core devs to do anything different
It's not just CPython, it's the Python specification of which CPython is the reference implementation. The specification couldn't move forward in significant ways without making some of the changes that came with Python 3.
> All I want is for people who I think have a huge blind spot to stop calling me ignorant
The level of vitriol in 2vs3 threads has been way too high from the start, because people always hate change. You are obviously free to do what you want, you always were. Don't mind the haters, but please don't be one either.
> So I hope "Python 2.8" gets a cool name, perhaps even some funding from a company who wants to keep their Py2.7 code alive and invigorated, and the community part as friends.
It has already been explained elsewhere in this discussion by other people, I would strongly advise against that. If you for whatever reason have to stay on 2.7, then make sure your new code is 2.7 only (and best if it works with 3 without changes if you decide to change your mind later).
Consider what's more likely in the far future (after PSF will give up on 2.7 support in 2020). That somebody will support 2.7 as it is, or that this guy will support his Python 2.8 hybrid?
Also consider what happens (in the far future) when some library you use will drop Python 2 support. It's not likely it will be easy to run on this Python 2.8 hybrid, either.
And if you for any reason must use Python 3 features in your code base, just bite the bullet and port it.
What I was hoping for was a sane migration path to prevent the split you talk about in 2020...somebody made a superset of both py2 and py3 that lets one move gradually. If support ends/project dies one would bite the bullet and move all the way to py3 I guess.
My py2 code uses unicode properly which may color my view a bit...
Clearly, "sane" is up for argument. My understanding is that 2.6 was the last version to get major features. Python 2.7's goal was to be a bridge between 2x and 3x[1]--it was that superset where code can run in both. It back ported many of the popular Python3 features (at the time). But that was 6 years ago and Python 3 has had new features since then.
(Looking at this from the perspective of the "Python Community" or someone who's goal is to adopt Python3) His focus is to back port newer Python 3 features developed since then. Does this help people move to Python 3?
Good on him for digging into cpython. While there's __future__ and the backports module, he seems to have focused on features that aren't just new libraries (which is cool). A few years ago I was trying to backport Python 3's Namespace Packages for my company since our internal import tools effectively do the same thing (except our's had bugs).
It's too bad the python 3.x fans can't see this as feedback about how difficult it for users of 2.x to upgrade to the latest and greatest. Many 2.x folks have sprawling code bases and complex operational needs. The 3.x advocates seems to consistently ignore that.
Shoot the 2.8 messenger all you want for choosing to call it Python 2.8, but don't dismiss the issue that drives thoughtful people to get value out of this strategy.
The python maintainers and the 3.x fans do see this, and know this, and decided to do it. They knew there would be a cost in community, and decided it was worth it. They haven't been oblivious, and they've made several large concessions in 3.4 and 3.5 to increase the ease of migrating. Stop denigrating them.
Christian Tismer tried this a couple of years ago [1].
I guess the intention was a bit different: he wanted to have stackless features in python. It's not clear to me the reason he decided to back down, whether because of licensing issues or just because the other python developers didn't like it.
What another load of crap. Call it something else but this isn't Python. I'd never use this because it's not official. Who knows if or how long it'd be supported for or if any backdoors would/could be introduced.
I've scheduled time this year for my teams project to update to Python 3. It's expensive in the short term but in the long term we get continued support and new features which is a huge win.
> This should have been the approach to modernising python all along.
The core driver for the Python 3 break was the fix in text model, this is what allowed literally everything else as it completely broke existing code.
And I, for one, think it's one of the most important improvements of Python 3, the text model of Python 2 is a giant mess and makes it very hard to correctly deal with non-ascii text for any non-trivial software, especially in large teams where not everybody will carefully evaluate the text-ness of their code..
> the text model of Python 2 is a giant mess and makes it very hard to correctly deal with non-ascii text for any non-trivial software
There are counter-arguments to this. Armin Ronacher, author of (among other software) the excellent Flask web framework, thinks that Python 2's system of codecs and byte streams is better in practice [1][2]. Reasons include: You can do byte -> byte conversions with codecs that are no longer possible. You can better handle text encodings besides UTF-8 (and here he describes several embarassing failures of Python 3 to handle OS paths correctly). You can write single APIs that handle byte streams like gzip and text encodings like UTF-8.
> Armin Ronacher, author of (among other software) the excellent Flask web framework, thinks that Python 2's system of codecs and byte streams is better in practice.
Armin Ronacher works in a very specific context of having to deal with byte/text interfaces in pretty much all his projects, and while I can see where he comes from I work at a different level and at the level at which I work the P2 model is a giant pain in the ass.
Armin is no foe of Python 3. And as noted in the essaye Python 3 has undergone several improvements or features reintroductions e.g. PEP 461 reintroduced C-style formatting to bytestrings, making generating binary data (especially ascii-based formats) significantly more convenient than it is between 3.0 and 3.4.
Also note that Armin has repeatedly praised Rust's text model, which is much more similar to P3's than P2's (except with static types and no messy legacy).
> and here he describes several embarassing failures of Python 3 to handle OS paths correctly
And (fucking surprise) the issue with that is the text model of FS paths is an embarrassing pile of garbage, Python 2 is convenient because it doesn't try to touch that mess at all and just hands the flaming bag of shit to whoever comes next.
> Also note that Armin has repeatedly praised Rust's text model, which is much more similar to P3's than P2's (except with static types and no messy legacy).
That is incorrect. Rust's text model has (almost) free (and copyless) transmutes from bytes to strings. Python does not. The text model of rust is much closer to Python 2 than 3 in many ways.
> That is incorrect. […] The text model of rust is much closer to Python 2 than 3 in many ways.
Rust's text model strictly separates proper strings and bytestrings, defaults to proper strings and requires that strings be properly formed (so much so that it has additional completely separated platform-dependent types for dealing with OS-originated "stuff").
The one "difference" (which is more in the realm of implementation detail than language text model) is that Rust leverages its ownership system to make UTF8 "encoding" and "decoding" free (literally for the former, essentially for the former). The encoding and decoding are still there and explicit operations though.
> Rust's text model has (almost) free (and copyless) transmutes from bytes to strings.
Only for the specific case of input bytes already in the language's internal encoding (which granted will be common as most inputs would be ascii or utf-8) and with the same ownership constraints as the input, and that's mostly enabled by Rust's ownership model.
> Python does not.
Python doesn't generally do no-alloc/0-copy operations so that's not overly surprising.
> Only for the specific case of input bytes already in the language's internal encoding (which granted will be common as most inputs would be ascii or utf-8) and with the same ownership constraints as the input, and that's mostly enabled by Rust's ownership model.
Except of course on operating systems where text I/O is done entirely in UTF-16. Say, Windows.
Since Python strings have no fixed encoding, but choose "the most efficient one" (heuristically) when decoding, they can cope better than a fixed UTF-8 encoding in these cases.
>> Python does not.
> Python doesn't generally do no-alloc/0-copy operations so that's not overly surprising.
Indeed. Even when the encoding is not changed, the string will be always copied. One could think of an API that does that, though, to optimize all those cases were memory is already owned by a shim in the runtime.
> Since Python strings have no fixed encoding, but choose "the most efficient one" (heuristically) when decoding, they can cope better than a fixed UTF-8 encoding in these cases.
That is wrong. Python can never pick the most efficient encoding unless you decode from latin1.
Rust having strings that are utf-8 is a guarantee and as such allows uou to do very efficient operations on them. Puthon gives you a vagie guarantee that it gives you O(1) access to something like a glyph.
These are very different and incompatible text models.
At no point is Python's text model fast or overly useful.
Armin has backed off of this stance since then. And for good reason.
As someone who works with Python text processing extensively, I can tell you that the Python 2.7 text model is broken and dangerous, due to the silent bytes-unicode coercion and misguided use of ascii instead of UTF-8 as the default text encoding. Many people don't realize this and will argue that it's not broken, because they have never fed non-ascii text through their app to watch it blow up! And once they realize that they have a problem, they then have to deal with a rat's nest of silent bytes-unicode coercions happening implicitly all over their app, sometimes impossible to deal with due to library code outside their control.
There is a good discussion to be had on whether a language should prioritize bytes or unicode strings as the main data type, but there is no excuse for the "ticking timebomb" string data type design that pre-3 Python has with strings and the default encoding.
For this reason alone I'm very happy that 2.7 is starting to lose its grip. Its continued support is a problem, and I have no love for people who are trying to hold on to it.
There are many other features in 3 that I can no longer live without - most of them now available through backports modules - but types and asyncio can't be easily backported either, and people are starting to use them extensively.
Yeah, but if you are dealing only with a subset of the English Language in the U.S., and your API endpoint that you are scraping wants to serve to all peoples in all locales in all situations, you are fucked if you want to use Python3 and its csv module.
You genuinely are better off using Python 2.7.x and its naive approach to text.
I don't understand what you mean by "your API endpoint that you are scraping wants to serve to all peoples in all locales in all situations".
That would mean to me that the API endpoint could be sending me Unicode, in which case Python 3's Unicode-aware CSV is going to work great, and Python 2's csv is fucked. The limitations of Python 2's csv module was one of the key points that moved my company to Python 3.
On Python 3, if you want to be naive about text (not sure why you're celebrating only working in a subset of English, but you have this option), you could open the file as Latin-1 and get the same results as Python 2.
Many CSVs are made with Excel. Excel's only form of Unicode CSV is tab-separated UTF-16. Python 2's csv can't parse those at all, can it?
> Python 2's csv can't parse those at all, can it?
Nope, not without re-encoding to UTF-8 before parsing (learned that out the hard way and found out it's easier to just take excel files as input).
P2's CSV module works byte-based, and basically only handles ASCII-compatible supersets, assuming your special characters (quote chars, field and record separators) are straight ASCII.
I don't think it's honest to post this without context, and without mentioning that in the five years that passed most of these things were remedied, and indeed, some things were already remedied at the time of his writing.
Some points Armin makes are valid and remain valid for Linux-ish systems, but have been shown and refuted countless times for other operating systems; Python is not a Linux-only show. I won't re-iterate all that here.
You should be aware Armin now has a more-or-less followup post telling people not to do what you just did (i.e., reference his 2011 post as an authoritative "Python 3 is bad" explanation, because both Python 3 and his own opinions have evolved since he wrote that post).
This approach could not have worked for modernizing python. The whole point of the Python 3 thing was to be able to remove warts in the language that could not have been fixed without breaking backwards compatibility. One core part of this is unicode support -- Python had a horrible story for international text before this.
The fact that there are some parts of "modern" Python which could have been implemented in Python 2.7 backwards-compatibly is irrelevant.
Python 3 is not the language designers worrying about minor subjective issues in python like the print keyword or the design of iterators and deciding that they want to change it all. It is the language designers worrying about major issues like international text, realizing that they regrettably will have to break backwards compatibility to fix those, and then just taking the opportunity to revamp things like printing and iteration since they're breaking backcompat in some pretty major ways anyway.
Python had a horrible story for international text before this.
No, Python had a horrible story for text, and people who worked in limited/sheltered domains didn't realize it. I personally lost all kinds of valuable hours of my life fighting with Python 2's "pretend everything is ASCII until it isn't, then fall over dead" model, because I -- a US citizen, working at US companies, and for quite a while dealing only with English-language content -- still ran into non-ASCII characters with regularity.
And here I'm being charitable; I simply refuse to believe that the overwhelming majority of people who used Python 2 never once had to deal with someone copy/pasting text out of Word or another program that used "smart quotes".
It's super-sad that "Unicode" is a prominent stated motivation for Python 3.
Unicode in Python 2 was fundamentally broken in that whether it had UTF-16 semantics or UTF-32 semantic depended on how the interpreter was compiled. That's a terrible, terrible idea. However, they could have fixed it by sticking to one option: UTF-16 (which provided compatibility with some interesting things that Python interoperated with like Cocoa and, via Jython, Java).
UTF-16 is a sad legacy mistake, but APIs providing Unicode operations of any kind can be build on top. So UTF-16 is a mistake to begin with, but it's not a blocker for supporting all of Unicode and features targeted at the needs of all writing systems and languages. Java, Windows, the Web Platform (including JS) show that proper i18n can be built on top of the bad but backward-compatible 16-bit code unit foundation.
Now, the _even_ sadder part of Python 3 is that if you decide that UTF-16 is a mistake and want to fix it, UTF-32 is the naive and wrong solution. When a Unicode newbie is told about surrogates, they think that UTF-32 is the answer. But then they waste memory and cache line space (and, if dynamically omitting leading zeros on a per-string basis, the compute and copy cost of promoting to different unit width when adding one emoji). And once the damage is done, someone points out that grapheme clusters are a thing, so they still didn't get O(1) indexing to user-perceived units.
The enlightened thing, of course, is to do what Rust does: use UTF-8 and use iterators on top for accessing pieces larger than a code unit (code point, grapheme cluster). (To my taste, Swift strings are too magic and DWIM-y. At least back when I read the Swift book, it didn't even explain the underlying representation. With Rust, the representation is very explicitly known.)
"UTF-16 sucks" is what Python 3 got right. That UTF-32 (with dynamic leading zero omission on a per-string basis) is the answer is what Python 3 got very, very wrong. The correct answers are either UTF-8 (for a new language like Rust) or holding the nose and making stuff work on top UTF-16 (Java, JavaScript) without breaking old programs.
To be clear, I don't agree with py3s Unicode model. I think it sucks for the same reasons you do.
I also think that default Unicode is a major improvement over py2 and is enough to justify breaking the language because modern languages should at least have that.
Python 3 fixes no fundamental issues with python 2 and introduced far more warts than it removed. GIL is still there, crummy runtime is still there and unicode is now an even greater mess. I really wonder how many people who bang on about unicode actually have a good grasp of unicode and text processing because python3's unicode design is obviously terrible. I can now access or count code points in O(1) (neither of which is in any way useful) at the cost of tremendously increasing space and time overhead for any basic text operation on non ascii text and having some bizarre hacks to deal with the fact that pretending that stdin and stdout and sys.argv are always text.
It most certainly fixes support for Unicode on Windows in terms of filesystems paths, OS function boundaries and the console. Some of these fixes have even taken until 3.6 to get implemented.
As someone who writes cross-platform code, Python 3 was a breath of fresh air after fumbling around in the dark with Python 2.
I can definitely believe that – but windows has basically lost[1] and a much worse text model to boot. Like Java and unlike python 3, they at least have the excellent excuse that this was not obvious at the time. And under unix the impedance mismatch has definitely increased. Not a good trade.
[1] I wouldn't count them out, but they're definitely on the back foot as bash inclusion shows.
> I can now access or count code points in O(1) (neither of which is in any way useful)
Oh, yeah, I agree that Python's unicode model isn't great. I like Ruby's, and Swift has it's own cool thing going where it's very explicit about the uselessness of code points.
However, I think that Python 3 having some form of default unicode support is way better than what Python 2 had. It could be improved (backwards-compatibly too!), but it passes my minimum bar for a "modern" language's text story.
In all seriousness, I'd much rather python3 had kept str, phased out the unicode type altogether, got rid of all the harebrained locale crap (sys.{get,set}defaultencoding etc) and just provided tooling (collation, regexp, denormalization etc.) for working with utf-8 encoded byte-'str's. This would probably have been a much smoother transition and ended up with a vastly superior result.
I'm pretty sure the people complaining here about how python2 str only supports ascii and they couldn't paste their smartquotes were bitten either by windows or unnecessarily bad unicode/str interactions due to python not just hardcoding utf-8 auto-conversion. That is the only sane thing to do (Your locale isn't *.UTF-8? Well sucks to be you. By now even the Japanese and Chinese seem to slowly have come around to the utf-8 bandwagon, and they had better reasons then most).
I might be wrong, but I can see basically 3 non-idiotic ways to do text in a programming language:
1. arrays of utf-8 bytes (Rust, Go). Python was close to that already and then messed it up. Indexing indexes into bytes O(1).
Upsides:
- efficient: most text you're going to get is already utf-8 and the rest should be converted on ingress/egress; html/css/most code will be represented fairly efficiently even if the body text is mostly say, Chinese; you can do a lot of text processing by just working on the ascii range (e.g. CSV parsing).
- sane: no BOM, no 32 bit encoding of 21 bit quantities etc; unix-compatible
Downsides:
- can't efficiently access individual logical characters or know the fixed-font width of the text
- normalization is kinda nasty (concatenation etc.), in
practice people just tend to ignore that
- hard to constrain to only valid utf-8 without significant downsides
- maybe not that beginner friendly
2. use some non-array type that doesn't allow for indexing (e.g. ropes), probably using (mostly) utf-8 for internal encoding.
3. arrays of logical characters. That means you need to make up fake characters to handle graphemes that are not directly representable as a single pre-composed code point in unicode. The upside is that this has beginner friendly semantics in a sense and allows indexing on what's meaningful in the domain (graphemes). The downside is that I can't see how to do this with a lot of complexity and some nasty gotchas. This seems to be what perl6 does https://design.perl6.org/S15.html#NFG
as mentioned in a comment above, the choice to take 2.7 behaviour when 3 behaves differently means this wannabe-python '2.8' is neither backward nor forward compatible
So when you see a project written in Python 2.8, you'll know that it contains lots of legacy code under active development by developers who are gung-ho about fancy language features, but that this team hasn't had the time/resources/gumption to do a code overhaul anytime in the past six years.
And also that you yourself, and any users of your own project, will need to run it with this guy's own homebrewed version of Python in order to know it will behave correctly.
I don't think this is a good idea simply because it reduces the need to upgrade your library from 2.x to 3. It's a clear cut and a good chance to weed out unmaintained libraries
That would mean someone develops a library that uses features of Python 3, but needs some horrible hacking to port those features to Python 2. Thanks, but no - please make a cut and decide or split it in two separate packages.
> That would mean someone develops a library that uses features of Python 3, but needs some horrible hacking to port those features to Python 2.
The "horrible hacking" already exists bundled into libraries and tools like Six and Future.
> split it in two seperate packages.
That was tried, and failed every time. Because now you end up with two diverging and hard to reconcile code bases. A single-source cross-version library, while not trivial (and not allowing the user of more advanced P3 features) works way better.
> The "horrible hacking" already exists bundled into libraries and tools like Six and Future.
Only for the simplest cases. How is Six going to help you write a regex that matches emoji, for example?
On Python 3 you write a range that includes the emoji you want to match, and you're done. On Python 2, that regex may or may not compile, depending on what sys.maxunicode is. If sys.maxunicode is 65535 you have to fall back on a different complicated regex that has a bunch of cases to find emoji in their UTF-16 representation. If you want to avoid that system-specific behavior, you can I guess encode it to UTF-8 bytes and write an extremely complicated regex that finds emoji in UTF-8.
This is my prime example about how something that's easy on Python 3 can require horrible hacking on Python 2. A wrapper library doesn't fix that problem.
Python 2 and 3 have different semantics, and the only way a correct, automated translation between them would be possible would be to emulate one inside the other.
Slightly tangential, but: I've noticed an interesting parallel between the 2.7 / 3.x partisan divide, and the US political partisan divide. In both cases, as partisan passions have increased without relief, there's both A) an increasing unwillingness to agree on basic facts about reality essential to the debate, and B) increasing presumptions of bad faith on the part of their opponents.
Examples of A in this discussion:
- Disagreements about the degree of library support for 3.x
- '' the ease/value of porting from 2 to 3
- '' the rate of industry adoption of 3 for new projects
- '' the degree to which people are driven away from Python entirely because of the version situation
Examples of B in this discussion:
- Claims of paternalism on the part of GVR/ the PSF in pushing 3
- Claims of unreasonable/emotional attachment to 2 by partisan devs
- Claims of willful distortion of facts by both sides (see A)
- Claims that the writing is on the wall for 2, because usage of 3 is supposedly accelerating
- Claims that the writing is on the wall for 3, because it's supposedly taken too long to drive not enough adoption
It seems outrageous to suggest that the differences between 2 and 3 are anywhere near as significant as the differences between, say political conservatism and liberalism, and yet the level of partisanship seems nearly the same. How did it come to be like this?
I like how your comment stands out as being such a rational, non-aggressive observation in all this. I wish I had the answer for you, but I can't figure it out either. shrug Maybe some people are using this as an opportunity to vent the frustrations they've had with using the language (despite it being so loved, and ranked 3rd in most-used according to IEEE), and since there are two versions of the language instead of one, they can more readily create an object doomed for the epitome of their hatred, the sacrificial lamb going to the slaughter (sounds like something from a bizarre school of thought in psychology). You'd think it'd just be easier to let people do what works best for them. I'm sure many expect to take collateral damage from the differences in versions (imagine being a Python 3 fan and having to start working for a company that exclusively uses Python 2 or vice versa), but these people also readily accept working with languages with radically different designs and welcome in copy-cats (like Clojure is to other LISP-like languages). The important thing is that the language does what you need it to do. The name on the download link shouldn't make that much of a difference. The world has adapted to accepting Python 2 and Python 3 regardless of the bickering. There's room for "Python 2.8", even if it is poorly-named. I imagine there are probably dozens of Python 2.8s in existence - they just haven't been noticed by people here on HN. Besides, in about 3 months, we probably won't be hearing more about Python 2.8 anyways unless someone else decides to use it.
Thanks for doing this! I took a look at the arguments underlying the proposed switch to Python 3 and came to the conclusion that they did not meet the necessary threshold of effort to switch. Instead, I want to see every feature in Python 3 backported into 27. I don't think the unicode change can be accomodated, but 99% of the rest of the new features in 3 can.
It's not dead, it's just really slow. The 2.7 release has been very long in the making, released only 18 months ago. There are stabilization releases to it still coming out. There's a tiny, but dedicated group building it and it shows. Jython 3 is possible if you're willing to put your money where your mouth is.
Yes, that would be true on all counts :) as one of those Jython maintainers. Jython 2.7.1 is currently bottlenecked on finalizing support on Windows for pip/setuptools. It should be the last fix.
The discussion is a nice overview how py3k crowd just steamrolls over any discussion that there are indeed inherent difficulties with forcing lossy conversion of imperfect outside input.
You are only told "we know python.org uses utf-8 so just decode it as utf-8." No further discussion, no pointers are provided how to correctly fetch an URL with text content into a string. Even small convenience function that at least tries to look on Content-Type: header would help here!
I am well aware that "in py2k it just worked" was mostly an illusion. But honestly, is the situation above an improvement?
Nobody should believe me, but I kind of expected something like this would happen. People just wants to keep using 2.7, and let the language evolve slowly over time, as is happening with PHP, for example. So if 2.7 development was officially stopped, someone will continue with it unofficially.
I think the problem for me is that they did these changes and made a big song and dance about Python 3. If they had have called Python 3.0 v2.9 instead everyone would have been clear they had to migrate.
PHP has been successfully deprecating features for decades now and cleaned up their code base; they have never had a schism in the way Python has so you can argue all you want about legacy, or people preferring 2.7. The reason this happend (and is still happening) are human not technical.
> If they had have called Python 3.0 v2.9 instead everyone would have been clear they had to migrate. [...] The reason this happend (and is still happening) are human not technical.
You know, as a strong proponent of Python 3.* (it's a significantly better Python), I started to write a withering critique of your statements here (and I do strongly disagree about naming it Python v2.9). But then I started actually looking at the evidence.
For example, when you go to download Python at the python.org website, you're still presented with a choice between Python 2 and Python 3 side-by-side, looking for all purposes like equivalent choices (Python 2 should be much less prominent and toward the bottom and/or the download button should be much smaller than the Python 3 download button)[0].
And then, when you follow the link for "Wondering which version to use?", you get a full page worth of hemming and hawing, and cost-benefit analyses and so on[1]. In fact, they should be saying something like:
Always use Python 3 unless you have a strong reason to do otherwise.
This typically only applies to people who have large legacy code
bases and are for some reason unable to upgrade, or else need a special
legacy library only available in Python 2 (most libraries are
available in Python 3)."
So wow, having read the official python.org statements on the Python 2 vs 3 issue, I'm pretty appalled. No wonder some people who are relatively new to Python are confused or surprised. I'm on Debian or Ubuntu mostly and so download Python via package manager or PPA, or build from source. So I had no idea.
It's also been a huge mistake for GVR to have allowed so many new and attractive features to be backported to Python 2.7. It's taken away a significant amount of incentive for people to make the move. Python 2.* should have been bugfixes only for a long, long time now.
That said, if you're in charge of significant amount of Python code for a company and you've not seen for years that Python 2 is a deadend, you've either been engaging in wishful thinking or oblivious to what's happening in the Python community.
It's also been a huge mistake for GVR to have allowed so many new and attractive features to be backported to Python 2.7. It's taken away a significant amount of incentive for people to make the move.
The idea that Python 2 needs to be sabotaged and the community forbidden from improving it so that people makes the move, with the people who are stuck to Python 2 codebases held as hostages, should be an indicator that something was done very, very wrong...
Personally, I'm neutral as to 2 vs. 3, I use both, but the schism is the main drawback of Python for me and it often drives me to just use other languages. Python 3 is great and has very neat improvements, but it should have just been called a different name so that both branches could evolve freely and compete on their own merits, rather than on the PSF mandating to use one over the other.
That's a really good point about the download links. The Python 2 link should be lower down on the page, or perhaps smaller, and maybe labeled "legacy support version" or similar.
To be fair to the "Should I use Python 2 or Python 3 for my development activity?" page, it starts with:
> Short version: Python 2.x is legacy, Python 3.x is the present and future of the language
They could deprecate the breaking changes - for example Python 2.8 could make print statements needlessly more awkward to write but warn instead of error and Python 2.9/3.0 could make the changes permanent.
Rather than have this huge break maybe just take the time to move all of Python forward rather than us still 8 years after Python 3 leave us still talking about it.
> They could deprecate the breaking changes - for example Python 2.8 could make print statements needlessly more awkward to write but warn instead of error
Interesting. Now I want to make my own Python 2.8 which is 2.7.latest with the "-3" command line argument hardcoded to always be set.
The effort to move from PHP v4 to v5 is comparable to the Python v2 to v3 move now. A lot of projects did not move to PHP5 for a long time. Lots of libraries had to be rewritten. The change brought a lot of subtle defects because programmers had unwittingly or knowingly relied on the by-value assignment of objects in PHP4.
The PHP Project did manage to allow people to write code that worked on both 4 and 5. At the same time, the project failed to get Unicode in. It's unclear how that will be possible without breaking a lot of programs. If PHP ever switches string handling like Python3 did now, it will be just as painful. Well, that's my prediction anyway, maybe they will figure something out. Just realize that their first try (PHP6) didn't fly at all.
> I've been working on Python 2.8 (not an official Python release) because I want to give all the people who use Python 2 access to the Python 3 language features, which I think are actually pretty cool.
Theres a module called __future__ for exactly this purpose
I hope I don't come across as too obtuse, but it might still make economical sense for the PSF to backpedal on their decision: even if Python3 is a good solution for the future, it's caused years of arguments and still hasn't been adopted by the majority[1].
Ridiculous as it might sound, that implies it would actually take less work to devise a path for Python2 that's backwards compatible but still has a future, and invest the time to backport all 3-only code, than it would to proceed with killing off 2 and porting everything over. It's also nothing more than everyone whose code is 2-only is being asked to do sometime in the future (they chose "the wrong competing standard" and have to pay the cost). Either way someone has to foot the bill for a unified Python but couldn't it just be that the early adopters of 3 could pay that cost if it saves updating (say) twice as much legacy code in favour of backporting the 3-only code?
Of course, the future version of Python should be the best form of the language with the right features, and that's why it was decided to kill off 2, but if after this many years the initiative hasn't completely succeeded, there's always the option to reverse course.
Heck, in the time until Python 2 is officially gone, maybe this new fork will evolve into a better language than 3 and still be backwards-compatible with 2!
For now I'm sticking with 2.7 for as long as I can but will just accept it when the time comes.
Now you're just adding on more technical debt for users of Python 2. Who's going to maintain your Python 2.8? What happens when there are conflicts between your Python 2.8 and Python 2.7?
this is incredible - give me unicode support in python 2. asyncio ? probably .. i'll still be happy with gevent. This is a clear path for a python upgrade.
Nope. Can't be done without getting Python 3 either way, because Python 3's text model is not compatible with Python 2's. That is why the core team allowed the other breaking changes, because software was going to be broken in the first place.
> Changing to `unicode_literals` will likely introduce regressions on Python 2 that require an initial investment of time to find and fix. The APIs may be changed in subtle ways that are not immediately obvious.
Unless you're willing to put in the time and energy to extensively test, this is basically a recipe for disaster. You're basically paying at least 80% of the cost of a full Py2 -> Py3 port (since looking for regressions is a huge part of that cost), for only a fraction of the benefit.
Being able to change things in one part of the code base at the time -- or one feature/data field at the time, is a HUGE thing. For production systems you just cannot put the backlog on pause for a month or two while porting. You can however incrementally fix over the course of some years.
That's probably true, but unicode_literals isn't the right tool to make an incremental port, because it neither obeys good py2 nor py3 text handling conventions.
It would become a substantial detour and end up being a kit more total work.
Also, porting can be done in parallel to normal dev. You aim your port at a specific release while you continue fixing bugs. When the port is done, you port over all the patches. Repeat until port and original version converge.
Isn't it as simple as a) do proper string handling in py2 unicode and b) if you write more u"" in a file than "", flip the switch and write "" and b"" instead?
Not quite, because the py2 library is often not compatible with unicode literals, and also because if you flip the switch in a module it changes the API of any functions that returned strings. That might break calling code from other modules.
So then you might have to add code to work around these issues, code that is needed in neither pure py2 or pure py3 -- hence making it a weird detour.
In general, it is more compelling to use `unicode_literals` when back-porting new or existing Python 3 code to Python 2/3 than when porting existing Python 2 code to 2/3 [1]
Right.. our codebase is already unicode all over the place because otherwise we could not i18n properly. So we basically have py2 unicode-correct code with encode/decode in the proper places for interfacing witg stdlib. I didn't consider people might use str and not unicode for text..
As the page notes, many Py2 APIs are completely broken with unicode literals, some noisily and others silently (due to implicit ascii encoding/decoding) leading to hard to diagnose regressions in the codebase under Python 2.
I think leveraging PEP414 and carefully using bytes, unicode or "native" literals is a much more resilient mode of operation.
If by this you mean "give me exactly Python 3's text model without breaking my unmodified Python 2 code", you should be aware that A) this is impossible because B) the Python 3 text model is backwards-incompatible with unmodified Python 2 code, and C) that's kinda why Python 3 existed and was backwards-incompatible.
i know it does - but it would be nice to have the everything-is-unicode mechanism of python 3.
Asian developers hit unicode problems before US based developers because of the natural differences in underlying OS language.
... You can't have your cake and eat it. If you want Unicode everywhere then you use Python 3, because it's a breaking change. Saying you'd want it in Python 2 is a confused way of saying "I need to use Python 3".
We should make this Python 4.0 and move on. Some high paid google engineers working the "official" 3.x will be pissed, but that's life. I am sad to see the schism in Python community.
A non-trivial number of people seem to want this, but why does it need to be called Python? Keep the syntax and C extension compatibility, but ditch the name. One could even work out a means of selectively bringing in ideas from elsewhere:
- more from Lua, Julia, Scala, and Haskell (within reason)
There are already plenty of examples of Python modules that implement significant modifications (e.g. Cython, Rpy2, Dask), so I don't think it would really feel all that different from typical Python programming.
from __past__ import bytestring_literals, loose_comparison, integer_division, ...
? Being able to upgrade one file at a time to Py3 would have made porting so much easier! I realize some things could be hard to support file-at-a-time, but others (like my examples) would obviously not.
I've held off on porting the Py2 code that I'm responsible for because of the lack of total test coverage. I know that something tricky with unicode, None comparison, or the list->generator split will cause a failure in a weird, untested edge case. If I were making a numeric library, that'd be one thing. However, I mostly write complicated business logic API glue code that is hard to test without huge amounts of mocks.
Also, this kind of thing gives me nightmares about porting:
WSGI therefore defines two kinds of "string":
"Native" strings (which are always implemented using the type named str ) that are used for request/response headers and metadata
"Bytestrings" (which are implemented using the bytes type in Python 3, and str elsewhere), that are used for the bodies of requests and responses (e.g. POST/PUT input data and HTML page outputs).
So according to the spec (!) WSGI headers MUST be `str` in Python 2 & 3, but that means that the semantic meaning of the types changes. How on earth am I supposed to write good code for working with headers (let alone mocking and testing) when I'm required to decode them in Python 2 but not in Python 3?
I consider the fact that 2.7 isn't getting new features a good thing. After a point, every new feature takes away something from everything that came before.
I never knew that despite Python 3 is called the future by PSF and has been around for a ling time, people still want to stick to Python 2. I mean it isn't the 2013 that libs aren't ported, most famous ones generally are ported. Don't know what the problem is.
The arrogance and stubbornness of the CPython dev team is starting to face the consequences we all knew were coming. If only they'd at least done unicode right. Gone with assume UTF-8 and kept the indistinction between bytes and unicode, more of us would be onboard.
Why does Python have so many hilariously convoluted trials and tribulations over decades trying to upgrade their language when the JavaScript community is able to run smoothly across many different versions and many different interpreters simultaneously?
As a Python 2.7 project why would I trust more to this unofficial version with a lot of backported code compared to the tried and tested 2.7 to 3 route?
What would happen with my python 2.8 when it needs to be converted to the official python 3?
1. Calling it Python 2.8 is a really bad idea. If you want to fork Python in this way, great. I'm sure very few people in the community would have a problem with it if you called it Brothon or P8thon or Hackthon (could run into trouble with that one, but who knows?) or IHate3.xThon. You can call it LifeOfBrython or Snake, depending on how much you care about where the name came from.
Guido van Rossum is one of the least litigious people you can find in the open source community. When he wanted people to stop naming packages after PEPs (pep8, pep257), he didn't try to go to court and figure it out later. He talked to the maintainers of the packages and explained why he thought it was problematic [0]. I wouldn't expect a cease and desist showing up anytime soon, but it's in poor taste, and I hope you reconsider it. Python is not yours just because you are free to use it and do as you please with it. It's really stretching the concept of open source for you to unilaterally declare a collection of hacks to be a point release that's been specifically addressed and decided against by the community as a whole.
2. I have so much sympathy for people who must maintain Python 2.x applications and cannot make the business case for putting the time into porting it to Python 3. I'm in that situation myself right now, and I have been for years. I can't find a compelling argument to warrant the time so long as the 2.7.x branch is getting security updates. When the security updates stop, the move will finish.
The reality is that those of us who have been working with Python for a long time now (10+ years) have already figured out reliable workflows to get around the most lacking aspects of the language in our domains of expertise. Or we have swapped out subsystems in other languages for cases where we absolutely cannot find workarounds.
I get it. I'm in that boat. I understand that boat because I live in it.
I can even understand why this project happened. Because hacking things you love to make them better for you as an individual is the heart and soul of what makes software development so great.
What I absolutely cannot understand is why anyone (absent certain libraries needed) would start a new project on a deprecated branch of a language. And I don't understand why this is such a contentious issue specifically with Python. I don't hear any of my friends who work with C# or Go getting into arguments about how the latest version has failed or shouldn't have been rolled out. As much as Swift 3 has caused problems, I don't hear any of my iOS developer friends complaining about how that shouldn't have happened and trying to backport the good stuff into the 2.x branch. I genuinely don't understand this behavior in the Python community.
*
I do, however, suspect that it has something to do with the overall age of the community.
Using Python has always been a matter of taste and style. I get that. Python is a language that makes explicit tradeoffs between performance and style. It is no shock to me that people who learned to love the language in its older form want it to stay that way. And I say this as someone who is closer to 40 than I am 30: we older folks in the industry need to fight against the stereotype that we can't or won't learn new tricks. Yes, there's a place for battle scars and pushing back against every new flavor-of-the-month stack that rehashes old problems in new ways. But obstinately sticking with stuff simply because it's what we know does all of us who are getting on in years a huge disservice.
Getting a good gig as a software engineer in your 20s is not that difficult. Getting the same gig when you're pushing 40 and the hiring managers are still in their 20s is a lot harder. If you're making decisions to start new projects on old versions of any language, please, please, please, do all of us a favor: make sure that you are doing it for solid reasons and not simply because it's what you already know and are already comfortable with.
*
Because this post isn't long enough already, I have a second pet theory about the reluctant adopters.
Python 3 is a _great_ language. It's not the same as Python 2. It makes different tradeoffs about styles and choices of expression from what Python 2 does because these things can and should change over time.
I think that the expressive power of Python, the closeness to human language, and the ease of reading well-formed code have a lot to do with the resistance to Python 3. There are people (I'm one of them) who get really really picky about language in general. I think the Oxford comma is a necessity in almost every case. And I think people who casually omit it are stupid, shallow, non-thinking, drones who don't care about the history of language and don't care about precise meaning of words, and therefore don't care about me because they an't be arsed to toss a comma in a place that greatly clarifies meaning.
I don't actually think all of that, but I'm closer to that than I am to not caring at all.
And there are many like me. When you're dealing with a programming language with intent at its heart, people are going to take that in different ways. When you're dealing with a language that cares about whitespace and eschews braces and wants to limit brackets and just expose the pure logic of the program, people really are going to get fired up about print vs. print().
That's expected. And beautiful that so many people care that much. But as much as I lean towards prescriptivism and that rules are good for a language, even I have to admit that language evolves. But it usually evolves to be more inclusive and more expressive, not less. The really bad fights about languages in general revolve around what to include, not what to exclude. Because languages are exclusive by default.
A counterpoint to my idea above that we are all just old and lazy, is this: that we really do genuinely care, and that we care for good reasons. But something has to give. We cannot refute the evolutionary pressure. Python 3 is as necessary for modern speakers of programs as a recent edition of a dictionary is in your preferred language.
Yes, you can get by with something older, and you can even make yourself understood by most people who speak the language.
But you are limiting your ability to express your intent when you make an intentional choice to refuse to adopt what the rest of the culture around you is doing.
Aaaaaaand I'm done. Sorry for how long that was. I didn't intend that. It just sort of happened. If you read all the way to the end, let me know, and I'll upvote you.
This is a bad idea. My guess is the author will not be able to call this thing "Python", and for good reason.
Should it exist? Sure. Why not? The nature of FOSS is that anyone can tinker with it and make it into whatever they dream. Go for it! Just don't call it Python.
I use both 2.7 and 3.x. No drama. New projects go through a "Do we see any issues with 3.x?" phase where we try to list libraries we'll need and check for support. Not that difficult.
I absolutely understand companies/people holding on to 2.7 for dear life. Converting a non-trivial working code-base would be costly and very difficult to debug if that code base doesn't have extensive test coverage (probably true of most). It would also be utterly irresponsible in most business cases.
Most businesses don't have software developers sitting around doing nothing. Revenue comes from existing products and new features along with bug fixing. No customer of a software product will pay one dime for a company devoting a year to port their entire code-base to 3.x. Can you visualize that announcement?
"We stopped delivering new features and fixing bugs a year ago. Instead we ported our entire product to Python 3.x. Today, a year later, we give you exactly what you were using a year ago. Enjoy!"
Yeah. Exactly. The huge sucking sound you'll hear is that of customers leaving the company throughout an entire year of nothingness. In a dynamic free market competitors would eat you alive as you stop delivering features and fixes while they zoom right past you with a better offering to your customers.
That said, sticking to 2.7 for the long haul --say, ten years from now-- will create the Python equivalent of old COBOL code still running in deep dark places within financial institutions. It will create codebases nobody wants to look at or touch. It will create codebases that will be anywhere from hard to impossible to support as libraries will surely evolve to support the 3.x and, eventually, 4.x branch.
It is perfectly sensible for a company to, given today's realities, stick to 2.7. This is almost exactly the problem described in "The Innovator's Dilemma". Good management means focusing on delivering what your current customers are buying and want to buy. Unless they are clamoring for your product to use 3.x it could actually be really bad management to make the switch.
The only way to do it correctly would be to hire a full parallel team of programmers to port the codebase while mirroring every single new feature and bug fix implemented as customer's needs are met. At some future point the two branches would achieve parity in function and reliability. This parity would allow seamlessly switching to the new codebase without damaging customer relationships. Of course, this would cost a ton of money for a non-trivial product and, at the end of the process, the company would probably have to fire one of the two teams. Pretty messy and costly in more than just financial terms, isn't it?
I am not advocating either approach. Just saying I understand this from both engineering and business perspectives. People pushing others to just switch are doing so from a frame of reference devoid of any understanding of the realities of business. Most businesses are not about the technology, they are about what problem you solve for your customer. They don't care about "the geek stuff" behind the curtains. And rightly so.
What's even ironic here is that the print command he used is Python 2 instead of Python agnostic. I wonder which version Randall Munroe would prefer, if either.
We sometimes detach threads that users have already flagkilled when they don't bring new information to a discussion, since we're all here to learn and memes and clichés can get in the way of that. "Obligatory xkcd", especially the standards one, is as predictable as anything here.
Please please please explain how that is inhibiting or restraining personal freedom. Or how it is inhibiting or preventing the expression or awareness of your thoughts or desires.
Oh, but you're expressing your thoughts and desires perfectly well though. You want someone else (not you, of course) to maintain Python 2 for you for the princely sum of of £0.00, so that you don't have to do any work on upgrading to Python 3.
And when the time finally comes around that those people, who have been maintaining Python 2 for many many years (for free), want to focus their efforts on an easier to maintain and more modern language that actually has a future they are repressing you?
>Oh, but you're expressing your thoughts and desires perfectly well though. You want someone else (not you, of course) to maintain Python 2 for you for the princely sum of of £0.00, so that you don't have to do any work on upgrading to Python 3.
That's how it works with programming language communities.
Not everybody is directly involved in maintaining the language, but the whole community has a stake (and a say) in the future of the language.
Furthermore, it's not just the core team that's responsible for the success of the language, but also the users and the companies that adopted it. Without those, Python would be some obscure toy language by a Dutch academic, and he wouldn't have a job in Dropbox etc.
There are lots of people that have been major contributors to Python's success, including large businesses that employed people like Guido, which also have concerns regarding the switch.
> That's how it works with programming language communities
Not always, python has a BDFL rather than a steering committee. But nonetheless it's the people who actually maintain the language who drive it forward.
> but the whole community has a stake (and a say) in the future of the language.
Comments like "the core team are repressing me by not updating 2.7" and random people making half-baked 2.8 releases don't help the future of the language.
Look, it's simple. Core team doesn't want to update Python 2.x anymore for a large number of good reasons. For some people (including you I assume) this isn't the decision you wanted.
But this decision was made years ago. Either move to a different language, update to Python 3 (again, you've had years of warning) or pay for a supported 2.7 version. Or just carry on using 2.7, it's supported until 2020.
Bitching about non-existent repression on hacker news archives squat.
So? For one, almost everybody I've read, even if they are OK with Python 3, say that that decision wasn't the best course the core team could have been taken.
Now, given that the decision has already been taken and followed through for 6+ years, should they now stick with it and see it through? It depends. There's no reason some of us should not just say "no" to that.
>Either move to a different language, update to Python 3 (again, you've had years of warning) or pay for a supported 2.7 version. Or just carry on using 2.7, it's supported until 2020.
Or you know, we can do all/either of those things, and still criticize Python 3 and try to get them to change course.
It would not prevent you from receiving updates, it would prevent you from receiving updates for free. You may of course pay somebody to update 2.7 for you. The Python Software Foundation never promised they would continue maintaining any release indefinitely, and it's unreasonable to expect them to do so. But in fact they are continuing to fix critical bugs even in 2.7, so they are being generous. Nobody is being blackmailed or repressed.
No, but it's reasonable to expect them to hear the concerns of the largest use base of Python, which is 2.x users - even if they decide not to follow them in the end.
Except if they just do it "for fun" and "for the sake of it" and could not care less for adoption or the community in general.
What a load of bollocks. For new projects this only matters if libraries aren't ported, which they are for the most part. For old projects, either you're in a situation where you can spend time porting your code to Python 3, or you don't; but as TFA mentioned pep-404, the writing has been officially on the wall ever since 2011 so at that point you have to admit you did choose to incur tech debt and do nothing about it, so the claimed loss of productivity is on you.
> Unlike 2.7 code, Python 2.8 wouldn't be able to guarantee exact 3.x compatibility, since there are some python scripts that will run under both Python 2.7 and Python 3.x but produce different output, and Python 2.8 chooses the 2.7 behavior in these cases.
What a terrible, terrible situation. Now you'll have "python" code that will neither run on 2.7 nor run compliantly on 3.x. As for the latter, please explain how that will alleviate anything on the following point, since behaviour at runtime will be subtly different:
> adding these remaining Python 3 features would greatly simplify running code targeting Python 3, and allow people to use Python 2.8 to run a mix of Python 2 and 3 code.
I don't know what recourse the PSF has but maybe they should even go all in and defend the "Python" name so as to prevent confusion and stop a potential community fracture. Just call it anything else but "Python 2.8" is not Python.