There is still a delta between how unicode behaves on 2 and how str behaves on 3, for example. In general, you'll be fine as long as you have a test suite that runs in CI for both 2 and 3 - to catch incompatibilities early.
When it comes to finding which packages need to be ported:
I've been disappointed by the "Python 3 Wall of Superpowers" and the fact that they've basically never updated their package list. (They update the Python 3 readiness, but you'd never find out if a popular package created in the last 4 years was incompatible, or if packages on the list have become unpopular.)
Today I found py3readiness.org, which is much more relevant and up-to-date. It has a better call to action, too.
Unfortunately, just because a package has supposedly been "ported" to Python 3 doesn't mean it works properly. I've found, reported, and developed workarounds for bugs in two of the packages shown in green on py3readiness.org.[1] Bugs that should have been found years ago if the package was being heavily used.
Both lists should replace python-ldap with ldap3. The former is abandon-ware and will never (probably) be upgraded. The latter already works great with python 3 (and is a better library all-around).
We know about that. Many of the packages are overriden if there are drop-in replacements like PIL -> Pillow.
In above list, these are not drop-in replacements and it will require lots of changes in your application code if you want to use them. One idea is to show alternative packages https://github.com/chhantyal/py3readiness/issues/9
Am I correct to assume this would be a fantastic place to start contributing to open source projects by helping convert these Debian tools over to Python 3?
If you're interested in this effort, please email me. This is a really good new
contributor task, so if anyone's asked you how they could get involved with
Debian, you should send them to us!
Paul, I already emailed you but I'm responding here because I have a feeling your response will be useful to other people interested in helping: could you give some direction for people like me who know Python 2 and/or 3 and want to help, but don't know the first thing about the structure of Debian packages or how they are developed?
For example: I'm looking at ./main/f/flask-wtf/flask-wtf_0.10.2-1.json and I'm not even sure what I'm looking at. The contents of that file are:
On my machine `apt-cache search flask-wtf` finds a package called `python-flaskext-wtf`. Digging deeper with `apt-cache showpkg` I find the homepage http://packages.python.org/Flask-WTF/, which in turn gets me to the code https://github.com/lepture/flask-wtf. I haven't updated the code to Python 3, but let's assume for a second that I have: now what? How do I tie this all together? Do I just send you a patch for this code? Is this even the right code?
If you point me in the right direction I'll be happy to write up a tutorial to help other folks contribute.
All the very same questions I had. Also, in the list there are different versions of the same library. Presumably there are different parts of debian that rely on different versions of the same library - will they be updated to use newer versions?
I see for example, openpyxl_1.7.0 is listed as needing attention. Looking around it seems that openpyxl supports python3 now, so I guess whatever has that dependency needs to be checked?
flask-wtf is a source package. python-flaskext.wtf is a binary package - a single source package can build one or more binary packages.
Binary packages are what's found by apt-cache search and installed by apt-get install.
You can install a Debian source package using:
apt-get source flask-wtf
(This just downloads and unpacks it into your current directory, so you don't have to be root - it's not actually modifying the installed package database).
Information on the source package is available here:
Sorry, I didn't document this since i'm rushing stuff out as soon as I get it together -- I was looking at the Trove classifiers, and setting candidate to true iff it says it's Python 3 compatable but no Python 3 package :)
Forgive my ignorance but does Debian (and other distros) maintain forks of all their software packages? For example, how would an hypothetical mailman "port to python3" happen? Would Debian maintain its own fork or would you contribute directly to the GNU mailman project?
> We can all soon look forward to the day where we no longer have to play Unicode whack-a-mole
I don't expect Unicode trouble to disappear with Python3, quite the opposite, it's trivially to insert ticking Unicode time bombs into Python3 code, especially in things surrounding shell scripting. Something as trivial as this:
print(os.listdir())
Will explode when it runs into filenames that aren't encodable by Unicode (i.e. perfectly legal filenames in Linux). Linux allows raw binary in a lot of places (filenames, arguments, environment variables) where Python expects Unicode.
In my experience, there's much less trouble with Unicode in Python 3, as it forces you to explicitly convert to/from Unicode.
Filesystem encoding is somewhat of an exception, but there's no good solution to this. In Python 2, the above would print binary characters, which is generally not what you want, potentially confusing the terminal emulator. In Python 3, you have to explicitly specify whether you want to output the original bytes or replace any bytes which aren't valid Unicode:
This is more verbose, but also much more explicit, which is a good thing. If you want this as the default behaviour, you can set the environment variable PYTHONIOENCODING=utf-8:surrogateescape (or alternatively "replace"), and the following now works, just as with Python 2:
Judging from the responses in this thread, it's either not properly documented, or people aren't reading the docs.
There is some discussion about making "surrogateescape" the default sys.std{out,in,err} encoding error handler in Python 3.5 in order to avoid surprises.
Are you implying that by using Python 3 incorrectly you can run into unicode problems? I agree with that.
However, there are ways to do this safely. At the boundaries of your code you have to deal with the unicode/bytes question, in any language out there. The good thing about Python 3 is that once that boundary is crossed, unicode and bytes are nicely separated.
> Are you implying that by using Python 3 incorrectly you can run into unicode problems?
The problem is that `print(some_string)` is now incorrect, even so it looks perfectly harmless. So far I haven't seen any "best practice" guide on how to fix that. `sys.stdout.write(os.fsdecode(some_string)` works, but is pretty ugly. Handling all filenames as bytes instead of strings works as well, but again, that gets ugly quick. Cleanest way seems to be:
But should I depend on users setting `PYTHONIOENCODING`? Should I do that in my scripts? It feels all very rough and causes a lot of issues that you really shouldn't have to deal with in the first place.
> The good thing about Python 3 is that once that boundary is crossed, unicode and bytes are nicely separated.
Not exactly "nicely separated", the unencodable bytes get squished as surrogates into the Unicode strings waiting to cause trouble later on.
JFTR. In Python 3.4 and later I think, stdout has a surrogateescape error handler by default. Not that it helps much, because this will fall apart if you write to other things. The story with this apparently is now so confusing that I am not sure what the correct way to do it on Python 3 is.
From my experience most people don't care anyways. In some of my libraries the Python 3 version just refuses to work unless the encoding is utf-8 to avoid accidental failures down the line which nobody can debug.
Python 3 does not have a good way to do cross-platform filesystem access, because some OSes treat filenames as byte sequences and other OSes treat them as strings.
Arguably this was always a problem, because there were values in your program that you weren't sure whether they were a byte sequence or a string. But in practice, under python 2 you could regard a filename as an opaque "handle" of type "string" (that would in fact be a string on some platforms and a byte sequence on others), and as long as you didn't need to manipulate it directly, everything would work.
Python 3 can't do that, so it desperately needs a better cross-platform way of handling filenames.
It's still an opaque handler - as long as you're just passing the filename to other OS functions, everything works. If you're printing ("directly manipulating") it, however, you have to decide what to do about non-Unicode bytes.
Even this is new, though; I don't remember when they fixed it, but Python 3.0 seriously would just ignore filenames that were not in the expected encoding... as in listing the files on the directory would just skip them as if hey were not in the folder... Python 3 had this big promise of solving a bunch of Unicode issues, but it doesn't really seem to have enabled anything not already possible--and not even terribly hard--in Python 2, while making a lot of missteps along the way and only very recently (as in, not even yet in Python 3.4, though committed to some development branch, per statements by others elsewhere on this thread) providing anything actuallt new in the area most often considered the "carrot" for upgrading to Python 3.
Python 3 did solve a bunch of Unicode issues (some of the fixes were backported to Python 2). The most important one is probably lack of consistency - while it's possible to enfore the bytes/Unicode barrier in Python 2, it's not intuitive and many programmers just ignore it.
There was a recent LWN.net submission about the other carrots:
You have to look at Python 3.0 to get a good understanding of what happened with the Python 3.x transition, as that was the critical moment that set the mentality of how people were going to view the people telling everyone "you need to port your stuff to Python 3"; when you see that some of the underlying complexity and issues still aren't fixed in 3.4 from the same decision, it is damming to the argument that Python 3 was really solving a real world problem at all. If nothing else, it means you have to start the transition clock not when Python 3 was first released, but from when it first manages to not be horribly worse than Python 2 by reasonably objective measures.
Python 2.6/2.7 handle Unicode just fine. I've had all-Unicode programs running since the early days of Python 2.6. Python 3 just does it somewhat differently.
I think Armin has got it; repr() is allowed (and does) to return unicode in Python 3, where as it's explicitly expected in Python 2.x that it will return a byte string.
As someone who uses a lot of Python scripting for sys-admin automation stuff, is there anyone here on HN who could ask apple to start distributing a python3 binary with the OS by default?
Otherwise we're going to end up with a similar situation as BASH and other utils in a few years time, where OSX comes with Python2.7.x, and everything else in the world is Python3...
(Although, by 2020, apple will probably have deprecated OSX in favour of iOS, and have javascript as the only blessed scripting language...)
(to be less pithy, it's super easy to write code that will run on both 2.7 and 3.x -- avoid print-as-statement and use format strings instead of '%' for interpolation and that will get you 95% of the way there--probably 99-100% of the way for simple automation scripts)
Yes, python2/3 scripts are possible, and yes, Homebrew works.
However, I don't really want (if I can help it) to install homebrew on all the workstations of everyone on the team (mostly video-editing), and for personal laptops, I don't really want to add that as a requirement. I certainly don't want to turn my < 100kb of scripts into hundreds of megabytes of requirements.
I'm already having to bundle an up-to-date static compiled standard rsync rather than the apple one (to cope with extended attributes across SMB mounts...).
It would be really nice to not have to think the whole time "Is this compatible python2 and python3", but to just write pure python3.
A lot of our scripts are to do with managing and synching all the files needed for video-editing, and other media-management utilities. I already dumped BASH (my initial attempt), as it failed or became ludicrously verbose when dealing with things like folder names starting with '-' or '*', newlines, quotation marks, tabs, or other "special" characters in file names. I don't know HOW they managed to get newlines into filenames, but our editors are creative people. I wouldn't be surprised if they managed to embed fonts and colours next.
So it does involve quite a lot of string processing, our automation scripts. Our tape archival system is CentOS, our main Server is a Synology RAID, and all the main edit stations are OSX, with the possibility of adding Windows boxen in the future. And since not all our projects are in English, everything is UTF-8, and mostly 'just works'. But I don't think the transition to Python3 is going to be 100% seamless. Hopefully unittests help, but I'm sure there are cases I'll miss.
As the maintainer of a Python project that is 2.7 and 3.3+ compatible on the same code base, there is nothing super easy about it. It's horrific actually. Enough so that I'll never do it again. It goes far beyond just print functions. Libraries like `six` help, but the subtle differences in how Unicode handling has changed makes it a real challenge.
I've written a largish project (~10k lines) targeting 2.7 and another one for 3.4. Other than the new features present in Python 3 (function annotations and asyncio), my code style has not changed at all. Neither project deals directly with unicode encoding/decoding however.
I've read all the relevant material. That doesn't mean it isn't painful. The same author of that blog post wrote another blog post complaining about how Unicode is handled in Python 3.
Ha, yes. I kind of forgot about swift. But, remember Atwood's Law and all.
Apple have invested quite a bit into JS recently, with their new LLVM JIT for Safari, I believe? (OK, by Apple standards, not much investment, but the fact that they did it at all says quite a lot...).
Since they did all that work, and are committed to maintaining it now, would it not (in some way) make sense to then use the same pluggable scripting language VM into other parts of the OS, and deprecate applescript, etc?
> Python 2 is scheduled to be EOL'd upstream officially and for good in 2020.
We're in 2015 now (wow, that went quickly), and keeping our release cadence up
(3 years a pop) puts Stretch up in 2018, and Buster in 2021.
Am I missing something? Did Debian decide to extend their release cycles?
Jessie is due to be released next week, which is 2 years since Wheezy (May 2013). In fact, the last several years have all been 2-year cycles, not 3. The only three-year cycle was Woody → Sarge (2002-2005), and AFAIK, that delay was one of the reasons that Ubuntu had room to enter the market. Since this put pressure on Debian to make their development more rapid, I'd be surprised if they decided to move in the opposite direction.
Nice, I'm glad to see that Python 3 adoption is finally picking up steam! Python 2 is nice, but it has really started to overstay its welcome, I think.
Currently we are waiting for a last few packages to be ported (anaconda is already running on python3 and ready for Fedora 23 release). Ironically the hardest part isn't porting (we've sent a huge amount of patches to upstream projects) but getting upstreams to accept the python3 as supported interpreter ("what is this python3 thing?" yes, this is a real quote).
I am happy to see a new distros joinning a party, gl!
If you want to switch back and forth between python2 and python3, one option is to install miniconda (like Anaconda, but with just the package manager) as your Python distribution and create a python2 environment
(conda create --name=myproject-py2 python=2.7 && source activate myproject-py2)
that you can then pip install the python2-only libraries into. Then, when your libraries support Python3, create a new Python3 environment and install them into that
(conda env export > environment.yml && conda create --name=myproject-py3 python=3.4 && conda env update --name=myproject-py3 --file=environment.yml && source activate myproject-py3)
One disadvantage of conda/anaconda is that you can't, or are discouraged from using virtualenv with conda/anaconda.
I use Anaconda on Win7, like it a lot, but I haven't figured out how to use, say, the python 2 that comes with anaconda alongside the python 2 that comes from python.org. I've looked around, haven't really understood what I should be looking at for that. But whatever it is, it isn't virtualenv if you use Anaconda.
One disadvantage of conda/anaconda is that you can't, or are discouraged from using virtualenv with conda/anaconda.
True, but conda has its own virtual environments that work quite nicely. You can make a new one with conda create and switch between them with activate/deactivate. You even get the same sort of interaction with pip (for the packages you want that aren't in conda or binstar), and I've been able to use e.g. Emacs packages that expect to work with virtualenv and have (mostly) had success using conda's envs instead.
Why would you want to do that? If you're using anaconda or miniconda, you should be using Continuum Analytics' builds of python, not python.org python.
Because I want to use Anaconda and its infrastructure, but skeptical people in my circle would want to see that whatever was built will run on the version of python that they would download (which would not be Anaconda).
"Why would you want to ..." answers are generally pretty frustrating. It's a big universe with a very large possible set of individual circumstances.
If you're building conda packages then the only possible consumer is other users of miniconda or anaconda.
I'm not sure "whatever was built" refers to in this case. Some kind of app? Of course you can use anaconda for normal library or application development, since there's functionally no difference between Continuum Analytics' python builds and the python.org python build.
Have you tried using the logging module? Might be overkill for really small scripts as it requires a few lines to set up, but even the tiniest tool ends up requiring print-debugging at some point. With logging it works the same in 2/3, you can leave all the debugging statements in, and if a library also uses it then you get additional debug output from that too. Add a "debug" command line flag to pass when something goes wrong down the road (to change the default log level), and you don't have to add/remove statements for debugging.
For a vim extension - I use snipMate http://www.vim.org/scripts/script.php?script_id=2540 It's a bit on the dead side but works great as-is. ifmain followed by the expander key (shift-tab in my case) expands to the if __name__ ... idiom. You could add your own "p" snippet that expanded to print() in python code.
Yeah that really is trivial. print now works like any other function, like it does in most languages. The new way really is objectively better. Short of allowing all functions to be called as "function argument" à la Haskell (which I would support, but would be a massive change) I don't see any alternative.
Blender has been on Py3K since 2008 and is an underrated software platform.
Asyncio is a revolutionary concurrency framework based on coroutines. The "yield from" feature is useful both for coroutines and for other generator functions.
Then don't. The differences aren't that big that it would take you more than an hour or two to learn them. It's just that it takes time and effort to port older code.
Unfortunately, I think what this means is definitely up for debate. Python3, "happening" may be nothing more than a permanent split, not the obsolescence of 2.
In fact, that's what it looks more and more like everyday.
What IS happening, is that libraries are not, will not, and cannot drop 2.7 support. The opposite isn't true and may never be.
Some libraries will never drop 2.7. I can see why some projects like Django may drop 2.7 in a few years. Django releases tend to break backwards compatibility over time anyway. I don't care either way. My older projects can't be easily upgraded from Django version to version, and my newer Projects start out on Py3k.
I looked into the changes between Python 2 and 3 a few years back (like, 2011, 2012). There lots of minor changes that were all nice, but no single big change that was a compelling reason to switch. If, e.g. they had ditched the GIL and allowed pure Python code to exploit multicore systems, that would have been compelling to me.
Instead, at the time, Wikipedia claimed that Python 3 ran somewhat slower on CPython than Python 2. :(
Maybe all of those minor changes taken together constitute a big improvement, once I give it a try. The Unicode handling is said to be much better, and Unicode handling does suck really, really hard on Python 2. So there's that.
I would love that. Ruby has become my go-to language for quick scripts, even in places where bash does the job. Bash is for things where I can keep everything in my head and there aren't too many lines. Ruby for everything else.
It would be better for everybody if we'd treat Python 2 and Python 3 like separate languages. Then we can reason about the sanity of spending development time to port existing and working code from one language to another.
I concur. In this case, I also think the development time to port the code to Python 3 is worth it, simple because Python 2 will be EOLed in 2020, which isn't that far away!
You're being downvoted because writing Debian's system infrastructure in a language whose spec isn't finalized or is not well known is a Bad Idea when other options exist.
Python 2 -> Python 3 is a trivial port compared to adopting a new language and the technical/cultural baggage that come with it.
Only because the core dev team is making it their priority to move distros over to 3. It won't make any difference, the strength of Python is in the library support, not the distro support. 2.x support by libraries will last a very long time, nothing worth its salt will be dropping it.
By the time they do, Python3 and wherever it is by that point will be unrecognizable to the original release of Python 3.0.