this is incredible - give me unicode support in python 2. asyncio ? probably .. i'll still be happy with gevent. This is a clear path for a python upgrade.
Nope. Can't be done without getting Python 3 either way, because Python 3's text model is not compatible with Python 2's. That is why the core team allowed the other breaking changes, because software was going to be broken in the first place.
> Changing to `unicode_literals` will likely introduce regressions on Python 2 that require an initial investment of time to find and fix. The APIs may be changed in subtle ways that are not immediately obvious.
Unless you're willing to put in the time and energy to extensively test, this is basically a recipe for disaster. You're basically paying at least 80% of the cost of a full Py2 -> Py3 port (since looking for regressions is a huge part of that cost), for only a fraction of the benefit.
Being able to change things in one part of the code base at the time -- or one feature/data field at the time, is a HUGE thing. For production systems you just cannot put the backlog on pause for a month or two while porting. You can however incrementally fix over the course of some years.
That's probably true, but unicode_literals isn't the right tool to make an incremental port, because it neither obeys good py2 nor py3 text handling conventions.
It would become a substantial detour and end up being a kit more total work.
Also, porting can be done in parallel to normal dev. You aim your port at a specific release while you continue fixing bugs. When the port is done, you port over all the patches. Repeat until port and original version converge.
Isn't it as simple as a) do proper string handling in py2 unicode and b) if you write more u"" in a file than "", flip the switch and write "" and b"" instead?
Not quite, because the py2 library is often not compatible with unicode literals, and also because if you flip the switch in a module it changes the API of any functions that returned strings. That might break calling code from other modules.
So then you might have to add code to work around these issues, code that is needed in neither pure py2 or pure py3 -- hence making it a weird detour.
In general, it is more compelling to use `unicode_literals` when back-porting new or existing Python 3 code to Python 2/3 than when porting existing Python 2 code to 2/3 [1]
Right.. our codebase is already unicode all over the place because otherwise we could not i18n properly. So we basically have py2 unicode-correct code with encode/decode in the proper places for interfacing witg stdlib. I didn't consider people might use str and not unicode for text..
As the page notes, many Py2 APIs are completely broken with unicode literals, some noisily and others silently (due to implicit ascii encoding/decoding) leading to hard to diagnose regressions in the codebase under Python 2.
I think leveraging PEP414 and carefully using bytes, unicode or "native" literals is a much more resilient mode of operation.
If by this you mean "give me exactly Python 3's text model without breaking my unmodified Python 2 code", you should be aware that A) this is impossible because B) the Python 3 text model is backwards-incompatible with unmodified Python 2 code, and C) that's kinda why Python 3 existed and was backwards-incompatible.
i know it does - but it would be nice to have the everything-is-unicode mechanism of python 3.
Asian developers hit unicode problems before US based developers because of the natural differences in underlying OS language.
... You can't have your cake and eat it. If you want Unicode everywhere then you use Python 3, because it's a breaking change. Saying you'd want it in Python 2 is a confused way of saying "I need to use Python 3".