Complaints around infelicities in the Python stdlib are fair enough - I don't think anyone would defend them, but what can you do? You can't change published APIs.
Fos most of the rest of his issues, I think Zed's problem is that he thinks there's one obvious answer, but lots of other people would disagree with his one answer. Python tends to be either very opinionated, or very agnostic. Where there's One Way To Do Things, that's what you must do; except where there's obvious disagreement in which case it doesn't bless a single answer.
API doc autogeneration; I can guarantee that if anyone came up with a default tool with default output such that "doctool module html_directory" was all you needed (and indeed people have) then there would be widespread, perfectly legitimate, disagreement on any number of choices in the design of it.
(Prsonally, I think API doc autogeneration is entirely pointless, and indeed fundamentally the Wrong Thing To Do for most Python libraries. When I come across pages of JavaDoc-ish API documentation, I groan internally and just look at the source instead.)
>Complaints around infelicities in the Python stdlib are fair enough - I don't think anyone would defend them, but what can you do? You can't change published APIs.
You're absolutely right, but that doesn't mean Zed isn't on to something. How many different ways are there to open a subshell in python? There's popen, popen2, popen3, popen4 (I think), plus more (subprocess?). Yes, the past can't be changed, but it's still a good idea to stop and ask why there are so many mistakes like this.
Why is there a time module, and a date module, and a datetime module, and yet all three are broken?
The main question is What factors allowed the stdlib to get so quirky, and what can be done to fix that? Pointing out the stdlib is quirky is the first step in fixing the problem.
And subprocess is available in 2.6, too. It looks like a well-designed API and certainly makes it easy to run external filters on data. For example, to get the output of 'tidy -q' on a string s:
If you have a quirky stdlib, this can be a big problem if its source is visible. This can be a tremendous source of bad examples to the community. This happened with the various Smalltalk images -- there was some rather good code, and then there was a lot of procedural/spaghetti code by junior programmers. A lot of it happened to get into places where it would be very visible to other junior programmers. (Database/OR tools.)
I think that a lot of this quirky stuff is due to the "uncool" factor of cleaning up such infrastructure level stuff. It's much cooler to be working on a proxy framework that talks to some new protocol. Making sure dates and times work well seems pedestrian. But it is actually tremendously important.
Complaints around infelicities in the Python stdlib are fair enough - I don't think anyone would defend them, but what can you do? You can't change published APIs.
I was under the impression that Python 3 broke a lot of backwards compatibility, so could they not introduce new APIs with that? Or is that in fact what they did, and Zed is just complaining about 2.x?
I apologise if I'm mistaken on this, I've hardly used python at all. (though I keep meaning to learn it so I can finally stop writing bash scripts)
Python 3 removed old, deprecated APIs, but did not change any existing APIs. Probably the most extreme re-organization was the urllib/urllib2 merger, which was at its core just removing old stuff. Because Python is dynamically typed, the "protocol" of a library is extremely important, and the semantics of existing functions/classes are almost never changed.
Of Zed's complaints, I feel that two are misguided (time handling seems quite reasonable to me, and I've never seen an easier documentation system than Sphinx).
I think applying the "del" keyword to objects in a container is a mistake in the language, but that's how it was designed, so it probably wouldn't have been changed in the 2 -> 3 transition. It certainly won't be changed now -- the best that can be hoped for is for its use to be discouraged in the documentation.
Install/Uninstall of modules is fairly easy, but requires all concerned parties to work together. Otherwise you end up with half-installed, broken modules. There are PEPs in progress to work on this, but it's a social/political problem rather than a technical one.
Python's file-manipulation APIs are an ongoing calamity.
> Python 3 removed old, deprecated APIs, but did not change any existing APIs.
This is simply false. Modules were added, renamed, and modified. Module names now conform to pep 8 (Queue -> queue, SocketServer -> socketserver, etc). cPickle, cProfile and cStringIO are now accessible through their non-c counterparts instead of individually. Packages have been grouped - for example the http libraries are now in http.* instead of the top level.
Examples of API changes include the removal of sys.maxint and sys.exitfunc and friends, many libraries returning unicode strings instead of byte strings by default, and lots lots more.
Do you have any examples of where the behavior of a library function/method was changed, without renaming that method? The only examples I can think of are the change to return lazy iterables from map(), filter(), etc, which won't affect most use cases.
"API doc autogeneration; I can guarantee that if anyone came up with a default tool with default output such that "doctool module html_directory" was all you needed (and indeed people have) then there would be widespread, perfectly legitimate, disagreement on any number of choices in the design of it."
In my experience, the opposite actually happens. When something is written, people either use it and customize it over time, or they just find something else and say it didn't suit their needs.
It's when an extremely useful/general library is PROPOSED that it becomes a huge disagreement. This is, for instance, why C++'s Boost still doesn't have logging or a unified XML library, and won't likely have a garbage collection library before a C++ TR defines the interface: everyone argues over the color of the bikeshed, and the people who try to write the libraries get caught up in infinite microdiscussions over details.
Conversely, someone like Linus Torvald's success is that he just makes the things he needs/wants (ignoring bikeshed discussions), and people generally end up using his creations. When it comes to software design, a dictatorship is often more productive than a democracy!
"When I come across pages of JavaDoc-ish API documentation, I groan internally and just look at the source instead."
> a dictatorship is often more productive than a democracy!
"The only thing democracy has ever really been good for is making sure that things people don't like get vetoed. Democracy is no way to plan; it is merely a way to restrict one's choices, possibly to none at all."
"No easy_UNinstall" - I'm one of the lucky few who has access to an extensive and sensible archive of easily installable and uninstallable Python library packages; I call it "Ubuntu". From what I can see, distutils is a good metadata format that helps real packagers like Debian and Ubuntu create real packages - but trying to write a cross-platform installer is just asking for trouble. Just unpack the package somewhere and set $PYTHONPATH in your startup scripts; that's what your startup scripts are for, anyway.
"rm -rf" - my understanding is that the contents of the "os" stdlib package are a thin wrapper around POSIX, and rmdir(2) won't remove a non-empty directory either (although Zed says Python's rmdir will remove a directory with subdirectories, but not files... that's odd). "shutil.rmtree" isn't in POSIX, so it isn't in "os" either.
"Time Converstion" - Zed says "If all they did was give me the exact same POSIX C API I’d be happy.", but so far as I can tell, the 'time' stdlib module basically is the POSIX C API, with braindead awkwardness fully intact. I'll grant that "calendar" is stupid and "datetime" is crippled, although the third-party "dateutil" module fills a lot of the holes in "datetime". (also, for as long as I've known of it, mx.DateTime has been freely available)
"API Documentation Generation" - It's not in the standard library, but I've been quite happy with the third-party tool "epydoc" for Python API doc generation, and it pretty much is as easy as "epydoc path/to/package" (and predates Sphinx and a lot of the other tools).
I have to agree with the rest of his examples, though - Python's had a long, rocky road from "procedural Unix scripting language" to "Object-oriented, Internet service-providing language", and although Python 3.0 has cleaned up a lot of the cruft there's still some oddness that remains (like the len function, or the del keyword). A lot of the standard-library crud has come from people saying 'here, I've found/written an 80% solution to this particular problem, let's put it in the standard library since it's better than the solution that's in there at the moment' rather than 'here's a problem, let's design a 90% or 95% solution for the standard library'.
A lot of the standard-library crud has come from people saying 'here, I've found/written an 80% solution to this particular problem, let's put it in the standard library since it's better than the solution that's in there at the moment' rather than 'here's a problem, let's design a 90% or 95% solution for the standard library'.
This is always a cultural/community issue. It would need a cultural/community fix. And, given what I've seen in other communities, it should be fixed, as there's a lot to be gained by doing so.
>> Then when you are told about it, you’d make up excuses trying to explain
>> why it is totally normal.
> That's a pretty incredible rhetorical device. "If you
> disagree, you're fundamentally incapable of reasoned
> thought."
No, that's not what Zed's saying at all. He's saying that people often respond to valid criticism of things they know well by justifying the behavior being criticised.
Ie, people mold their way of thinking to suit their existing tools. Since they're familiar with how things are, they have problems seeing what could be.
It's not an ad-hominem attack at all. Think of the academic who has no idea of how to teach because he knows his subject so well he can no longer see it from the perspective of an outsider.
Except, do you doubt its truth? Humans are rationalization engines. Communities will go to great lengths to prove that everything about their culture is good.
I not only doubt its truth: I know for sure it's false.
If I disagree with someone, then I often still grant that their opinion is the result of reasoned thought. In such a case it's the underlying assumptions that we disagree on and those assumptions are usually not amenable to reasoning.
A programming example is bracing styles: I have my preferred one and a colleague has his preferred one. I have my arguments and he has his arguments. In the end, it comes down to what each of us considers to be 'best readable', which is an entirely irrational consideration. We get along very well, despite our differences (and of course, for each project, we settle upon a style, based on exterior considerations like: what would be consistent with this clients' codebase).
Another example is religion: I'm a staunch atheist and my girlfriend is a Christian. 'nuff said.
That's not an example of what he's talking about. He's talking about people creating reasons to explain away a real problem in a way that preserves reputation or personal peace-of-mind.
What he's talking about is the pattern you get when you speak to an Alzheimer's sufferer who is trying to pretend they have nothing wrong. They can be extremely convincing to you and themselves for a while making up excuses for why things are the way they are. Eventually they exhaust (rapidly, if you make them reload context) and the facade falls away.
Programmers do this stuff all the time. "Oh it has to be this way because ... [some bullshit reason that doesn't explain why a customer has a reasonable objection to the software cracking its head doing a double blackflip when they asked it to step forward]". Is it reasonable to the educated impartial observer that a piece of enterprise software should crap itself on a null pointer exception and start silently failing? Not remotely. Will you find programmers defending their software when it exhibits such behaviour? All the time.
This is exactly why people who learn multiple programming languages/paradigms are so valuable to these types of conversations. The problem is that reasoned and experienced thought gets thrown out with the speculative garbage.
Enough said. If you bloody well knew what I typed, just exit FFS.
Python's design is full of shortsighted, dogmatic decisions like this - don't even get me started with functions versus methods and __len__-esque line noise, argggg! And there should only be one obvious way to do something? Give us a break - that's just not how life, mathematics, or anything should work.
Explicit function calling via () is one of my favorite things in the language. To each their own.
"exit" merely references a function. It doesn't call by name alone. It can thus be passed to a function (a callback for example), saved to a variable, etc for later calling. Maybe not so useful for the exit function in the interactive shell but very useful for functional programming. This is an example of consistency, sometimes at the expense of convenience; not great for hacking, awesome for collaboration and large projects (imo).
But how does "exit" by itself print that pretty little message telling you that, although it's obvious you wanted to exit, it's not going to let you? It's not just a function reference if it shows up that way.
In CPython, exit is actually an object of type site.Quitter (the actual class is implementation dependent). site.Quitter overrides the __repr__ special method (which is called by the interpreter on any expression typed in the shell to print the result). site.Quitter also overrides the __call__ special method, so when an object of type site.Quitter is called, the overridden __call__ method invokes system exit.
>>> exit_class = type(exit) #gets a reference to the class
>>> my_exit = exit_class('bye') #the arg is used to print the message
>>> my_exit
Use bye() or Ctrl-D (i.e. EOF) to exit
>>> my_exit()
<python shell exits>
Minor inconsistency: typing "bye()" doesn't work so technically the message is incorrect. But I suppose they don't want you to be hacking exit() in the first place.
"exit" merely references a function. It doesn't call by name alone.
His point was that the user already typed in "exit", and the interpreter recognized the user's intent to exit, so it should just shut up and exit already, instead of telling the user to exit in a different way.
Life was just the first item on my list. I also said mathematics. There's plenty of precedent for programming languages to "mimic" mathematics due to being a branch of mathematics, after all. Having only one obvious way to do something is religious dogma unmatched by any similarly practical activity I can think of right now.
Yes the python APIs are very inconsistent, but I disagree about the del statement. That's a different matter and the way Zed looks at it is very much the kind of hemispatial neglect that tends to befall people who are immersed too deeply in the OO paradigm.
del x unbinds x from whatever namespace it resolves to. How would you say c.remove(x) if you don't know which collection c represents? Now, of course you could argue that del x is too different from removing something from a collection to have it use the same syntax. But that's not about neglect. It's just a different opinion about what is consistent.
The reasoning is probably that removing a variable should always be done in one way and that is del x no matter if x is inside a collection we know or not. It's a very conscious attempt to make it consistent and that's not neglect. Whether it's a good idea I don't know.
Maybe a global del function would be better. If x is inside c and we know what c is we would say c.del(x) and if we want c to be resolved we say del(x). But is that really so different from what we have now?
More generally, my opinion is that the way functions are used in OO is in itself very inconsistent. If you have a function f and variables x, y, and z, it's mostly a consideration of implementation details that leads to a decision of whether it should be x.f(y, z), y.f(x, z) or z.f(x, y). Why does the user of an API have to think about or remember which one it is?
That's one of the problems: Inline blocks are fundamentally incompatible with significant whitespace, or at least no one has thought of a good way to combine the two.
I think I misunderstood you before. I thought you were talking about putting multiple statements on a line.
But Haskell can pass around code blocks as well, with or without significant whitespace.
For example, I can define a function called 'forever' to run a sequence of actions forever like so:
forever x = x >> forever x
Where 'x' is any sequence of actions. You can pronounce '>>' as "followed by". (These are monadic actions, actually, but nevermind that.)
I can invoke it like so:
myFunc = forever (do {this; that; other})
or:
myFunc = forever $ do
this
that
other
The dollar sign is a function that lets you dispense with some parentheses. Think of it like an opening paren that closes at the end of the subexpression it's in.
If the block is a function rather than a sequence of actions, Haskell uses '\' as a lambda for introducing anonymous functions.
For example, the function 'map' takes a unary function and calls it with each value from a list. Below I'll call it with an anonymous function which returns double its argument:
How about when the function is being passed as an argument to a function? Does the ')' just go on the next line by itself? What if it's not the last argument? There are many opportunities for ambiguity and/or ugliness.
what I love is when I want to install a single ruby gem and I get to sit around waiting for a half hour while it fetches seventeen unrelated things from github. For extra special sauce, let github be down.
I hate that when criticised, people demand that the criticizer fix the problem they're pointing out. I realize your request is less strong than that, but fixing "your" software is not "my" problem. Especially when you consider that nearly every piece of software I've ever used has bugs, and most of the bug reports I submit are ignored.
I'm sorry; No. I'm not demanding he fix it; I'm asking he do something more constructive than nothing. Python isn't "my" software - it's "our" software. Zed's smart enough to contribute and help fix things - I hold him to a higher standard than someone just walking in off the street.
Python-core does not ignore bug reports, especially ones which really fix things, and come with a patch. Asking him to help isn't out of bounds.
Though I agree these things are best addressed patch in hand, it's worthwhile pointing out that there are design issues that people do need to be aware of to prevent future occurrences of the idiosyncrasies he is mentioning.
Yet, here’s what you have to do for Sphinx which is an insane amount of work for something that JavaDoc, POD, Doxygen, RubyDoc
Nit: Sphinx is a system for long-form, separately written documentation. Like manuals and tutorials; think stuff that requires indices. That's why autodoc is an extension, why you usually set up a separate directory, why you have the flexible build system, and so on. (Although all you have to do is create a file and type `automodule` and then the module name.)
Is that intuitively the orthogonal method to append()? Not in my view. If I appended something, I'd want to remove it or delete it, not pop it. If I push'ed, then popping would be obvious.
It's not just that they don't overlap; it's that they don't even really relate or affect each other. In other words, one is pointing North and the other West. They aren't opposites; just different.
Both of you are correct. In programming languages / the IT world, object A is orthogonal to object B if object A can be used without thinking about the potential consequences to object B.
In mathematics, orthogonal means perpendicular in a geometric-sense (think two vectors) but can also be used in other contexts with a different meaning.
Even more generally, two vectors are orthogonal if their inner product is zero.
This is why the word "perpenidcular" is not used, as sometimes your vectors don't really have "directions" in the intuitive sense of the word (e.g. the inner product space of functions).
I think that comes from the ideas behind vector decomposition (remember physics 101? ;-)).
When you decompose a goal into several linearly independent sub-goals - those sub-goals are said to be 'orthogonal' since they don't have any interaction/interdependence with one another.
Orthogonal is an appropriate synonym for complementary in a lot of cases. It wasn't an appropriate usage here though. Features complement each other by not overlapping, hence being orthogonal, which is probably the only usage that most programmers hear and the source of the confusion.
pop(i) or del a[i] is remove the item at i; insert(i, x) inserts x before the item at i; append(x) adds x after the last item (which can't be done with insert). remove(x) is unordered as it operates on the item not the index, so the opposite of remove should be an unordered add (which exists in an unordered set but not in a list). The rest of his points are more valid.
Coming from a Java-background, I've often found myself trying to call list.add(). I think that would be an even more natural complement to list.remove().
I think the del statement as in "get rid of this" is perfectly fine. There is no need to pollute the namespace of lists (or anything, BTW) with a method to duplicate an already existing, and very fundamental, language feature.
The one that always bugged me but seems to be true for most dynamic languages is immutable types being passed by value and everything else passed by reference.
Speed? Not sure if this is actually what the OP was thinking of, but pass-by-value usually implies copying, while pass by reference just involves handing around a pointer.
ie, if you had an enormous immutable value (in Python, a tuple with lots of entries), then in a hypothetical Python implementation which did call-by-value, it might be significantly slower than call-by-reference, on account of having to copy the entire tuple.
But this falls squarely into 'implementation detail' - if you're so inclined, you could implement either CBV or CBR in hundreds of other ways. Certainly from the perspective of time-independent program behaviour you shouldn't be able to tell the difference.
I've ranted on this subject to my coworkers many times in the past. Python's API is full of missing inverses, missing analogues, and similar things with oddly different shapes.
mystuff[4] = 'apple' doesn't add an 'apple', it overwrites an existing element in a list of at least 5 elements, and del mystuff[4] actually does change the length of the list. They aren't symmetric at all.
"A normal person will eat everything in front of them, but a person with neglect will happily eat only the things on the left side of their body (emphasis added)."
It appears that this condition is more serious than we originally thought.
Fos most of the rest of his issues, I think Zed's problem is that he thinks there's one obvious answer, but lots of other people would disagree with his one answer. Python tends to be either very opinionated, or very agnostic. Where there's One Way To Do Things, that's what you must do; except where there's obvious disagreement in which case it doesn't bless a single answer.
API doc autogeneration; I can guarantee that if anyone came up with a default tool with default output such that "doctool module html_directory" was all you needed (and indeed people have) then there would be widespread, perfectly legitimate, disagreement on any number of choices in the design of it.
(Prsonally, I think API doc autogeneration is entirely pointless, and indeed fundamentally the Wrong Thing To Do for most Python libraries. When I come across pages of JavaDoc-ish API documentation, I groan internally and just look at the source instead.)