Hacker News new | past | comments | ask | show | jobs | submit login
Myths of Enterprise Python (paypal-engineering.com)
301 points by rbanffy on March 24, 2015 | hide | past | favorite | 250 comments



This is a fairly weak article, full of deflections around the real weaknesses of python. I'm surprised it is on the Paypal Engineering blog.

Like all languages, Python has strengths and weaknesses, and there is no shame in that. An honest article would address the negatives head on, acknowledge them as potential negatives,(not skip around them) and provide alternatives, .

The strawman "Python has a weak type system" is a good example of such deflection.

No one (who understands type systems) would complain that "Python has a weak type system"

A more common "criticism" would be "Python does not have a static type system, which is handy for large codebases worked on by large teams".

Address that upfront, and you have a decent article. This is just a fanboy (who just happens to be at Paypal) praising his favorite language and ignoring its weaknesses.


The keyword here is "Enterprise" Python. The Enterprise is a place where you have "decision makers" which are often people who don't understand type systems, and have absolutely no incentive to do so.

Paypal is a large organization which needs to hire Python programmers. Everyone reading this now has a reminder that Python is popular at Paypal. We also know that endless PLT arguments are probably not common in the workplace.


Fortunately, there are other large organisations employing decision-makers which understand the business value of type systems.

Apple: https://developer.apple.com/swift

Facebook: http://flowtype.org + http://hacklang.org + http://hhvm.com

Microsoft: http://fsharp.org + https://haskell.org + http://typescriptlang.org

Mozilla: http://rust-lang.org

Google: The jury is out on this one.


Google, Apple, Facebook are the tech companies.

Even though they are enterprises, the term enterprise software often refer to the other companies, who use tech as an aid to run their main businesses.

Think about Box, MobileIron, and many such companies targeted to enterprises and who their consumers are. That is the enterprise software market.


Can you clarify what you mean about Google?


For one, both Dart and Go repeat Hoare’s Billion Dollar Mistake, by admitting null references instead of implementing option types.

In a previous HN discussion, I wrote to one of the Go designers:

> I disagree with you on the relative complexity of type systems, and, as someone passionate about my craft, I despair Go is merely good not great, due to what appears to be uninformed design decisions.

> You may prefer to write in Go, but you don't work in a vacuum. Your creation is out there, gaining mindshare, and propagating mistakes made half a century ago. As a language designer, you have the power to shape human thought for years to come, and the responsibility to remain intellectually honest. This is the meaning of Hoare's apology. He didn't know better. You have no such excuse.

https://news.ycombinator.com/item?id=5822545


Well said. Your comment on that link was quite insightful as well. I'll quote it here below:

"So, Go has message passing. Just like Erlang, since 1986.

Only without per-process heaps, which make crashing processes safe. And without process linking and supervision, which helps systems built using Erlang/OTP achieve nine nines of uptime. Instead, it includes null references, also known as Hoare's Billion Dollar Mistake.

But it's not enough to scorn the industry; Go's designers also look down their noses at academia.

A modern, ML-derived static type system? Generics, which would enable the unwashed mashes to write their own `append`? Ain't nobody got time for that — wait, what? Oh, Rust does?

Go's tooling is fantastic, and its pragmatism is commendable, but ignoring the last 30 years of programming language research is not."


I think in both cases, they are allowing pragmatism to win out over correctness. The underlying systems that Dart & Go try to interface with embrace NULL with a kind of oblivious joy that can't be replicated.


I find the idea that optionals aren't pragmatic pretty bizarre. Of all the features that "advanced" type systems give you, optionals are some of the most straightforward. I still haven't heard a convincing argument for why they should not be in Go (e.g., someone pointed out that all types in Go have a default value, but there is a very obvious candidate for a default value for optionals--None; "in practice these issues don't come up"--doesn't that just mean people don't use null much? Why allow it to inhabit every type, then?).

Anyway, neither Dart nor Go is particularly "close to the hardware" as they both have fairly substantial runtimes. We're not talking about a macro assembler here, we're talking about a typed programming language. What the compiler does under the hood is largely irrelevant in any language that doesn't make a distinction between register, stack, and heap allocation.


It’s an example of the “New Jersey approach” to programming language design.

> Simplicity-the design must be simple, both in implementation and interface. It is more important for the implementation to be simple than the interface. Simplicity is the most important consideration in a design.

http://www.jwz.org/doc/worse-is-better.html


If the underlying interface (JavaScript or POSIX as the case may be) is using NULL's up the wazoo, then you need a way to represent that. You can try to represent it as optionals, but that creates impedance, which ultimately may cost you more complexity (and performance).


It doesn't have to cost performance (not even compiler performance!) or create impedance. You just represent the None variant as null in the compiled output. I realize this sounds suspiciously easy and it seems like there must be something I'm glossing over, but there honestly isn't, as long as you never introduce null to your own language in the first place. Once you've done that, though, it gets harder, because you have to differentiate between Some(null) and None, which means they can't both be compiled to null. But this is a completely self-imposed problem; it is not an issue when you create a language from scratch, only if you want to retrofit optionals onto an existing language.


> I realize this sounds suspiciously easy and it seems like there must be something I'm glossing over, but there honestly isn't, as long as you never introduce null to your own language in the first place.

Yeah, having done this before, it isn't that easy. You basically have to map NULL to something else, and if that mapping is so direct and straight forward, you actually haven't improved your engineering one bit.


It is literally that easy. This is how it is done in Rust, for instance. You have improved your engineering by (1) requiring exhaustive match on anything that is potentially null, and (2) eliminating the need to check for null anywhere else. I don't understand why people take it as an article of faith that this must be difficult. In fact, the sheer simplicity of it is why I believe it should be in Go, and am confused about why it is not.



Null is much less of a problem than having no static type system at all.

A "billion dollar" mistake is a mistake Google can affod to make :-/


Most likely a quip about how Go's type system is uninspiring.


Ironically, Go being uninspiring is its greatest strength.


Or the criticisms of Python that he has encountered are just different from your criticisms.

I have not seen anyone complain that Python is not compiled for at least a decade, but maybe the author has.


I always read "not compiled" as a synonym for "slow."

I'm not worried about Python's type system. At worst it's manageable, at best it's expressive for prototyping.

But when I see benchmarks that suggest Python is 10 to 100X slower for critical server code, I have to wonder why anyone would use it for enterprise development.

Which is why there are so many Java and C++ code jockeys working in enterprise. Neither language is pretty or fun or interesting from a CS point of view. But there's no arguing both consistently run faster than anything this side of assembler.

I would have expected critical industrial infrastructure code to pay some attention to that - because speed isn't an abstraction. When you're running giant data centres, extra cycles consistently cost real money.

Dev costs are relatively small compared to operating costs. So it's well worth spending extra time getting good, fast compiled code working.


PyPy really closes the speed gap. The latest PyPy 2.5 release uses pinning to pass pointers between the C layer and PyPy greatly improving IO performance [1]. I've noticed this in a project I've been working on holding open large amounts of concurrent connections (> 100k at a time), PyPy has been completely competitive with Go, and actually using less memory. Yes, it's not as fast as a Java/C+ version, but with PyPy its more like 2-5x slower, not 10-100x slower which really changes things.

[1] http://morepypy.blogspot.com/2015/02/pypy-250-released.html


>But when I see benchmarks that suggest Python is 10 to 100X slower for critical server code, I have to wonder why anyone would use it for enterprise development.

Because CPU cycles are cheap and bugs are not.

>Dev costs are relatively small compared to operating costs.

Uhh, not in my experience.


It really just depends on the project. Sometimes, all you want to do is, like, take data, type-check it, maybe do a couple of simple transforms, and then store it. But you want to do it 50,000 times per second.

In that case, it may very well be the case that ops costs absolutely dwarf dev costs.

Similarly, it may be that what you want is to take data and run it through super-complicated algorithms depending on a lot of business data, and massage it all over the place... but you only need to do this 10 times per second. In which case your dev costs may absolutely dwarf your ops costs.


>> Dev costs are relatively small compared to operating costs. > > Uhh, not in my experience.

It's probably not possible to make a true statement out of context about which costs less. This depends quite heavily on what you're doing.


Python being really fast to code means that the biggest optimizations (those that'll give you a million times or more increment in performance) are fast and cheap to implement. If after that performance is still that relevant, you can always replace parts of the code, with the biggest optimizations already in... And replacing code that already solves a problem tends to be much safer than solving it in "development time intensive" languages.

C++ does have its place (I'm not convinced about Java), but starting with it some project you could make as well with Python just because of performance isn't a good policy.


On my latest project, I've spent about 5k in operating costs, and about 100k in dev costs, and that is just one example. I really don't know where you're coming from when you say dev costs are relatively small compared to operating costs...


In much enterprise development (indeed much of development in general) development time is much more important than CPU time. Not just because of price, but also in reaction time to new and changing requirements.


But there's no arguing both consistently run faster than anything this side of assembler.

FORTRAN is faster than C++ and Java but you don't see it used much in enterprise anymore...


But when I see benchmarks that suggest Python is 10 to 100X slower for critical server code, I have to wonder why anyone would use it for enterprise development.

Because not all enterprise development relates to critical server code?

Neither language is pretty or fun or interesting from a CS point of view.

I find working with c++ brings huge amounts of fun, and I do that almost daily :P But maybe I don't have a CS view (not sure what that is supposed to be?)


Oh sure. I hear complaints all the time about Python not catching things you'd sure wish a compiler would catch. You end up just writing more unit tests to compensate, but that annoys people.


Some people - who do understand type systems - only think of static type systems as type systems. And in that regard they might look at "dynamically typed langauges" as having "weak type systems", since they are often just unityped (only have one type).


Unfortunately, the words “strong” and “weak” often don’t mean anything more than:

> Strong typing: A type system that I like and feel comfortable with

> Weak typing: A type system that worries me, or makes me feel uncomfortable

http://web.archive.org/web/20091227121956/http://www.pphsg.o...


I agree to a degree, though some things are at least confined to a certain dimensions: like strong/weak typing. A continuum more than a binary distinction, but at least you can say things like "stronger" and "weaker" than something else.

(Note that I wrote "weak type system", not "weak typing". "Weak" here is a just a generic adjective, and not meant to be precise.)


If you want me to trust what you say about a language — or any technology, actually — be forthright about its deficiencies.

Because I don't believe you really, truly understand a language until you can tell me what sucks about it. It takes significant time (in a reasonably decent language) to discover the corner cases, performance bottlenecks, quirks, big-deficiencies-hidden-in-plain-sight and outright bugs in a language like, say, python.

Something as "rah-rah" this, which goes so far as to basically call the GIL a source of unicorns and rainbows, is convincing almost in inverse proportion to its stridency. Suddenly I'm wondering if all these "myths" about python might be smoke pointing toward a fire. That's probably a bit unfair, but it's hard to know what to take seriously when you're listening to a voice that's less than credible.


I've had a chance to use Python several times in various projects.

The language really excelled at small projects, where the expressiveness of the language and the excellent libraries available really let us get a lot done with little code and time. Avoiding the recompile/redeploy steps also helped.

But using the language for larger projects was quite a different thing. Suddenly you could find yourself in code several layers down, being passed an object of God knows what type from unfamiliar code, and having to find out what you were receiving essentially by trial and error. And once the execution times started to climb, it became increasingly frustrating to see trivial problems that would have been caught during compilation in a statically typed language appear in Python only after 30-60 minutes of execution.

There are surely ways to alleviate these specific problems -- I can think of some things to try myself -- but my experience suggests that the sweet spot for dynamic languages like Python is in projects that fit between one pair of ears, and stricter statically typed languages become increasingly useful as things scale up.

(Hardly a radical position, I know.)


>Something as "rah-rah" this, which goes so far as to basically call the GIL a source of unicorns and rainbows

The GIL is definitely no source of unicorns and rainbows, but I think the case against it is usually overstated. There are numerous ways of sidestepping it (multiprocessing, pypy, C extensions, etc.), and it does serve a useful purpose.


>> The GIL is definitely no source of unicorns and rainbows, but I think the case against it is usually overstated. There are numerous ways of sidestepping it (multiprocessing, pypy, C extensions, etc.), and it does serve a useful purpose.

That's definitely all true, but the article brushes it's implications for multithreaded Python off as if they don't exist, which is what one of the posters above me was probably referring to.

Yes you can do multiprocessing, but if I have lots of shared, volatile state that's not what I want. Yes you can use PyPy, but if that doesn't work for some python framework I use, or if I can't control my deployment environment and it only has CPython I can't use PyPy. Obviously you can write C extensions for about any language, using that as an arguments why the GIL is not a problem for multithreaded Python is disingenuous. Maybe I don't know or don't like to program in C? Green threads are not a substitute for multi-threading either, as they still don't allow full utilization of multiple cores and are really only a solution for I/O bound processing.

Of the 'numerous ways to sidestep the GIL' none are satisfactory if you have a CPU bound problem operating on shared state, that lends itself well to parallel execution, which are many. I wouldn't use Python to write a video codec or to do DNA sequence processing for example. It's not a fatal flaw for Python-as-a-language, but it's a flaw of CPython nonetheless, and not an insignificant one.


I wouldn't write a video codec in python either, nor would I write code that requires a huge amount of shared volatile state, but these things are not common coding tasks in general, particularly not in enterprisey-type programming.

> Maybe I don't know or don't like to program in C?

If you want to write a video codec or highly performant multithreaded code, you should probably give it a go.

>Of the 'numerous ways to sidestep the GIL' none are satisfactory if you have a CPU bound problem operating on shared state, that lends itself well to parallel execution, which are many.

You mean exactly like the matrix calculations done in the C extensions of numpy?


Also numba. It is a multithreaded JIT compiler for a subset of python code. Just need a decorator.


PyPy also has a GIL. Their proposal to get rid of it (PyPy STM) is far from being ready for production and comes with a pretty important overhead: http://pypy.readthedocs.org/en/latest/stm.html#introduction

(but it's a really promising project that all Python developers should follow!)


You can deal with the GIL, it's not the death sentance some people think it is, I agree.

But this guy actually DOES try to argue it's a _positive_ somehow! He really is claiming it's all unicorns and rainbows, he doesn't mention _any_ negative to it, he says it doesn't effect concurrency at all, look at generators and deferreds that proves it (not true, a misdirection), he even tries to say it's a positive, the "GIL is a performance optimization for most use cases of Python, and a development ease optimization for virtually all CPython code. The GIL makes it much easier to use OS threads"

I think either this is an intellectually dishonest argument meant to confuse less-technical people, or the guy doesn't understand what he's talking about.

And I still think python is probably fine for 'enterprise', although as always it depends on what you're doing with it!


The GIL absolutely makes it easier to use OS threads in terms of development ease, but the "performance optimization" piece I can't really see, except insofar as the development ease aspect means that developers are less likely to avoid using threads where they would provide a performance benefit.


"The GIL absolutely makes it easier to use OS threads in terms of development ease"

Yes, but pretending that you can have the ease of not worrying about thread safety while still having the performance benefits of multi-threading (real OS threads, mind you) is ridiculous. You can't have your cake and eat it too.


You can have the ease with some of the performance benefits with the GIL (since the GIL is released in certain low level operations) -- because, for the things where you get the performance benefits, someone else has dealt with the pain of not having the ease.


I think the performance comment is in regard to the single-threaded common case. The reason the GIL is still around is that Guido has said that a replacement is only acceptable if it avoids performance regressions for single-threaded code. Most of the Python world seems to have concluded that that's not possible in practice.


This is one of my favorite questions when interviewing software developers: "What is your favorite language and why?" Followed by: "Tell me what you would change about it if you could."

It's a good fanboy filter.


I would just answer "Nothing."

Because, really, it isn't just because a language has it's deficiencies that I would want to change that. That would probably result in a new language, which is not desirable. Usually a language is what it is because of it's pros, which unfortunately happens to create the cons. If you remove the cons you would probably also lose some of the pros.

So, I think a better question would be: "And what wouldn't you use [language] for? Why?" Or, if you want to sound cool, "What do you hate about it?"


> If you remove the cons you would probably also lose some of the pros.

Especially true of Lisps, of course.


Not exactly. I use Guava's Optional frequently when writing Java. It really helps take care of the null problem. It doesn't change the language significantly. However, I really wish it was part of the language instead of a library. Other libraries typically don't return Optionals which forces me to add even more lines of code protecting against nulls.

edit: s/Java/Android Java/, unfortunately the spat between Oracle & Google has prevented most of Java 7 & all of Java 8 from being incorporated into Android Java, which is what I work with on a daily basis.



It's in Java 8. java.util.Optional<T>

Still a library but at least it's part of the standard library.


Optionals are part of Java 8.


There are some cases that won't change the language. Python can easily add better support for static analysis without changing the language at all. I shouldn't have to do this:

""" :type member_list: list of [string] """

To have that hinting. I know this is getting better, but it isn't to the point of being very useful.


Meh.... I think he's right that there's no good reason _not_ to use Python.

But I think he overstates the case.

Calling Python "compiled" doesn't seem right -- he's more right that this doesn't matter, there's no reason not to use something because it's "not compiled".

Calling Python's typing "strong typing", eh, really? He provide a link to a wikipedia article on typing in general, how about a link to anyone else making the case that Python should be considered 'strongly typed', for a particular definition of 'strongly typed'? I'm dubious that it would be any definition of 'strongly typed' that those who think they want it... want. Again, I think a better argument might be: So what, you don't need strong typing for success, because of a b and c.

And the concurrency discussions is just silly. Sure, you can do _all sorts_ of things without multi-core concurrency, you may not need it or even benefit from it, and scaling to multi-process can sometimes work just fine too. But there are also _all sorts_ of cases where you do really want multi-core parallelism, and Python really doesn't have it -- and trying to justify this as "makes it much easier to use OS threads" is just silly. I guess it makes it 'easier' in that it protects you from some but not all race conditions in fairly unpredictable ways -- but only by eliminating half the conceptual use cases for threads in the first place (still good for IO-bound work, no longer good for CPU-bound work).

I think there's a reasonable argument to be made that Python will work just fine for 'enterprise' work. There are also, surely, like for all languages/platforms, cases where it would be unsuitable. By overstating the case with some half-truths and confusions, he just sounds like a fanboy and does not help increase understanding or accurate decision-making -- or confidence in the Python community's understanding!


I can think of three ways of thinking about the term "strongly typed" once you realize it doesn't mean "statically typed".

1) It's meaningless--it's almost impossible to produce a real ranking of languages on the strength of their types.

2) These dynamic languages are all unityped because they're not statically typed. The fact that runtime tags won't allow certain operations to succeed has nothing to do with types.

3) Strongly typed is a rough and ready way of saying you don't do many implicit conversions.

I sympathize with (1). (2) is accurate, but really a terminological point. There's clearly a phenomenon to talk about, even if "typed" is the wrong word. I also sympathize with (3).

Python does far fewer implicit conversions than PHP, JavaScript or Perl. It even doesn't do at least one that Java does. So people often say Python is strongly, but dynamically typed. It's a real aspect of the language that you can view as a plus or minus.


  >>> 1 + "0"
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: unsupported operand type(s) for +: 'int' and 'str'
This is what strong typing refers to.


It's not so clearcut. If this is evidence Python is strongly typed, would the following be evidence to the contrary?

  In [1]: 1 + 2.5
  Out[1]: 3.5

  In [2]: "foo" * 3
  Out[2]: 'foofoofoo'

  In [3]: [1, 2] * 3
  Out[3]: [1, 2, 1, 2, 1, 2]


It's not contrary, it's just convenience. One of the weaknesses in Python (if you can call it that) is that everything is an "object." The classes implementing string, list, and integer simply have methods that respond to those operators and rhs types.

  In [1]: "foo" + 3
  ---------------------------------------------------------------------------
  TypeError                                 Traceback (most recent call last)
  <ipython-input-1-21582e79f06e> in <module>()
  ----> 1 "foo" + 3

  TypeError: cannot concatenate 'str' and 'int' objects
That's because the "special" method "__add__" implemented by the string class will raise TypeError if it gets any object that isn't an instance of a subclass of string.

In a way it is kind of funny to call it, "strongly typed," but it does work. Maybe it should be called, "instancely-typed."


Okay, with this definition of 'strongly typed', can anyone come up with an example language that _isn't_ strongly typed?

It seems to make 'strongly typed' pretty meaningless, and this definition is probably _not_ what anyone who says they want a "strongly typed" langauge is using, so it hardly counters them to say that python is "strongly typed" under another definition, it's just confusing them with semantics.

(Of course, the people who say they want 'strongly typed' may have no idea what they're actually talking about, but wouldn't it be better to educate them then to take advantage of their ignorance to push your pro-python agenda?)


C is not strongly typed.

In general, when someone is talking about strong typing, they are talking about silent failure for unintuitive or ambiguous constructs. For example, if the expression

    x = '1' + 1
results in an error, you are probably using a strongly-typed language. In C, this is equivalent to

    x = 32;
In javascript you get

    x = '11'
In PHP you get

    x = 2
These are examples of weak typing.


Strong, as opposed to weak, implying that there is type-constraint checking; it's just done at run-time in Python.

You could also describe Python's type system to be dynamic in that instances of the type meta-class define the constraints and objects (instances of a class) can have constraints added and removed at run-time. Python is still fairly strong in this regard in that the built-in classes are immutable (ie: it is a TypeError to assign a bound method to an attribute on a built-in class such as str).

I suggest "instancely-typed" because categories, unions, and type theory. I'm only coming to grips with that in that OCaml's type solver can be both awesome, annoying, and cryptic. And at the end of the day I'm still not sure what it's buying me other than proving exhaustive pattern matches in certain conditions, fast pattern dispatching, and requiring specialized operators (+, -, , / for ints... +.,-.,.,/. for floats... etc). I'm sure the enlightenment will come when it stops becoming such a PITA to write a basic program.

Sometimes not having to satisfy the constraints up-front makes exploratory programming (where the constraints are not specified and known up front) easier. Python is going the annotation route in newer versions of the language which is rather useful so that tools could be written to verify consistency up-front (or at least provide hints).


Javascript comes to mind, since it only functionally has I think 5 types (string, number, boolean, array, object - and the line between array and object is blurry) and no user-added ones. There are no built-in type semantics whatsoever other than what properties are present on an object at the time of its use. Functions of different arities aren't even type-differentiable (although arity is available as a reflectable value).


I don't know. Runtime type checking is necessary because the language doesn't have constructs which allow dispatching based on type.

Otherwise, you'd run into Python's strong type system when doing an incompatible operation.


No. That would be evidence of polymorphism. Those objects have explicit ways of handling those operations based on the types or interface of the object left/right of the operand.

The interpreter/type system is doing nothing implicitly behind the scenes to coerce the objects.


I love Python for things like this. 'foo' * 3 is so intuitive. Seriously, to hell with Java's StringWriters and StringBuffers and StringBufferInputStream throws IOException.


This is a great example, which is why I think "strong vs. weak" typing is properly thought of as a scale rather than a classification. Which is used properly in the blog post, merely comparing some instances of implicit conversion in JVM that make less sense by default (all of your examples can be argued to be deliberately useful, as opposed to Javascript WAT-style statements).


I agree with your second point. Python doesn't pretend to be strongly-typed, but it does use its, uh, 'loose' typing to great advantage.

I particularly like Python's arbitrary-precision integers, where ints are implicitly converted to longs, allowing you to have the performance of ints without the problem of integer overflows.


You say "python doesn't pretend to be strongly-typed", yet the OP is claiming it _is_ strongly typed.

I think the reality is "strongly typed" doesn't really mean much.


No. Why would it?

As most OO languages, Python has function overloading based on the type of the parameters. The fact that those operators do different things when operating on different types is evidence that Python has strict types, not loose.


Not overloading in the sense of Java and what not.

There is no dynamic dispatch based on types in the interpreter, you have to manually do it in the body of the method, of which you have one.


Which makes it ambiguous:

    Python 2.7.1 (r271:86882M, Nov 30 2010, 10:35:34) 
    [GCC 4.2.1 (Apple Inc. build 5664)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 1/2.3
    0.4347826086956522
    >>> 1 / 2.3
    fish: Job 2, 'python' terminated by signal SIGSEGV (Address boundary error)
(I didn't expect that last part.)


There's something weird going on with your system. There is absolutely no way under normal circumstances that you'd get a segmentation fault dividing 1 by 2.3.


    Python 2.7.5 (default, Nov  3 2014, 14:26:24) 
    [GCC 4.8.3 20140911 (Red Hat 4.8.3-7)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> 1/2.3
    0.4347826086956522
    >>> 1 / 2.3
    0.4347826086956522
Identical output under Python 3.3.2


> There's something weird going on with your system.

Yep -- it seems like a readline problem. As you can tell, I don't use this machine for Python.


Indeed, it is strongly typed but not statically typed. The only thing is, almost every language is strongly typed now, the only counterexample I know is C (and derivatives). Not that there aren't others, but I'm not sure there are others that anyone really uses. So I wonder if it is really even worth saying a language is "strongly typed" anymore, it's hardly ever a distinction. Might as well just say "not C".


C, JavaScript, PHP, lisp...

There are many popular and weakly typed languages in use today.


JS is strongly typed, not sure about PHP but I don't think so. lisp is not a language, it's a family of languages


> [] + {}

"[object Object]"

> {} + []

0

JS and PHP are the prototypical examples of weak typing.


Again, not sure about PHP. But no, JS is strongly typed. What you've shown there is coercion, not type-punning or anything of the sort.


Strongly typed is pretty precisely defined and Python is certainly strongly typed.


Defined as what? The Wikipedia article says the definition is unclear:

Languages are often colloquially referred to as "strongly typed" or "weakly typed". In fact, there is no universally accepted definition of what these terms mean. In general, there are more precise terms to represent the differences between type systems that lead people to call them "strong" or "weak".


I agree that there are precise definitions of the phrase "strongly typed", but from my experience if you put 10 computer scientists in the same room, you'll hear 3 different definitions of what it means.


True, but the complaint that people tend to come up with is that Python is not statically typed. What they are looking for here are things such as clear, accurate and reliable IntelliSense, and the ability to right click on a method definition, select "Rename", and be assured that the refactoring will be propagated through to every usage throughout your codebase with no false positives and no false negatives.


Strong typing saves you from a whole universe of obscure bugs at no extra cost.

Static typing would be nice, but not in a Java or C# or C++ way. The extra verbosity makes it not worth the benefits. Sure, you get slightly more reliable intellisense, but at the expense of a program that's 15-20% longer.


But Python is strongly typed. A language can be dynamically typed, and strongly typed - no problem with that. More details in http://eli.thegreenplace.net/2006/11/25/a-taxonomy-of-typing...


There are plenty of options for concurrent and parallel operations in Python[0]. It is heavily used in data analytics and the scientific programming communities. Though it does often need a little help from C to take advantage of SMP hardware.

[0] https://wiki.python.org/moin/ParallelProcessing


Python is compiled in the sense that it produces bytecode, but in being so it lacks the benefits of other compiled languages such as thorough compile time error checking. I've found that this is sometimes why people don't even notice the compiler.


[OS threads are] "no longer good for CPU-bound work" "Python will work just fine for 'enterprise' work"

These two statements conflict.


As a pythonista, the only issue I had is with types. A lot of the time, I don't know what to expect from a function, it could be None or a int (legacy code). In a typed language like go, you would not have the ability to do this. So, although powerful, it opens the door for potentially bad decisions.

Python's proposal for type hinting will dampen the effect of type inconsistency I feel.


Go is not a good example of a type system - its types are limited and restrictive and really will stop you writing some good code that you wanted to write in Python. If you want a Pythonic language with static types, maybe take a look at OCaml (or my personal favourite, Scala - though that language suffers from a bunch of warts for the sake of JVM compatibility). http://roscidus.com/blog/blog/2014/06/06/python-to-ocaml-ret... is one example of someone making the move.


I see Go and OCaml recommended to people fed up with Python's lack of static typing, and I just cringe. I love experimenting with programming languages, and Go seems to be picking up momentum but unfortunately for both Go and OCaml, they have this ridiculously restrictive syntax to them respectively which it makes it deeply unattractive for prototyping anything.

If you're looking for a statically typed Python that can be grokked in a matter of minutes, and compiles down to a static binary like Go does, and runs as fast as C in benchmarks, you should definitely check out the Nim compiler [1].

Code example:

    import rdstdin, strutils

    let
      time24 = readLineFromStdin("Enter a 24-hour time: ").split(':').map(parseInt)
      hours24 = time24[0]
      minutes24 = time24[1]
      flights: array[8, tuple[since: int,
                              depart: string,
                              arrive: string]] = [(480, "8:00 a.m.", "10:16 a.m."),
                                                  (583, "9:43 a.m.", "11:52 a.m."),
                                                  (679, "11:19 a.m.", "1:31 p.m."),
                                                  (767, "12:47 p.m.", "3:00 p.m."),
                                                  (840, "2:00 p.m.", "4:08 p.m."),
                                                  (945, "3:45 p.m.", "5:55 p.m."),
                                                  (1140, "7:00 p.m.", "9:20 p.m."),
                                                  (1305, "9:45 p.m.", "11:58 p.m.")]

    proc minutesSinceMidnight(hours: int = hours24, minutes: int = minutes24): int =
      hours * 60 + minutes

    proc cmpFlights(m = minutesSinceMidnight()): seq[int] =
      result = newSeq[int](flights.len)
      for i in 0 .. <flights.len:
        result[i] = abs(m - flights[i].since)

    proc getClosest(): int =
      for k,v in cmpFlights():
        if v == cmpFlights().min: return k

    echo "Closest departure time is ", flights[getClosest()].depart,
      ", arriving at ", flights[getClosest()].arrive
Statistics (on an x86_64 Intel Core2Quad Q9300):

    Lang    Time [ms]  Memory [KB]  Compile Time [ms]  Compressed Code [B]
    Nim          1400         1460                893                  486
    C++          1478         2717                774                  728
    D            1518         2388               1614                  669
    Rust         1623         2632               6735                  934
    Java         1874        24428                812                  778
    OCaml        2384         4496                125                  782
    Go           3116         1664                596                  618
    Haskell      3329         5268               3002                 1091
    LuaJit       3857         2368                  -                  519
    Lisp         8219        15876               1043                 1007
    Racket       8503       130284              24793                  741
This language deserves your attention.

[1]: http://goran.krampe.se/2014/10/20/i-missed-nim/

[2]: https://github.com/Araq/Nim/wiki/Nim-for-C-programmers


Nim is nowhere near the maturity of OCaml, and everything I've seen about it has the whiff of zealotry. I'll wait until I see more nuanced talks about it, and an established ecosystem that doesn't rely on C libraries.


> ecosystem that doesn't rely on C libraries

If something is compiled to native code, what's the point of writing a library that's not just a binding? In python it makes sense, because you get the pure-python installation of your app - it doesn't depend on the OS, python versions, etc.

But once you're going to compile your app for a target platform... what's the point of not relying on C libraries?


If your app uses C libraries then it inherits the problems of C: there will almost surely be bugs in the library that mean your app might segfault, or worse, have security problems. Thus e.g. the recent effort to write a full SSL stack in OCaml.


Agreed on the "lack of declared types" thing. While I'm exceptionally fast writing new Python code, reading other people's Python code is a different story - particularly, if they are new to Python.

While there are a lot of concurrency options to not get you in trouble with the GIL, it is annoying that there is no canonical way or pythonic way of doing it. I saw people using things like monkey patching, which made my jaw drop. I had to implement Websockets recently and was totally confused as to the right way of doing it. I ended up evaluating NodeJS and Go for this function, and went with Go.

Overall ... I love Python. It is perfect for small projects. For larger codebases (3KLOC+), I am slightly skeptical. It can be done but developers need to have quality code and strong test coverage.


My hobby project is something like 20k lines of Python. As long as you've got good tools (e.g. PyCharm), large codebases in Python are just about as easy to navigate as they would be in something like Java.

Our codebase at work is.. phew, I don't even know, a couple hundred thousand lines of Python? Still, easy peasy to dig into.

I will 100% agree on the websockets thing. Python is really failing in that area.


I code in Python from time to time but I am by no means an experts. For a small game server I used http://autobahn.ws. Is there something wrong with this library? It was the easiest websocket library I found for pretty much any language. I plan to use it again for a larger project, more serious project. Is there any reason I should avoid using this and Python in particular for websocket stuff?


I thought both Tornado and Twisted had decent websocket implementations?


A couple 100KLOC of Python, excluding 3rd party libraries? Other than Google or maybe NASA, I can't think of many places with so large python codebases.


A 100k Python project doesn't sound that bad, especially with a modular design and a great IDE (PyCharm). Good tools make reading and navigating the codebase easy. At that point, it matters a lot less how big it is, because you can trust that your tools enable you to find the answers you seek.

The project that I help maintain is a collection of Django apps totalling more than 700k lines. Thankfully I do not maintain all of it, but I am the primary maintainer for several sub-apps, totalling about 180k lines of Python. (This doesn't count the UI code, which is not in Python.)

I'm the sole author for about 12% of the part I maintain (since I wrote some of the individual sub-apps), and am familiar enough to describe probably half of the rest. There are still some parts that I really have to puzzle out ("What happens on the codepath to generate those reports?"), but that part shrinks over time.

The modular design that our original engineers used when they first built this system really helps, and the fantastic tooling in PyCharm is critical to my being able to navigate the codebase. Between grepping the python sources (thank god for shell aliases wrapping grep!) and my base familiarity with the system, I can start tracing with PyCharm's "find definition" to investigate things. Being able to run my own copy of the system, and put debugging breakpoints in (even in libraries!) is __amazing__.


RightNow, acquired by Oracle. A couple of 100k+ line code bases in Python. Wasn't that hard to figure out what was expected when calling a function.

Gets even easier when you use a code linter.


Medical industry.


>"I had to implement Websockets recently and was totally confused as to the right way of doing it. I ended up evaluating NodeJS and Go for this function, and went with Go."

You had to implement Websockets in Python? Instead of using an already-functioning, tested library?


At $work, we call these people "problem solvers".


I'm guessing they meant they implemented a websockets application, not the actual protocol.


I'm guessing that too as I had a similar issue in a hobby web project in Python. I wanted instant messaging between users and the time required to read the docs for twisted and port the whole thing over was off putting. I ended up making a message server with Node and Socket IO that's like 50 lines and sends messages to and from clients and the server. It works but it seems a bit of a kludge.


It's this simple if you had come across Tornado:

    class WebSocketHandler(tornado.websocket.WebSocketHandler):
        def on_message(self, message):
            self.write_message("Echo: %s" % message)
I think the problem is that you chose a pretty heavyweight / complicated library.


Ta - maybe I'll give that a go. I didn't really try it. Actually I did glance at some docs for the framework I was using, web2py and it suggested some stuff with Tornado but one version only worked on chrome and another needed flash so presumably wouldn't work with iOS so I thought sod it - node/socket IO is easy and works anywhere.


I agree.

I know we can get picky about the exact characteristics of a languages type-system, but for me they fall into two camps: "Typeful" languages and "(effectively) Typeless" languages.

The former have a "good" typesystem, and expose it as a tool the programmer can use to help themselves solve problems. In the latter case, there may or may not be a type system lurking in the language somewhere, but the language either doesn't expose that system as a tool for the programmer to use, or it does so inconsistently.

Personally I think python is in the later camp, it has types but it doesn't really give the programmer much help when solving problems.

Clojure is another example. The Java type-system is down there somewhere in the murk, but idiomatic Clojure code doesn't use it, and it's not really a tool for the programmer to use, more an accident of the platform.


Indeed, this is one of the biggest problems of Python for large-scale projects (saying this as an avid Pythonista myself). This is due to dynamic typing, though, not weak typing (Python's typing is actually strong).

Weak typing has its merits but sometimes static typing is very useful for reading and understanding code.


I can relate, on some level. Sometimes I feel like writing test cases that increase coverage is not enough. There are days where I think I need a sat solver in order to get confidence that I'll never see incorrect behavior from my code.

> type hinting will dampen the effect of type inconsistency I feel

I'm not sure that I see type hinting being that advantageous. I kinda hope that it could allow PyPy and other VMs to make assumptions that would otherwise require heuristics.

Perhaps type hinting could make static checkers/linters for Python more effective, though.


For me it's not so much the dynamic typing that makes the code hard to reason about--it's the object system, with its inheritance and mutable state.

This is why I like dynamic languages that have immutable-by-default data structures, like Clojure.


Yes, in theory this is a problem in lots of languages which are weakly typed. There are lots of ways of handling this, some generic and some language specific.

In python I would suggest wrapping your call in a try / expect block or using "isinstance()".

If you are using a publicly available library or popular piece of code that can return None or an Integer, I would argue that piece of code was written incorrectly. Newbie python users might do this, but I think experienced python devs would see the problem.

Finally, some documentation or justification for the reasoning behind this decision at the top of the function or on a web page somewhere would help as well.

As with most powerful , full featured languages it is pretty easy to shoot yourself in the foot if you are not careful, I don't think this is a python specific problem...


I would counter that isinstance() checking is - at least in the general case - an antipattern.

Better to check that the object you've been passed implements the interface you need rather than reasoning about its inheritance chain. Duck typing!


Some times one wants classes to be semantic, not only structural. That means that when the developer writes "class C(Foo)" he means something different than writting "class C(Bar)" even if both classes implement the same methods.


As much as I hate the Java language, "isinstance" does allow you to check for an interface, not just a class.


>"Yes, in theory this is a problem in lots of languages which are weakly typed."

Minor pedantic correction: Python is dynamically typed, and strongly typed. Those are separate concepts.

    Strongly-typed vs weakly-typed
    Statically-typed vs dynamically-typed


We really need a language that defaults to statically typed, but that can be dynamically typed where you ask nicely.

E.g. - as opposed to the Java practice of having 3 or more layers of fun to read (NOT!) XML on top of the language's native static typing to glue things together. I'm gonna be sick now...




Have you looked at Haskell?

It's not really dynamic typed, but has no requirements of making the types explicit, nor of making your functions work over a defined set of types.


Python is strongly typed.


True. What most people are complaining about is that it's dynamically typed, not weakly typed. That's where most of the confusion sets in.


Python is great for relatively small projects written by a single person or for glue code, but large Python projects are scary. Diving in and reading someone else's code can be quite difficult (what are the types of these parameters? they're being used like dicts, but what if they're really a completely different class with different behavior? how can I trace this down when it goes through multiple IPC indirections since they chose to use multiple processes to get around the GIL? oh i'm missing packages and they don't seem to be in pip...). The bigger the project, the more places there are for runtime errors that a statically compiled language would have caught or at least has better tooling to find (syntax errors in an error path block that rarely gets hit, type errors, uncaught exceptions, missing methods, accidentally fat-fingering a variable name and instantiating a new variable). You end up leaning really heavily on tests that wouldn't have to be written in other languages and documentation that would otherwise be statically enforced in many other languages. Even when using Python for glue code, be wary that it will need to be in-charge (i.e. embedding a Python interpreter in your code versus embedding your code in Python especially with CPython).

Not saying it's not possible and doesn't work well for a lot of companies, but I do think you take on technical debt to get up and running fast when you choose Python. It's a choice I would personally think twice about.


http://news.efinancialcareers.com/us-en/173476/investment-ba...

"...There are around 5,000 developers using [Python] at Bank of America, .. there are close to 10 million lines of Python code .. and we got close to 3,000 commits a day."

IMHO, it would be scary with any language.


Is this different for Python over any dynamic language?

This seems to be a better argument for statically compiled languages over dynamic languages rather than an argument against Python.


That's a good point, and perhaps it is. The majority of my experience with dynamic languages is probably in Python.

One thing that makes a difference between Python and other commonly used dynamic languages (here referring to JavaScript / ECMAScript and Lua) is that Python is dramatically more complex out of the box. This is both in terms of language features and Python's batteries-included standard libraries. For example, the amount of overloading that you can do in Python is very impressive, but the result is that common operations aren't necessarily predictable or dependable without more context (not that this isn't a problem in static languages - looking at you C++, but at least in the other languages you get type information to help out and some checking at compile time).


Funny, my experience has been that large/ complex python code is actually easier to follow than JavaScript because of the line noise that JS requires. Side by side, the same code will usually be easier to read in Python (or Ruby.)

Same goes for Java, although in a slightly different sense. I've seen non-trivial Java projects balloon quickly where you have dozens or hundreds of classes & interfaces with no clear sense of structure. High line noise, lots of boilerplate. Sure it compiles but at the end of the day it's just as likely to hit a NPE so you have to rely on functional and unit tests regardless of static/dynamic/compiled/interpreted.

Obviously a lot of it comes down to who's writing the code. For me, a language that lets me succinctly express my intentions with minimal cruft is what wins.


I totally agree that it comes down to who the code is written by. I've done both python and java, and in python I would go so far as to define interfaces as a form of documentation in my python code so people who use it know what I expect out of their classes which want to interact with my object. It doesn't require the class to explicitly implement the interface but they can if it makes sense (fairly similar to Zope Interfaces I guess). Unfortunately, I don't see this a lot in most python applications and you're left searching around through calls to figure out exactly what is required of the object being passed in. Look at a lot of popular libraries -- you are still left searching through code just to see what can possibly be returned because they aren't always the same type -- also, exceptions aren't part of the function or method definition, which means if you don't document your exceptions, I'm up shit creek. I think a lot of the problem around the python ecosystem, again, that I've seen, is people just don't follow best practices -- documentation, it's a core component if python development -- It's one of the arguments they use against static typing. Personally, I think python developers need to be much more strict about development practices than java developers. I think it's a lot easier to make bad python code than it is to make bad java code. The same goes for C, which I used to also do as well. To be a good C developer, you have to be very strict and structured (however, for somewhat different reasons). With great power comes great responsibility.

I've seen shit java code as well. A lot of bad java code usually revolves around things not being modular or not having some form of consistent development patterns or not breaking methods down into simpler sub-problems -- I think documentation isn't as important as it is in python for the fact that I know exactly what is being returned, what exceptions can be thrown, and what exactly needs to be passed in just by looking at a method. In terms of the business logic associated with the class, that still needs to be documented.

That said, I understand why python is generally used at startups -- it allows fast initial development where at startups, time is critical. Long term development really relies a lot on the teams ability to make structured decisions and organize their code, which is a difficult task in any language.


Most of these runtime errors rarely make their way to production and are easily identified and rectified in cases when they do slip through.

More restricted languages, e.g. Java, do have advantages in terms of forcing developers to do some more self-documenting and better opportunities for tools to help with syntax highlighting, refactoring, and warning about those errors you mention. But those errors are usually caught easily even with basic test procedures and almost always if there's a half-decent automated testing regimen.

A lot of it comes down to, who's writing this big project? If developers are talented, there's a lot to be said for the freedom granted by a language like Python or Ruby. If developers are more mediocre, the forcing function of a language like Java may help to produce a "boring" and larger but understandable code base.


>If developers are talented, there's a lot to be said for the freedom granted by a language like Python or Ruby. If developers are more mediocre, the forcing function of a language like Java may help to produce a "boring" and larger but understandable code base.

This is a bit of a false dichotomy.

Much like I don't think lisp is popular for its functional-ness or whatnot, I don't think python or ruby are popular for their dynamic properties. My guess is that 90% of it is just how the code ends up looking. And with type inference getting further along, I think we'll see more people realizing that typing is a great sanity test to have in your code (especially in big codebases). See the popularity of Go in pythonic circles.

I have never been shown any non-tiny piece of robust code that was easier to deal with because of a language's dynamic nature. I have been on this Earth a bit less time than others, but my gut feeling is that it doesn't actually exist.

(note: my day job involves a lot of python, and I love the language for many things. I am looking forward to optional type annotations, though, as are many of my coworkers)


This is an argument for unit tests, not for a statically typed language.

Type errors tend to be very few and far between compared to logic errors.

Multiple processes in most web applications are best handled by a pre-forking webserver in front of something like mod_wsgi, and using an engine like Celery for asynchronous and long-running operations on the backend, often launched with something like supervisor.

I do take issue with some parts of the article, namely saying there is a good type system (there's not, but it's all fine, duck typing, etc) and that twisted is a good framework.

It is definitely true however that Python is a great fit for all kinds of serious applications, but with all things, it takes discipline.


> This is an argument for unit tests, not for a statically typed language.

Static typing is like unit tests that take no maintenance.

> Type errors tend to be very few and far between compared to logic errors.

False distinction. If you just take your python code and write it line-by-line in another language, sure, you won't gain much. But if you actually work with the language, you turn logic errors into type errors. See e.g. http://spin.atomicobject.com/2014/12/09/typed-language-tdd-p... , and note that that's a lot more verbose than it would be in a modern language with type inference, higher kinds and so on.


> Static typing is like unit tests that take no maintenance.

Um, no. At best they are like a tiny fraction (and trivial fraction at that) of the unittests required. AND static typing often requires maintenance (as all code does) including initial burden of creating them in first place.


>Static typing is like unit tests that take no maintenance.

And makes your code 20% longer, which makes for a lot of extra fun not-really-so-cost-free-after-all maintenance.

I found the complaint that it finds bugs earlier (at compilation time!) ironic given that, in my experience, running a suite of unit tests takes less time in python than, say, compiling does in Java, never mind compiling and running tests.


> And makes your code 20% longer

Not necessarily true, especially if you are using a language with type-inference or you are able to encode your logic in types.

Additionally - what do you think unit tests are? They aren't additional code you have to write?

I'm not someone who thinks types completely reduce the need for testing, but I absolutely do not get why dynamic language fans are like "Uhg, I HATE having to write types", but then end up basically reimplementing a type system in a much more verbose testing framework.


>Not necessarily true, especially if you are using a language with type-inference or you are able to encode your logic in types.

I agree that it is not necessarily true, but the languages which manage to squeeze in static typing and still end up with programs that are shorter and sweeter than python's are all fairly niche right now, which carries its own set of problems.

>Additionally - what do you think unit tests are? They aren't additional code you have to write?

Never denied it for a second. There aren't any languages which don't require unit testing though, and there probably never will be. Let's not pretend otherwise.

>I'm not someone who thinks types completely reduce the need for testing, but I absolutely do not get why dynamic language fans are like "Uhg, I HATE having to write types", but then end up basically reimplementing a type system in a much more verbose testing framework.

I've never done this.

I'd wager that the amount of code I have to write, including tests, is less than in all other practical languages. Often much less. That is very valuable.


I would guess a well done scala project would be at least one order of magnitude shorter than a comparable python project. You just can't get the same level of abstractions without a type system to help you out.


> I'd wager that the amount of code I have to write, including tests, is less than in all other practical languages. Often much less. That is very valuable.

It would take a very gerrymandered definition of "practical" to say that Python counts but F# doesn't, and I'm pretty confident F# would win that comparison for most problems. (If you'll allow me Scala, which is my language of choice and the one I use full-time at my job, I'm very confident it would win the comparison for the vast majority of problems)


And as someone who has coded both Python and Ruby on the job, and most recently Scala, I find the amount of time I'm debugging runtime errors much less in Scala, which is very valuable.

I also don't feel the amount of code I am writing to be all that significantly larger than what I was writing in Python/Ruby.


F# might, but I doubt scala would.


> And makes your code 20% longer, which makes for a lot of extra fun not-really-so-cost-free-after-all maintenance.

Less than that, in my experience. Certainly a lot less than the amount of tests and documentation you'd need to make up for the absence of types.

> I found the complaint that it finds bugs earlier (at compilation time!) ironic given that, in my experience, running a suite of unit tests takes less time in python than, say, compiling does in Java, never mind compiling and running tests.

You need to work with the language. Is a python test cycle faster than a full rebuild? Probably. Is it faster than the time between making a change and seeing the red underline in one's IDE? Absolutely not, in my experience.


>Less than that, in my experience. Certainly a lot less than the amount of tests and documentation you'd need to make up for the absence of types.

Certainly more, in my experience.

>You need to work with the language. Is a python test cycle faster than a full rebuild? Probably.

By an order of magnitude.

> Is it faster than the time between making a change and seeing the red underline in one's IDE?

It can be that quick, yes. I have a unit test watcher that reruns them every time a file save in the project is detected (using watchdog/epoll), and I get the results back in seconds.

That has the added advantage of detecting more than just trivial type errors. It catches logic errors too.


> And makes your code 20% longer

One thing I've noticed on larger dynamic language projects: Developers tend to adopt a very defensive strategy in terms of validating inputs, to ensure it is clear a bug is not in their component. This includes a lot of what are essentially runtime type checks.

This is not necessarily a bad thing, because it ultimately makes the code more reliable. But it is "more code" which needs to be maintained, and eliminates the much of the advantage of using a dynamic type system in the first place.


> And makes your code 20% longer, which makes for a lot of extra fun not-really-so-cost-free-after-all maintenance.

Just 20%? Sounds like a good deal, considering those that brag that brag that "over half of our code base are just tests!". I don't know if that kind of thing is fashionable or widespread any more, though.


What is the benefit to using a dynamically typed language if you have to spend all your time writing tests and verifying the static properties of your code?


Mainly the boilerplate other languages force upon you. Whenever i look at Java Code i start to chuckle.

Its so much more information that you need to load into your head.

To quote a picture from the post: https://www.paypal-engineering.com/wordpress/wp-content/uplo...

And its not like you need static typing everywhere just in some places its better to enforce it for your own sanity. Type hinting looks like a good way to solve this problem in python.


That image isn't really a fair comparison (in that C++ / Java aren't the only statically typed languages.)

I imagine a statically typed Python dialect would only have an extra 15 or so lines, and the benefits would be numerous.

I love python as much as anyone, but having to keep a reference manual handy just to use someone's library because they had the audacity to use a variable is not my idea of a fun time.

EDIT: I'm probably exaggerating too much.

Would I use Python in a large, complicated, multi-developer project? Yes, I would. And my only real complaint would probably be that I like static typing so much that I'd miss it.


> I imagine a statically typed Python dialect would only have an extra 15 or so lines

Actually you can have a statically typed Python if you want. Have a look at http://docs.cython.org/src/quickstart/cythonize.html

All type declarations are optional. It compiles modules to versions completely interoperable with the rest of Python code. They're importable, behave as you'd expect, etc.


Consider Haskell. It's about as anti-boilerplate as they come. And with automatic type inference.


I've never had to spend all my time doing that. I am making the case that type errors (passing a hash instead of an int, etc) are NOT one of the major causes of errors in software development.

I'm apt to screw up other things, but that's not one of them.


> I am making the case that type errors are NOT one of the major causes of errors in software development.

That's symptom of a bad typing system. I'm with a moderately sized Haskell codebase right now and type errors represent the biggest share of errors on it by a huge margin - several times bigger than runtime bugs.


Haskell makes programming into a duel with the type system specifiers.


Yes, it does. That duel replaces a chase on unknown and strange runtime bugs.

Sometimes it's a win, other times it isn't.


I prefer to think of it as a dialectic.


Shouldn't you be writing tests anyway?

NB I'm not being flippant - I've been using Python for a couple of years and appreciate the lack of boilerplate compared to many other languages and if there are "type" problems my unit tests, which I will be writing anyway, find them. Using an IDE like PyCharm does a good job of warning you about type problems at development time so I don't have a lot of run time type problems.


Yeah, you should, but you don't have to check trivial things like "is this a list?"


Well, I wouldn't write a unit test for that - I would write a unit test for the high level observable behavior and if I get it wrong that something is a list then it'll break and my unit test will fail.


I think that's closer to functional testing than unit testing though...


The idea is that writing code + writing tests in a dynamically typed language is easier and faster just writing code in a statically typed language. (Of course it depends on the languages in question, the programmers and a number of other factors)


> This is an argument for unit tests, not for a statically typed language.

> Type errors tend to be very few and far between compared to logic errors.

The divide between type errors and logic errors is entirely dependent on the language and the discipline/inclination of the developers. In some languages, what you might think of as "logic errors" can easily be moved to "type errors". And in that regard, the difference between what is unit testable and statically typeable is also entirely dependent on the language and the discipline/inclination of the developer. So what might have to be expressed as unit tests in one language, might be checked by the static type system in another language.

But I can understand how some people might think that static type systems are only for catching things like "error: expected addition of integers, but got attempt at adding an integer with a FireTruck". ;)


I've found over the years that there's a fairly widespread prejudice among many enterprise developers in general and .NET developers in particular towards Python. Mention to a few random .NET developers that you're using it on a side project and perhaps about half of them will give you a look of puzzlement at best.

The main problem isn't anything to do with technical failings of the language, but a misconception that it's an obscure niche language that nobody uses outside of academia, with an ecosystem bordering on the nonexistent. People seem to think that if you start using Python you're going to end up with an incomprehensible codebase that's impossible to maintain because you can't hire developers who know it. They're generally quite surprised when I tell them how widely used it is and what all it gets used for.


A .NET developer recently was telling me the .NET community tends to be very insular, a lot of interesting ideas are ignored unless Microsoft makes them first-class citizens of the ecosystem. That's unfortunate IMO.


This is always how it has been with Microsoft. They control the developer toolchain 100%. They have gone out of their way in years past to burn other development tools (they pretty much shut Delphi out of the market with a last-minute change in one of their specs, back in the day - was it COM?) The advantage of this is that you get some really nice APIs - the gulf between the .NET and Java core library architectures is huge - but the drawbacks are too many to name: forced obsolescence every few years, continual new experimental technologies forced on developers using the Embrace, Extend, Extinguish paradigm, and you really don't have the time or energy to try to work outside this model. You are either in the community or out of it. This is showing signs of change with the opening of the .NET toolchain, but I'll be curious to see if there is a point at which some of it actually starts to be community driven. Miguel de Icaza cannot shoulder the entire movement by himself;)


That's the most common criticism of the Microsoft ecosystem by far. It's not just being insular, it's that huge swathes of the .NET community insist on being spoon-fed by Microsoft. The attitude is that if something isn't included out of the box with Visual Studio, you've no business whatsoever paying the slightest bit of attention to it.

You see it in spades in the Silverlight community. They're all blaming Microsoft for the decline of Silverlight, even though it's largely due to factors outside Microsoft's control.


This is my experience of the .NET world: spoon-fed is, oddly, preferred. I don't get it, myself.


That depends on the type of developer you are speaking with. Most enterprise developers working for medium to large companies feel this way. Contractors are a different story. They embrace open source as much as the Java community.

Disclaimer: .Net developer learning python/django for a rather large-scale cloud implementation.


Relative to the .NET stack, Python's toolchain has always been no-cost for anyone to use, and the language is so fun that there are many people who enjoy just coding in it for free.

I can almost see these guys' angle, of why they might see this as a possible threat that could undermine their career. .NET coding pays well, and is usually semi-exclusive to those who previously had an employer pay for them to use the tools, even though that's not theoretically necessary.


On the flip side of that, F# is the first language I've seen that has made me consider dropping python. It really is a work of art.


I could not agree more. F# has made me feel exactly the same.


Considering it's replacing C++ as the starter language at many universities this may not be true for too much longer as younger developers get more exposure to Python.


Hooking onto that, is there any reason to use C++ as a programming language in non-technical university courses?

Someone I know does a math and economy study and they use some programming to find optimal solutions for certain problems (e.g. traveling salesman). They are taught C++ with pointers and by-reference parameters and OOP and everything in a 7 week course. Why not Python with some functions? That perfectly suits their purposes and makes life a lot easier both for the teacher (less code to check; clearer code) and the student (easier to learn; less code to write).


I'd probably go for C or C++ if you're teaching data structures. It can be helpful when pointers and memory layout are right there in front of you, and not obscured by the language.

For other computer science topics, probably not.


> I'd probably go for C or C++ if you're teaching data structures.

Hmm perhaps if you are interested in the technical working of a data structure, but even if you use Python you'll have to mind your memory and CPU usage. They have to implement things like the traveling salesman problem and get the correct answer to a reasonable number of cities.

Python is slower than machine code by definition, but not so much that it becomes much harder to do the same calculation. They have to think of speed regardless of the language, and doing that is much easier when you pass around lists instead of pointers to pointers of doubles (double * *, added spaces to avoid markdown).


I have a fairly large code base (80K LOC). When the size grows, lack of typing can become a problem. 95% percent of the time, the code is explicit enough. However, when you end up with meta code, then it can become really difficult to track the types down in the n-th level of recursion...

Python2 to 3 migration is not easy. There are tools but the problem is that they don't see everything and therefore, you end up with 95% of your code converted. Then it's up to you to figure out the last 5%, which is an order of magnitude harder that the first 95%... So I ended up having a fairly long transition of migrating Python2 code to Python2+3 code.

For bothe these issues, the common discourse is : have proper test coverage. But well, we live in a real world, and maintaining a code coverage strong enough to allevaite the problems (around 90%) is just very hard. If you're coding alone, that may be just too much (my case). In a team setting, with money to spend, that may be possible, but you'd need a very disciplined team.

But anyway, AFAIC, working with Python is just super productive (I compare to Java). It also feels much more battle tested than, say, Ruby.

For me python is not a scripting language but a "glue" language set in the middle of a huge libraries ecosystem.

Now, I didn't do XA transaction stuff, high performance stuff, etc. For me it's more alike a well done Visual Basic : you can achieve a lot very quickly. Contrary to VB, the language and its ecosystem are really clean.

I'm lovin' it


I work on a team of three that has a project slightly larger than you 100k-110K LOC (and growing by about 5k LOC a week), we've managed to keep test coverage at about 95%, and have found it's worth the investment upfront, as it makes refactoring so much easier.

Looking back, I don't think even for a personal project, I would ever do something that wasn't a one-off without good test coverage. It's essentially taking on technical debt, as it makes you much more afraid of fixing anything.


I developed this library ( https://github.com/hhuuggoo/thedoctor ) to help deal with this problem. My belief is that type validation is only part of the problem - being guaranteed about properties of your inputs and outputs is also necessary (for example, make sure the input dataframes have datetime indices, and include the following columns, or ensure that this matrix is nonsingular). I have worked on some of the largest python projects at nyc investment banks, I think I've seen what I can safely call the scariest python project in the world

There is also another library out there which approaches the problem a bit differently

https://andreacensi.github.io/contracts/


>I have a fairly large code base (80K LOC). When the size grows, lack of typing can become a problem.

I do as well, and I find that while we occasionally run into typing bugs, the tests nearly always catch it and they catch it quickly. Moreover, these are tests that we'd write in any language, and static typing would, in most other languages, mean more verbose code. Overall we still come out ahead.

>However, when you end up with meta code, then it can become really difficult to track the types down in the n-th level of recursion...

Yea, I try to avoid that. If there's a library that does meta-code that's unit tested to hell and back, maybe. If I have the time to write one and unit test it, maybe. But, I still try and keep regular projects clear of it.


Re the typing point:

Statements like "Python is more strongly-typed than Java" can mean too many different things without a precise definition of what "strongly-typed" means. The Wikipedia page linked from the article even supports the position that there isn't an accepted definition for the strong vs. weak! (https://en.wikipedia.org/wiki/Type_system#.22Strong.22_and_....)

These terms are not very illuminating, and I don't understand the post's argument about types as a result, especially w.r.t None vs. null.

One argument that the post might be making, and I'd like to see fleshed out, is "Python's expressions and built-in operators check their inputs at runtime in a way that gives useful and effective error messages in practice." That seems like a lesson from experience that could be backed up with anecdotes and provide some useful feedback on the language. That avoids the terminology debate about types, which is more about picking definitions than about the quality of the language for certain purposes.

It does require defining "useful and effective" for error messages, but I'm more interested in that debate :-)


Python is very much an established languages and myths like these need to go away in order for people to feel comfortable using python for serious, well designed projects that scale and do every and any thing that a person may want or need. Great post.


I agree BUT... Python 2 and Python 3 migration is still a major issue. It has come a LONG way in the last 18 months.

Also Python is good at thing but

> feel comfortable using python for serious, well designed projects that scale and do every and any thing that a person may want or need

Might be a little to bold of a statement :)


As a non Python programmer, is 3 the way to go for all new projects? Or is there some advantage to 2 still?


There is no advantage to 2 unless you're using a legacy environment or a legacy dependency.

Edit: the "Python 3 Wall of Superpowers"[1] is a good resource for seeing how far the ecosystem's conversion to Py3k has come. Many of the "red" packages even have Python 3-compatible forks (e.g., Supervisor).

[1]: https://python3wos.appspot.com


There are still a lot of libraries which haven't moved yet.

I periodically check my requirements.txt's and there's still a few big holdouts left.


Which are they?

I know that python3wos is out of date regarding the libraries people actually use. When I look at the py2-only libraries on there, I see:

- the big "legacy dependencies", Twisted and gevent

- libraries that are ported but the site doesn't know it, such as protobuf

- highly specific code to extend a particular system, such as tiddlywebplugins

- system utilities that it doesn't matter what language they're in, such as supervisor and Fabric

- libraries that have been abandoned (in all versions) and superseded, such as MySQL-Python

I'm not saying you're wrong, I'm saying that we need a better python3wos. It makes it look like "unless you bet on a massive legacy asynchronous framework, you're fine".

I think that to some extent this should be the case, but to some extent, there are things that should be ported that we're not seeing because of the unrepresentative set of packages on python3wos.


Twisted is one, and it's there partly because a lot of code relies upon it, not because I like it. I also use a few niche libraries which haven't been bumped and there's a few libraries like mechanize too - not so niche yet still not bumped.

Honestly, if there were some big incentive at this point I would go through the hassle of upgrading, but there isn't. The gains on python 3 seem relatively minor and incremental.


The only major reason to use 2 instead of 3 is library support, however a lot of good libraries still aren't updated to Python3, such as the google-api-python-client, gevent, scrapy, the list is getting smaller every day though. There are a few other reasons, but they're pretty uncommon/advanced, and not something new users will encounter [0].

Python 2 gets many of Python 3's features back-ported (they can be enabled with an import statement), which is really convenient when working with 2, but it also helps contribute to the lack of a migration from 2 to 3.

Python 2 and 3 are very similar though, and most projects can even run under either 2 or 3 with the exact same codebase. So it doesn't take much effort (or none at all) to move a project from 3 down to 2 while still in early development if requirements necessitate.

[0] One example that comes to mind is a difference in the way str.format() works. In Python 3, strings are unicode text objects, in Py2, they are bytestrings. Some projects used str.format() in Py2 to format binary data rather than text, which breaks when moving to 3. Porting a compatible change to Py3 has been rejected several times because of the complexities in implementation, as well as it being unidiomatic to the language. See:

https://bugs.python.org/issue3982


There's still a bunch of libraries/frameworks that only work with python 2. If you don't expect to be needing those, python 3 can be safely used.


It depends on your problem domain. If it's Ansible-related for example, you go with 2. If it's web related, 2 is a good default, as those are still the most tested code paths. If it's scientific, go with 3, most of those tools are newer.


If it's scientific, go with 3, most of those tools are newer.

Which tools? I use python for scientific computing and I can't think of any tool that I use where the python 3 version (if it exists) is newer/better than the python 2 version.


numpy and pandas are both Python 3.

Newer and Better is a totally different question and this is why we have the Python 2 or 3 debate.


Sure, but as far as I know numpy and pandas for python2 and python 3 are exactly the same. I don't think there is a newer python 3 version of either.


I only knew they were in active development and supposed that's were the action is, it's not my area at all. Most scientific use is also with version 2 then. Then I can't think of any problem domain were you should go with version 3 in practice. I would be interested in what the other recommendations here are based on, other than "I like it for my own stuff".


I would definitely recommend learning with 3 and going forward from there.

Python 3's library is growing all the time and in my opinion already has all the libraries or bindings you'll need right now.


If you need unicode (i.e. your app will take more langauges than english and spend a lot of time on that (hint! GUI hint!), the 3 is the way to go. Unicode support is so much better...

For the rest, it depends on the libraries you need...


> It has come a LONG way in the last 18 months.

Google actually just started supporting Python 3 last week, and they were one of the last big holdouts.


It may be bold but why not be a little bold now and then.


I love python, and this is a terrible article, as many others have said here.

You have to understand that these articles are written for a number of reasons:

* The company (here, paypal/ebay) needs to recruit people.

* Often, the language or technology is under siege internally, and these external posts strengthen its position.

* The company wants to get some message out there, but it doesn't actually have any interesting technology of its own, so it goes on and on about some known tech.


Python is crème de la crème.

I was using the Requests library last night to work with some APIs, it's an absolute gem, it's so easy to use. The Python library manager (pip) just works and the idea to isolate your dev environment with virtualenv is fantastic. PEP8 for a universal style guide is underestimated in large projects.


The need to create a separate virtual environment for every piece of software one touches in order to avoid version collisions is not exactly what I'd call fantastic. Somehow other language ecosystems manage to sidestep this problem entirely by simply versioning libraries within a single shared repository. That, is fantastic.


You can do worse than pip and virtualenv, yes, but I wouldn't call pip the cream of the crop by any means. NPM is way easier to use for starters, and Go, Rust and Nim go one step beyond NPM by compiling language dependencies down to a single binary file which can be shipped to users. It's very succinctly done.

Python really needs to step up the game to stay ahead of up and coming languages like Nim, which looks like this:

    import rdstdin, strutils

    let
      time24 = readLineFromStdin("Enter a 24-hour time: ").split(':').map(parseInt)
      hours24 = time24[0]
      minutes24 = time24[1]
      flights: array[8, tuple[since: int,
                              depart: string,
                              arrive: string]] = [(480, "8:00 a.m.", "10:16 a.m."),
                                                  (583, "9:43 a.m.", "11:52 a.m."),
                                                  (679, "11:19 a.m.", "1:31 p.m."),
                                                  (767, "12:47 p.m.", "3:00 p.m."),
                                                  (840, "2:00 p.m.", "4:08 p.m."),
                                                  (945, "3:45 p.m.", "5:55 p.m."),
                                                  (1140, "7:00 p.m.", "9:20 p.m."),
                                                  (1305, "9:45 p.m.", "11:58 p.m.")]

    proc minutesSinceMidnight(hours: int = hours24, minutes: int = minutes24): int =
      hours * 60 + minutes

    proc cmpFlights(m = minutesSinceMidnight()): seq[int] =
      result = newSeq[int](flights.len)
      for i in 0 .. <flights.len:
        result[i] = abs(m - flights[i].since)

    proc getClosest(): int =
      for k,v in cmpFlights():
        if v == cmpFlights().min: return k

    echo "Closest departure time is ", flights[getClosest()].depart,
      ", arriving at ", flights[getClosest()].arrive
And performs like this:

    Lang    Time [ms]  Memory [KB]  Compile Time [ms]  Compressed Code [B]
    Nim          1400         1460                893                  486
    C++          1478         2717                774                  728
    D            1518         2388               1614                  669
    Rust         1623         2632               6735                  934
    Java         1874        24428                812                  778
    OCaml        2384         4496                125                  782
    Go           3116         1664                596                  618
    Haskell      3329         5268               3002                 1091
    LuaJit       3857         2368                  -                  519
    Lisp         8219        15876               1043                 1007
    Racket       8503       130284              24793                  741
http://goran.krampe.se/2014/10/20/i-missed-nim/


> Go, Rust and Nim go one step beyond NPM by compiling language dependencies down to a single binary file

And the person in security hat now says: so how do you deal with library upgrades? If you need to go back to original app developers to provide you with a new version just to update one library, then you've got a problem.


Rust gives you the option to dynamically link, and I expect Nim does as well. As for Go, I believe dynamic linking is somewhere on their roadmap, though I don't know how high of a priority it is.


>> Python has great concurrency primitives, including generators, greenlets, Deferreds, and futures.

It's controversial statement.

Generators, greenlets, deferreds, and futures, it's all not great concurrency primitives definitely. It's okay for typical "python" tasks, but many applications have needs in more powerful solutions like golang channels for example.


Go's channels aren't a concurrency primitive, they are a concurrency abstraction. Goroutines are the primitive. I also don't think it would be hard to implement Go channel semantics in Python.


In my opinion the difference between abstraction and primitives lies in fact that abstraction cannot be instantiated.

Channels can be instantiated, goroutines cannot.

Correct me if I'm wrong please.


Your definition seems to have its roots tightly coupled to abstraction vs primitive in the sense of Object-oriented programming. Your use of the word "instantiation" makes me feel that way. Let me know if that is misguided.

In the world of concurrency you could probably even call goroutines an abstraction but goroutines are at least closer to the fundamental concurrency primitives such as threads, locks, mutexes and tasks/coroutines/whatever.


Mahmoud makes a spirited defense of Python with ten general themes, but the most important thing he had to tell us about PayPal's experience with Python was this:

"Our most common success story starts with a Java or C++ project slated to take a team of 3-5 developers somewhere between 2-6 months, and ends with a single motivated developer completing the project in 2-6 weeks (or hours, for that matter)."


I don't understand why the GIL gets so much derision from the cool kids, where node.js remains trendy.


I expect that "cool kids" aren't actually a homogeneous group, and the subset of them deriding Python for the GIL are disjoint, or nearly so, from the subset making node.js trendy.

Or, they aren't actually deriding Python for the GIL, they are noting that, given the GIL, you need to use an evented approach, so you might as well use a platform designed for that as its central model, rather than one that's designed around the threaded model but without the ability to use it effectively.


Wouldn't use expect the cool kids to all be using Twisted then?

I get the impression that most web devs wants to keep up with the latest tech, and by most counts Python is old now.


> Wouldn't use expect the cool kids to all be using Twisted then?

No, because Python with Twisted isn't the same as a platform built for the evented model from the ground up, its a library for a traditional threaded platform built to handle the evented model.

> I get the impression that most web devs wants to keep up with the latest tech, and by most counts Python is old now.

I don't think the desire of devs to be working in something that they perceive to be in demand and growing moreso is restricted to web devs.


I think a lot of the 'trendy' GIL hatred comes from that one blog post by Zed Shaw where he moaned about python 3 not having anything really great in it.


Because their parents used Python.


This article overshoots. Yes, Python is slow, and yes, all four runtimes cited are slow. As I've said before, I no longer believe the idea that languages don't have performance characteristics, only runtimes do, and as it happens, the decades-long efforts to speed up Python and their general failure to get much past 10-20x slower than C are a big part of why I believe that. NumPy being fast doesn't make Python fast; it's essentially a binding. A great binding that, if it meets your use case, means you can do great work in Python, but does nothing to help you if you don't need that stuff.

As for PyPy's "faster than C" performance, people really really really need to stop believing anything about a JIT running a single tight loop that exercises one f'ing line of code! Follow that link and tell me if your code even remotely resembles that code. In practice, PyPy is, I believe, "faster than CPython" with a lot of caveats still, but "faster than CPython" isn't a very high bar.

(Similarly, though another topic, Javascript is not a "fast language". People seem to believe this partially because of JIT demonstrations in which integers being summed in a tight loop runs at C speed. But this is easy mode for a JIT, the base level of functionality you expect from one, not proof that the whole language can be run at C speeds.)

There is no version of Python that will run you at anything like C or C++ or Java or Go or LuaJIT speeds on general code. It can't; for one thing you have no choice but to write cache-incoherent code in Python, to say nothing of the numerous other problems preventing Python from going fast. (Copious hash lookups despite the various optimizations, excessive dynamicness requiring many things to be continuously looked up by the interpreter or verified by the JIT, etc.)

I've drilled down on this one, but there several other "debunkings" here that are equally questionable. Python does not have a "great" concurrency story... it has a collection of hacks of varying quality (some really quite good, though; gevent is awesome) that get your around various issues, at the cost of some other tradeoff. The definition of "strongly typed" that Python conforms to is almost useless, because everything is a "strongly typed" language by that definition. In practice, it's a dynamically typed language, and yes, that can cause problems. Another collection of hacks are available to get around that, but they're add-ons, which means the standard library and other libraries won't use or support them. Yes, Python is a scripting language, it's just that it turns out "scripting languages" are a great deal more powerful that was initially conceived.

Wow, I must hate Python, huh? Nope. It's a fantastic language, certainly in my top 3, and still probably underutilized and underrespected despite its general acceptance. When I hear I get to work with it, I generally breath a sigh of relief! It is suitable for a wide variety of tasks and should certainly be in consideration for a wide variety of tasks you may have to solve.

But, it is always bad advocacy to gloss over the problems a language has, and all languages have problems since no one language can solve all problems perfectly. If you are doing something really performance sensitive, stay away from Python. If you've got something highly concurrent, think twice... your problem needs to comfortably fit on one CPU using one of the existing solutions or there's hardly any reason to prefer Python. (Yes, Python can sort of use more than one CPU but if you're going to do that you'll probably be happier elsewhere. "Just throw lots of processors at the problem" isn't generally a good solution when you're starting from a language that can easily be ~50-100x slower than the competition... that's still an expensive answer, even today.) Yes, it is dynamically typed and there are situations where that is OK and situations where it is counterindicated. In the long term, you don't help a language by trying to minimize the problems... you help by making it clear exactly what it is good for, when you should use it, and when you shouldn't. Otherwise, you convince a hapless programmer to pick up your solution, pour a year or two into discovering it doesn't actually do what you said it did, and now you've made an enemy for your language. Better for them to never pick it up because you truthfully told them it wasn't suitable.

That said, though, be sure your task is performance sensitive before just writing Python off... modern machines are really hard to understand the performance of and most people's intuitions are pretty bad nowadays. Dynamic typing has its problems, but so does static. Etc. etc. No easy answers, alas.


The need to be 10-20x faster than CPython is a niche. The General User in General Case just does not care.

I've had projects fail cause of not getting done, being buggy, but never for not being fast enough.

Finally, slowness, until you get really low-down is relatively easy problem with many solutions.


To be clear, in general, I agree. However...

"Finally, slowness, until you get really low-down is relatively easy problem with many solutions."

There is a barrier that you can hit in Python/Perl/Ruby/Javascript where you're trying to do something, you've optimized the Python/etc. to within an inch of its life, and it's still just too slow. I've hit it twice now in pretty serious ways. Once you've removed all the slowness-that-has-easy-solutions, you're still using a very slow language... the 50-100x number I cite is with the slowness already removed for optimal code, though, to be fair, this is in comparison to fairly optimal C/C++ as well. Well-written Python can be competitive with poorly-written C, and that is also not even slightly a joke, since it's generally easier to get to the well-written Python. But you can still run out of juice on a modern machine.

But ultimately this is just something you want to know and understand, and not be too bedazzled by claims that everything's hunky dory in every way.


In my opinion, definitions of 'strongly typed language' and 'dynamically typed language' are orthogonal and Python is both. You can't say that Python isn't strongly typed because its dynamically typed. For example, C is staticly typed, but weakly typed at the same time.

I agree on the rest, though.

Edit: since a lot od people here are arguing what does 'strong typing' mean, I take it from what I learned at college: it means that, apart from typical conversions (like int -> float), it doesn't do many automatical conversions for you.


In Python world deployment is still not as easy as dropping a jar in a container. Right?


It's not. some people are solving this problem with https://github.com/conda/conda. There are many ways to use conda for deployment. My preferred approach is the following:

For builds: - build a conda environment for your project - export your conda environment using conda list -e, and then take all the conda packages for those and put them into a single tarball

For deployment - bootstrap a base python environment using miniconda http://conda.pydata.org/miniconda.html - install that tarball of packages with conda install <mytarball.tar>

It's not as simple as a jar, but it's reasonably close


It easier, just drop egg/deb/rpm/msi into your pythonpath.


Sorta. It's not that easy in practice (although not terribly, hard since you can package up your python project with setuptools and then deploy it to an internal (or actual) PyPI followed up by using distribution packages for a wsgi server like gunicorn, mod_wsgi, etc). In reality, you typically care about third-party modules enough and building rpms/debs is typically not fun enough that you just have the normal pip/virtualenv story.

There is a solution from Twitter that my devops team has been flirting with called PEX[1]. It builds all of your dependencies into a zip file similar to a jar and sets it up to work by just putting it on your pythonpath. This would in practice be very similar to an uberjar.

[1] - https://pantsbuild.github.io/pex_design.html


Actually, building rpm/deb is quite easy, setuptools handles bdist_rpm by itself, for bdist_deb there is a plugin. For windows folks, bdist_msi is by default part of setuptools[1]. As a backup plan, you can still build eggs with bdist_egg and let pip handle the package manager duties for you.

The real complexity comes elsewhere - both in java and python world, setting up the java application server or wsgi server for python is more involved than just dropping an app there. And then there comes the debugging the exceptions... there I would pretty much prefer the python world.

Also, be careful with zipping python projects. While .zip is a valid member of pythonpath, packages can have problems with finding their assets (if they have any). For example, you cannot zip django this way.

[1] It even handles setting up vcvars when building native extensions. I was impressed, it was easiest building of windows binaries for free software I've ever seen.


Except that bsdist_rpm only builds the current package. That's not really any different than pip. You'd need to recursively build packages for all of the dependencies and have some magic to autodetect the provides/requires inside your little package ecosystem to be reasonable for any non-trival app to be deployable via rpm. This isn't impossible, but it's a far cry from PEX or uberjar.


I'm the project lead of Crowdcrafting (powered by PyBossa) with 21.7k lines of code written in Python.

Our stack in Crowdcrafting is fairly simple in terms of hardware (we've two servers with 2 GB of RAM each and only 2 cores, while the DBs have 4 GB of RAM with 4 cores). You can read more about it here: http://daniellombrana.es/blog/2015/02/10/infrastructure.html

In my experience the problems that we've had are always related about how we develop, not the technologies. One of our ways of solving problems is avoiding increasing the resources of the hardware as it will "hide" the real issues of your software. Thanks to this approach we were able to store more than 1.5 records per second over a period of 24 hours in 2014 with a DB running on less than 1GB of RAM and our servers with 2GB of them. It worked really well! (Actually after that our provider called us to expend more on our hardware).

Our platform is designed to scale horizontally without problems. We use a very simple set of technologies that are widely used, and we try to use very simple libraries all the time. This has proven to us to be very efficient and we've managed to scale without problems.

We heavily test and analyze our software. This is mandatory. We've almost 1000 tests written covering 97% of our code base. We also analyze the quality of our code using a third party service and we've an score of 93%, and this helps people to help sending patches and new developers joining the platform.

Right now our servers are responding in less than 50ms in average, and we are not even using Varnish (check out this blog post: http://daniellombrana.es/blog/2015/03/05/uwsgi.html). Thus, yes Python is a good language, it can scale, and if you usually have a problem it will be basically in your own code. Hence debug it!

Cheers,

Daniel


As it pertains to "Enterprise" and Python, I think the most interesting project is OpenStack. I don't use OpenStack but it has been interesting that in conversations with Enterprise devs/managers OpenStack has come up several times. It turns out Enterprise is still really skeptical of the public cloud and are going through the efforts of running their own IaaS/PaaS internally. In my experience OpenStack seems to be coming up much more frequently than CloudFoundry.

And for the record, by Enterprise I'm not talking Silicon Valley Enterprises. These are strictly non-tech Enterprises that traditionally view anything software related as a cost center.


The section on concurrency is a hilarious joke that deftly dodges the obvious drawback of Python's concurrency model while peppering the reader with distracting solutions built on top of multi-processes or single threads. What is missing here? True, multi-threaded (at the OS level) concurrency.

We're engineers here so let's be real and recognize that different tools are suited to different tasks. Limiting oneself to concurrency without OS threads is, to be polite, not necessary. Obviously you CAN build pretty much anything you want with pretty much any tool, as the examples in this article show, but that doesn't mean you should.

(written by someone who writes Python daily)


I'd also add to this that the government is pretty heavily invested into python (at least NASA, DoD, and DoE based on direct experience). It's kind of becoming the scientist's and engineer's new Fortran.


It's a matter of ongoing surprise to see many scientist-engineers in my niche, in the USAF-sphere, who are jumping from Matlab to Julia, leapfrogging Python. Kudos to them.


Julia is pretty cool, I made a point to experiment with it a bit last year. What we work on is pretty heavily object-oriented, and at the time it seemed to me that the types system seemed a bit too awkward to use to for our purposes at the time. I'll probably look at it again this year (probably through Jupyter/IPython notebook)


I really hope Julia succeeds. I will keep working on Octave as the springboard that they can take from Matlab to Julia.

My hope is for some day Matlab to be as obsolete as Octave itself.


I've found Python to be a rolling disaster and a source of technical debt for any source code which starts to get beyond a single file or 150 lines. (5 years professional Python experience). I hope to transition somehow to write Java or another language with better design framing.

edit: (I know a boatload of different languages - Java is the most comparable and least shocking for developers who aren't FP nerds. Go is inadequately expressive.).


> and while it has excellent modularity and packaging characteristics

If only... In actuality, the majority of python's package management is being redone by pypa, and it's not complete.

That's been my biggest problem with python; the circuitous path to deployment the Right Way in an enterprise, where one doesn't simply publish to pypi.

It'll get there, but to throw around the word "excellent" hides a lot of the current pain.


Does anyone else get a consistent ssl error visiting that site?:

    Secure Connection Failed
    An error occurred during a connection to www.paypal-engineering.com. Cannot communicate securely with peer: no  common encryption algorithm(s). (Error code: ssl_error_no_cypher_overlap)


This is such a troll. This article mixes valid points with obvious inacurracies.

Ok, Python is a fine language, even for building large-scale systems. No, it is not free from drawbacks.

#1 : Ok, Python is not new and is mature #2 : "Compiled" is almost a meaningless word. The general definition of compiler is a program that transforms a document written in a source language into a document written in a destination language. TeX is compiler, a CSV to XML parser is a compiler. As regards security/reverse engineering, Python and Java bytecode files can be very easily decompiled into the original program. It is not the case for C/C++ and thus reverse-engineering is much more difficult. Anyway, this issue can be mitigated using obfuscation. So ok for this myth. #3 : Again speaking about the "security". I don't see why Python would be more "secure" than Java. I can see why it would be more secure than C/C++(runs on a virtual machine, so less buffer overflows, bounds checking etc.).

#5 : Yes very valid point for Python. And then you start comparing the JVM (a platform, a runtime, a virtual machine) with Python (a language). This is like comparing apple to oranges. And no, Java is not dynamically typed. Python is.

#6 : This is the wrongest point of your article. Python is slow, very slow. Excuse me, CPython, the runtime that virtually everyone uses is very slow (as all non-JIT virtual machines). Usually 5-10x slower than a native program and 2-4x slower than a JVM program. You can't completely decorrelate the language and the platform. Speaking about Java performance means discussing the JVM (although Java can be compiled). Speaking about Python means discussing CPython.

Performance-wise, Jython is not much better (and sometimes a lot worse than) CPython, and they only implement Python 2.5. Pypy does an excellent job at optimizing Python and often yields on par performance with the JVM. However, it is incompatible with C extensions/modules written for CPython, which means it can rarely be used in production (goodbye databases etc.)

There are countless stories in HN of start up developers rewriting their systems from Python/Ruby to Java/Go because performance was an issue.

#7 : I think some of your examples are contrived (e.g. Youtube) as we don't know where exacly where Python is used is the infrastructure. I only use Python to draw graphs does it mean my infrastructure relies on it ?

About GC pauses. Yeah, there are no pauses in Python because the GIL allows only one thread to run concurrently.

#8: The GIL is a performance optimization ? Really ? The GIL is here because it eased the development of the CPython VM but is terrible for performance. Because of the GIL a CPython cannot run more than one thread concurrently. In the era of multicore processors, this is terrible for performance. Yes, there are workarounds for I/O operations (greenlets etc.), but for processing intensive tasks, it is impossible to exploit a multicore processor with Python. And no multiprocessing is not a solution, as it does not allows to share memory between processes. Also remind that Python is often 4x slower than Java on a signle core...

#9: True but this looks like a strawman. Nobody ever said Python programmers are scarse.

#10: I am not having this debate again, but a strong type system is useful for big projects. I don't say it's mandatory but it reduces development time.


And no multiprocessing is not a solution, as it does not allows to share memory between processes.

You actually can. multiprocessing defines a few classes (e.g. Array) that store their data in a mmap kept in the module. Data there will be visible to all processes, and you can even have a numpy array as a view to this data.

That said, my attempts at working with this recently have been pretty painful. multiprocessing tries, but at least in Python 2.7 does not succeed in abstracting away the differences between Unix forks and Windows spawns. This results in a lot of weird issues: things running differently in command line vs. console IPython vs. IPython notebook; the need to structure your code to avoid pickling errors; etc. This is probably the biggest portability issue I've encountered with Python thus far (using it mostly for scientific problems.)

I'm aware that there are solutions to some of these issues out there, but it's too bad that the standard library implementation has these issues.


The mmap solution, along with various others, are ridiculous workarounds for folks dead set on jamming a square peg into a round, single-threaded hole. If a technology doesn't support something that you need, natively, don't rely on lame workarounds, use a more appropriate technology.


I predict this will become a reference link for pro-Python arguments in the same way that insufferable 'fractal' article has become a boondoggle whenever PHP is mentioned.


Eh. As a Python fanboy, a lot of his points are pretty weak. His examples of Python being fast includes libraries which call into non-Python code and a non-standard implementation.

Besides, there are way better links for pro-Python arguments. I would point to Norvig's spell-checker in 21 lines of Python for example (http://norvig.com/spell-correct.html).


Not sure about trying to predict the future, but it sure reads the same way.


When I read the title, I thought it actually referred to an enterprise version of python. Something along the lines of ActivePython, where certain versions of the language and packages are supported. It's strange that the post doesn't meant versioning or packaging at all. This is a hugely important for enterprise software. Hopefully the work Nick Coghlan is doing around packaging will address this to some degree.


Guido has been conceiving the idea gradual typing in python and wants to introduce this into python 3.5. I think this will be a great mix of both weak and strong type checking.

https://www.python.org/dev/peps/pep-0484/ http://baypiggies.net/


Picking out BAML and JPM as example of successful adoption at scale has got to be a joke!

Giving a large number of "enterprise" developers a language like python means that you end up with ~10m lines of poorly designed, un-pythonic mess.

My personal opionion is that using python for any kind of large scale infrastruture is a very bad idea.


I recently heard someone advise for a friend to learn python2.7 instead of Python3. As someone not in the industry, this was surprising to me? Are companies/people still writing new code in python2?


This is also what I heard as of a year or two ago, but I think its becoming a progressively more outdated piece of advice as the world of Python 3 libraries catches up with 2. But certainly, Python 2.7 is still being written widely, and will remain so for awhile.


I don't think the issue is new code, so much as it's all the old code that needs supporting.

Also, library support still isn't 100% there. So, if one critical piece of your stack isn't on 3.x, you've gotta throw out the whole thing that stick with 2.x.


The biggest issue is that there's no real compelling reason to migrate away from Python 2 now. It's not meant to be an attack on Python 3 - Python 2.7 is still well-supported, and is plenty good enough for a lot of us.

That said, if I were starting out now, I'd start with Python 3.


almost every finance shop in nyc that is using python is using python 2.7. And yes - for new code as well.


Think of Python 2 and Python 3 as different languages with the first having a larger ecosystem and slightly better performance in some common workloads on the main implementation.


> Python is in fact compiled to bytecode

When we talk about compiled languages, we usually mean "compiled to machine code". It's wrong to put a non-JITed "compiled to bytecode" language in the same category as C/C++/Haskell/Rust/Nim/etc.

> Furthermore, CPython addresses these issues by being a simple, stable, and easily-auditable virtual machine.

http://www.cvedetails.com/vulnerability-list/vendor_id-10210...

> Each runtime has its own performance characteristics, and none of them are slow per se.

Unfortunately, they all are slow when compared to the most efficient language/compiler available on the same platform. And no hand waving will fix this. Nor the small and not so small differences that render non-CPython implementations unusable in practice. The only valid argument remaining is that not all programs are CPU bound, but how many Python programmers are willing to say "I can only write non CPU-bound software in this language"?

> Having cleared that up, here is a small selection of cases where Python has offered significant performance advantages:

> Using NumPy as an interface to Intel’s MKL SIMD

The fast parts of NumPy are written in C. Python doesn't get to claim any performance advantage here.

> PyPy‘s JIT compilation achieves faster-than-C performance

I doesn't, they are just comparing different algorithms.

> Disqus scales from 250 to 500 million users on the same 100 boxes

Yes, thanks to Varnish that caches those slow Django requests, according to the linked article. Can you guess what language is Varnish written in? In the mean time, Disqus has also moved its slow as hell services to Go: http://highscalability.com/blog/2014/5/7/update-on-disqus-it... (a case where a shitty compiler that produces machine code beats a bytecode VM by so much it isn't even funny).

> It would be easy to get side-tracked into the wide world of high-performance Python

This depends on what you're smoking.

> One would be hard pressed to find Python programmers concerned about garbage collection pauses or application startup time.

Not garbage collection pauses, but in more than one occasion, when doing data migrations with processing done in Python, I had to call gc.collect() manually each N iterations to keep memory usage under control.

> With strong platform and networking support, Python naturally lends itself to smart horizontal scalability, as manifested in systems like BitTorrent.

The BitTorrent library of choice nowadays is http://www.libtorrent.org/ - written in C++.

> The GIL makes it much easier to use OS threads or green threads (greenlets usually), and does not affect using multiple processes.

The devil is in the details. You can only run OS threads in parallel if only one of them is Python code. To parallelize multiple instances of Python code you need to start multiple instances of the Python interpreter. Or you can pretend that you don't really need parallelism after all and cycle green threads on the same core while bragging about millions of requests per... day.

> Here’s hoping that this post manages to extinguish a flame war [...]

:-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: