For those who don't know the name, Tim Peters is the author of "The Zen of Python". He is the one who uniquely captured was Python is all about with this inspirational little poem:
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
The praise for Mr. Peters means that much more coming from you. Don't sell yourself short; your PyCon lectures have done an immense service along the same lines of "capturing what Python is all about", and Python would feel like a very different language without functional-programming builtins like `any` and `all`, not to mention the `itertools` standard library (among the many other contributions you list in your profile).
The reason Guido didn't want 3.6 to guarantee dict ordering was to protect 3.5 projects from mysteriously failing when using code that implicitly relied on 3.6 behaviors (for example, cutting and pasting a snippet from StackOverflow).
He thought that one cycle of "no ordering assumptions" would give a smoother transition. All 3.6 implementations would have dict ordering, but it was safer to not have people rely on it right away.
# variable A has three causes: C1,C2,C3
C1 <- rnorm(100)
C2 <- rnorm(100)
C3 <- rnorm(100)
A <- ifelse(C1 + C2 + C3 > 1, 1, 0)
cor(A, C1)
cor(A, C2)
cor(A, C3)
# If we set the values of A ourselves...
A <- sample(c(1,0), 100, replace=TRUE)
# then A no longer has correlation with its natural causes
cor(A, C1)
cor(A, C2)
cor(A, C3)
We should probably get rid of that. It is old (predating numpy) and has limited functionality. In almost every case I can think of, you would be better off with numpy.
If you don't want to add a dep on numpy (which is a big complex module) then it's nice to have a stdlib option. So there are certainly at least some cases where you're not better off with numpy.
Even better if Python adds a mainline pandas/numpy like C-based table structure, with a very small subset of the pandas/numpy functionality, that's also convertable to pandas/numpy/etc.
What kind of subset would you have in mind? I think that any kind of numeric operation would be off the table, for the reasons given in PEP 465:
"Providing a quality implementation of matrix multiplication is highly non-trivial. Naive nested loop implementations are very slow and shipping such an implementation in CPython would just create a trap for users. But the alternative – providing a modern, competitive matrix multiply – would require that CPython link to a BLAS library, which brings a set of new complications. In particular, several popular BLAS libraries (including the one that ships by default on OS X) currently break the use of multiprocessing."
Numpy is incredibly widespread and basically a standard so I would propose: It should have exactly the same layout in memory as a numpy array. It's fine if it has a very limited set of operations out-of-the-box. Maybe something like get, set, elementwise-arithmetic. Work with numpy project to make it possible to cast it to numpy array to help the common case where someone is fine with a dep on numpy and wants the full set of numpy operations.
The best they can do without BLAS. Doesn't have to be as fast as numpy, just faster and more memory efficient than doing it in native Python, without the dependency.
This is an impressive post showing some nice investigative work that isolates a pain point and produces a performant work-around.
However, the conclusion is debatable. Not everyone has this problem. Not everyone would benefit from the same solution.
Sure, if your data can be loaded, manipulated, and summarized outside of Python land, then lazy object creation is a good way to go. But then you're giving up all of the Python tooling that likely drove you to Python in the first place.
Most of the Python ecosystem from sets and dicts to the standard library is focused on manipulating native Python objects. While the syntax supports method calls to data encapsulated elsewhere, it can be costly to constantly "box and unbox" data to move back and forth between the two worlds.
First off, thank you for all your contributions to Python!
I completely take your point that there are many places where this approach won't fit. It was a surprise for me to trace the performance issue to allocations and GC, specifically because it is rare.
WRT boxing and unboxing, I'd imagine it depends on access patterns primarily - given I was extracting a small portion of data from the AST only once each, it was a good fit. But I can imagine that the boxing and unboxing could be a net loss for more read-heavy use cases.
You could create a custom C type that wrapped an arbitrary AST node and dynamically created values for attributes when you accessed them. The values would also be wrappers around the next AST node, and they could generate new AST nodes on writes. Python objects would be created on traversal, but each one would be smaller. It wouldn’t use Python lists to handle repeated fields It seems like a non-trivial implementation, but not fundamentally hard.
The analogy with numpy doesn’t seem quite right, as Raymond observes, because numpy depends on lots of builtin operations that operate on the underlying data representation. We don’t have any such code for the AST. You’ll still want to write Python code to traverse, inspect, and modify the AST.
Very fair points. For general purpose ASTs from Python your design should be more efficient while essentially keeping the existing interface.
When I referenced numpy, I was thinking of a query layer which could push traversal into the extension as well. Something that could have given me “.select(ast.Import).all()”, which in my head is kind of like doing a filtered sum in numpy.
Very cool to get your thoughts on this, thanks for making an account :)
>However, the conclusion is debatable. Not everyone has this problem. Not everyone would benefit from the same solution.
Everyone would benefit from developers being more performance minded and not doing uneccesarry work though! Especially Python who has long suffered with performance issues.
You can focus on multiple things, you know. There is some low-hanging fruit in Python for performance in certain circumstances (mostly hot loops, at least, in my experiments). For example, if you need to extract a string from a datetime object, doing so manually with f-strings is about 20% faster than strftime. If you use the string mini-format language instead, it’s 40% faster.
What you’re describing is myopia. Focusing purely on performance at the expense of anything else would probably result in highly unreadable code, yes. Being aware of and caring about performance, and choosing to prioritize it when reasonable is not the same thing.
You can, but each added focus degrades the quality of the others.
The key principle is thinking with a mindset of cost.
Even if it’s low hanging fruit, there’s a world of difference between assuming we can work it in, and saying, “this is what it will cost, and this is what we won’t work on result”.
And similarly, saying ”that’s impossible” is not in the same universe as “the cost is extremely high and it’s not important enough to us to pay that cost”.
IME from the perspective of an SRE / DBRE, performance is nearly always given up (if it’s ever even considered) in favor of making things easier, and this tends to have large consequences later on.
People seem to have taken the quote, “premature optimization is the root of all evil” to mean “don’t focus on performance until you absolutely have to,” but when you push them to prove that they don’t have to, they often haven’t even profiled their code! Cloud compute – and especially K8s – has made it such that it’s expected that you simply over-provision and scale as necessary to get the throughput you need. I don’t personally see running through a profiler (ideally as part of CI) as being particularly difficult or onerous.
We've had to contact the IRS a number of times. It was always difficult to get through. However once we did get through, the representatives were polite, professional, and really seemed to care about achieving case resolution.
Your nice work on the JS itertools port has a todo for a "better tee". This was my fault because the old "rough equivalent" code in the Python docs was too obscure and didn't provide a good emulation.
Here is an update that should be much easier to convert to JS:
def tee(iterable, n=2):
iterator = iter(iterable)
shared_link = [None, None]
return tuple(_tee(iterator, shared_link) for _ in range(n))
def _tee(iterator, link):
try:
while True:
if link[1] is None:
link[0] = next(iterator)
link[1] = [None, None]
value, link = link
yield value
except StopIteration:
return
I think it's not actually in the training data. The llm is just using RAG, meaning it gets the top search results for that query and based on that generates a text. Kinda like perplexity but apparently worse then it.