Hacker News new | past | comments | ask | show | jobs | submit login
Python wats (github.com/cosmologicon)
123 points by luu on Oct 10, 2015 | hide | past | favorite | 122 comments



Of these, only += on tuple elements and indexing with floats strike me as WTFs, and many others are evidence of the author really going out of their way to look for behaviors to nitpick on. The absolute majority of these are things that Python programmers either will never encounter, or can easily explain from Python "first principles" (e.g. most of the nitpicks on string and sequence behaviors).

I can tell you what I consider WTF worthy - things that I and others have undoubtedly been bitten by because they are ubiquitous. Those are mutable defaults, lack of loop scoping, and "global"/"nonlocal" scoping. Oh, and add the crazy "ascii" I/O encoding default on 2.7 (and the inability to cleanly reset it) to that list.


Truthiness values are also WTF worthy (although not mentioned above).

    if not x:
        do_something()
Where do_something will occur if x is None, zero, 0.0, a blank string or even midnight in some versions of python.

This has bitten me multiple times. It's a violation of python's explicit over implicit. It's an example of weak typing in an otherwise strongly typed language. There's really no good reason for it.

IMO, if x isn't a boolean, that line should just throw an exception. If you want to check for x being 0.0 or "", you should do x == "" or x == 0.0.


I completely disagree. Your snippet above is exactly what I do whenever I want to check for falsiness. If I get a None, a 0, 0.0, "" or [], I check for it using "not x", sometimes using it for multiple return values (e.g. if I get a None or a 0 from mydict.get("param"), meaning that the value wasn't there or is 0).

To me, this is the absolutely most reasonable way to check for truthiness or falsiness. If you want to check for equality or identity, use == or is.


As a person who reads your code I'd like to know which one you were checking for. If you were checking for None or an empty list, for instance, because there is often a non-semantic difference between them.

E.g. one can mean "value not provided" while the other means "there aren't any of that thing" and the behavior you want will often be different depending upon which one it is.

It's this type of subtle difference that often causes obscure bugs.


If I need different behaviour between "value not provided" and "there aren't any of that thing", I would perform both checks (if x is None:/if not x:).

The idiomatic way to check for an empty list is "if not mylist", but I still enjoy the fact that I can check for "if this is anything other than a thing I care about, do this" with one check.

Not to mention that doing anything else would violate duck typing. You don't care if the thing passed is an empty list, you care if it's falsy. If you had a "reverse()" function that checked for an empty list and raised an exception, you'd get in trouble when someone passed an empty tuple. What you care for is whether the object is iterable or not, not whether it's a list.


>If I need different behaviour between "value not provided" and "there aren't any of that thing", I would perform both checks (if x is None:/if not x:).

They will both be true if x is None. What you probably want is actually:

    if x is None:
and

    if len(x) == 0:


I know they'll both be true, the first one will return (or I'll use elif). You definitely want the falsy comparison, the second object may not have len defined.


Your code is brittle; it'll fail when I pass it a generator expression, and for no good reason.


If I'm checking the length of what I think is a list and it turned out to be a generator I WANT it to fail.


PHP does few things right, but it provides an `empty` function for documenting this use case - it will return true for empty strings, null, empty arrays, and unset variables, among others.


Why is null "empty"? That seems worse than being implicit - it's misleading.


Because there is nothing in it?


Why not do `x == False` then? Arguing about a language not being as strongly typed when you're not being as explicit about your own types seems pretty roundabout.


You want `x is False`, because `0 == False`: Python didn't originally have integers so `0` and `1` were used for `False` and `True` respectively. In order not to break existing code using equality comparisons (rather than implicit truthiness) or typechecking, when booleans were introduced `bool` was made a subclass of `int`, and `False` and `True` equal to respectively `0` and `1`.

This is one of the warts which wasn't changed in Python 3.


>Why not do `x == False` then

Because it's no more explicit than "not x" and because it's 5 characters longer.

There's always a trade off to be made between clarity and verbosity.

Judging by the number of bugs I've squashed caused by the developer either not realizing or simply forgetting that ("", 0, 0.0, midnight, False, [], {}, None) all resolve to false, I've come to believe this is one of those times where the clarity should probably take precedence.

You can go to the other extreme by making everything explicitly statically typed which will make your code much more verbose (e.g. see Java). I believe that trade off isn't worth it either, despite the small increase in type safety.


> You can go to the other extreme by making everything explicitly statically typed which will make your code much more verbose (e.g. see Java).

Java is not an example of a good static typing implementation and is not exemplar of verbosity in modern statically typed languages.


Both behaviors are documented and "if not midnight" adds a fair bit of complexity, probably a good idea to know exactly what happens when you write that.

[1] https://docs.python.org/2/library/stdtypes.html#truth-value-...

[2] https://docs.python.org/2/library/datetime.html#time-objects


The midnight behavior was found to be too buggy and surprising, with little to no real-world use, so has been removed from more recent Python versions.


If Python's `if` is weakly typed, Rust's `for` is weakly typed. Few would argue for the later, so I disagree that this is weak typing.


+= on lists is an annoying one:

    def myfunc(mylist):
        mylist += [1,2,3]
is NOT the same as:

    def myfunc(mylist):
        mylist = mylist + [1,2,3]
Try this on each of those definitions:

    a = [4,5,6]
    myfunc(a)
    print a
The first one mutates 'a', the second only reassigns (and discards) a temporary `mylist`.

Put another way, += is not just syntactic sugar. With lists, it actually mutates the original list.


It is syntactic sugar to call mylist.__iadd__. The behavior to modify it inplace is common also for other non-pure data structures.

I'm not sure but I think that list.__iadd__ is even more tricky: When the new object still fits into the allocated array, it will modify it inplace, otherwise it returns a newly allocated list and doesn't modify it inplace. So you cannot rely on __iadd__ modifying a list (or anything else) inplace.


> I'm not sure but I think that list.__iadd__ is even more tricky

False; __iadd__ is always in-place for lists.


Which is what every language with a C inheritance does. += has always mutated the original value, I don't know why you expected it to be different.

To me the real wtf is the second, where python creates a variable with the same name and discards it.


Which is what every language with a C inheritance does. Arguments are passed by value and put into ordinary variables, I don't know why you expected it to be different.


Well no. What would you expect this code to do?

    void increment(int x) {
        x += 1;
    }

    void printit() {
        int y = 0;
        increment(y);
        printf("%d\n", y);
    }
C doesn't have lists so I had to use an int here, but the Python example does the equivalent of that code printing 1, not 0. (But only for lists. That same Python code with ints does what you expect.)


The indexing with a float was on a dict. If you can index with a string, why not a float? If it's hashable, it's good!


Yup. If hash(x) == hash(y), then some_dict[x] and some_dict[y] will return the same value. The int/float is just a convenient red herring on that one, since it's an easy way to get two different-looking keys which hash the same, combined with the double red herring of the initial "can't index" example being a list (which does require integer values for indexing).


nope. Here I use the trick that for most ints, hash(i) == i to quickly find two values with the same hash and put them in a dict.

    >>> k1 = "foo"
    >>> k2 = hash(k1)
    >>> hash(k1) == hash(k2)
    True
    >>> d = {k1: "v1", k2: "v2"}
    >>> d[k1]
    'v1'
    >>> d[k2]
    'v2'
Now that I've been pedantic, I hope I get this right: in Python, it is equality (==) that defines same-ness for dicts and sets. But if x == y, then you have to ensure that hash(x) == hash(y). Python uses this characteristic to make the initial check for dict and set membership integer comparisons on the hash, but when two items have the same hash, Python goes on to distinguish them on the basis of equality.


Oh, while we're talking about things with predictable hashes, here's a Python WTF that I enjoy:

    class A:
        def __hash__(self):
            return -1

    print(hash(A()))
Prints -2.

For the same reason, if you ever need an easy hash collision, use the hashes of -1 and -2.


I'll explain the reason, BTW: a hash value of -1 isn't valid in CPython because that integer is reserved as a flag value. So any hash method that returns -1 quietly gets its value changed to -2.


> If hash(x) == hash(y), then some_dict[x] and some_dict[y] will return the same value

This was my first thought too, so I checked but

>>> hash(0)

0

>>> hash(0.0)

0

>>> hash('')

0

>>> {0: 4}['']

KeyError: ''

Which makes sense, because there is checking for hash collisions, and when they are resolved 0 == 0.0 while 0 != ''


Its more like (although Im not sure the exact implementation).

def make_key(obj): return (obj.__class__.__module__, obj.__class__.__name__, hash(obj))

make_key(x) == make_key(y)


The += on tuple elements thing kind of looks like a bug in CPython.


It's quite obvious where it comes from though, it first calls __iadd__ on the list and that succeeds, and then it calls __setitem__ on the tuple which fails; however, the first operation already mutated the list.

This has nothing to do with tuples, I think, as one can constructs many other examples just like this one.


I stand corrected. My initial comment was just a gut reaction to the weirdness of an operation on builtins raising an error but still having side-effects.


Agreed. tup[0] = tup[0] + [1] fails as expected.


The semantics of augmented assignment are (slightly) different than normal assignment:

https://docs.python.org/3.4/reference/simple_stmts.html#gram...


3 different kinds of NaNs is pretty rad too. No idea what's happening here:

   >>> x = 0*1e400
   >>> set({x, x, float(x), float(x), 0*1e400, 0*1e400})
   set([nan, nan, nan])
   >>> set({x, float(x), 0*1e400, 0*1e400})
   set([nan, nan, nan])
   >>> set({x, float(x), 0*1e400})
   set([nan, nan])
EDIT: it gets worse:

   >>> x = 0*1e400
   >>> y = 0*1e400
   >>> z = 0*1e400
   >>> set({x, x, x})
   set([nan])
   >>> set({x, y, z})
   set([nan, nan, nan])


Yeah, things liked the nitpick on (1,2,3) == sorted((1,2,3)) seem not really `wats`. sorted returns a list, surprise surprise.

The only actual surprise on this list is, I agree, that += operator. Especially because tup[0] = tup[0] + [1] fails as expected.


sorted is always a list, whereas reversed is an iterator (at least in Python 3)


Same in Python 2. A collection can even override "reversed" to provide their own reversed collection/iterator[0], not so for `sorted`.

[0] https://docs.python.org/3/reference/datamodel.html?highlight...


the fact that calling sorted(b) where b is a reversed object affects the following comparison "sorted(b) == sorted(b)" is wierding me out..

Ref: http://pastebin.com/VQs9GSzp


Bool acting as an integer instead of requiring a conversion (True//(True + True) == False*25).


Python originally didn't have a boolean type, and people used integer 0 and 1 as stand-ins for boolean values. When the boolean type was added, first (before the type existed) "True" and "False" were added as built-in aliases for integer 1 and 0, and then bool was implemented as a subclass of int, locked down so only two instances (True and False, with integer values 1 and 0) could exist.

So while it looks weird there is a rational backwards-compatibility reason for it.


If that were the sole reason, it should have been fixed when introducing Python 3. It seems to be much less controversial than other breaking changes in Python 3.

I think the more important reason is that Python doesn't really have an emphasis on the boolean type in general. The `if` statement works for every type, in contrast to other languages allowing only the boolean type in `if`. It is very common to use `if string_or_number_or_anything: blah blah` in Python. While I won't make a stance whether this is a good thing or not, it is still awkward that Python allows implicit conversion between `bool` and `int`.


>So while it looks weird there is a rational backwards-compatibility reason for it.

That was a while ago, however. This behavior ought to be phased out in favor of treating it as entirely its own type.


Thing is, it's not actually there for backwards compatibility. Guido says as much:

http://stackoverflow.com/a/6865824/1763356

Alex Martelli's answer on the same page goes into more depth.

I get the worries, but this is Python - you should know what domain you're working on anyway, because you have to. It's a different philosophy to statically typed languages, and since having True and False as aliases is convenient I'm personally glad for it:

    # Truth as samples
    sum(x > 10 for x in xs)
    (numpy.random.randint(0, 10, 1000) == 7).mean()

    # Several properties are made obvious
    assert True > False
    my_flag ^= True


>I get the worries, but this is Python - you should know what domain you're working on anyway, because you have to. It's a different philosophy to statically typed languages

The only real benefit to this appears to be some shortcuts that shave a few characters off and do so, IMO, at the expense of readability.

E.g. this reads easier for me:

    len([x for x in xs if x > 10])
Than this does:

    sum(x > 10 for x in xs)
Martelli may call this stuff contortions but to me it feels more natural.


> E.g. this reads easier for me: len([x for x in xs if x > 10])

At the expense of allocating a new list, maybe. But the `sum` convention is faster, space-efficient and cleaner. It's also pretty obvious once you've seen it.


Ok with generators then...

I'm sure some people will look at this and think that it's summing all of the numbers over 10 in xs:

    sum(x > 10 for x in xs)
Whereas with this, which is explicit, it's substantially less likely they'll think that:

    sum(1 for x in xs if x > 10)
Again, I don't see the point of maintaining the weakened type system for a few minor 'clever' barely-short cuts like this. Look at what that kind of thinking did to perl.

If python started throwing exceptions on all the code that treated True and False as integers, I'm pretty sure all the fixes done to accommodate that would probably make the code cleaner and easier to understand.


> I'm sure some people will look at this and think that it's summing all of the numbers over 10 in xs

Perhaps it's just a bias of familiarity, but that argument seems contrived to me.

I'm worried the rest of my arguments will amount to "I like what I know", so perhaps we should lay this to rest as a difference in tastes. It does remind me a little of concatenation with `+`, which is oft-hated... and yet I've not seen a single error caused by it.


How about this:

   sum(int(x > 10) for x in xs)
Just require the conversion to be explicit.


I would argue that the vast majority of behaviours described here are precisely the desired behaviour.

In fact, I think the ones he picks reflect tremendously on his skill as a programmer.

Take for instance his apparent insistence that int(2 * '3') should produce 6. Cool. Magic "special" behaviour for the multiply operator if the string happens to contain a number.

The same goes for his desire for the string "False" to be falsey. Wonderful magic special behaviour dependent on the (possibly unknown) content of a string means I've got to go round checking my strings for magic values.


I agree, but there's enough weird shit going on here that it's still a worthwhile "article". For me, the `mixing numerical types`, `Operator precedence?`, `extend vs +=`, `Indexing with floats`, `all and emptiness`, and `Comparing NaNs` were all interesting.


I don't find many of these wat-worthy.

In python 2.x you could redefine True and False, and this is fun:

    >>> object() < object()
    True
    >>> object() < object()
    False
    >>> object() < object()
    True
But python3 dropped a lot of wat-ness.

This one is fun too: https://bugs.python.org/issue13936


I interpret "wat" in the original "wat" presentation as "certain behaviors that can't be predicted by reading the code". In this regard, I don't think many things pointed out in the article can be considered as "wat". I could predict the behavior of many items fairly easily. The same thing could not be applied when I watched the original "wat" presentation.

The only exception I think is the bool coercion. Even though I can predict the behavior, I find it fairly disturbing. I believe that should have been removed when transitioning to Python 3. It is very unfortunate, considering Python is generally reluctant to implicit type conversion.


Besides the mentioned debunks of wat, the

   all([[[]]]) == True
makes perfect sense, because the all all does is check the truthiness of the argument's elements. An empty list is false, non-empty is true, no matter the contents.


By itself, yes, but

  >>> all([[[]]]) == True
  True
  >>> all([[]]) == True
  False
  >>> all([]) == True
  True
is pretty counter-intuitive at first glance.


At first glance, yes. Needs a moment to think this through. But not a wat because this works exactly as the logic says.

If we denote

  x = []
then

  bool(x) == False    # empty list is falsy
  bool([x]) == True   # nonempty list is truthy
  bool([[x]]) == True # nonempty again
and finally,

  [bool(elem) for elem in [[x]]] == [True]  # all True!
which is the thing `all' is interested in. It is more like a newbie mistake or careless document reading if the user thinks `all' runs through the lists and all of the nesting too.


Can someone explain why the first one is a wat? To me it seems perfectly logical

Cast false to a string, then check the Boolean value of that string. It's not an empty string so of course it would be true.


Because people from normal langs or without cs background expect type cast roundtrip to return original value if temp type can represent any value of source type (e.g. string can represent any value of bool)


So someone would expect that an int cast of "2" would be 2? What would an int cast of "a" be?


I'll add to that - In what "normal langs" is this not a problem?

What would a C or C++ programmer expect? Because my experience in those languages says "casting" is a meaningless way to understand what's going on.

Perl doesn't have bareword true/falue values. Ruby doesn't use "casting", I think. That is, I think the idiomatic way to express this in Ruby is:

  >> !!true.to_s
  => true
  >> !!false.to_s
  => true


I have a favorite NumPy wat:

   >>> import numpy as np
   >>> 'x'*np.float64(3.5)
   'xxx'
(on recent NumPy releases this raises a deprecation warning)

Clearly the correct answer is 'xxx>' ;)


How about

>>> type(np.uint64(1) + 1) numpy.float64


Okay, I don't get this one:

  >>> False == False in [False]
  True


Both "in" and "==" are comparison operators that can be chained and so it's interpreted as

    False == False and False in [False]
https://twitter.com/marcusaureliusf/status/55794887300903731...


>>> 1 in [1] in [[1]]

True


Again, chained comparisons. For any comparison operator (comparison operators are <, >, ==, >=, <=, <>, !=, 'is', 'not', 'is not', 'in' and 'not in'):

    x (operator1) y (operator2) z
is defined by the language as equivalent to

    (x (operator1) y) and (y (operator2) z)
So

    1 in [1] in [[1]]
is equivalent to

    (1 in [1]) and ([1] in [[1]])
For any comparison operator


Though, according to the docs, `y` in your first example is only evaluated once when using the shorthand chained method.


Right. I was giving another example that might make it more clear : P


I tend to write complex conditionals with parenthesis, sometimes excessively, but the clarity and guarantee of correct evaluation far outweighs any brevity.

    >>> False == (False in [False, ])
    False


The very first example is stupid. What sane language would assume a non-empty string should be converted to a False boolean?


isn't it doing the opposite? str(False) is the string "False", which bool() casts to True because it is a non-empty string; Non-empty strings are cast to True booleans.

This is much faster than checking if the string has some semantic meaning first, which would have to be localized.


That's exactly what the grandparent is saying


I thought this was going to be about the "Investigating Python Wats" talk Amy Hanlon (my co-worker) gave at PyCon last year: https://www.youtube.com/watch?v=sH4XF6pKKmk


Excellent talk. Her use of the dis module to explain wats by showing bytecode was particularly enlightening. It inspired me to give a similar talk next week end at PyConFr: https://github.com/makinacorpus/makina-slides/blob/master/py... (in French but the meat of it is Python code).


You might post this as an HN submission.


It seems like these all are exclusive to dynamically/weakly typed languages. Anyone have examples of wats for languages with strong static typing?


Java Puzzlers spring to mind. There's a book [0], and some fun talks available on youtube [1][2][3].

[0]: http://www.javapuzzlers.com/

[1]: https://www.youtube.com/watch?v=wbp-3BJWsU8

[2]: https://www.youtube.com/watch?v=yGFok5AJAyc

[3]: https://www.youtube.com/watch?v=Wnzyp1aitb8


Java:

* Reassigning interned integers via reflection

* private fields work on a class level, not an instance level, i.e. Instances of a class can read private fields of other instances of the same class.

* Arguably package-private fields as the docs like to avoid mentioning they exist.

* ==, particularly in regards to boxed types.

* List<String> x = new ArrayList<String>() {{ add("hello"); add("world"); }};

* The behaviour of .equals with the object you create above

* Type Erasure


to add another:

    System.out.println(0.0 == -0.0); // true
    System.out.println(java.util.Arrays.equals(new double[] {0.0}, new double[] {-0.0})); // false
(It's documented in the contract for Arrays.equals, but still kind of ridiculous.)


Byte is signed. `Byte b = 127; b += 1; assert b != 128;`. That's not cool man.


OCaml's parsing rules are pretty opaque sometimes. For example, if you have an "if" statement that's just performing effects (not returning a value), you may run into behavior like this:

    # let x = ref 0;;
    val x : int ref = {contents = 0}
    # let y = ref 0;;
    val y : int ref = {contents = 0}
    # if false then
        x := 1;
        y := 1;
      ;;
    - : unit = ()
    # (!x, !y);;
    - : int * int = (0, 1)
    # y := 0;;
    - : unit = ()
    # if false then
        let z = 1 in
        x := z;
        y := z;
      ;;
    - : unit = ()
    # (!x, !y);;
    - : int * int = (0, 0)


Some guy made https://www.reddit.com/r/lolhaskell and put 3 posts there.


Sure. Javas signed byte type comes to mind.

Oh, and `null`, in any language ;-)


To me the first two: "Converting to a string and back" and "Mixing integers with strings" aren't "wats" at all.


actually almost none of these seem like wats to me.. Just what I would expect.

The one wat was adding a float to a huge integer. Granted, I assumed something fishy might happen with such large numbers, so I steer away from doing those operations.


Well to be fair it is only a wat if you don't know about floating point imprecision. Because that is exactly what is causing this contrary to what is stated. Also python just gets this behavior from C, so the other claim that this is a special case of python is also not true.

Since the 1 is shifted 53 bits to the left the number is right at the start of the number range where only even numbers can be represented.

You get 900719925470993 as the decimal number which you cast to float : 900719925470992.0

Then you add 1.0 and get 900719925470992.0 because of floating point imprecision and rounding. The next floating point number would be 900719925470994.0.

92 is less than 93 and this gets you this seemingly weird x+1.0 < x = true.


It's still unfortunate that `x + 1.0` rounds incorrectly. Python is one of the few languages that has correct (infinite-precision) float-int comparisons, so it's reasonable to expect addition to work similarly.

That said, I'm not aware of a language that does better, and I'm aware of many that do much worse.


There is Frink (http://futureboy.us/frinkdocs/) which has interval arithmetic which seems so much more sane than floating point numbers..


I looked at Frink a little; it doesn't seem to implement integer-floating addition properly either.

    9999999999999999999002 + 49.0
rounds to 9.9999999999999999990e+21, whereas the lesser value

    9999999999999999999000 + 50.0
rounds to 9.9999999999999999991e+21.

Plus, I'm not a fan of computing with decimal arithmetic, since it's less stable. For instance,

    min(a, b) <= (a + b) / 2 <= max(a, b)
doesn't always hold for decimal floats, whereas it does for (non-overflowing) binary floats. Decimals are generally more prone to this kind of inaccuracy, since they lose more bits when the exponent changes.

(Consider a = 1.00000000000000000001, b = 1.00000000000000000003.)

Interval arithmetic support is cool, but not useful without effort - bounds like to grow. Plus, Python has bindings for them anyway ;).


> all and emptiness [...]

If all would work any other way, it would be seriously broken. (Given the rules for how to convert list to bool.)


Anyone know the story behind the converse implication operator? What it's useful for, and why it's undocumented?


∗∗ is the exponentiation operator. True is implicitly converted to 1, and False to 0. It's a coincidence that 0^0, 0^1, 1^0, and 1^1 line up with converse implication.


Maybe someone can explain better, but I don't think it's a coincidence.. In category theory, there is a general notion of "exponential object". This works like normal exponentiation in various categories of numbers, and like modus ponens (and also reversed - given exponent and power, you can deduce the base) in various categories of logic.


There's no coercion - True is just an alias for 1, and False for 0. True <pow> False is just an ugly way of writing 1 <pow> 0.


Remember that booleans are integers in python. This is just the regular exponential operator.


Working in Python, my biggest wat is that a test can cause a failed assertion in the code and still succeed (if AssertionError is caught somewhere upwards in the code, which will happen at any `except Exception`).


I've hit this too, it would be really nice if AssertError didn't inherit from Exception. There's a whole Exception object hierarchy but it seems like none of the parents of Exception are actually useable for anything


That would partially solve this problem (it's far rarer to see bare excepts or `except BaseException`), but could cause others. The question there should be what is (or should be) intended when someone says `except Exception` - it's not clear to me whether that should or should not include AssertionError.

A more direct solution might be to have a means of storing off every assertion failure into a (clearable) list. The test harness could then check that list at the end of each test. If a particular test should succeed after triggering an assertion failure, it can check the list to make sure it triggered exactly the expected failures and then clear it.


> The question there should be what is (or should be) intended when someone says `except Exception` - it's not clear to me whether that should or should not include AssertionError

Right, I guess I'm not suggesting changing the language today to do this. Rather, I'm suggesting that it would be better if it were already done that way from the beginning. If it were designed that way from the start then there would be no question about the intention because it would just be part of the definition of Exception


"If it were designed that way from the start then there would be no question about the intention because it would just be part of the definition of Exception"

Heh, doesn't that argument apply currently? I think I was trying to ask, "when someone fails to think deeply enough about it, what is likely to give the 'correct' results"?


A similar issue exists for StopIteration.


That's why you should inject defects into your code, to see whether your tests catch them. There are tools to do that automatically.


I feel like I may be missing some of what you're trying to communicate. I agree that injecting defects can help test a test suite and that this can be a good practice. I don't quite see the relevance to the issue I described.

Edited to add - thanks for responding, by the way. I'm confused by all the down-votes.


Don't put too much store in early downvotes. It still usually balances itself out.

I still don't quite get what you meant.


Assertions in code are for documenting and checking things that can't happen unless there's a bug. This helps programmers reason about the code as they read it, and helps us find errors closer to the defect that caused them.

If a test case triggers an assertion violation down in some method, there is a bug. That should break the test, so that I'm told about the bug, and can investigate and fix it. If there happens to be a `try...except Exception` anywhere in the stack above that method, the test never learns that an assertion fired and might even pass. This makes every test less useful than it could be.


Oh, that's why you inject defects, and if that injection doesn't cause your test to fail, eg because of a try..except Exception, you know that your test is wrong.

(Then you investigate and hopefully remove the catch-all.)


But catch-alls of that form can be correct.

This isn't terribly relevant to narrow unit tests, where I should be able to know what I expect to have happened and presumably if an assertion pops it won't have happened, but it makes larger scale fuzzing substantially less useful.


I always run into trouble in Python with closures and assignment. E.g.,

  def f():
      x = 0
      def g():
          x += 1
      g()
  f()


Here's a quick interpretation of Python's method:

Any assignment operator (compound or not) follows the same namespace rules. Scope lookups are consistent within a scope, regardless of order. Any "namespaced" lookup (`x[y]` and `foo.bar`) will act inside that namespace.

Non-namespaced identifiers will always assign into the nearest enclosing scope (a scope is always a named function) by default, since clobbering outer scopes - particularly globals - is dangerous. You can opt in to clobbering with `global` and `nonlocal`. If you don't ever assign to a name in the current scope, lookups will be lexical (since a local lookup cannot succeed).

---

Hopefully most of these value judgements should be relatively obvious.


the behavior of is seems pretty hard to predict

  >>> 3.1 is 3.1
  True
  >>> a = 3.1
  >>> a is 3.1
  False


It seems to be a case of leaky abstraction, where the way python caches small objects becomes visible, another example:

  >>> a = 'hn'
  >>> a is 'hn'
  True
  >>> a = ''.join(['h', 'n'])
  >>> a is 'hn'
  False
  >>> a
  'hn'
  >>> a = 'h' + 'n'
  >>> a is 'hn'
  True
Edit: found another interesting case

  >>> a = 1
  >>> b = 1
  >>> a is b
  True
  >>> a = 500
  >>> b = 500
  >>> a is b
  False


Second case is because of the range of integers that are cached, right?


It's actually more complicated than that. CPython will cache constants within a scope, so

    a = 500
    b = 500
    a is b
will give `True`. But the REPL forces each line to compile separately and CPython won't cache across them. However, there's a global cache for integers less than 256 so in that case they're `is`-equivalent even in the REPL.


`is` means "do these things reside in the same location in memory?", and the runtime is merely required to give a plausible answer.

For inequivalent values or `True`, `False` and `None`, the result is defined. Identity is also preserved across assignment. For everything else, the answer is "whatever the runtime feels like". PyPy, for instance, will always say True for integers and floats.

It's not just hard to predict - it's inherently unpredictable. The runtime gets free reign as long as you can't prove that it's lying.


How are the converse implication operators worthy of a wat? That's completely normal...


I thought I knew python .... I guess I do not know python ...

awesome idea - thank you


False as a valid fd: os.stat(False) (False == 0)


Not sure if I can call this as `wat`, but this one puzzled me at first:

    foo = 1.0 + 2 # Foo is now 3.0
    foo = 1,0 + 2 # Foo is now a tuple: (1,2)
    foo = 3 # Foo is now 3
    foo = 3, # Foo is now a tuple: (3,)
source - https://wiki.theory.org/YourLanguageSucks#Python_sucks_becau...


The ',' in always creates a tuple. Why is that puzzling?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: