Black: An uncompromising Python code formatter

zestyping · on May 25, 2018

This is great except for the enforcement of double-quotes around all strings and spaces around slice operators. These two choices contradict the standard Python documentation, most of the standard library, and the behaviour of the interpreter itself.

When the language itself has an established convention, Black should follow that convention, not fight it. These two weird choices just generate needless churn, which is surprising as it seems Black has quite the opposite goal.

It's a shame, because all the other design choices in Black are pretty good!

ambivalence · on May 25, 2018

Colons in slices are implemented to the letter of PEP 8. Your disagreement here probably stems from pycodestyle mistakenly enforcing a different rule (no spaces before colons on if-statements, defs, and so on) in the slice context.

The language itself doesn't have an established standard in terms of string quote usage. If it did, Black would follow it. What repr() does is a weak indicator and how the documentation is written is random, there was not only no enforcement as to which quotes to use, there wasn't even a recommendation. Black standardizes on double quotes since it has clear benefits whereas the other option does not.

zestyping · on May 26, 2018

I thought of another way to express this that might resonate better.

What I like about the Black philosophy is that it wants to make code style _uninteresting_. People should think about other things, not formatting. That's a great goal. So it seems to me that the best style choice is the most _boring_ choice. The least creative, least novel way. It should try to avoid inventing new formatting algorithms.

What's the most boring way to format a string literal?

The way the language already does it. The way every Python programmer has already seen string literals formatted from the very first day they started typing things into a Python interpreter.

Even if half of us liked single-quotes and half of us liked double-quotes, you can guarantee that every Python programmer on the planet has seen and lived with strings that are formatted the repr() way, including the double-quote fans. You can't guarantee the opposite.

No one could fault you for doing it the repr() way. Blame Guido :) he made that choice decades ago, and everyone has already had to make their peace with it. It's a solved problem. For anyone writing a program that emits Python code, it's the default way to format a string. It's the least assailable option, and that's a good thing.

How does that sit with you?

weberc2 · on May 26, 2018

I’m a Python programmer and I never had the perception that single quotes are more canonical in any capacity. I always felt that single quotes were for degenerates who didn’t realize that single quotes are most commonly used for chars by the broader programming community. ;)

vijucat · on May 26, 2018

I've seen single quotes for chars vs. double quotes for Strings in Java, but in sh, it's "Strong Quoting with Single Quotes" vs. "Weak Quoting with Double Quotes":

http://www.grymoire.com/Unix/Quote.html#uh-1

"When you need to quote several character at once, you could use several backslashes. This is ugly but works. It is easier to use pairs of quotation marks to indicate the start and end of the characters to be quoted. Inside the single quotes, you can include almost all meta-characters"

So some of us associate single quotes with proper const strings, with no variable expansion, command substitution or other interpreter hanky-panky.

heavenlyblue · on May 26, 2018

I’ve always thought that double quotes are for degenerates who don’t pay attention to what a standard repr output is for strings.

kuroguro · on May 26, 2018

And I’ve always thought that triple quotes are for degenerates who... no, wait, nvm.

heavenlyblue · on May 26, 2018

... who can’t write self-explanatory code.

nerdwaller · on May 26, 2018

Code only captures the what and the how, comments for any complex routines can capture the why. Obviously overuse is probably a red-flag, but in my experience code alone doesn’t always capture enough context in any sufficiently complex system.

I get your specific comment may be a bit sarcastic, given the ancestry - but I make this one more for younger developers I see that don’t always filter that ;)

aldanor · on May 28, 2018

You don't have to press Shift as often to type string literals so it's a win, all other things being equal.

rndgermandude · on May 26, 2018

I despise people using single quotes even more than people using tabs instead of space, emacs instead of vim or GNU-style indentation instead of... anything sane :D

greyman · on May 26, 2018

I used to use single quotes for dictionary keys and other constants, and double quotes for strings intended to be read by human. I read some article about it a considered it a good convention. Now I see it actually isn't a convention. :-) I switched to black just yesterday and overall I like it, still I miss a bit this distinction between single and double quotes.

zestyping · on May 25, 2018

Slices: My reading of README.md is that Black inserts spaces around the colon, is that right? Almost none of the Python I've seen in 20+ years is written that way. The tutorial and documentation on python.org don't use spaces around the colon, and it is extremely rare in the standard library (out of over 1100 slice expressions that have operands on both sides of a colon, I count 10 that use the extra spaces).

Quotes: We have a difference of opinion over what constitutes an established convention. If the language's own way of displaying strings has been stable for its entire history, I consider that established. A reasonable choice would have been to do what repr() does (single-quotes unless the string contains a literal single-quote) or a simplified version of it (single-quotes always), not the opposite of what the language itself does.

(I wouldn't care so much about this except that I've really wanted a tool like Black for a while and greatly appreciate the philosophy!)

Shish2k · on May 25, 2018

> My reading of README.md is that Black inserts spaces around the colon, is that right?

I can't speak for the readme, but:

    $ cat test.py
    x = [1,2,3]
    print(x[1: 3])

    $ black test.py
    reformatted test.py

    $ cat test.py
    x = [1, 2, 3]
    print(x[1:3])

zestyping · on May 26, 2018

Oh, I must have misunderstood. Sorry! I take back my issue with the slices, then.

geofft · on May 26, 2018

The README confused me, but I think it's saying that it inserts spaces if needed to make it clear that it's a lower-precedence operator. But unlike operators like +, it's not one that inherently requires spaces.

I tried out black and it does the following (the file originally had no spaces):

    x = a + b
    x = m[a:b]
    x = m[a + 1 : b]
    x = m[a:-b]
    x = m[a : 1 - b]

I would personally still write m[a + 1:b], I think, but black's approach is totally defensible. (I guess I would really write m[a+1:b], and black would rightly correct me.)

jwilk · on May 26, 2018

m[a+1:b] and f(x=y+1) are weird corner cases of Python formatting.

I usually avoid this problem by adding extra parentheses:

  m[(a + 1):b]
  f(x=(y + 1))

hawkice · on May 26, 2018

Wouldn't being closer imply being used together by the operator? So: 6/2 * 3 = 9 but 6 / 2*3 = 1? In the slice context it seems very obvious the colon has to be slicing, because it can't stand alone to produce an indexable value. But spacing should use this rule, right? [I think this rule was the one used by fortress, the Guy Steele language?]

geofft · on May 26, 2018

Yes, that's the rule black is implementing (assuming I'm reading your comment right and understanding its README right...). m[a + 1:b] is in danger of being read as "take the slice 1-to-b, add it to a, index m on that" - even though we know that isn't a well-typed set of operations, the spacing implies that's how it should be read, same as e.g. m[a + 1/b].

Spacings that don't conflict with precedence are m[a+1:b], m[a+1 : b], or m[a + 1 : b]. While the middle one makes precedence obvious, all three are acceptable. black picks the latter, and I think it picks that one because of another rule that it should prefer writing a + 1 instead of a+1.

mixmastamyk · on May 25, 2018

Double quotes also have drawbacks, visual noise and the doubling of keypresses required on the most common keyboard layouts.

rapind · on May 26, 2018

The "visual noise" makes it clearer the me that it surrounds a block of text, so I don't buy this.

Double quotes are also the convention in english to delineate a literal, so I would argue it's more obvious.

I'll grant you keyboard presses though. There's an obvious advantage to single quotes here, but consider why this is the case. To make the use of "apostrophe" more efficient since it appears far more often than a double quote in english. I suppose it's pragmatic to leverage this advantage in code where string quoting is extremely common...

Perhaps double quoting is a bias I've developed writing code, but I suspect it's actually a bias I carried over from reading and writing english and what simply seemed more obvious.

Language design than supports both makes me crazy. Pick one and enforce it!

ambivalence · on May 25, 2018

You can still type single quotes. You have a tool to convert that for you.

The visual noise complaint is interesting. Do you also consider the letter W to be more noisy than the letter V? Should we discourage the use of noisy letters in the alphabet?

zestyping · on May 25, 2018

A double-quote is more noisy than a single-quote, and W is more noisy than V.

The difference is that " and ' are equally usable options in the context we're talking about. Quotes are very common, so the visual noise adds up when your screen is full of quote marks. Given that they mean the same thing, and one is both harder to type and harder to read, it makes sense to prefer the other.

mixmastamyk · on May 25, 2018

> Given that they mean the same thing, and one is both harder to type and harder to read, it makes sense to prefer the other.

Nailed it.

prawl · on May 26, 2018

A double quote is in no way harder to read. Only on Hacker News.

abiox · on May 26, 2018

this seems a bit reductive.

mixmastamyk · on May 25, 2018

Yes, why www. was dropped, and quotes are used everywhere in most Python code. The triple doubles for doc strings are the worst example, though I have no illusions pep8 will be changed any time soon.

davnn · on May 25, 2018

> The visual noise complaint is interesting.

I thought the same. I guess you could just modify your programming font to make the double quotes really tiny :)

ben509 · on May 25, 2018

I agree about the noise, but, it's much easier to work with. raise FooException("Can't load bar") works, as does f"The flange is elevated by {flange['spronge']} degrees of spronge."

mixmastamyk · on May 25, 2018

Goes both ways, sometimes there's text to quote:

raise KeyError('"%s" not found.' % name)

f'The flange is elevated by {flange["spronge"]} degrees of spronge.'

I don't normally use many contractions or possessives in my code, but am willing to admit it happens occasionally.

zestyping · on May 26, 2018

Yep, that's why I like how repr() does it. Guido solved this nicely a long time ago :)

woolvalley · on May 25, 2018

Eh, many programming languages use double quotes for strings, and single quotes for singular characters. Languages that can use single quotes for strings that I know of are ruby and python.

ComputerGuru · on May 25, 2018

Don’t forget VimL, the language that insanely decided to use double quotes as a “start of comment” indicator, then went back and gave them special meaning based on their position so they either signal the start of a comment or a string.

prawl · on May 26, 2018

A different comment character could have been chosen, but this affects nothing. It's not insane.

samatman · on May 25, 2018

Lua, and there are others. I favor double quotes for this same reason even when single quotes are allowed.

I'd like «guillemets» but that's going to have to be self-serve...

spc476 · on May 26, 2018

I use single quotes when the string in question is used like an enumeration:

    set_color('black','white')

And double quotes for strings meant to be read by a human:

    print("Hello there.  How are you?")

And an example with both forms:

    syslog('warning',"unit %s spin rate %f too low",u,rate)

For me, it's an indication of how I expect the string to be used.

Inityx · on May 25, 2018

And even then, Perl and Ruby (heck, even Bash) have a semantic difference between double- and single-quoted strings. The only other popular general-purpose languages where they're synonymous are PHP, JavaScript, and Lua.

ComputerGuru · on May 25, 2018

>The only other popular general-purpose languages where they're synonymous are PHP, JavaScript, and Lua.

PHP does treat single and double quotes differently. Contents of single quotes strings are not parsed for variable substitution.

ciupicri · on May 26, 2018

Also Pascal and SQL.

zestyping · on May 26, 2018

And JavaScript!

lambda · on May 25, 2018

Hmm. My reading of PEP 8 on slices is that the spaces surrounding : in slices are optional, and used to implement the later rule "If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies). Use your own judgment; however, never use more than one space, and always have the same amount of whitespace on both sides of a binary operator."

This leads to the following difference between the PEP 8 recommendation and Black:

PEP 8 accepts both of these:

  ham[lower+offset : upper+offset]
  ham[lower + offset : upper + offset]

While Black just uses:

  ham[lower + offset : upper + offset]

Also, this is one that PEP 8 doesn't have a example on, but I think this would be one of the cases in which the space wasn't necessary:

  slice[a.b : c.d]

The "." operator binds so tightly in my mind that it doesn't need the spaces around the ":" to disambiguate. It's at the same precedence level as subscription and function call, and higher precedence than unary + and -.

After typing this all out, I think that it might be the case that the bigger difference is actually in how binary operators are treated. It looks like Black always puts whitespace around binary operators, which contradicts PEP 8's recommendation that you are allowed to vary spacing around binary operators to make precedence clear.

Since Black does allow for leaving in extraneous parentheses to make precedence clear, I wonder why it doesn't allow varying the space around binary operators as well? Of course, it should enforce that the spacing actually does match the precedence, and that the spacing is consistent within an expression. That would allow the following examples which PEP 8 lists as "Yes":

  x = x*2 - 1
  hypot2 = x*x + y*y
  c = (a+b) * (a-b)

While right now Black rewrites them as follows, which PEP 8 lists as "No" (though my reading is that varying the spacing like this is optional, so the following could be accepted as well):

  x = x * 2 - 1
  hypot2 = x * x + y * y
  c = (a + b) * (a - b)

ambivalence · on May 25, 2018

Black does not take existing formatting into account. Doing so would cause non-deterministic formatting (Black would sometimes change its mind about its own formatting and a second pass would cause a different formatting than the first).

With this in mind, it has to enforce a rule around operators. Since any operand might be complex, it's more robust to default to spaces around operands always. Otherwise we would inevitably end up hugging operands with an operator that humans consider too tight. And since that's subjective, there is actually no rule that we can hard-code about that.

uryga · on May 26, 2018

> Black would sometimes change its mind about its own formatting and a second pass would cause a different formatting than the first

Why not? I mean, I never built a formatter so idk about the complexity involved, but to me it seems that if Black had a rule like "if you see `a+b` (no spaces), just leave it alone", applying the rule twice shouldn't change anything.

(though I imagine that the interactions of such rules could become hard to understand...)

zestyping · on May 26, 2018

The whole idea is to _prevent_ people from thinking about formatting, by actually making it impossible to make any formatting decisions. There's exactly one way to format the program and that's it. It's draconian, but that's what's great about it.

uryga · on May 27, 2018

That's the feeling I got from the readme, but GP sounds like this particular decision was made not so much for 'philosophical' reasons, as because Black-ing a file twice might give different results. I was curious why :)

komali2 · on May 25, 2018

I'm new to python, I didn't realize there's a difference, can you help me understand the benefits?

ambivalence · on May 25, 2018

I put it in the README. Let me know if anything is unclear.

jsmeaton · on May 25, 2018

I have to admit that I was very excited to begin enforcing black on all of our projects at work - until the double quotes decision. I’m used to single quotes everywhere (except docstrings!).

I’m probably going to still move to black but I know I’m likely to face some pushback now on this one choice, and I don’t yet have the will to defend it.

I also know I’m being mostly unreasonable .. “why is this opinionated tool not perfectly aligned with MY opinions??” .. but feelings.

nas · on May 26, 2018

The double quotes are the only thing that bother me as well. Double quotes are harder to type (on my keyboards) and look more noisy. All other Black formatting I think I could live with. I suppose I can live with double quotes on strings too but I'm not going to like it. ;-)

kstrauser · on May 25, 2018

That's the #1 complaint I have (and hear) about Black. Everything else, I can live with. And honestly, I can live with using double quotes everywhere, especially when I can type single quotes and then let it re-write them. It's just that they currently look really, really odd to me because that's not how Python is traditionally written.

stevesimmons · on May 26, 2018

I prefer a quoting convention of single quotes for text identifiers (as they rarely contain quotes) and double quotes for English text (because contractions are common). Thus:

    key['first_name'] 
    print("Isn't this clearer?")

gvx · on May 26, 2018

I generally follow this convention:

1. Any string literal that contains either ' xor " will be delimited by the other one. 2. For all other string literals, delimit by " if it's meant for human eyes only, delimit by ' if it's meant for computers as well.

masklinn · on May 26, 2018

Same, though I'd word the first one as programmatic identifiers.

Adopted that from Erlang, where atoms are single-quoted and text strings are double-quoted.

waleedka · on May 25, 2018

Actually I prefer standardizing on double quotes and I do that in my code on a regular basis. I know some people use a mix, sometimes single quotes and sometimes double quotes, which is fine, but I prefer consistency in this case.

fishywang · on May 25, 2018

PEP8 doesn't have a preference between single or double quotes. Black has:

> It will replace the latter with the former as long as it does not result in more backslash escapes than before.

And I think that's good enough (and in compliance with PEP8)

icebraining · on May 25, 2018

The slices thing seems to follow PEP8.

aphextron · on May 26, 2018

Everyone has an opinion, and this is theirs. That's kind of the whole point.

cup-of-tea · on May 26, 2018

Coming from a C background I always stuck with double quotes for strings. Single quotes are used for characters in C. I think double quotes are better anyway because an apostrophe is much more likely to turn up in a string than a double quote itself. I rarely have to manually escape quotes inside strings.

vjeux · on May 25, 2018

At Facebook, we are now using prettier[1] on all our JavaScript files, a growing number of Hack files are formatted with Hackfmt[2] and now black is being rolled out for Python. It's a really exciting time :)

[1] https://prettier.io/ [2] https://github.com/facebook/hhvm/blob/master/hphp/hack/src/h...

nickpresta · on May 25, 2018

Hey Vjeux. What does black mean for the prettier python plugin[1]? I had high hopes to move over all projects to prettier (for JS/Python). Is there going to be any merging between prettier python and black? Do you recommend one over the other?

Thanks

[1] https://github.com/prettier/plugin-python

vjeux · on May 25, 2018

I don't know :)

I worked on prettier myself because I wanted to solve formatting for the language I was involved in. It turned out that the prettier infrastructure was actually really good for other languages so we used it for CSS, Markdown, GraphQL... and added support for a plugin system for other people to build printers for their own language. patrick91 (not working at Facebook) is working on a python formatter using the prettier infrastructure.

Independently, ambv (working at Facebook) started black which is written in Python. He's part of the Python core team and the Python infrastructure team at Facebook so it made sense for him to drive adoption of black within Facebook.

One interesting thing I realized is that communities are built around programming languages and it's really hard to influence another community from the outside. So my bet would be that black has the most chance of succeeding within the Python community.

ballenf · on May 25, 2018

> One interesting thing I realized is that communities are built around programming languages and it's really hard to influence another community from the outside.

IMO, that point is worthy of a detailed blog post or conference talk, if you would be so inclined. Would love to hear more.

vjeux · on May 25, 2018

That's true, I'll consider it, in the meantime here are some thoughts.

When I started working on React Native, I thought that the most difficult thing would be to design a good set of APIs to make it easy to write mobile apps using React that felt good. This turned out to be the "easy part", we started the project wanting to solve this and having lots of good ideas on how to do it.

What turned out to be a lot harder was the fact that we were trying to use JavaScript from within iOS and Android ecosystems.

1) Those at the time were in different repos, how do you synchronize code between them?

2) The three ecosystems use a different set of tools for everything: IDE (xcode, intellij, sublime/atom/code/emacs), package manager (cocoapods, maven, npm), linters (eslint), build (how do you hook up with the play button in xcode?), profilers (can you display stack traces with the two languages calling each other?)...

3) Mixing and matching languages inside of a single project is hard because there are a lot of subtle different semantics (eg: javascript doesn't have int32 or int64). If you have type systems, they are incompatible (flow vs obj-c). So in practice you end up with a lot of boilerplate to talk between the two languages and it's a performance overhead.

There's also a social aspect where you invested so much learning an ecosystem that it becomes part of your identity. So you see someone wanting to bring another language as trying to attack you directly.

My mission since then has been trying to "break down the silos" and trying to build tools that can work with all those languages. It's not been easy :)

ComputerGuru · on May 25, 2018

Wow, that website is hard to use! You can’t touch scroll down on iOS unless you click towards the bottom of the page as scrolls in the animation zone are ignored.

philip1209 · on May 25, 2018

Why black over yapf?

ambivalence · on May 25, 2018

Black is a simple tool. It tries to implement a single code style well. It's not configurable.

We tried YAPF before and could never roll it out for everybody. I even contributed the "facebook" style to the tool. There were a few reasons why YAPF didn't work out for us but the most important were:

- YAPF would at times not produce deterministic formatting (formatting the same file the second time with no changes in between would create a different formatting); Black treats this as a bug;

- YAPF would not format all files that use the latest Python 3.6 features (we have a lot of f-strings, there's cases of async generators, complex unpacking in collections and function calls, and so on); Black solves that;

- YAPF is based on a sophisticated algorithm that unwinds the line and applies "penalty points" for things that the user configured they don't like to see. With a bit of dynamic programming magic it arrives at a formatting with the minimal penalty value. This works fine most of the time. When it doesn't, and surprised people ask you to explain, you don't really know why. You might be able to suggest changing the penalty point value of a particular decision from, say, 47 to 48. It might help with this particular situation... but break five others in different places of the codebase.

vjeux · on May 25, 2018

Another issue is that yapf uses an algorithm that is quadratic, so you end up with code you actually see in practice take seconds to minutes to format.

The authors have been adding specific workarounds for some cases but it's a general issue with the approach:

- https://github.com/google/yapf/issues/264

- https://github.com/google/yapf/issues/39

Black algorithm doesn't explode in such way. It's also a lot faster overall which makes it possible and reasonable to enable a good format on save experience.

omoikane · on May 25, 2018

I do like Black's line wrapping rules of keeping things on separate lines to minimize diffs. I really hate auto formatting tools that tries to compress lines vertically through binpacking techniques that produce unaligned code.

gvx · on May 26, 2018

RE: non-deterministic formatting: I'm not familiar with YAPF, but what kind of behaviour does it have? Like if you run it on two copies of the same file you might get different results, or that it's not idempotent (i.e. sometimes YAPF(YAPF(source)) != YAPF(source))?

aagimene · on May 25, 2018

> It's not configurable.

Except for the line length:

> if you're paid by the line of code you write, you can pass --line-length with a lower number.

ambivalence · on May 25, 2018

Yeah, I wasn't willing to die on that hill, especially that I'm introducing a new default value that wasn't popular before.

nemoniac · on May 26, 2018

Ah but there's the irony.

Everyone would like to stick to the standard formatting rules. Well... except for that one little idiosyncratic thing that they just won't sacrifice.

Then black comes storming in to enforce conformity. Well... except for that one little idiosyncratic thing the author of black just can't bring himself to adhere to.

Don't get me wrong. I'm a big fan of the approach in general. I use things like paredit-mode and aggressive-indent-mode and whitespace-mode and I've just set up emacs to use black together with blacken-mode, but with blacken-line-length set to 80.

gregw2 · on May 26, 2018

Time for a fork!

Let's call it "black-ish"

ben509 · on May 25, 2018

It was a good call. Some code bases use long identifiers and an 80 column limit gets to be a mess.

And for the 80 column fetishists, you'd have to pry their VT-100 terminals from their cold, dead hands. Disposing of the body is enough of a pain, but worse, you have to find a recycler who can deal with the lead glass in the CRT.

asah · on May 27, 2018

The modern argument for lower-line-lengths is the ability for more screens+fonts+UIs to display two blocks of code side-by-side. Somewhere between 80 and 120, you start to force many UIs to wrap. Adding to the subjectivity, approx nobody cares about one line of long code out of 100,000 -- so what percent of code-that-wraps is too much?

To avoid dying on that hill, many eng leads just throw up their hands and give into the 80-column folks, even if it's less productive for everybody else.

The beauty of gofmt and black is that formatting can become a commit hook, so no human wastes time cutting lines manually like some early 20th century typesetter.

lambda · on May 25, 2018

I've got to say, I'm quite happy with the default line length in Black.

The reason I always fight with people who want to increase line length is that they generally want to increase it to 100. Which is just a bit too big to fit two columns comfortably side by side at a reasonable size on a laptop screen, or three columns side by side on a wide desktop screen, which is how I generally set up my editors.

I'm always frustrated with codebases that use a 100 character standard, since I'm always running into lines that wrap, but when I use type annotations, 80 characters causes too many function signatures to have to wrap.

klodolph · on May 25, 2018

When I read that line in the documentation, it hurt a little. I'm going to be honest here. I did not feel good with someone saying that an 80-column limit is just there to pad my paycheck, rather than an informed decision I made. I realize that it was tongue in cheek, but it still felt bad.

sametmax · on May 25, 2018

If that hurt you, you need bigger problems in life.

ubernostrum · on May 25, 2018

I like a line length rule because I use an editor that can show me multiple files side-by-side, and keeping things to a certain length ensures that I can productively do that.

And my terminal happens to be around 240 characters wide for the font size I use and the size of my laptop's screen, which means if I limit to 80 characters or less I can snugly fit three files, or if I set to 100 I can get two with some breathing room.

Plus, if a line really is going over those lengths, then it's often a code smell: maybe I've got code that's too complex and ending up deeply-nested, or maybe I've got functions or methods taking way too many arguments, and hitting a line-length rule will warn me about that.

klodolph · on May 25, 2018

Why are you being so dismissive? It seems like you are making this into some kind of contest to see who has bigger problems.

I am just being honest, and I think I am being helpful because other people will also react negatively to the way the documentation is written. I support the project and want it to be successful, providing feedback like this achieves progress towards that goal, in my estimation.

revfried · on May 25, 2018

yapf is not consistent, you can rerun it on source and it will reformat it multiple times. The rules for how it formats are also hard to understand

pomber · on May 25, 2018

Prettier is what I miss more everytime I switch from JS to Python. I hope Black fixes that.

breckuh · on May 25, 2018

I feel the same way. Prettier is a dream--instant speed, great output and 100% reliability for over a year.

So far black seems great. I just ran it on some existing Python packages and it was fast and the output was correct. Still need to try the editor plugins but very excited so far.

Now, if only someone would lead a similar project for R. :)

acdha · on May 25, 2018

That's been my experience. I'm using the VSCode plugin and have for the first time ever enabled format-on-save for Python. I have yet to have any reason to see that as anything other than a positive.

ljm · on May 25, 2018

In case of credit where credit’s due, is gofmt where the concept of auto-formatting syntax became mainstream? At first I hated it because I thought I had a style, but later on I enjoyed the consistency far more than I enjoyed my signature approach.

A programming language has an opinion on how you build software with it, so it’s appreciated that it also has an opinion on how you should write it so that it remains consistent and easy to follow. No debate about where to put braces or semicolons or whatever else doesn’t matter when putting something in front of your users.

It feels like dumbing down in a way, which is sad, but I think this is more for the benefit of collaboration than individualism or artistic intent. In that case you either disable the tool or refuse to use it.

In every other case, you’ve automated away almost every nitpick from a code review.

sametmax · on May 25, 2018

When i heard about gofmt, i hated it. I wanted choice. Then i realised i loved python because it forced people to indent. I gave black a try. I still hate some style decisions, but who cares ? The benefits are far too great to pass up.

ljm · on May 25, 2018

I felt the same with Prettier and the 80 char limit. At the same time, it solved every single style problem except the choice of line length!

That is a fantastic achievement.

platinumrad · on May 25, 2018

I think the unconfigurability of gofmt may be its greatest strength. Indent, etc. have existed for years but getting everyone to just agree on a reasonable style is invaluable. GNU Indent's default style is one that (almost) nobody uses and the number of available options is overwhelming. rustfmt's default style is fine, but it also has an overwhelming number of options, in addition to not handling line wrapping well. ocp-indent is the only other formatter I've used that's as painless as gofmt.

biztos · on May 25, 2018

In the Perl world auto-formatting syntax is old hat, though of course (it being Perl) the formatter is highly configurable[0]. So in practice you get consistency within an organization, but perhaps not over time.

Coming from that background I was initially put off by the lack of options with gofmt, but once I realized the whole world of go coders would be using the same formatting I immediately fell in love with it.

[0]: https://metacpan.org/pod/distribution/Perl-Tidy/bin/perltidy

sametmax · on May 27, 2018

Configurability for this means:

- debate

- time to setup the tool to your liking

- testing (and adapting to the style)

- and going back to 1 from time to time

So at best it's going to be costly, morally draining and repeated regularly, especially if you change team or in open source. At worst, which is the case for perl, it will not be used by most devs.

iainmerrick · on May 25, 2018

Go is probably where it went mainstream in the current generation, yeah, but the idea has definitely been around for a while.

The earliest implementation I know of was COMAL, a very clean and tidy version of BASIC, I think dating from around 1980.

A lot of 80s micro BASICs did some level of auto-formatting purely as a side-effect of the source code being stored in tokenized form to save memory.

jejones3141 · on May 26, 2018

A friend had COMAL on his C-64. It was very impressive.

Microware Systems Corporation's BASIC09 from about the same time took the code you entered and converted it into "I-code". It's VM code, Jim, but not quite as we know it. The basic09 program didn't do the things we expect from IDEs now, but it did let you modify code etc., so its internal form reflected the BASIC09 statements and control structures rather than having lower-level branch and conditional branch instructions. That let it prettyprint your code with a consistent format when you listed it. (It also let it avoid the insane interpretation and symbol table lookup overhead of Microsoft BASICs of the era, inherited from the days when Altair BASIC had to run on a system with 4K of RAM.)

Too · on May 26, 2018

Visual studio has done auto formatting of c# since at least 2005. Ive used many other tool chains that does it before go became popular. As others noted, the biggest difference with gofmt is probably the lack of configuration.

cup-of-tea · on May 26, 2018

As usual the practice can be traced back a long time to Lisp. It's completely expected that Lisp code will be formatted in the standard way. This isn't a Go thing.

danpalmer · on May 25, 2018

I really want to use Black, but we have a particular style point that we’d miss so much that we haven’t adopted it yet...

Double quotes for strings that need to be human readable, single quotes otherwise.

This makes it so obvious when something is going to be sent to the user, we find it really useful. That said, I think Black’s appeal is it’s uncompromising nature, so I wouldn’t ask it to change. Adding the option to turn off quote formatting would probably go against its vision. Also, it could be argued that we should use the internationalisation functions to denote strings sent to the user, but hey we don’t do i18n yet.

For now, this, and one or two places that it fails to have an opinion (number of lines after imports) are keeping us from using it.

ambivalence · on May 25, 2018

Do you find your team enforcing the string quote rule consistently? It seems to me like it's easy to miss at times as automatic enforcement is impossible. Are there no cases where a string that wasn't originally planned to be user-visible ends up being so? I've heard this idea at times but when I looked at actual codebases it turns out it's more of an aspiration than an actual rule. And if you can't depend on it, why have it?

As for number of lines after imports, how is a lack of enforcement there stopping you from using the tool? Black enforces one line but is fine if you put two (on module level). In general, if you give up on the tool due to a missing rule, you end up having to manually enforce tens of other rules that you'd otherwise be free from.

yycom · on May 26, 2018

> And if you can't depend on it, why have it?

That's a bit rich. There are other conventions in programming that you can't depend on technically but serve a real purpose. Identifier naming and comments are the first that come to mind.

If a language gives you a choice of token that has no semantic distinction then different people will adopt different semantics by convention.

As an aside, calling a tool "opinionated" is code for "my conventions are better than yours". That's fine if I don't have any conventions or I can't decide, but if I have decided, then it's just offensive.

zestyping · on May 26, 2018

Sometimes it's helpful just to have a decision; any decision, followed consistently, is better than no decision or continued debate. This is one of those situations.

So I don't read "opinionated" to necessarily mean "better than your opinions"; it's more like "makes decisions for you so you can avoid the cost of debating them."

yycom · on May 26, 2018

And what if I have already incurred the cost and am happy with my decisions, and they differ to yours? I now cannot use your potentially useful tool, even if 90% of our decisions do accord with each other. That's disappointing.

jrs95 · on May 26, 2018

Comments are often a side of poorly written code though. Ideallly the name and structure of the program should make the intent and purpose obvious, which eliminates the need for a lot of the comments people leave. Of course this isn’t always true, but if you’re using comments to compensate for bad/confusing code, you shouldn’t think you’re doing “the right thing”.

lambda · on May 25, 2018

I kind of subconsciously do the same thing; use single quotes for symbol-like strings, and double for human readable.

But I can buy the argument that just having a single auto-enforced rule improves consistency and that has greater benefits than the somewhat vague distinction that is not enforced.

danpalmer · on May 26, 2018

> Do you find your team enforcing the string quote rule consistently?

Yes. It's pretty much the only formatting style point that we don't have automated.

> As for number of lines after imports, how is a lack of enforcement there stopping you from using the tool?

Our current automated linting enforces it, but Black doesn't always reformat it, so we might get linter errors from Black formatted code.

ambivalence · on May 26, 2018

> Yes. It's pretty much the only formatting style point that we don't have automated.

Alright, fair enough!

> Black doesn't always reformat it, so we might get linter errors from Black formatted code.

Well, as long as it doesn't add new linter errors, that should be fine, do you disagree?

There are always going to be suboptimal formattings and missing transformations but as long as the situation gets better automatically on average and can be further improved with minimal manual input, you should be fine.

waleedka · on May 25, 2018

One idea you might want to consider is to create a global function, let’s call it h() and pass all your human readable strings through it, like so h(“hello world”). This function, simply returns the same string. This is more explicit than relying on quote types. It also allows you to do interesting things such as logging everything to a text file and running a spell checker on it, or checking for wrongly encoded string, ...etc.

demosito666 · on May 25, 2018

I believe _() would be even better, granted you might want to translate messages in future. At first might just def _(s): return s.

ComputerGuru · on May 25, 2018

A single underscore is pretty much de facto reserved for localization.

jedberg · on May 26, 2018

Which is exactly what this is. Putting everything a human reads into a function is step one of localization.

cup-of-tea · on May 26, 2018

That's pretty standard in libraries that do i18n (like Qt, for example). It's a very good practice IMO. Not only does it enable i18n, it means you can easily collect every user-destined string and check it for things like forbidden words etc.

danpalmer · on May 26, 2018

Yep, as others have commented, this is what I meant by using the translation tools.

Too · on May 26, 2018

Why would you have strings that are not supposed to be human readable? Machine to machine should ideally use bytearray (b""-strings) or integer enums for that purpose.

Then you also have the ambiguity of what is considered human readable, is an xml document human readable? Http headers? File paths? Urls? Is a programmer considered human?

zachsnow · on May 26, 2018

I believe the point is whether the string is intended to be displayed to a human, not whether a human might read it in the code. Still some wiggle room around logging, etc. I guess.

Too · on May 27, 2018

Sorry if i wasn't clear, what i meant was actually why would you have strings that are not intended to be displayed to end user?

Smells like non binary serialization format or something alike, which is usually a code smell. It's convenient the first 2 weeks but once the project grows you need a more strict schema and once you have that you might as well use a serialization library which might as well have a binary serialization backend.

aldanor · on May 28, 2018

> why would you have strings that are not intended to be displayed to end user

Why would you not? E.g., you use pandas and columns all have names. Colors are typically also strings, etc. Thus you would often do things like

    grouped = df.groupby(['foo', 'bar'])['baz'].mean()

However, the parent's point, IIUC, is that he'd do

    grouped.plot(color='red', title="User-facing title.")

Too · on May 28, 2018

Colors as strings is a perfect example of what i'm trying to question here. How do i know which colors that are valid? Run time error? No thanks. Reading online documentation[0]? No thanks. I'd rather have my auto-complete[0] tell me right away the available options and my compiler tell me if i misspelled 'turquoiuse'. A better solution for this is a color-class with a long list of predefined colors as constants, and since they are just constants they could have any underlying format, not necessarily a string. Even in a dynamic language such as python this should be preferred over random strings.

Then yes, there are valid uses, especially in ad-hoc scripts. Column names and dictionary keys are one of the gray zones, though again, in my experience once your project grows these are also usually better to code generate from your db schema or serialization protocol; either complete data structures, api-functions, or just a list of constants. Anyway, my point is not to ban strings entirely, it's to question what we use it for. If you use strings as data/identifiers so frequently that you need a special convention for them something smells quite fishy.

[0] Auto complete = More accessible form of documentation. Before someone starts screaming that i'm stupid for "ignoring documentation".

nicpottier · on May 25, 2018

We did this as well, but I think the benefits of black and auto-formatting outweigh losing this convention.

bpicolo · on May 25, 2018

Could fork it and remove that rule

alexhill · on May 25, 2018

I love it. Something I always wish for with linters is an easy way to run them only for the lines changed in a particular diff, to allow a codebase to gradually converge on consistency without breaking git blame by reformatting everything. Is there a nice way to do that for any Python linter?

vjeux · on May 25, 2018

At Facebook we only tell you about lint violations for the lines you touch using arcanist from phabricator[1]. While it works great for most lint warnings, this hasn't worked that well for code formatters.

The most successful strategy was to add a flag in the file (@format in the header) to tell that a file is automatically formatted. The immediate benefit is that we enable format on save for developers on those files when they use Nuclide (>90% of penetration for JavaScript and Hack).

The other advantage is that when we release a new version of the formatter, we can re-run it on all those files so that people don't have lint warnings on code they already formatted in the past.

With that setup, there's a strong incentive for individual engineers to run the formatter on their team codebase in one PR and then everyone benefits from now on.

[1] https://secure.phabricator.com/

philwelch · on May 25, 2018

Have you run into issues where the "let's reformat the entire codebase" commit makes `git blame` unusable?

ambivalence · on May 25, 2018

It doesn't. Use `git hyper-blame` or `git blame $REV^ -- $PATH`.

Sure, there is an additional step but we feel this shouldn't be a blocker for significant workflow improvements.

In fact, a single big "reformat all" commit is better than a bunch of incremental ones that reformat areas that you also change semantically. That is harder to filter and makes diffs harder to follow (which changes are logic and which are just style?).

sciurus · on May 25, 2018

I hadn't head of hyper-blame before. It's part of chromium's depot_tools.

> git hyper-blame is like git blame but it can ignore or "look through" a given set of commits, to find the real culprit.

https://commondatastorage.googleapis.com/chrome-infra-docs/f...

ianamartin · on May 25, 2018

I'm so glad this is a thing. I get so tired of people arguing about the small stuff. Sometimes too much freedom is a bad thing. It seems okay at first, and you try to be as democratic as you can with your team, and then someone wants something really wonky, and PEP8 and PEP257 don't say it's absolutely wrong after all, and shit just wastes time.

I don't agree with every detail--double quotes as default is going to be hard for me to adjust to--but the things I don't agree with aren't as important to me as being able to set it and forget it and stop debating it every so often.

This is the gofmt the python world needs. As far as I'm concerned this is the new standard.

Well done.

kungtotte · on May 25, 2018

Even if your team is totally on board with your style guide, everyone wants to stay pep8/257 compliant, no arguments, you use linters to warn about errors, etc. you will still have some instances of people committing code that breaks the guidelines.

Either it gets caught in code review and you have to waste time with nitpicking, or worse it makes it through to the repo and now you have to make a commit to fix what amounts to a typo.

Autoformatting with a unified, consistent tool means that you remove all those problems.

acdha · on May 25, 2018

> I don't agree with every detail--double quotes as default is going to be hard for me to adjust to--but the things I don't agree with aren't as important to me as being able to set it and forget it and stop debating it every so often.

That's the key part for me, too: Black offers the freedom of not needing to waste time talking about things which really don't matter. No more wasting time on code review where the real issues are obscured by sloppy whitespace, idiosyncratic formatting preferences, etc.

I also had a preference for single quotes but … every file in every project I work on is consistent as soon as I hit save and I certainly don't care enough not to let that outweigh a minor aesthetic point.

curtis · on May 25, 2018

> By using it, you agree to cede control over minutiae of hand-formatting.

I may be in a minority, but I do not want to cede control over minutiae of hand-formatting. Am I the only person that feels this way?

jjkoletar · on May 25, 2018

Your complaint is the entire reason it's an interesting project: it gives you practically no choice. That's why it's called black, as a reference to the Henry Ford quote: "Any customer can have a car painted any color that he wants so long as it is black."

The idea is to toss aside control over nitpicky formatting _configuration_ options in favor of not worrying about formatting configuration and just going with someone else's opinion of what the configuration should be instead.

jaredklewis · on May 25, 2018

Definitely not the only one, but I think the trend is toward using formatters.

José Valim, creator of Elixir and general programming whiz, I think perfectly summed why formatters are so great in a talk he gave at Elixir conf.

(I am paraphrasing from memory here so if someone has the source, please chime in.)

The gist: “I started using the formatter and at first I ran the it on my code and I hated it. It’s taking all my carefully, hand formatted code and messing it up! But then I ran it on OTHER people’s code and I loved it, as the code started to look like the standard format I had gotten used to.”

I think many people don’t start to like formatters until they see what it does to other people’s code. Many people like their own fine tuning, but that’s only half the question. For a big project, most of the code I read will not be my code. I prefer all of that code be in one, consistent style. Sometimes formatting is expressive, so it is a trade off, but for me the lost expressiveness is far outweighed by the Gaines consistency.

njharman · on May 25, 2018

Why are you interested in minutiae? Esp that which has no functional impact and can be automated.

People are into things like this pep8 etc because they don't want to waste another second of their lives thinking about formatting. Or, worse discussing, arguing, bikesheding, documenting, enforcing, teaching the new guy how we format "here".

I'm sorry to sound snarky, but this is one of the things you slowly learn over years of development. I've had more than 25. Long ago I felt like you. No longer.

eslaught · on May 25, 2018

The phrasing "minutiae" is unfortunate because it makes it sound like the exact formatting is a matter of taste or doesn't matter. In my experience, there are always corner cases where either tools like these produce bad results. "Bad" isn't just some aesthetic property, it can mean the difference between being able to absorb the meaning of a block of code in 10 seconds, versus having to spend a minute taking it all in. Across a large body of code, those paper cuts really add up.

Personally, the only code formatter I've ever been really comfortable using is clang-format. And the reason is that they really try hard to get the corner cases right. Black might be fine, but I've been burned many times with other tools and in general would be reluctant to trust a tool like this without seeing what it does in practice to a large code base.

Barrin92 · on May 25, 2018

I think most people would agree that corner cases are an issue, but in large codebases the payoff of having a strong, unified standard provides a bigger payoff than spotty edge cases.

The larger the codebase and the more developers you have working on a project, the less important edge cases become and the more benefit you get from a common standard.

ambivalence · on May 25, 2018

For a large codebase that uses Black, you might want to look at PyPA/Warehouse or Fabric 2.

bunderbunder · on May 25, 2018

I've noticed that, these days, I only find myself worrying about careful hand-formatting when the auto-formatter is doing an inadequate job. e.g., IntelliJ's default Java standard makes far too much stuff optional, which is another way of saying that it is full of cases where it forces me to make the decision.

Looking through Black's rules, it seems to me like the rules it's implementing are comprehensive and specific enough that I'd probably end up with few, if any, situations where I even want to take control. And I'd gladly give those up in return for not having to wade through so much diff clutter when I'm doing code reviews.

spathi_fwiffo · on May 25, 2018

around the same length of time developing. ... i still hand format; mostly because i think most auto-formatters have bad defaults. Which may no longer be true, since i turn off indent in vim immediately on any installation.

It isn't really interest in minutia -- My fingers just do the thing automatically at this point; which means if autoindent is on, I then have to go back and delete all the stuff my muscle memory has made me do.

I'm ok with it if i'm forced into some IDE with an editor that is not built for actually writing code (i.e. every IDE default editor); in those instances auto-formatting is very useful. I just avoid those environments, if at all possible.

ianamartin · on May 26, 2018

Totally agree with this. After having to go through how we do things "here" at a few different jobs, it's just not worth it. It's not worth it when I have to learn someone else's stupid idiosyncrasies, and it's not worth it when I "get" to enforce my own.

I'm willing to let go of the things I'm used to in favor of having something completely uncontroversial. I've been in the business not as long as you, but long enough to count the time lost on this stuff.

We do get attached to style and personalization. I think I was a lot more attached when I was younger at this. Perhaps my formatting was more important when my code itself was less personal or less elegant or something. Or maybe the tasks were simply things that weren't all that interesting but just needed to be done. So my way of leaving my mark was to make the formatting just absolutely perfect. Perhaps it was a way of asserting some agency in junior positions where the architecture was predetermined, the problem was well-defined, and the solution was already known when the ticket was assigned. Just get in there and write the code.

I think--and I may be wrong about this--that as I've gotten older and into roles that are more autonomous, where I get to architect entire components of core company business or start from scratch or do other things that assert my personality and agency in code, I care a hell of a lot less about formatting. Mine, yours, someone else's, I don't fucking care, just forget about it and move on.

I also suspect that caring a lot about code formatting is one of the few ways that juniors can signal that they are really engaged in their work and get a little attention. You can't argue about an application's design or anything actually important, so you push a little on what you can, which is somewhat reasonable, and probably a signal of poor management, really.

Anyway, I digress. Bottom line is that I care less and less as I get older and have other things to worry about. I'm starting to view people who are really picky about personal conventions of code formatting as people who either don't or can't contribute anything more interesting to a conversation.

jjuel · on May 25, 2018

This is one of the things I miss in Python from Go. I loved gofmt because I didn't have to think about the formatting it was just formatted. It did take time to get used to, but now I miss it in other languages. Especially when working with other developers who may not share the same formatting style as myself.

"Gofmt's style is no one's favorite, yet gofmt is everyone's favorite."

philosopherlawr · on May 25, 2018

maybe give it a try. I felt the same way about prettier, but then I realized how much time I save by no longer having to worry about formatting. As I type, I just type it all on one line and then do a keyboard shortcut and it magically gets reformatted and looks pretty.

tom_ · on May 25, 2018

This is what I do too. Provided it's easy to type out, I don't really care all that much what the code looks like - especially when I'm getting paid.

(It just wants to be consistent, something the computer is very good at enforcing.)

dexterdog · on May 25, 2018

Not at all. I would never use something like this except to clean up some sloppy code that I am inheriting.

joemaller1 · on May 25, 2018

Sometimes that sloppy code is your own code from yesterday or last week or last month. Tighten that feedback loop and sooner or later you're formatting on save and really happy about it.

cjbprime · on May 25, 2018

I probably won't use this for my own personal solo projects. But I'm happy to cede control over minutiae when working on a large team, because it's almost impossible to stick to one style in that case. Even just communicating what the coding style is exactly is too hard for most teams.

0xffff2 · on May 25, 2018

I'm fine with ceding control right up until the formatter does something I don't like for no good reason. For languages like Rust, where there is a single format convention that is closely tied to the language (via rstfmt), I am okay with this kind of forced standard. For something like Python, C++, or Java where there isn't a single "winner" for format guidelines, there's virtually no chance that I would embrace something like this.

revfried · on May 25, 2018

A little bit of a catch 22 you think?

C++ has clangfmt.

Black has a good momentum right now it very well might be the clear winner in a few months

0xffff2 · on May 25, 2018

I don't think it's a catch-22 at all. Either the language has an official style guide (C#) and/or formatter (Rust, Go), or it doesn't (C++, Java). Given that black doesn't even seem to comply with PEP-8, I don't consider it acceptable.

I use clang format because it gives me full control over the style I cede control to it because it happens that I can express all of my personal minutiae of hand formatting in clang format rules. In contrast, I currently am writing Java in VS Code, and the Java formatting plugin doesn't give me an easy way to change its rules, so I disabled it entirely.

ambivalence · on May 25, 2018

> black doesn't even seem to comply with PEP-8

That is just not true. Where does Black not conform to PEP 8?

zbentley · on May 25, 2018

> right up until the formatter does something I don't like for no good reason.

Hand-formatting. You are describing formatting code by hand.

spathi_fwiffo · on May 25, 2018

I almost always turn off auto-formatting; at least when i can use vim. If i'm force outside of vim (where i don't have a 'shift' operation), then I don't mind it as much; but i almost always hate the way auto-formatted code looks.

freyir · on May 25, 2018

I used to feel this way before using Go.

ben509 · on May 25, 2018

We get a fair amount of code from non-developers, and it invariably looks like hot garbage. Having something that auto-cleans it is a godsend.

Forge36 · on May 25, 2018

You are certainly not a minority in the discussion. Personally I hate repeating the same argument as it's a style question. I program in C# and dislike braces on new lines, I leave it in place because my company rolled out a standard and if I deviate from that I create more work for myself and others.

Are you consistent with your formatting rules? I'd be interested to know how much time you spend formatting your code vs ceding control to an auto-formatter.

curtis · on May 25, 2018

It takes me a lot longer to write the code than it does to format it for readability.

I'm inclined to think that the reason that auto-formatters are popular is not because manually formatting code is hard, but simply to head off nitpicking in code review.

I think there's a better solution to the "style nitpicking in code review" problem: Just don't do it.

If you're nitpicking style during a code review, chances are good that you are not looking for real problems.

anonnel · on May 25, 2018

It's more cargo cult for nitpickers.

jowiar · on May 25, 2018

If you’re working alone, fine.

But when working with other people, getting everyone to do the same thing, and have that automatically done for you / enforces is incredibly valuable. It’s such a massive win that any deviation from “my personal optimum formatting” is rounding error.

zephyrfalcon · on May 25, 2018

Not really. I worked at one place where they enforced PEP8 (using flake8 etc) using a git commit hook. In other words, when you try to commit some code, it would run flake8 which would most likely balk at some of the changes you made (and not let you commit until those were fixed). A lot of the rules/warnings are extremely nitpicky, and they had all of them turned on (except for the line length); I don't know how many times I could not commit my code because I left a space at the end of a line, or an empty line contained whitespace, or I had only one blank line between two class definitions, etc. It was just ridiculous. I can understand the argument for having all code in the same format, but this kind of mechanism just seriously decreased my productivity, not in the least because it constantly pissed me off.

Admittedly, Black works differently; as I understand it, it will just auto-reformat your code rather than yelling at you and making you go through your code and fix everything by hand, which is what the aforementioned approach did.

Still, I wonder about the usefulness of such tools. Python code is already much more uniform than most other languages. Also, I am not sure you should take the last crumbs of creativity or personal preference away from programmers. Last but not least, PEP 8 is meant as a style guide, not as a book of law that needs to be enforced at all costs. Some of the Python core developers seem to agree; I have seen comments from Guido and others who apparently think that such tools go against the spirit of the PEP.

JoshTriplett · on May 25, 2018

> I don't know how many times I could not commit my code because I left a space at the end of a line, or an empty line contained whitespace

Trailing whitespace causes issues with git, editors, diff tools, and numerous other things, as well; keeping it out of a repository is a good thing.

> I had only one blank line between two class definitions

I certainly agree that that's the kind of thing a tool should help with rather than complain about and make you fix.

zo1 · on May 25, 2018

> "I certainly agree that that's the kind of thing a tool should help with rather than complain about and make you fix. "

It's also the type of thing you should be getting your IDE to worry about. And not leave it till it's time to commit/push.

crehn · on May 25, 2018

Just run

  black .

or

  autopep8 -ir .

before commit, what's the big deal? Keeps the code consistent, improves readability, simplifies CI and allows focusing on more important things.

0xffff2 · on May 25, 2018

> I don't know how many times I could not commit my code because I left a space at the end of a line, or an empty line contained whitespace, or I had only one blank line between two class definitions, etc.

On the other hand, it drives me nuts when I see stuff like this in our source files. I would love to have a commit hook that did nothing but enforce a minimal set of white space rules. Just requiring no trailing white space and no mixed tabs and spaces would make me so happy.

>Admittedly, Black works differently; as I understand it, it will just auto-reformat your code rather than yelling at you and making you go through your code and fix everything by hand, which is what the aforementioned approach did.

I don't actually write much Python, but surely you could have run the same tool (or some other formatter configured to match the linter) locally to have it do that auto-format for you?

zephyrfalcon · on May 25, 2018

> I don't actually write much Python, but surely you could have run the same tool (or some other formatter configured to match the linter) locally to have it do that auto-format for you?

Maybe... I don't work there anymore, but if I'm ever in a similar situation, I will consider that approach, assuming it will only reformat files as needed. (I suspect the company-mandated flake8 script scanned all the code (200K lines), rather than just the files that changed, considering how slow it was.)

repsilat · on May 25, 2018

So long as it isn't too helpful, I guess.

A coworker of mine at a previous job used a JS autoformatter built into his editor, and he couldn't insert a `debugger` statement into his source when testing locally because his editor would delete it immediately...

I spend about equal amounts of time fighting with my style linter, formatting my code, and disabling dumb lint rules. Maybe an auto-formatter would save time by reducing the first two more than it would increase the second (though if the formatter introduces bugs by deleting bad code all bets are off.)

I also don't really care about style... Who really cares where line breaks are? Who cares whether you line up your comments with spaces or not? That stuff doesn't "take time", affect readability, cause arguments etc, because we're not children.

ben509 · on May 25, 2018

> I can understand the argument for having all code in the same format, but this kind of mechanism just seriously decreased my productivity, not in the least because it constantly pissed me off.

Globally, though, code is read an order of magnitude more times than it's written. So it's a huge productivity improvement in the not particularly long run.

zephyrfalcon · on May 26, 2018

That doesn't really apply here... Like I said, Python code is already much more uniform than most other programming languages. Unless somebody's formatting style is particularly ridiculous, their code should be just as readable as anybody else's, even if they put a space where it supposedly doesn't belong, or use too many/too few blank lines, or use the "wrong" way to split and indent a long list of function parameters. There is no productivity gain for others here. Maybe for languages like Javascript or C, but not Python. Almost all of it is just nitpicking.

(Of course you can make code less readable in other ways, e.g. by choosing undescriptive names, or weird idioms, but flake8/Black naturally don't address those issues.)

joshuamorton · on May 27, 2018

As someone who reads and writes a lot of python a, consistent style is incredibly valuable.

The only reason I don't use autoformatters is because most of them are bad. In my experience, black doesn't have those issues.

takeda · on May 26, 2018

> Last but not least, PEP 8 is meant as a style guide, not as a book of law that needs to be enforced at all costs. Some of the Python core developers seem to agree; I have seen comments from Guido and others who apparently think that such tools go against the spirit of the PEP.

Actually PEP 8 starts with: "A Foolish Consistency is the Hobgoblin of Little Minds"[1] A lot of people miss that part.

[1] https://www.python.org/dev/peps/pep-0008/#id15

davidfstr · on May 26, 2018

Love the idea of eliminating formatting debates when writing Python.

At the risk of being redundant, I also raise an eyebrow at the choice to prefer double quotes for strings. My company standardized on single quotes, mainly to be consistent with repr and also encourage the use of double quotes in messages displayed to the user.

Everything else seems in order. I might increase the line width to 90 just to use an easier value to remember when configuring editors and other tools ;)

epr · on May 25, 2018

--single-quoted-strings would be a really nice option for the vast majority of python programmers that use single quotes by default.

aldanor · on May 28, 2018

Ditto, no single quoted strings -- not even going to consider it no matter how brilliant it (potentially) is.

prawl · on May 26, 2018

This isn't a vast majority by any stretch of the imagination.

jhall1468 · on May 26, 2018

The overwhelming response here saying use single quotes begs to differ.

ambivalence · on May 26, 2018

That's not how this works. All the people who are just happy or indifferent about double quotes don't comment about it. And some of the ones that aren't happy about it commented here multiple times.

Judging from the additional stars on GitHub, and projects that just migrated (pytest!), I'd say there's a very vocal minority which is very attached to single quotes.

jhall1468 · on May 28, 2018

> Judging from the additional stars on GitHub, and projects that just migrated (pytest!), I'd say there's a very vocal minority which is very attached to single quotes.

Accuse me of selection bias. Immediately use even more biased selection bias.

pkilgore · on May 26, 2018

Selection bias is real here.

I thought dbl quotes smart for all the reasons in the readme. Me commenting 'this is great' is just noise on HN, and discouraged by the rules. Never take self-selected anything as truth, especially comments (tweets/posts/voluntary votes)

aldanor · on May 28, 2018

This is all subjective unless some data is collected. I could say with just as much confidence that all of the major Python projects I've seen or contributed to use single quotes -- e.g. numpy or pandas.

mjkunc · on May 25, 2018

this is the only issue stopping my team from adopting black.

49bc · on May 25, 2018

> You will save time and mental energy for more important matters.

Exactly why every language, from here until the end of time, should have a “go fmt” equivalent.

quietbritishjim · on May 26, 2018

The problem with code formatters for Python is that you can't just break lines using whitespace; you need to insert symbols, ideally parentheses. For example, if you need to break this line of code:

     left[first][second] = right[first][second][third]

Manual breaking looks like this:

    left[first][second] = (
        ‎right[first][second]
            [third]
    )

Code formatters will produce something like the following atrocity:

    left[first][second
        ‎] = right[first][second][
        third]

Comments and strings are also unwrappable if the formatter is afraid of inserting characters.

ambivalence · on May 26, 2018

Did you actually try Black on a line like this?

freyir · on May 26, 2018

In:

    left[a_rather_long_key][a_rather_long_key] = right[a_rather_long_key][a_rather_long_key][a_rather_long_key]

Out:

    left[a_rather_long_key][a_rather_long_key] = right[a_rather_long_key][
        a_rather_long_key
    ][a_rather_long_key]

quietbritishjim · on May 26, 2018

I must admit I didn't. I felt confident it was restricted to adding and removing whitespace because the README says at the start:

> Black ignores previous formatting and applies uniform horizontal and vertical whitespace to your code.

I now see it does sometimes modify non-whitespace characters e.g. later in the README it mentions:

> In [certain] cases, parentheses are removed when the entire statement fits in one line

I'm not in a position to test black out right now (I can't run Python on the computer I'm posting this comment on). I'd be curious to know what it does on the code I posted, and on over-length comments and string literals.

freyir · on May 26, 2018

I tried it. Black doesn't seem to touch long strings or comments. It just leaves you with "line too long" errors that you can clean up yourself.

For your code example, see my comment above.

jsmeaton · on May 25, 2018

I wanted to add a bit of positivity and mention that I'm really liking how the black project is approaching problems. For example, a few people brought up fluent interfaces[0] as an issue. There were many opinions about the right and wrong thing to do, but discussion got to a very pragmatic decision I feel.

Then there were requests to add command line arguments specifically so that tools could integrate with black that were added almost immediately.

Congrats on gaining so much traction so quickly, and thanks for listening to users (when it makes sense).

[0] https://github.com/ambv/black/issues/67

erikig · on May 25, 2018

I chuckled at this "If you're paid by the line of code you write, you can pass --line-length with a lower number."

zaius · on May 26, 2018

Has anyone rolled this out to an existing codebase? What's the best practice? A single commit that reformats the whole codebase? How do you avoid creating merge hell?

ambivalence · on May 26, 2018

You can see how this was done for Fabric, PyPA/Warehouse, and pytest.

General guidelines:

1. One commit with only the automatic formatting. Afterwards you'll be able to skip over it easily with `git hyper-blame` or `git blame $BLACK_REV^ -- $FILE`.

2. Avoid leaving open pull requests. If you do, after landing the blackening commit, blacken all pull requests, too. They shouldn't conflict then.

3. Set up enforcement with pre-commit or CI (you can run `black --check` on Travis or similar).

4. Don't forget the repo badge ;-)

ra7 · on May 25, 2018

How does this compare with Yapf (https://github.com/google/yapf)?

ambivalence · on May 25, 2018

Answered here: https://news.ycombinator.com/item?id=17155205

joobus · on May 25, 2018

Anyone know how I can get auto format on save to work in Vim with Black? I installed with Plug.

Figured it out:

`autocmd BufWritePre *.py Black`

crooked-v · on May 25, 2018

88 characters per line is a weird choice. Why not 90?

ambivalence · on May 25, 2018

Originally PEP 8 had 79 characters. Now that was a weird choice so most companies went with 80 instead, including Facebook. You want a low-ish limit because it makes it possible to fit two files side by side on a typical screen resolution. Even if you don't edit like that, you look at diffs like that. More importantly, a low column limit is helpful to disabled engineers who don't have to navigate horizontally so much.

So we used 80. I always felt bad when the linter stopped people from pushing impactful changes because they went over the limit by two characters. So two years ago I set up a "highway speed limit" style warning in flake8 (code B950 in PyCQA/flake8-bugbear). What it does is it keeps your limit intact (for example "80") but doesn't trigger unless you went over by more than 10%. So the limit happens to be 88.

When I was working on Black, I was faced with a dilemma. Should the formatter stick to 80 or be able to "go over" a bit, too, as we would let humans do. I felt like the latter made more sense as the resulting code looks nicer (fewer occasions to break a single line into three or more). Then I remembered Raymond's talk "Beyond PEP8" where he mentions that experience shows "90-ish" is the wisest choice. So I went with it.

bpicolo · on May 25, 2018

> More importantly, a low column limit is helpful to disabled engineers who don't have to navigate horizontally so much.

I saw that you mentioned that - where have you seen a study that claims 100 is the cutoff? Would be interested in seeing that.

ambivalence · on May 25, 2018

I don't have a formal study, just conversations with several legally blind engineers I work with.

B1FF_PSUVM · on May 25, 2018

> Originally PEP 8 had 79 characters. Now that was a weird choice so most companies went with 80 instead

Probably thinking of one of these:

- backslash continuations

- terminals/etc that counted newlines

- off by one errors

Izkata · on May 26, 2018

Also possibly a terminal text editor - the cursor sits on the column where the next character would be input (such as vim's insert mode). So with 79 character lines, the cursor is sitting at 80 while waiting for the next input.

If your screen is larger than that it's no big deal, but if your screen is 80 columns and the cursor was at column 81, it would wrap to the next line without actually being a newline.

mappu · on May 26, 2018

Also taking up a character, are ASCII-rendered scrollbars (like EDIT.COM and friends).

samatman · on May 26, 2018

Yes.

The old standard was actually 78, so that a diff would fit on the screen.

revfried · on May 25, 2018

Science. 88 produced smaller files than 80. And 90 and above didn’t noticeably shorten files.

Like Raymond Hettinger said in his talk beyond pep8, 90ish is better than a strict 80. If you have 81 characters on a line its a waste of time to move that to three lines and now harder to read. So there should be some buffer. that buffer is 10%. The goal is 80 but we are ok with up to 88.

gugagore · on May 25, 2018

but no matter what, it's a strict something, right? in Black it's a strict 88 by default (unless it violates some other rule, if I understood the readme)

drivingmenuts · on May 25, 2018

90 characters?!?!? You heathen. I want 132 characters per line.

philosopherlawr · on May 25, 2018

How does this compare to yapf?

ambivalence · on May 25, 2018

Answered here: https://news.ycombinator.com/item?id=17155205

hrez · on May 25, 2018

What's with python 3.6 requirement? - Major stable distros are at python 3.5. So I can't even try it without messing with system python or OS.