It's funny you should cite limited cognitive capacity, because the same starting point leads me in exactly the opposite direction.
"Human brainpower is limited, so we should use weaker languages" seems a strange argument. The whole point of using more powerful languages is that they can express more with less, allowing us to fit more into limited human RAM. It's because our brains can't manage millions of details that we don't build systems directly out of machine instructions in the first place and resort to cleverer strategies like higher-level languages. At what point along the way does that process stop being valid?
It's for the same reason that we sometimes favor Newton's tools over Einstein's: RAM isn't the only cognitive limit. "Powerful" in the sense you're using is similar to compression. The deeper the compression, the more effort it takes to unpack fully.
Deeper abstractions are only useful in coding to the extent that you don't have to understand the details. But abstractions leak:
As Martin Fowler writes, "Any fool can write code that a computer can understand. Good programmers write code that humans can understand." You might be able to turn 100 lines of Java into 1 line of Lisp, but if it takes your colleague more time to understand and modify the 1 line of Lisp than the 100 lines of Java, then you've made things worse.
The deeper the compression, the more effort it takes to unpack fully.
That implies that humans must mentally compile the code. By definition that's not true of a higher-level language.
Ditto for your invocation of leaky abstraction: by definition, a program written in a language can be understood by learning the language and reading the program. If that's not true, then the program must not really be (fully) written in that language.
The whole reason why we prefer a program in, say, C to a 100x larger program written in assembly language is that it's easier to understand. Given that available evidence points to source code size as the best measure of complexity and best predictor of errors -- there was a recent thread on HN about this with links to several papers -- I think the burden is on you to explain how the benefit of writing smaller programs in more powerful languages holds good up to, say, Java, but breaks down when you move to, say, Lisp or K.
In theory, theory and practice are the same. In practice, they're different.
The higher the language, the more abstract the abstractions. Each symbol represents more, and therefore hides more. There are manipulations on the symbols that are plausible in context but idiotic when one looks at the hidden details. I see that regularly with developers who have only worked in higher-level languages. From their perspective, they will see two approaches to the problem that will be equivalent. However, there will be 3 orders of magnitude difference in performance. By definition (well, your definition) that doesn't matter in a higher level language. In practice, it does.
I don't think I've said that Java is some sort of magic universal middle point in languages (and indeed, I don't think that exists), so I don't accept the burden you're pushing on me. You asked how a language could be too powerful, and I tried to answer your question. If you have more questions, feel free to ask.
Not "pushing on you". Just very interested in this question.
So this is about performance? But that is a separate issue. The claims in the OP are that a language can be "too powerful" because programs written in it, although shorter, are too hard to understand. I'm looking for evidence of this that doesn't simply reduce to "intuitive is just what you've seen before". The interesting thing is that there doesn't seem to be any.
(Performance is a separate issue because it has to do with the gap between abstractions and the machines they have to execute on. I think this may be the Achilles heel of the highly mathematical FP approaches. Still, it's a change of subject to interpret "too powerful" as "too powerful for the machine" rather than "too powerful for the brain".)
The underlying problem seems to be that abstraction is hard. That is, for any complex system, it's hard to come up with a set of concepts that can express the system with less overall complexity. This is true of programming in all languages. The problem is masked in lower-level languages because the abstractive weakness of the tool provides an excuse to produce the usual reams of code and thus the usual runaway complexity. But that doesn't mean the task is easy in a more powerful language. The hardest part -- coming up with a good set of concepts in the first place -- remains. Faced with that, people do sometimes rush to apply the technical devices offered by, say, a Lisp to a not-clear-enough set of concepts. The resulting programs can indeed be hard to understand. I've done it myself.
Performance is not a separate issue because as professional programmers we aren't selecting tools for anything other than their ability to help us get things done for our users. This isn't a question of "too powerful for the machine", it's a question of "too abstract an approach for a team of humans to efficiently understand the implications of the choices their members make".
In that sense, some abstractions are indeed too "powerful" for the problem and the people at hand. When training somebody to work the register at a corner store, one does not start by deriving number theory from Peano's axioms and then telling them to work it out from there. You make sure they understand basic arithmetic through simple practice and then you train them on the specific operations for money, mainly with further practice. For the novice cashier, "power" in the sense you use comes at too high a price with too low a payoff to be worth it.
By the way, what came across as pushy was you insisting I had a burden to justify something I had never said (and incidentally don't believe).
"Human brainpower is limited, so we should use weaker languages" seems a strange argument. The whole point of using more powerful languages is that they can express more with less, allowing us to fit more into limited human RAM. It's because our brains can't manage millions of details that we don't build systems directly out of machine instructions in the first place and resort to cleverer strategies like higher-level languages. At what point along the way does that process stop being valid?