The answer was "not while computing was getting exponentially better every year". This sort of improvement curve is really difficult to compete with.
It's in the process of being liberated now. GPUs are already deployed in many machines with a fundamentally different architecture. And there's a bubbling foment of alternatives being developed now.
In the end, I don't think it was necessarily a case of us getting stuck on "von Neumann" per se... we would have gotten stuck on anything we settled on in the 1960s. And despite familiarity breeding contempt, I'm not that convinced we could have done that much better in the 1960s. A lot of the alternatives are not universally better, they're better in some limited domain but it's not obvious that they are going to uniformly beat von Neumann for general computing, even accounting for the fact that "general computing" has coevolved for 50 years with von Neumann.
GPUs are much closer to von Neumann machines than to anything people typically deriding the latter have in mind (such as say the Reduceron architecture.) Every commercially significant accelerator, GPUs included, has a RAM, often multi-banked to support vectorized random access, it has an imperative programming model, and it runs (often vector) instructions from program memory operating on registers, data is occasionally exchanged between registers and RAM addresses. $100 says this will remain true for the next 20 years.
The thing most far away from a von Neumann machine that is commercially significant is the FPGA and it is nothing like what people deriding the von Neumann machines have in mind. It's imperative, has registers and RAM blocks, and it will cause nausea in the typical functional programming supremacist. (Except those that choose to squander its performance potential on implementing a wasteful machine like the Reduceron because graph reduction is just better than an imperative low-level programming model.)
Programmers need to be liberated from misconceptions wrt von Neumann machines.
I'm taking the general case of the question, admittedly.
I think a lot of the von Neumann detractors think that there's going to be this revolution, where one day they'll wake up and omigosh, stream processors have suddenly taken over the market (or whatever your favorite non-von-Neumann architecture is) and That was the Day that Everything Changed.
That's not how it's going to be. It can't be. Too much inertia. And that inertia isn't just a vague mass... it's everything you do now, browsers, video games, everything. It's going to be gradual. No, it is gradual, since it's actually happening now. From that perspective it's not a surprise that GPGPUs are still "mostly von Neumann", and are the first things out of the gate that have become wildly popular.
I think it also demonstrates my "familiarity breeds contempt" point fairly well. I remember around the time the PS3 came out, there was a convergence between the Cell hype and what-became-CUDA hype, and I recall a small legion of fanboys running around claiming that in five or ten years, GPUs or Cells would be the dominant form of processor. Which ignores the fact that von Neumann is not all disadvantage; if all we had were non-von-Neumann processors, von Neumann processors would be hailed as a breakthrough when somebody invented them, because wow, look at these workloads they just tear through compared to the competition! Even heavy-duty FP programmers will dip into true imperative programming on certain data sets without shame, using the FP system to confine the side effects but freely using them within that confine.
We'll get there. But it'll take time. And as always, initial naive ideas and wild promises will give way to surprising practical problems and performance that wasn't quite as good as we hoped, but may still be advantageous for certain problems.
I... think we're vehemently agreeing?
On the plus side, once these architectures get proofed, if they're in conventional silicon they'll have very swift access to the very-small-number-of
nanometers chip fabrication processes.
I don't think it's inertia. I think RAM is fundamentally very useful, and that's why it's not going away. You want to store data somewhere if you access it more than once, it has to be a large array of cells if it's a lot of data, and accessing data in the array by address seems to me more useful than accessing it by something else (such as by key/content.) Similarly, a file of registers addressable by number is very useful, etc.
I don't mind new and original ideas - let's hear new and original ideas, if someone invents a non-round wheel that's better than a round one, great! I do mind a general assumption that wheels are round due to inertia without explaining what if not round wheels we're gonna use once we're past inertia.
While I think you are right that the von Neuman architecture was not an arbitrary choice. But I think big computers are evolving into something else.
In theory, a von Neuman machine asks for a word of data and gets it back from RAM. But nowadays we need caches which communicate with the real RAM using something vaguely like network packets. The hardware then emulates a von Neuman addressing scheme on top of that.
But look at what software does with all that RAM: good programming isolates objects from each other and forces them to communicate through well defined interfaces. Bitter experience with threads teaches us to pass values around more than to share state.
It seems there is a room for the software abstractions to map more directly onto the hardware than they do now. In which case computers will no longer be giant von Neuman machines, but networks of little ones.
I agreed with your previous comment, but here I think you're missing an elephant in the room: a brain does not use "RAM", as it's implemented in von Neumann computers. There seems to be a long term storage of some sort, sure, but that's more like a hard drive storage than RAM. And in the brain, memory is integrated in the compute elements themselves. So the obvious way to copy such architecture is to use analog computing elements, which are at the same time memory elements - neural network weights.
When NN algorithm development reaches the point where they work well enough for commercial applications (which seems to be rather soon), the next step will be specialized hardware to run those applications, and that's where we can start building analog chips with huge weight arrays, programmed as analog values on transistors/capacitors/memristors/etc. That's when a true departure from von Neumann might take place.
You can't program a brain the way you can program computers. You can't program NNs either, im that sense. They have their place, I work for a company shipping them now with hw acceleration. Let's ignore the role of RAM here amd the fact that NNs are not like real neurons and let me grant the point that it's not a vNA, programming model-wise. Most of computing however will not be NN-based because you want to program things precisely, and the brain is terrible at executing commands precisely.
But that's the thing - I don't want to program things precisely! Most people don't. I want to tell my computer what I want in plain English, and I want it to understand me, translate my wish into actions, and perform those actions. Just like a human servant would do. That's the future of programming (or so we should hope, at least).
Well, I would say it sounds like you want a computer that understands you imprecisely so it can account for the idiosyncrasies of human communication, but I think you still want it to carry out the action it's decided you want in a precise manner. Protocol negotiation does not lend itself to imprecise ordering of actions or commands.
Of course. If I hire a human servant, I will expect him to understand me even when I'm not very precise in my commands, and perform the action I wanted precisely. In fact, the best servant is the one who does what you want before you have to ask him, right?
That's exactly what I expect from computers in the future.
I don't know why you're assuming the AI is smart enough to perfectly understand what you want even with incomplete knowledge but is stupid enough to need your help even though it's effectively capable of doing everything by itself.
It's perfectly possible to have an automaton that's good at predicting needs and inferring outcomes without assuming that it can set independent goals of its own without being prompted.
One is driven by external stimuli in a deterministic way, which makes it a computer like any other.
The other is internally directed - which edges into questions of sentience, free-will, and independent mentation, and is unknown territory for CS.
Siri and Viv are already heading in the former direction. Siri has some nice canned responses which can create a fun illusion of a personality, but not many developers - and even fewer NLP designers - are going to tell you Siri is independently goal seeking.
Human brains are also very inefficient at computation (per unit energy, time, mass, volume, or some combination of the four), unless it's a problem they were tuned for over a really long evolutionary time (e.g. facial recognition, bipedal locomotion, etc).
Sure, if you find a problem that a NN algorithm is really good at, then mapping the hardware to the NN algorithm will get an efficiency speedup, but that's because of the application-specific hardware, not because the vNA is a bad general computation model in terms of what we can physically instantiate.
NN != brain. Not by a long shot. We still have no clue how it operates, don't kid yourself. There are some processes happening in the brain that aren't supposed to happen, like the massive amount of ions passing through membranes, causing some think there is some time-bending quantum effect making it happen. We simply have no clue.
You can say the same thing about our computers: they are very inefficient at tasks they were not designed to do. Actually they are very inefficient even at tasks they were designed to do, but since they do them very fast, we've been ignoring it (at least until Moore's law started to slow down). Just think about all layers of abstraction present in current computing platforms, and how much computation really happens when you type 2+2 in a Python console.
Also, there are a lot more problems a brain solves better than computers. Software NNs slowly learn to solve more and more of those, primarily by better copying how brain does it.
So, in summary, vNA might not be a "bad" general computational model, but it surely is very inefficient compared to human brain, when it comes to "computing" vast majority of tasks.
>You can say the same thing about our computers: they are very inefficient at tasks they were not designed to do.
Hold on -- we need to be consistent about what part of them was designed for what purpose. In the context of the discussion, it's the hardware. Computer hardware is only "designed" to be able to maintain a small state while performing a huge number of Boolean circuits (from some large set of possible circuits) per second, writing to external stores.
They were not designed to e.g. make graphs, host discussion forums, etc, and yet do that very well --because we can apply software that converts those problems into "do these Boolean circuits/writes really fast". Considering how handily they beat the alternatives at those tasks, I think it would be fair to say that they're good at things they're not designed for!
It's just that there remain a few problems that aren't tractable to this approach so far; but focusing on those problems is a misleading picture.
>Actually they are very inefficient even at tasks they were designed to do, but since they do them very fast, we've been ignoring it (at least until Moore's law started to slow down)
As above, they were only designed to do those circuits and they're good at that.
I have no idea what you mean by "perform Boolean circuits" in the context of this discussion.
Bottom line: vNA based computers are inefficient, regardless of what they are trying to compute. But they do most of the tasks fast enough for us not to care. When it's not fast enough, people build ASICs.
A Boolean circuit is a circuit that implements a Boolean expression (given input signals set to 1/0, output the 1/0 value corresponding to the result of substituting them into that expression.)
Computers at their core are machines designed to implement many such circuits, of many varieties, per second, and send their signals to output devices or their own internal state. When CPU makers design them, they are optimizing for the efficiency (with respect to cost, space, mass, energy consumption, et/vel cetera) with which they perform that task. They do not specifically optimize them for hosting forum discussions; instead, the software or OS designer chooses instructions for the CPU that allow it to accomplish that efficiently.
All of the above was in response to your claim that computers are only (somewhat) efficient at what they're designed to do. But they're designed to implement boolean circuits (and write to peripherals), they're not designed to host discussion forums, et cetera, and yet they're still good at the latter.
What do you disagree with there?
>Bottom line: vNA based computers are inefficient, regardless of what they are trying to compute
If modern computers are "inefficient" at summing numbers (or hosting discussion forums, for that matter), then may I forever be dredged in Soviet bureaucracy!
Computers are circuits. Those circuits don't change every second. They never change, unless there's a programmable logic embedded somewhere. Perhaps you're confusing "circuits" and "operations", e.g. "adder" and "addition"?
Of course CPU designers optimize those circuits for speed/power/etc. of the instructions. No one says the implementation of those instructions is inefficient. It's the way of performing high level tasks, dictated by available instructions in the ISA, which in turn is dictated by vNA, is inefficient.
Why would we care about how efficient are the instructions, if the tasks we are interested in, when performed with those instructions, are inefficient?
I'm not sure why you're belaboring the distinction between circuits and ops, if you understood and agreed with the (sub)point I was making there.
Do you disagree with the broader point I was making, that "computers" -- in the sense of computer hardware -- aren't specifically designed for many tasks for which they're used? We can agree that computers aren't designed to host discussion forums, right? And we can agree that the hardware design is focused on efficiency of fast execution of the cir-- sorry, operations, not the discussion forums per se, right? So what were you disputing there?
If you agree that computer hardware architecture isn't optimized for hosting discussion forums, then it sound like you agree computers aren't designed for that. And if you agree that they're nonetheless good at hosting forums, then it sounds like you're walking back on your earlier claim.
>It's the way of performing high level tasks, dictated by available instructions in the ISA, which in turn is dictated by vNA, is inefficient.
Again, where are you getting this? In what relevant sense is a vNA computer "inefficient" at sums or hosting discussion forums? It's not only efficient in absolute terms, but more efficient that any other known method. Again, if that's "inefficiency", I don't want to be efficient!
What would be the non-vNA architecture that would do it more efficiently? Why would a neural net be better at hosting discussions? Again, you can definitely hammer them into the shape of one application-specific computation of one size; but having to do that for every application you'd ever want to do does not sound like efficient hardware.
Being good at something, and being efficient at something, in general, is not equivalent.
Earliest computers were designed for calculating missile trajectories, and they were good at that task. Not efficient (required a large room, cost millions, and consumed megawatts of power), but fast enough to produce results faster than any other known method.
Later, this design, which was inefficient even for the original task, was used for other tasks, and it was of course even more inefficient there. Modern computers have become more efficient at the original task (numeric calculations), but they are still horribly inefficient for hosting forums, because of the limits of vNA. If you want to have something that is more efficient, the answer is easy - build a specialized processor, designed to host forums. It won't be able to do anything else, but if will be orders of magnitude more efficient at hosing forums. You have to choose having one computer that is inefficient at many different tasks, or many computers that are efficient at single tasks.
I don't see where I ever disputed that specialized hardware is better than general purpose hardware at the task for which it is specialized; in fact, I specifically made note of that. Is there a reason you think it refutes a part of the thesis I was presenting?
The claim under contention is that there's some root deficiency of the vNA at computing in general, that perhaps could be surpassed by some ingenious FP-inspired model. If the only "limit" or "inefficiency" of vNA is that it doesn't achieve ASIC level efficient on every task, that's not much of a criticism. Even applied "horribly" to hosting discussion forums, the vNA is light years beyond all other general hardware.
I was under the assumption that a more substantial criticism of vNA was being offered. But the above criticism makes no sense unless it were somehow economical to rearchitect a computer for every distinct problem you plan to work on.
I was disputing your claim that brains are less efficient than vNA based computers. My point is that for vast majority of tasks we face on a daily basis, our brains are vastly more efficient than our computers. Note that a brain is a general purpose, non vNA based hardware.
My conclusion: we should try to build a brain-like hardware out of silicon, as soon as we learn enough about how brains work.
> Just think about all layers of abstraction present in current computing platforms, and how much computation really happens when you type 2+2 in a Python console.
That's not a limitation of von Neumann architecture. That's a consequence of us using the vast and cheap computational resources available to us to make things easy.
Last I saw, a custom built computer is many many orders of magnitude worse than the brain at playing Go per unit energy. And its not like we evolved to play Go.
Fair enough, my claim was overbroad. I would still say it's true for most computation tasks: a vNA computer on typical hardware is still going to be more efficient as a general-use computer than something on a neural architecture.
Humans do currently have an advantage on certain tasks that require detection of overall global patterns adn symmetries. (But even then, only a subset of those tasks for which someone hasn't been able to specify a good algorithm for finding that structure.)
Even so, the core point remains: it's not that vNA is a bad architecture or that neural computers are more efficient in general, and the latter are probably much harder to reason about.
I don't think von Neumann computing will go away but agree that analog computing will massively boost perf of neural nets in the near future. There is a really good in depth talk about the subject here from Microsoft Research a couple of years ago - still extremely relevant
"Will Analog Computing and Neural Nets Power Innovation after Moore's Law?"
Yeah, I kind of get the impression that all these alternatives are about how to structure programs at a high level; none of them seem to avoid the need to translate it to run on hardware with a von Neumann architecture, nor remove any of the related frictions, nor identify a superior hardware architecture that would avoid the problems.
GPUs are much closer to von Neumann machines than to anything people typically deriding the latter have in mind
The GPU architecture was developed to enable the application of many von Neumann machines to a particular type of "mildly embarrassing" parallelism. I'm not so sure that we couldn't greatly benefit from going a little further in this direction.
Right now, I see a mainstream architecture that does fairly well with read parallelism, with a need for more efficient two-way communication between separate threads. What if we had better hardware support for the things we're trying to do with with the Disruptor that came out of HFT? This would also benefit multiplayer games.
Not only are FPGAs as far from the ideal of "computation as reduction" expressed in the paper as they could possibly be, but they're the hardest machine to fit into this sort of a paradigm.
The typical CPU will run functional code fairly efficiently if enough work is put into the compiler (to some functional programmers this sounds bad because hardware should make writing functional language compilers easy. People who know about hardware make a different conclusion - that hardware is just fine, as Ken Thompson said, "the PDP-11 is a fine Lisp machine, it's not a language special enough to warrant a specialized machine.")
The typical accelerator is a harder target to a general-purpose functional programming language (it's a poor target for a general-purpose imperative language to begin with.)
The FPGA, with its explicit management of every imaginable resource, is the worst target for a general-purpose functional programming language.
In general, the farther away a commercially significant machine is from the classical von Neumann architecture, the more terrible target it is for functional languages, which never prevented FP aficionados from talking about "the end of Moore's law bringing about the long-awaited demise of the von Neumann machine", a "demise" that largely doesn't happen and where it does, it does not mean what they think it means.
> to some functional programmers this sounds bad because hardware should make writing functional language compilers easy. People who know about hardware make a different conclusion - that hardware is just fine, as Ken Thompson said, "the PDP-11 is a fine Lisp machine, it's not a language special enough to warrant a specialized machine."
Wow, really? C got a lot of things wrong, but the one thing that it got right is its abstract model of the machine. As a functional programmer, rather than specialized hardware, what I want is functional languages designed to target the abstract C machine. (Though maybe I am in a minority.)
> The FPGA, with its explicit management of every imaginable resource, is the worst target for a general-purpose functional programming language.
Explicit resource management is compatible with functional programming, but it requires a type system capable of undestanding the ephemeral nature of resources, à la Rust.
> The FPGA, with its explicit management of every imaginable resource, is the worst target for a general-purpose functional programming language.
That's a great way to put it...
but perhaps there's a more suitable FPGA cell for functional programming (than block lookup tables with state)? Maybe something with a local stack, recursion assistance, an iterator?
Also, a good deal of the FPGA resource management is focused on optimization... What if the optimization-at-all-costs constraint were relaxed somewhat, or supplanted with meta-information to indicate the completion of a group of cells to allow a message to propagate?
With relaxed constraints partial, on-the-fly and self- modifications could become regular features.
> Maybe something with a local stack, recursion assistance, an iterator? [...] supplanted with meta-information to indicate the completion of a group of cells to allow a message to propagate?
What you're describing sounds a lot more like many CPU cores connected by a network than like an FPGA. The point of an FPGA is that it gives you very precise and flexible control over things like timing and interconnects. Without that, FPGAs are actually pretty slow and you'd better use a CPU.
> With relaxed constraints partial, on-the-fly and self- modifications could become regular features.
There's a reason why self-modifying software is restricted to niche use cases: it's hard to reason about. Dynamic partial reconfiguration is slowly becoming a thing in FPGA design, but it's typically used to work around resource limitations.
To put it simple, FPGA design (as practiced in nearly 100% of commercial use cases, i.e. with VHDL or Verilog) is iterative programming on steroids – even a sequence of actions requires an explicit mutable state variable. If you need to access external memory (e.g., DDR3 RAM), you have to communicate with a memory controller module. So not only is there no garbage collection, there's not even a malloc! And no recursion.
From time to time you'll read statements along the lines of "functional programming languages map really well to FPGA design", allegedly because the logic functions between register stages behave like declarative constructs. Such claims are excellent bullshit detectors, indicating that the person uttering them doesn't have the slightest clue what they're talking about. It's on par with believing that expressions like "a + b * c" make C a functional language.
It's different than what you describe because the language is explicitly aware of the fact its running on an FPGA. However, it's seemingly pretty declarative and functional and has an FPGA as a possible target.
First impression, could be misinformed, but I'll bite.
Since a hardware circuit is a recursive function that takes a state of the hardware and produces a new state, starting from an initial state (and so is the universe where the initial state was supplied by the Big Bang), you can of course express it straightforwardly as a recursive function, which I think is what they sometimes do:
This is VHDL/Verilog spelled in Haskell (real hardware description languages will not make you spell the old state - leds,mode... - and the new state - leds',mode'... explicitly, which gets annoying for 20 state variables. Of course this could be taken care of by a preprocessor or a macro system, my point isn't that it's verbose but that it's VHDL spelled in Haskell, a bit awkwardly and maybe you can do it less awkwardly but the programming model is the same so calling it Haskell is cheating, kinda.)
This is definitely not what TFA is about - you don't represent the blinking leds as graph reduction, rather, you generate VHDL from VHDL-like Haskell.
This is not to say they aren't doing cool things in this project. If indeed they compile this:
fir coeffs x = dotp coeffs (window x)
where
dotp as bs = sum (zipWith (*) as bs)
...to efficient hardware description code, that's really cool. In general, I think that a sufficiently restricted functional language can compile very nicely to FPGA/hardware (and to DSPs and GPUs etc. etc.) I'm just saying that a general-purpose functional language is better targeted at a CPU, and that a functional language targeting something else is a DSL of sorts.
To me, the inability of these "haskell to hardware" systems to handle general recursion shows that they are fundamentally missing the most important part of the mapping from functional computation to combinational logic plus registers. They don't easily permit the "unrolling" of recursively-written code, so your recursive programs must squirrel away their state between every iteration, even when they could potentially be performed in parallel. In trying to sum a list in parallel in hardware, how do you know how many per-element operations to do at once? This determines the number of gates you'll need, and the buffer size for storing the rest of the list until those gates are free.
One way to get around this is to use termination proofs to bound the size of input data, which is exactly what's done in total functional programming. In total programs, no term is a bottom, so laziness vs. strictness is moot (both give the same answer for all programs with no exceptions or infinite loops in either evaluation strategy). Which is what you would expect since you're modelling hardware which has a clock and will clock out results into its output pins each cycle anyhow.
How can we bridge the mapping? Well, you could write a totality checker which searches for a recursion principle which is well-founded, and use this to bound data/iteration sizes and determine the shape of necessary hardware. Or you could build a theorem-proving language whose constructs enable extraction of these sizing bounds, and have humans write the proof with application of tactics.
What I really think we should be doing is writing specifications (aka, type signatures, per Curry-Howard), which do not themselves impose a particular mode of execution of the program which fulfills the specification, but only completely specify its results (and effects, if in an effectful language), and then do (constrained, by the desirable properties of the resulting programs/hardware designs) searches for proofs of those theorems.
I'm a Haskell fan, so I'll bite: what do you use to program your FPGAs? VHDL or Verilog? Or do you have some other higher level language? I have not found VHDL or Verilog easy to use at all.
In addition to what yosefk has written: I really don't see the advantage of trying to fit the round peg Haskell through the square hole FPGA design – it even seems to add friction (see the bundle/unbundle stuff in the examples).
Besides, if you thought that getting C programmers to use a pure, functional language like Haskell is difficult, have fun convincing an FPGA designer (which typically come from an electrical engineering background, due to historical reasons) to do this. Their UART example [1] uses monads, monad transformers, and lenses...
"The thing most far away from a von Neumann machine that is commercially significant is the FPGA and it is nothing like what people deriding the von Neumann machines have in mind."
Are you referring to Rudeceron architecture FPGAs here?
Now that Moore's Law has effectively ended on a single chip (ended around 2007), FP is front-and-center again, but mostly because of the emphasis on immutability, which is a prerequisite for parallelization. Moore's Law has been replaced with Amdahl's Law.
> Now that Moore's Law has effectively ended on a single chip (ended around 2007), FP is front-and-center again
Ah yes, the decades-old claim that FP will make parallel programming much simpler. I wonder when that notion will go the way of "Java will be faster than C because of the JIT compiler" or "Compilers become better and better, so dynamic scripting languages will be quite fast soon."
There are two main problems with this sentiment:
1) Immutability does not automatically result in parallelity. If your algorithm is sequential, there's nothing a functional, immutable language can do to help. Oh, and a lot of algorithms are hard or impossible to parallelize efficiently, and even if you can parallelize it in theory, you still have work with limited latency and throughput to other processing elements.
2) Mutable arrays of compact, pointer-less data elements map incredibly well to transistor arrays (SRAM for caches) and transistor/capacitor arrays (DRAM for main memory).
> the emphasis on immutability, which is a prerequisite for parallelization
This is blatantly false. Basically all parallel algorithms in high-performance-computing make extensive use of mutable memory.
While I agree with your main argument, your examples are a bit unfortunate because Java is faster than (or certainly as fast as) C already for a similar development effort (with one caveat[1]) and dynamic languages are quite fast (see V8). Today, projects like Truffle/Graal can run pretty much any dynamic language within 3-5x slower than C. So I'd say that those are two predictions that have come to pass. They do, however, require greater computational resources (RAM and energy) which makes them less suitable in some constrained environments, so whether you want to call that mission accomplished or not is up to you.
[1]: Today a simple Java program running on HotSpot compiles pretty much to the same machine instructions as a comparable C program, and a large Java program would likely be faster than a large C program give the same development effort. The major hindrance to Java performance these days is the simple lack of value types, which creates many unnecessary cache faults and has nothing to do with the JIT. And there's the GC which normally trades better throughput for worse latency, but, again -- nothing to do with the JIT. BTW, value types are the top feature being worked on by the HotSpot team.
> Java is faster than (or certainly as fast as) C already for a similar development effort
Last I heard, this only applies if you write C in Java (i.e., no object composition through references, basic arrays instead of ArrayList, avoid GC wherever possible). I remember a project some years back where a git client (?) was to be written in Java, and even with these techniques they only got to within 2x C performance.
> for a similar development effort
Well, then that defeats the main reason for using Java, doesn't it? The gist of that claim was that you could write idiomatic Java (with GC, small objects, etc.) and the JIT would use the information gathered during runtime to optimize the program on-the-fly to be faster than C.
> Today, projects like Truffle/Graal can run pretty much any dynamic language within 3-5x slower than C.
Do you have a source for that? Not doubting you, geniunely curious.
The JIT can and does do all that, and I am talking about idiomatic Java (I'm speaking mainly from the experience of migrating a >5MLOC soft realtime defense application from C++ to Java). Similar performance can be achieved for much less effort (although C would always beat Java given significantly greater effort), and this effect is more pronounced the larger and more concurrent the application is. The cost -- for large application -- is just significantly greater RAM consumption. For small/constrained applications there are other costs.
As I said, the main barrier (which is currently being addressed) is memory layout of arrays-of-structs, whose fixing requires additions to the language semantics (which are being done). Perhaps that particular issue was a failed promise, but it's mostly a result of hardware changes that occurred 10 years after Java's introduction.
> Well, then that defeats the main reason for using Java, doesn't it? The gist of that claim was that you could write idiomatic Java (with GC, small objects, etc.) and the JIT would use the information gathered during runtime to optimize the program on-the-fly to be faster than C.
You can put the same amount of effort in and the Java will perform better than the C. Or you can require the same level of performance and the Java will take less effort than the C (with the possible exception of very extreme performance cases).
That is simply not true unless the C code is written by a C programmer inexperience in writing high performance C code (a Java programmer for example).
Java can't compete with C the moment you start creating/destroying objects. A fact that Java proponents either don't understand or carefully avoid talking about.
Just a few weeks ago, there was a question on /r/haskell that was basically "Hey, I wrote this code, and it is taking all my cores. Why is it doing so if it is sequential?" And, of course, the answer was that, no, his code was not sequential, he just failed to understand how the compiler parallelized it.
My point being, until you've tried what we have today, and understood how powerful it is, you don't get to make a good point about it.
Anyway, yes, Amdahl's law is true. And no, it does not impose huge constrains on most practical problems, at least not for the next decade.
And by the way, yes, specialized hardware can be a great speed-up. And it's been designed with FP-like constraints since forever, because if programming parallel general purpose computers is hard, specializing hardware for parallel operation is orders of magnitude worse. Our high level FP languages aren't up to the task, that's true. But our high level imperative ones can't even theoretically get there.
What I should have said immutability allows for code that is both amenable to parallelization and be easier to reason about by the people that write it.
Inertia, mostly. All the common tools are built around C, all the existing code is written in C, all the jobs to maintain and extend that code are all in C.
Like most "Why does everyone only ever use language X to do Y" questions, considerations of actual technical merit / best technical fit are a very minor factor in the equation.
> but can you show me any demo where a FP language outperforms C for parallel tasks?
Modulo rewriting your C so that it does exactly what the FP language (runtime) does, such examples are not difficult to come by. Stalin, Stalingrad and Roger Fateman would be good starting points.
Regarding rewriting your C to match what the FP compiles into (or executes on the fly), the point is its very hard to write such _correct_ C unaided. So you use the higher level constructs to write that machine code / assembly / C.
I'm not sure this is a worthwhile comparison. Because the idiom of parallel programming is the same as the idiom for sequential programming in FP. It's completely different (and more difficult to make correct) in imperative languages. Which means it will be harder to maintain and harder to evolve even though it might be slightly faster in certain situations.
Well, this discussion started when someone suggested the speed increases fueled by Moore's law held FP back. This is what I've been questioning. If it's not about speed, then what does it have to do with Moore's law?
"but can you show me any demo where a FP language outperforms C for parallel tasks?"
That's not a fair comparison. C proponents should ask, "Is there a demo where (alternative) outperforms C for (application domain) when C has (alternative)'s safety-checks enabled, too?" Memory and dataflow safety in C, using checks instead of exotic stuff, easily adds 20-500% overhead depending on application. Whereas, these safer or more functional languages are usually within 200% with some Common LISP's and Schemes matching or outperforming C in a few benchmarks. Concurrent Haskell wouldn't be giving me a performance reduction: I'd be getting more capabilities in development and reliability with a performance sacrifice. One that goes down every year for at least one implementation of each language or style.
The question didn't even mention safety; maybe FP people are tying their own hands by insisting on safety features that users don't value. And if you're comparing N-way parallel FP code against serial C, maybe 500% overhead still leaves room for FP to win.
It's implied in the correctness argument. Is a speedup over C fine if it gives incorrect results? In that case, I'll demolish C myself by optimizing away any parts of the user program that dont map to fastest instructions and the registers. :P
Elephant in the room of having provable algorithms is in real numbers representation limitations in any imaginable hardware (hello irrational numbers!). You simply can't make complete proofs for most math-intense problems. Combinatorics should be fine though.
Huh? I might have missed your point but I'm talking about how C makes no effort toward correctness of results. The rest that do to varying degrees sacrifice some performance to get that. Can't speak on the proofs as it's not my specialty. Far as real numbers, that's what analog computers do with some models for making it more general purpose.
I was just going one step further due to my experience in proving correctness of algorithms in functional languages. You probably had on mind usual array safety checks and control flow stuff for which aspects were invented, I raised the bar a bit by pointing out that there are some fundamental issues which even super functional purists can't overcome, so the overall talk about code safety is kinda funny ;-) You will still crash Space Shuttle if you are running on Haskell, just in a different fashion than with C.
I've rarely seen what you're describing although I know plenty work remains go be done. So, could you give examples of code injections or mission-ending errors at code level that Haskell toolchains or Isabelle/HOL couldnt handle?
Huh? My impression is that functional has been growing massively over the last 5-10 years. OCaml has been around for 20 years but it's only recently that you hear people talking about it. The amount of Haskell or Scala hitting HN goes up and up - and that's matched by the increase in jobs in these languages.
Lisp (first formulated in 1959) is still alive and well: clojure.org
New languages (such as Javascript, Ruby, Python, even newer iterations of Java) become popular to the extent that they marry Algol-like (i.e. C-like) syntax with functional features derived from 1960s-era Lisp in an appealing fashion.
Reading between the lines- it's not really something to celebrate, given it is happening because progress is slowing. I'd rather be chained to a flawed model and continue to enjoy exponential improvement for a long time.
A better example would be any modern SMP x86 machine. These use a quickpath bus to connect distributed memory and cpu, basically a cluster. Intel goes to great pain to give the illusion of a von Neumann machine, but it's a leaky abstraction.
Really? The only time I see anything close to a leak is when I look at the speed of various operations (and let's be honest, von Neumann's machines always had bottlenecks, as much as he tried to keep them out)
As soon as you run more than one thread, you'll have to worry about memory barriers, cache coherency, processor affinity etc. Even with one thread, modern IO and system calls are highly asynchronous. Not the sort of machine that e.g. C or C++ was originally envisioned for.
It already has, unfortunately it's not too popular. This is why Prolog gives most folks a case of headache. My moment of enlightenment while studying Prolog happened when I realized that I wasn't logical, but trying to solve the problem in Von Neumann style. I was humbled to see simple logic solutions in place of my overly complicated solutions. This has made me such a better developer, I keep my code as declarative as possible.
Oh, Prolog. I once wrote a hardware configuration in Prolog. (Dependency resolution - rules like "An A requires a B or a C".) The configuration part was fine. Trying to write a menu system in pure Prolog, though...
Most of the troubles with functional programming come when you actually have to do something. The imperative/functional boundary remains a problem.
(User-visible, generic) monads are absolutely terrible for expressing effects. That so many people have a hard time understanding them fully shows that it's a real problem of the most serious kind (aside from their inherent shortcomings). That just a vanishingly small portion of developers enjoys working with them is not an indication of a lack of a problem, but rather of its presence. That people in PL research are actively trying to find alternatives to monads (and you have people like Martin Odersky publicly saying they don't like monads) is yet another (though less powerful) indication.
That some pure languages like Haskell decided to treat mutation as a side-effect like IO (requiring monads) only exacerbates matters.
Backus himself admitted this, in an interview circa 1995:
> Doing a functional language and combining it with the ability to do real things is very hard. And everybody has had a problem with it.[0]
Or, as the editors nonsensically summarize it:
> FP [the language] is not suited to many mundane programming tasks, particularly those that consist primarily of fetching data, updating it, and returning it to some database. Tasks such as updating an account balance are naturally suited to the von Neumann paradigm.
This book is for a general audience, but that's a particularly bad example since ledger accounts are just about the oldest form of append-only database. (edit, Like old enough that Chaucer makes a very raunchy joke about "double entry" bookkeeping.)
[0] From chapter 1 of _Out of Their Minds_, by Shasha and Lazere.
Prolog is rather imperative. If it were declarative and pure, it wouldn't mandate a specific execution strategy, but rather leave it as an implementation detail - a user-adjustable one, even.
In an imperative language (including all functional ones), optimizing a program is rewriting it. In a declarative language, optimizing a program is adjusting the language implementation (e.g., making a smarter SQL query planner).
A better argument for declarative languages is the Haskell technique of decoupling the abstract syntax of a DSL (usually a free monad) from its possible interpreters (monad morphisms).
SQL is not Prolog though. SQL, at least the parts mapping to relational algebra, is not Turing-complete, Prolog is. I'm pretty sure Prolog not specifying an execution strategy would mean that your program terminates under one interpreter but not under another, not to mention their asymptotic complexity, which I think is not as bad with SQL.
In other words, it's not a lie to promise that you can specify a query over tables in a declarative way, but it's sort of a lie to promise that you can spell out predicates describing a solution and the machine will find a solution satisfying these predicates, because sometimes it can't. Specifying an execution strategy doesn't make it less of a lie but it lets you write a working program kinda looking as if it weren't a lie.
> I'm pretty sure Prolog not specifying an execution strategy would mean that your program terminates under one interpreter but not under another, not to mention their asymptotic complexity, which I think is not as bad with SQL.
Yup! But it's not as bad as it seems. In a pure logic language, the changing the execution strategy can only change three things:
(0) Whether a query terminates, in which case it returns the entire soution set.
(1) If the query doesn't terminate, which subset of the solution set is returned.
(2) How fast it runs, of course.
Which is a lot better than the subtle things that can go wrong when optimizing imperative programs by hand. Selecting an execution strategy for a declarative program is no different from tuning a garbage collector: if you do it wrong, bad things can still happen, but they won't be as bad as if you manage things by hand and do it wrong.
> it's sort of a lie to promise that you can spell out predicates describing a solution and the machine will find a solution satisfying these predicates, because sometimes it can't.
Yup! Which leads (IMO) to the conclusion that general-purpose languages must always include some imperative (explicit control) component. Declarative languages really shine as DSLs, though.
SQL is turing complete, at least on Postgres.
Case statements provide control of flow and WITH Recursive CTEs provide looping/iteration/recursion.
Pure SQL functions can accomplish nearly anything (except dynamic execution, think macros) that procedural PL/pgSQL can, WITH statements provide variables and temp tables.
SQL is the most declarative language I'm aware of.
I have the feeling many schools try to convey the imperative, functional, logic progression scheme, but very few people will either get the love or the wit to follow. FP and Prolog flipped my mind around computing twice. The notion of tasks, stack, processes, programs, function, properties, sets, space, graph traversal, continuations, etc etc all fit in a small framework it's maddening.
Backus and von Neumann have a history, from their time at IBM.
When Backus first proposed the idea of Fortran to his boss, he first "had to overcome the persuasive arguments of von Neumann, who was then a consultant to IBM." In Backus' words:
> He didn't see programming as a big problem. I think one of his major objections is that you wouldn't know what you were getting with floating point calculations. You at least knew where trouble was with fixed point if there was trouble. But he wasn't sensitive to the issue of the cost of programming. He really felt that Fortran was a wasted effort.[0]
Backus and von Neumann had made innovations in the use of floating-point and fixed-point numbers by computers, respectively.
From what I can gather, von Neumann was an incredible genius who just continuously pushed new ideas. He introduced "scaling factors" (fixed point) to solve a scientific problem (in nuclear physics) and didn't worry about improving its "usability."
Backus, on the other hand, cared about the difficulty of programming. Both Fortran and his work on floating-point, as well as his later efforts in functional programming, manifested that.
After leaving computing altogether, Backus took to meditation and, as he calls it, "living."
[0] Quotes are from chapter 1 of Out of Their Minds: the Lives and Discoveries of 15 Great Computer Scientists, by Shasha and Lazere.
I think abandonment of the Von Neumann paradigm will be crucial for AI. For decades we have labored under the misguided assumption that the brain is isomorphic to the Von Neumann architecture, with "memory banks" and "processing units." This approach has failed.
Admittedly, Lisp was once the first choice for AI research, and it did not readily lead us into the promised land either. Perhaps a functional approach is not radical enough :)
Some of the most interesting applications of ML include recurrent neural nets, that act as processors, augmented with memory and an attention mechanism (memory addressing). They even call them Neural Turing Machines. The fundamental difference is that NTMs are learning from experience, and are not programmed top-down. So there is still a twist in the Von Neumann architecture.
> For decades we have labored under the misguided assumption that the brain
> is isomorphic to the Von Neumann architecture, with "memory banks" and
> "processing units."
Who has been working under this assumption for decades? It seems rather obvious to me that this is very far from reality, so I'm a bit surprised if the AI community thought the brain were a CPU for several decades.
He seems to be under several misconceptions... This one may just be a mutation of the widely accepted idea that we can in principle emulate a brain using our standard computer models, even if it'll take n more decades of hardware advances, the idea that the brain is discretely computable.
I guess one way to approach questions like this, that challenge our fundamentals of Doing Things, is to ask this:
Would an alien race be doing the same things? Using electricity, decimals, binary, registers, memory etc. the same way we do?
As opposed to say photonics or biochemicals-based computers, trinary or quarternary number systems, belt architectures and so on..
How much of our world has been shaped by simply the ORDER in which we discovered things?
If there is only one Right Way of doing things then all the intelligent civilizations out there will eventually use the exact same architectures and have the exact same technology and, personally, that would make for a very boring universe.
It's a bit silly but that's exactly the way I tend to approach such questions and my conclusions after musing on the subject for years is that we are hamstrung by the historical ordering of past breakthroughs. Further, it is very difficult to determine the extent to which our current ways of thinking are limiting us.
Would an alien race discover things at the same ORDER we did?
For most things, I do think the answer is a huge yes. Electronics before photonics is a serious candidate for universal order. We got biochemical computers first, but we don't get an easy way to improve them, again looks probably that this is universal.
About binary, yes, that's a theoretical best. If we stop using it some day, it will be only because our physical computers somehow suck at encoding them, or shine a lot on another specific basis.
Here's a much smaller and less ambitious interpreter in Python as a hackable taste of the language: https://github.com/darius/parson/blob/master/eg_fp.py -- or rather another dialect, since everyone who implements FP makes up their own concrete syntax. Mine reads left to right instead of right to left, and leaves out the noisy compose operator (it's implicit by juxtaposition), making it read kind of like Forth or Joy.
I also wrote an optimizing compiler to Scheme from this dialect back around 1990, but it's lost in the sands of time.
I find funcional languages great for modelling application logic, but when working with graphics cards, printers, web APIs, etc. procedural code is more intuitive. Maybe it's a limitation of my thinking, but I find an impure functional language (C# and F# are good examples) the best fit for most problems.
I would argue that this is largely because the systems you mention do not have the requisite (hardware or software) functional abstractions required to make them mesh nicely with functional or declarative control* code. Backus here is writing from a hypothetical word in which that is not a problem.
---
* a dangerous word in and of itself in such a conversation, because of its imperative significance.
Wouldn't that just place the burden of providing those abstractions on the hardware implementer. In essence, are we not just pushing the problem down the stack?
No. I always wanted to say that to the title of this paper ;)
The complaint of van neumann programming langauges, as the paper describes, is that it can only be extended as a framework. Todays imperative languages are much more powerful, either they are untyped (implicitly generic), or they include generics, or at least have objects with polymorphism.
The paper puts forward applicative (functional) programming, as a start, because it can be extended much better. The main point, functions can be put together, because you don't have to name, and thus not give types to, the arguments.
To make the system complete (useful), he adds a applicative state transition system, or mutable state. Think IO monad.
I have a different take. You want systems where everything can be abstracted over. That is, put inside of a function (abstraction). And whatever the system does is truly hidden, unobservable to the outside. That might sound like fp, but it has problems when you want to hide state, or when the abstraction is about input/output. Example:
function downloadAndVerify(url):
data = http.get(url)
hash = http.get(url + ".md5")
return data, md5(data) == hash
In my opinion, the above expresses what I want. If I need async, my callers need async. If I need IO monad, my callers need IO monad. Neither can hide that. If it blocks, it can be solved the same way as CPU intensive operations, do it in the background. So processes need to be easy.
You want a language where any async api can be converted into a blocking api, and any blocking api into an async api. (Using processes, most likely.)
When I saw this, I wondered whether this was the same Von Neumann who gave his name to Von Neumann probes, which are hypothetical self-replicating spacecraft that can fill the galaxy within a relatively short timespan.
It turns out that this is the same Van Neumann, who appeared to have been very influential in the early days of computing. In reading about him, I was impressed with all the work he'd done. The people who came up with the concept of Von Neumann probes named them after Von Neumann because of his theoretical work regarding how computing machines can self-replicate.
Fermi also used the concept of Von Neumann probes and the lack of alien Von Neumann spacecraft as proof that aliens (or at least interstellar spacefaring aliens) don't exist.
Von Neumann was often considered by colleagues to be the smartest person they'd ever met. There's an old story that goes something like this:
Von Neumann was asked a mathematical puzzle: two bicyclists start 20 miles apart and head towards each other at a steady speed of 10 mph each. At the same time, a fly that travels at a steady 15 mph starts from the front wheel of the southbound bicycle and heads to the front wheel of the northbound one, then immediately turns around and heads to the wheel of the southbound one, and continues until the bicycles meet and he is crushed. What total distance did the fly travel?
The trick is that it seems you'd need to compute an infinite series, figuring out the distance traveled on the first journey, then the second, then the third and sum them all. In fact, you can just realize that two bicycles starting 20 miles away, each moving at 10 miles an hour, will meet in 1 hour; if the fly's traveling at 15 mph constantly during that time, it travels 15 miles.
Von Neumann provided the solution immediately. The man who posed it to him said "Oh, you must have heard the trick already!" "What trick?" said von Neumann. "I just summed the geometric series!"
The way I solved it was to say that the bicycles meet in the middle because their velocities are equal. They travel this 20/2=10 miles with v=10mph in 1 hour. In this 1 hour, the fly will travel 15 miles, no matter how often it changes its direction.
Fermi and von Neumann overlapped. They collaborated on problems of Taylor instabilities and they wrote a report. When Fermi went back to Chicago after that work he called in his very close collaborator, namely Herbert Anderson, a young Ph.D. student at Columbia, a collaboration that began from Fermi's very first days at Columbia and lasted up until the very last moment. Herb was an experimental physicist. (If you want to know about Fermi in great detail, you would do well to interview Herbert Anderson.) But, at any rate, when Fermi got back he called in Herb Anderson to his office and he said, "You know, Herb, how much faster I am in thinking than you are. That is how much faster von Neumann is compared to me."
There are a number of stories about von Neumann, but almost all of them are apocryphal. I do remember his former professors remarking that if they put unsolved problems on the chalkboard, he'd have them solved by the end of class.
Legend has it that when Gödel presented his incompleteness theorem Von Neumann understood that Hilbert's program was destroyed and immediately threw in the towel.
Claude Shannon, arguably one of the greatest electronic engineers of the 20th century (which would be "ever" I guess) even went to Von Neumann for advice. The "novelty" of a signal was thus named "entropy" (inverse thereof) because of how Von Neumann understood information theory fitted into the broader picture of theoretical physics. Entropy = Amount of Disorder in a System = 1 / Novelty, or something like that. Point is, Von Neumann didn't just understand Shannon's ground breaking new theory right away, but also understood how it tessellated with all the other ground-breaking theories of the day.
Von Neumann was one of the greatest mathematicians ever. Theoretical physics, computer science, game theory, statistics, he was a master at everything.
Alternatively, it could be the case that introducing even minor bugs into the programming of your first Von Neumann probe can be a civilization-ending event.
So maybe we haven't seen any probes because they first disassembled their creators for building resources, then became too busy disassembling each other to get very far.
Wikipedia argues [0] that we actually use a "Modified Harvard Architecture" instead of a "Von Neumann Architecture" today, since we have a separated data and instruction cache. Also, various embedded chips really separate instructions and data.
Not all of von Neumann's architectures were von Neumann architectures [1].
That is to say, not all of the architectures he invented [2] ended up being given the name "von Neumann Architecture". His 29 state cellular automata rule [3] for implementing a universal constructor [2] was "non von Neumann" in the sense of [1].
It's just so amazing what he accomplished with his mind, a pencil, and a piece of paper, without the use of computers, which he also helped invent.
Kanerva has discussed alternative computing styles as well. Especially, he asks, why it is hard to encode intelligence in such a rigid setup.
> The disparity in architecture between brains and computers
is matched by disparity in performance. Notably, computers excel in routine tasks that we - our brains - accomplish with effort, such as calculation, whereas they are yet to be programmed for universal human traits such as flexible learning, language use, and understanding.
He focuses on the representation problem and develops a more flexible representation of concepts with what he called hyperdimensional computing[1].
You can always come up with some really elegant and expressive formal systems to model programming/computing and thats the mathematics side of things BUT people often ignore the physical/engineering side of things. We need to engineer actual physical systems that can implement those so called elegant formal systems of mathematics. This is where the problem is, engineering a physical systems is a very complex problem with lots of constraints and is not as simple as coming up with a bunch of symbols and rules of a formal system.
Physics is about reality and mathematics is about representations, you can create representations from thin air but you cant do that with reality :)
"Out of the Tarpit" (http://shaffner.us/cs/papers/tarpit.pdf) is a paper with a similar bent arguing that we can eliminate complexity with a functional/relational approach to designing software.
Reminds me that we put so much work into making things "digital", that we forget how many layers of abstraction there are. All the work to add numbers when superposition is instantaneous.
It's not just the tyranny of memory, but tyranny of the clock! It will be interesting when we get rid of both...
It's not about liberating programming, it's about liberating programmers by enabling them with more powerful abstractions. And pure lazy statically-typed FP is one hell of a powerful abstraction.
Note: in the time it took me to write this ramble and also the code review I had to do in between writing and posting, others like catnaroek have addressed similar themes
Speculating wildly here, but part of why I think there is continued (albeit unconscious) resistance to other paradigms is that the procedural/imperative paradigm seems to be the most `hackable' of the bunch.
I'm not using `hackable' in a positive sense here. Procedural code is bare-level, almost the machine code of programming models. "First do this thing, then do this thing, then do this thing", etc etc. There's a relatively small implicit underlying `physics'[1] in a procedural system. In some sense, every procedural program involves the creation of some higher-order logic (a model) as a part of its construction, in order to both define the state machine, and dictate the manner in which state flows through it.
The trick, of course, is that every model has a bias, encodes some manner of thinking. An aspect of that is that it is easier to represent some things and harder to represent others[2]. In a procedural program, when the model you've (consciously or not) developed fails, it's trivial to 'fall out' of the model and revert to writing base-level imperative code, hacking your way to a solution.
Functional and other Declarative paradigms, on the other hand, have a stronger, more complex `physics' to them. In the case of declarative, for example, a user is asked only to declare the state machine; the execution of it is left to the `physics' of the system. This can mean that a well-written program in a functional or declarative language appears to be simpler, more elegant. In reality, this is largely because a large set of assumptions that would need to be explicitly declared in a procedural language have been encoded directly into the language itself in the functional/declarative case[3].
This means that when you're operating withing the paradigm that the functional/declarative language affords, everything is smooth, beautiful, elegant, and verifiable according to the physics of the system. However, it's much harder to 'fall out' of the assumed model, because the granularity required to do so isn't at the native resolution of the language.
---
[1] By physics, I mean a set of assumptions one needs to make about the behavior of the system that are not directly encoded by the user's input
[2] Think of how easy it is to, say, pick up a piece of steak with a fork, and how hard it is to scoop soup with the same implement. Different models have different affordances, recommend different problems, or different solutions to the same problem.
[3] As the Church-Turing thesis shows us, these systems -are- equivalent in power, so those semantics still have to be somewhere. To paraphrase Hofstader, there are two kinds of music-playing systems (with a continuum in between). On one end, a system with a different record for each song, and a single record player capable of playing all records. On the other a system with a single records, and a different record player for each song. The difference is how much semantic weight you put on the encoding, and how much you put on the decoding (but they still need to sum to 1)
Thank you for saying this. I don't think anybody else noticed. In fact, people in the J community occasionally get confused about this as well.
Most people shit their pants when confronted with something like FP (Backus language demonstrating these ideas): I know I did when I first encountered J. Much more confusing than ML or Lisp. Still not sure it's the right thing, but for parallel problems it probably is. Kids don't seem to have a problem with it though, so it's probably just old habits.
No. Other styles include logic (e.g. Prolog), (forward chaining) production systems (e.g. CLIPS), dataflow, and (gated) quantum computing. There's some overlap between the different styles.
There are a bunch of other paradigms, e.g. declarative programming was considered the thing some decades ago; but in general nowadays most domains are best handled by the tools & approaches we have from either the direct imperative simplicity, the OOP camp, or the functional mindset.
I think declarative is still preferred, if you can make it work. I think the real issue is coming up with a declarative language that is general purpose and efficient enough.
For specific domains, however, declarative languages have done quite well, e.g. SQL, regexps, XPath, etc.
I believe that what we're seeing is declarative paradigm used as subsystems / domain specific languages that do a particular core task accurately and efficiently, but surrounded with a general purpose "scripting environment" in which you do the messy interfaces with the surrounding world and users.
Besides the obvious SQL example, the current ML systems such as Tensorflow are a great illustration; you declare a particular computation graph and then the system "just executes it" over multiple GPUs.
Functional is one of the many possible idioms of declarative programming.
It's just one that's completely focused on an idealization of "software", instead of any more concrete problem. That makes it more general than most. And the functional idealization is more similar to what real computers run than other representations (some people already mentioned Prolog). Those give FP some advantages over the others, at least for now.
As someone who studied SICP in the 80s and continued on to ML, etc., I can say that even though functional programming is far from dominant now, it's way trendier than back then, so much so that there's just no question.
I graduated in the mid-90s. I remember using lisp and ML in multiple classes. We also used Prolog although that's not a FP language but it is certainly non-traditional.
Disagree. I've found over the years that I pick up new programming languages rapidly. I attribute this to already having a broad base of experience in different paradigms and techniques such that at this point everything is just a variation on something I've worked with before.
But can something that everybody uses become a competitive advantage? Intuitively that doesn't seem true to me, although it would be an interesting dynamic if it was.
So it seems like you need a small collection of successful early adopters to prove its value instead as a middle ground for there to be any concentration of advantage at all.
I assume you're putting "competitive advantage" in reference to "gets me hired" instead of strictly "gets the job done well", in which case, yeah, someone hiring someone else for the way they get a job done rather than their ability to do so would probably get you stuck. You could fake your way in.
Otherwise you might be referring to "no one will get my code", which is also a legitimate concern.
I think you are probably better off being a top performer in a very common field. There is more opportunity that way. If you are lucky you can also be an early adopter in a growing field and ride the wave.
For example, Erlang and Haskell are cool languages and probably a technically better choice but I am not sure being an Erlang developer is a good choice for career development. There are not too many options for working in that field.
This strikes me as completely the wrong way to make decisions. If you're taking it as a given that you're a top performer in a common field, then yeah. You might as well assume you're a billionaire and argue how many doors that opens for you.
But you're not a top performer in a common field until you've committed years of effort to not being a top performer in that field. And you can hardly argue that the world is a worse place because someone wanted to be the world's best yodeler instead of the best truck driver.
What made me think that functional programming is worth being aware of is that good testing practices head towards things done more easily in functional programming, pretty quickly. Programming in functions guarantee a compartmentalization of state and a determinism of the result (as opposed to non-determinism).
Procedural programs and Object-Oriented programs need to postulate concepts like design-by-contract, assertions with explicit precondition/post-condition checks, and unit testing suites as a form of insurance against non-determinism and poor compartmentalization. So it's possible to get this kind of control but only after writing a lot more code.
Going completely stateless is probably foolish though.
As for your heuristic: yes, I agree, a top performer in a common field is a safer bet and a sound strategy. If you're able to secure yourself properly you won't be overcome by a large increase of top performers, if such a thing is possible. We have a lawyer over-supply so I guess in some fields it is.
You don't have to be an "X" developer. Developers can know more than one language, and knowing functional programming can definitely be a competitive advantage, even if you don't typically use functional languages.
Not in all banks. Standard Chartered and Barclays have probably the largest Haskell codebases in existence. Many others are investing in Scala (e.g. Morgan Stanley). It's probably more of an alien concept to the likes of Google and Amazon.
Ah yes Clojure, Citigroup had a team of 70+ fulltime Clojure developers in Credit last I heard. The examples I'm giving are all strategic commitments to FP by senior management. The Google situation sounds more likely to be viewed as a "problem" by them as (last I heard), Clojure was not on their white list.
It's in the process of being liberated now. GPUs are already deployed in many machines with a fundamentally different architecture. And there's a bubbling foment of alternatives being developed now.
In the end, I don't think it was necessarily a case of us getting stuck on "von Neumann" per se... we would have gotten stuck on anything we settled on in the 1960s. And despite familiarity breeding contempt, I'm not that convinced we could have done that much better in the 1960s. A lot of the alternatives are not universally better, they're better in some limited domain but it's not obvious that they are going to uniformly beat von Neumann for general computing, even accounting for the fact that "general computing" has coevolved for 50 years with von Neumann.
See also https://news.ycombinator.com/item?id=12155561 (which may well have been the inspiration for this submission)