BQN: An APL Variant from Marshall Lochbaum

mlochbaum · on Aug 16, 2020

That's me! It's a pretty exciting time for BQN because I've just put the self-hosted version online where you can try it out. In addition to exploring a lot of new APL ideas[0], BQN is also a test of the Co-dfns[1] approach to compilation, meaning that with an appropriate runtime the entire bytecode compiler could be run in parallel on a GPU. It's still in proof-of-concept stage, so that the online compiler is very slow and one running on dzaima/BQN, another implementation, compiles only about as fast as Go despite having a lot less features. However, having written several thousand lines of BQN I'm very happy with the language's fundamentals, and I feel it's a big advance over APL or J.

[0] https://mlochbaum.github.io/BQN/doc/index.html

[1] https://mlochbaum.github.io/BQN/implementation/codfns.html

zerker2000 · on Aug 16, 2020

I've always been curious one decision pervading various APL: what are the relative merits of right to left evaluation order, vs more conventional left-to-right method chains? The core principles are incredibly elegant, but I must admit hitting "expressions are executed right to left, except if you hit a {} lambda, wherein separate statements are ordered left to right, but are themselves expressions going right to left again" is always kind of jarring.

mlochbaum · on Aug 16, 2020

For left-to-right execution, the dissonance between execution and statement separators (including functions and list notation) is very bad as you note.[0] For right-to-left execution, assignment has a similar problem: the name really needs to be on the left to allow easily scanning definitions, but this conflicts with the evaluation order. Particularly statements with multiple assignments in them don't work at all. Right-to-left is usually framed as a more "declarative" or "functional" style (and note that application and composition of mathematical functions follows this order) while left-to-right is considered "imperative".

In my first major language, I,[1] I did change to a left-to-right order. For BQN, which is intended to stick much closer to traditional APL, I didn't want to make such a large break from the methods that have worked in the past. I do think I'll introduce some mechanism like a "pipe" that goes in statement order so that longer chains of functions and operators can be built up without such a discontinuous reading order.

[0] https://mlochbaum.github.io/BQN/problems.html#right-to-left-...

[1] https://github.com/mlochbaum/ILanguage

zerker2000 · on Aug 16, 2020

> Parentheses may be used to specify the order, but there's a (usually) better way as well: whitespace! Fewer spaces means the function will be executed earlier.

Oh man, that is a fascinating idea I've played around with some, glad someone's put it more seriously into practice.

Re: assignments, Aaron sure seems to sprinkle them liberally inside lines - though not generally ones referenced in subsequent ones, to be fair. (I suppose another notable attribute of the co-dfns codebase is extensive use of ⊣ as a leftwards statement separator)

Mathematical functions are prefix, which imo is much harder to follow than either infix (OO/APL) or suffix (RPN/Forth), though this may be more of an english-speaking intuition than some manner of universal truth. I do at least observe a trend towards "imperative but functionally pure / side-effect isolated" in recent language design.

zerker2000 · on Aug 16, 2020

Ah, I see ILang deals with the issue by… not having a name binding primitive, making you SKI-calculus your arguments into place. (Compare the piles of roll-swap-dup you get in varless forth dialects, though not as bad due to there being two directions args are passed in from.)

I suspect there is room here for an unprincipled "preceding word/pattern is defined by the following line/s" loose-binding operator, to restore skimability, and a separate inline destructuring let form that cannot leak out of lexical scope. Which I suppose would warrant lambdas - perhaps with ⍺ ⍺⍺ ⍺⍺⍺ to denote enclosing function input, rather than argument/operand/hyperand, though nonconcrete functions would stress the type system as is

mlochbaum · on Aug 16, 2020

It does have : for assignment, although there's no scoping, and you have to assign to symbols that are surrounded in single quotes. You can see some examples at https://github.com/mlochbaum/ILanguage/blob/master/examples/... . They need extra parens, since : is an ordinary function and has the symbol on the left. Your overall point is right; I didn't really try to address any issues with assignment. And of course there are lots of things to try, and I hope some of them make it into some other language I design!

082349872349872 · on Aug 16, 2020

I think it's history: in the 1960's, machines were strongly card (or at least line) oriented, so pure right to left intracard yields a very small "parser" compatible with traditional mathematical notation, and the unavoidable intercard transitions become top to bottom on a teletype.

(FWIW in my array noodling, the equivalent of dfns are given in the haskell style of "expr where defs" instead of the strictly temporal "let defs in expr", which does result in a consistent right to left for those who wish to read bottom-up.)

pritambaral · on Aug 16, 2020

Just curious: were you aiming for a ROT1'd name?

mlochbaum · on Aug 16, 2020

I spent about an hour believing I'd succeeded; by that time I'd come up with the long form "Big Questions Notation" and realized the pun on "bacon" (paralleling APL and "apple"), so I certainly didn't want to correct the mistake.

rmetzler · on Aug 16, 2020

The default program on https://mlochbaum.github.io/BQN/try.html returns an error on mobile safari.

ludsan · on Aug 16, 2020

Lovely to see. Congratulations on all the effort you put in. I see, from the github README, that BQN supports trains. I'm curious how far you go with this? Monadic and dyadic forks and hooks à la J?

mlochbaum · on Aug 16, 2020

BQN uses the 2-modifiers ⊸⟜ for hooks and, because data values (numbers, characters, and arrays) apply as constant functions, to bind arguments to functions. I find them very flexible and intuitive and would say they're one of its best features. For example, F⊸G⟜H is the function that applies F to its left argument, H to its right, and then G to the results together.

As in Dyalog APL (and subsequently NARS2000, ngn/apl, and dzaima/APL), BQN uses the two-train for simple composition, so that (F G) applies F to the result of G. The "nothing" indicator · allows a function's left argument to be omitted and can accomplist the same thing: (·F G) is another way to write that 2-train, and can be part of a longer train with more functions to the left. So it functions like J's Cap, but also works outside of trains.

andrewnc · on Aug 16, 2020

I am a HUGE fan of J, so having new sibling languages is very exciting to me.

I look forward to giving this a try :)

mncharity · on Aug 16, 2020

> The Javascript-based compiler[1] is also slow [...] this is purely due to the slow runtime

Top-level functions are defined with 'let'. I wonder if 'const' is more likely to be inlined?

Array's are given a flag property 'sh'. I wonder if adding a property to an Array drops it off Array operation fast paths? Perhaps interferes with its optimization on element kinds?

Functions are given a flag property 'm1' xor 'm2'. So shape polymorphism, different hidden classes, and associated burden on inline caches at call sites, property accesses, etc. Better to always set both. I wonder if adding any property to a function drops it off function call fast paths? Or interferes with inlining? Perhaps these flags might be avoided using 'Function.length'?

I wonder if there's some low-hanging performance fruit here.

[1] https://github.com/mlochbaum/BQN/blob/master/docs/bqn.js

mlochbaum · on Aug 16, 2020

Thanks for the tips; this looks like some good info. Overall I would say Javascript's performance when evaluating the runtime is outstanding: not to say it couldn't be improved, but it beat a Go version that I wrote even before dzaima built me a bytecode->JS converter. That the runtime is a snail in spite of this is no mystery: the VM provides very few functions, and the rest are all blocks in the VM, with the full lexical scoping machinery. The compiler needs to be optimizing away some of that. I'm more interested in these sorts of optimizations because I do think I'll get the implementation on Wasm eventually, although it will probably have to JIT things to keep up with JS.

JS should be able to figure out that all those let variables aren't modified, and the timings don't show any difference (they did in an earlier version based on eval() instead of Function()). The scheme with m1 and m2 can almost certainly be improved; I guess I'll look at having a single type property that's always set.

7thaccount · on Aug 16, 2020

Hey Marshall,

Do you still work at Dyalog? Is this just a personal hobby project?

Why choose JS or Go as implementation languages if you're already quite familiar with C? I guess web browser support is neat, but what I've always really wanted was a single executable (no install) that can run as a REPL or build self-contained executables that bundle the code and interpreter. I think that could be pretty useful. I know J exists, but it is a huge install.

mlochbaum · on Aug 16, 2020

I left Dyalog in June and have since been taking a break from employment to work on BQN roughly full time. Eventually I'll get another job, but I hope to keep working on BQN and won't seek to commercialize it.

I hope to have BQN running directly on at least Wasm and x86 eventually. Currently I'm focusing on the self-hosted parts of the implementation, so the VM is written in whatever's convenient, portable, and fast. Because the VM is very small it's easy to port between languages—the Go port I did in about three days without knowing any Go beforehand. But manual memory management adds a lot of difficulty, particularly because closures can form reference loops and have to be garbage collected. I'd rather stay in a memory-managed environment for as long as I can to work on the unknown aspects of BQN before attacking the known but hard memory management problems, and when I make that jump I might skip directly to assembly/machine code because I have more control that way.

7thaccount · on Aug 16, 2020

Since I can't edit my current comment, I'll add that I wish your current project the best of luck and hope it morphs into something that people want to use and can use for development.

7thaccount · on Aug 16, 2020

Ah that makes sense and sounds good!

mncharity · on Aug 16, 2020

A python or julia implementation, integrated with normal numpy, might enable low-barrier exploration and incremental adoption?

mlochbaum · on Aug 16, 2020

This is a nice idea. I'd thought about trying to run BQN on top of an APL (difficult because existing APLs' boxed or nested models are at odds with BQN's based "like a normal programming language" array model), but trying the more mainstream APL-adjacent langauges hadn't occurred to me. numpy at least looks promising.

mncharity · on Aug 17, 2020

A story of "It's a python package, a DSL for numpy" could reduce several barriers to exploration and incremental adoption. DSL's let one use the familiar host language, but for that one little bit where you'd be willing to learn a new and better way of doing things.

A further step in that direction, might be to have a second, more familiarly pythonic interface. Sort of a Q for K. But with minimized cognitive distance from a pythonic ideal, numpy api, and BQN. Attempting to create a low-barrier space encompassing them, for learning and trade offs. Though I'd have to think about what that might look like. Semantics and vocabulary might be learned separately from syntax. Perhaps someone might use BQN-native syntax for prototyping, and then dump it for release as company-acceptable "just using a normal python library" verbose pythonic code. A counter argument is the APL and pythonic ideals for code are rather different, so it might be worth using python and numpy simply as infrastructure, without worrying about chasing hearts and minds.

smabie · on Aug 16, 2020

Awesome work! Hopefully array languages will undergo a renaissance!

One thing is that even after reading running.md, I'm still kind of confused about how to actually run BQN. Maybe provide an easy way to get started with the self-hosted bytecode compiler? Or maybe there is an easy way, but I'm just stupid.

mlochbaum · on Aug 16, 2020

You can use the self-hosted version at https://mlochbaum.github.io/BQN/try.html . I haven't built any direct way to run it locally yet; you have to require docs/bqn.js from Node.js, and then call the result on strings to run them. You can also call test/tj.js with any number of string arguments to evaluate them and print the results (as formatted Javascript objects, not with nice formatting).

This stuff is all shifting underfoot now, as I only put the online version up yesterday. I'll probably make a real executable/REPL to use locally pretty soon, and I also intend to make some or all of the code blocks shown in documentation dynamic so you can change the code and reevaluate.

smabie · on Aug 16, 2020

Thanks! Gonna give it a try now.

I've actually developed my own array language inspired by kdb+/q and forth called xs: https://cryptm.org/xs/

It's a little more pedestrian than what you're doing: APL/J have this ethereal magical quality that k/q seem to lack for whatever reason. Probably because of the lack of tacit programming (which xs supports btw).

It's a Java world, we array language folks need to stick together. Maybe we can even break the glass ceiling and get array languages used for front-end work in a couple decades!

mlochbaum · on Aug 16, 2020

I'd seen xs before, actually. I spent a while building concatenative array languages (nothing this complex though—file functions! The luxury!) before deciding that I really need some more structure to make sense of the code.

A good amount of implementation discussion goes on at https://chat.stackexchange.com/rooms/52405/the-apl-orchard . I'm sure we'd enjoy your contributions.

smabie · on Aug 16, 2020

I think there's a conflict between the essence of array languages (building a DAG of monadic (in the FP sense) operations and letting arrays flow through it) and the one dimensionality of a stream of symbols in a text file. Ben Lynn [https://crypto.stanford.edu/~blynn/c/apl.html] puts it best:

"Worse still, tacit hooks, forks, and ranks obscures the beautiful category theory connections: hooks and forks are function composition and application within the Reader monad, while adjusting the rank of a function corresponds to applying functors."

Hooks are a structured approach to this problem, while stacks are an unstructured one. And there will always be tension between the two for as long as the current textual model of programming is used (or maybe no matter what representation we use). How obscure are we willing to get in order to get closer to the beautiful DAG we have in our head?

I of course don't have a good answer to that, and I think the dominant paradigm of array languages has set to be discovered. APL gets us close, but we clearly agree that there's more work yet to be done.

Looking at your bytecode design, I was reminded of this old paper about building APL hardware: http://www.softwarepreservation.org/projects/apl/Papers/1970... . Not sure how relavent that is, but maybe it'll be the inspiration of a beautiful VM architecture!

quantified · on Aug 16, 2020

Exciting. I’m really interested to see how this develops, great work to this point.

_9iuc · on Aug 16, 2020

If you could get this to run fast on GPUs you might get some converts from other APL family languages.