The comments here are interesting, lots of possibilities for other ur-languages. My suggestion is macro based languages. Macro based programming predates all programming languages other than ASM[1]. The simple macro systems, like early assemblers provided, aren't ur-languages, but once macros can expand other macros and generate definitions of new macros, the macro systems can become general purpose programming systems.
The two earliest macro systems that were clearly designed to be general purpose languages that I know of are Christopher Strachey's GPM[2] and Calvin Mooers TRAC[3] programming language. These languages appeared at roughly the same time, the mid 1960s. I prefer the syntax of TRAC, but otherwise they are almost isomorphic. TRAC was featured in Computer Lib/Dream Machines[4] by Ted Nelson where the author said it was one of the three important languages for programmers to learn. A good introduction to TRAC and it's implementation can be found in Études for Programmers[5].
Other more contemporary examples of macro programming languages are m4, and TeX. LaTeX is programmed in the TeX macro system.
[4] Ted Nelson, "Computer Lib/Dream Machines", 1974, Self-published. (There is a 2nd edition from Microsoft Press, but I'm only familiar with the 1st edition).
[5] Charles Wetherell, "Études for Programmers", 1978, Prentice Hall. (It's out of print and available from Amazon for $427. I'm going to have to start locking up my old books.)
Macro/concatenative languages should replace the Forth line. The author is off-base in thinking RPN or stacks are important to Forth rather than concatenativity. RPN is just an easy way to make interpretation simple, an implementation detail, not a functional requirement. It would be like defining APL as being evaluated right-to-left.
The stack is important to Forth, but equally so is its threaded-subroutine-call nature, which isn't present in other early languages that I know of. That puts Forth halfway between assembly and higher-level (and easier to read) languages. I never saw it as a macro language.
At least that's the viewpoint I'm familiar with from back when we heavily used it.
It's a way of structuring an interpreter. In forth a word is just a pointer to its definition, which in turn is just a list of pointers to other words, potentially user defined or built in primitives. Execution threads through these pointers, much like making subroutine calls with arguments being passed implicitly on the stack.
It's much more compact and lower overhead than a classic interpreter walking a full Abstract Syntax Tree structure. These days most languages are going to a full native code JIT however.
I read about TRAC in Computer Lib while working on a template language that basically just has #! as a special prefix in JSON objects to signal a macro-expansion. I expand macros until there are no more macros to expand, and return that as the result to render in HTML. Reading about TRAC is what made me realize I had a general purpose language on my hands. Too bad, I could have avoided the whole exercise of writing an interpreter, scheduler, debugger if I just left it as an alternative syntax for interpolating variables into HTML.
Thanks for the tip on Études for Programmers, looks like I'll be able to look at it in a library at least, surprised archive.org didn't have it.
Pretty excellent summary, this is roughly the taxonomy I have in my head.
I would maybe add SQL as an ur-language as well. It's not quite general purpose like most of these, but it should have a place in this list, I think. It has some kinship with Prolog and the declarative style, but it's really it's own thing.
You could also maybe argue for something like LabView. Many programmers look down on purely graphical programming languages, but with Houdini, Unreal's Blueprints and the various node-based shader/material systems in gamedev, I think it probably deserves it's own little branch of this family tree.
SQL is on the same group as Prolog. In fact, it is a simpler and purer implementation of a constraint-solving language, so it is probably a better option for learning the group than Prolog.
For the signal transformation ones, there are also the hardware definition languages on the same category as LabView and animation languages.
To your point, Berkeley's intro-level CS class, CS61A, used to use teach Prolog as an example of logic programming. They use SQL now instead.
I can't really attest to whether that's a better choice for learning logic programming, though. It's easier to "run programs" in Prolog. SQL wants to be a cog in a machine where Prolog is more freestanding.
I definitely did a bunch of Prolog in my CMPT216 class back in undergrad and I hadn't really thought of SQL as an alternative until today. I agree with you it's easier to "run programs" in Prolog, and a lot of the concepts from the class would have been more difficult to map to SQL than Prolog (e.g. writing a Sudoku solver in Prolog was mind-melting in a good way, writing a Sudoku solver in SQL would probably be mind-melting in a bad way). The one thing that SQL does have going for it though is that there's a ton of tooling for getting practical real-world data into it.
I would absolutely love to take a set of logic/set-oriented problems and solve them both in Prolog and SQL, just to see which ones are ergonomic in each language. Maybe this summer at the cabin...
The tooling thing goes both ways. I remember classmates struggling with homework because they were trying to use a different flavor of SQL than the SQLite we ran for class. This stuff is obvious to engineers, but very confusing to someone whose first hello-world was 8 weeks ago.
Ok fair enough. I suspect that's a pretty non-standard/rarely-used feature though. If you learn SQL you likely won't encounter this and I still contest the idea that SQL is a good entry into the logic paradigm.
SQL and Prolog are both relational. That's very unique to both. But SQL is all about querying databases. Prolog can also be used and understood as a database querying language but it's also very strong for
- parsers
- interpreters
- expert systems
- solving combinatorial problems
If you really want to you can probably use SQL for that too. Or any language for that matter. But going out and learning SQL won't naturally expose you to these applications and how well the logic paradigm lends itself to them.
> I suspect that's a pretty non-standard/rarely-used feature though. If you learn SQL you likely won't encounter this
Recursive common table expressions are part of the SQL standard (since 1999) and are quite frequently used to traverse hierarchical data (aka "adjacency list").
It is part of basically all (good) SQL tutorials - at least in the "advanced" part.
I don't remember using recursion in a real project, but I built a HN clone on top of Postgres, with the following query:
WITH RECURSIVE thread(id, parent_id, user_id, post_id, timestamp, text, depth) AS (
SELECT id, parent_id, user_id, post_id, timestamp, text, 0
FROM comments
WHERE user_id = 1
AND parent_id IS NULL
UNION ALL
SELECT c.id, c.parent_id, c.user_id, c.post_id, c.timestamp, c.text, t.depth + 1
FROM comments c
JOIN thread t ON c.parent_id = t.id
WHERE c.user_id != t.user_id
)
SELECT * FROM thread ORDER BY timestamp ASC;
Oh, it's certainly not a "type specimen". It does actually have recursion and matching-like behavior, but those are convoluted and not really used, so it is not very suitable for general programming.
But lacking those general features makes it an even purer "learning example" of the group.
Not to mention the various graphical programming languages used in industrial control - ladder logic, continuous function chart, and sequential function chart. Ladder in particular is incredibly widespread, as it was a direct descendant of the diagrams used to document and debug relay logic, and was adopted in some of the first PLCs (programmable logic controllers) in the late 60s/early 70s.
Programming PLCs sometimes feels like going to, say, Australia or New Zealand, and being immersed in an environment which went down a different evolutionary tree very on in its development.
SQL might be in its own category or not, but it shares a trait that languages in other categories ended up with: "I know, we'll make it kind of like writing English so that people who aren't experts can program!".
This is one of those things that seems to keep coming back - recently in the Ruby world with Cucumber.
I suspect that idea is now dead. Anything new that accepts something that looks like natural language now needs to be at least as smart as ChatGPT. Stuff that sort of looks like a natural language but is really a formal language now comes across badly.
There have been very few non-English programming languages. There was a French version of COBOL once. I'm surprised that something hasn't come out of China.
There's a Chinese dialect of Python. Does anyone use that?
Interesting point about AI... that really could be a game changer, although to me it doesn't feel like it's as precise as we often ask code to be. Yet.
SQL is based on Relational Algebra a mathematical foundation for modeling, factoring and joining data. In this it’s probably unique. Other languages may be based on other mathematical models such as pure functional languages being based on lambda calculus etc.
I think you’re first instinct is correct given libraries like Linq and GQL in Groovy. The relational algebra is decidedly different versus say map, filter and reduce.
Sql is accidentally general purpose. Recursive queries, a feature present in all implementations since the 90s, put it just over the line of turing completeness.
IMO the eighth ur-language is Erlang (with cousin Elixir), which brings the ultra-scaling actor model, an abstract operating system designed for high reliability and low latency, and a bunch of nice rare features like builtin binary structuring/destructuring and pattern matching. While it had its origins in prolog, it's now sufficiently far away from prolog's inner machinery that it only has syntactical similarity.
Learning erlang made me definitely at least 10% smarter as a programmer. Highly recommend, even if you never use it in anger.
Erlang is a nice and unique language, but I feel at its core it falls into the object-oriented/message-passing paradigm, very much in the vein of Smalltalk. Consider these paragraphs from Smalltalk's Wikipedia entry [1]:
> A Smalltalk object can do exactly three things:
> 1. Hold state (references to other objects).
> 2. Receive a message from itself or another object.
> 3. In the course of processing a message, send messages to itself or another object.
Replace "Smalltalk object" here with "Erlang process" and the description holds.
> Unlike most other languages, Smalltalk objects can be modified while the system is running. Live coding and applying fixes ‘on-the-fly’ is a dominant programming methodology for Smalltalk and is one of the main reasons for its efficiency.
Erlang's famed robustness similarly owes much to its ability to update code in a running program.
Joe Armstrong talked about that in the past, and I can sort of see it if I squint, but at the end of the day, there are a lot of things in Erlang that are not processes, like most of your basic data types, so I'd just say Erlang is good at borrowing, like a lot of practical real world languages are, and in doing so, came up with a niche where it's pretty good. It takes things from the FP world, its syntax from Prolog, and definitely has message passing as an integral part of how you architect a system with it.
Fair point! It definitely is a mishmash of good ideas. I think the message passing / OO aspect stands out to me as most distinct from other languages, even if it's not as pervasive in the language as in Smalltalk.
It's funny to me. OTP Processes have a ton of similarities to Agents, but from what I recall it sounds like those two things were developed almost entirely independently and it wasn't until way after the fact when people outside of Ericsson started using Erlang that people started noticing and commenting on that.
I agree. Many of the core language concepts from Erlang you can get from Self and Prolog, but experiencing the OTP is something different that I think you only get from the Erlang family.
Likewise, experiencing JVM is something different from ALGOL ...
And not quite sure if SQL could be grouped together with Prolog due to its declarative nature. But if so, DBMS is, like OTP, quite far from Prolog, just in a slightly different direction :)
I guess what I'm trying to say is that the list of ur-languages seems quite fine to me.
That got briefly mentioned in the Object-Oriented section. It's worth noticing that the article specifically excluded Java and C++ from that section, putting them in the Algol heading instead. Smalltalk and Erlang share enough ideologically that it seems reasonable to me.
I disagree that Erlang is an ur-language (it's too close to the Self and Prolog ur-languages to be its own, IMO), but I do agree that OTP/ERTS could be a sort of ur-framework or ur-runtime, for exactly the reasons you describe: even explicit attempts to implement "Erlang-style concurrency in $LANGUAGE" routinely only cover some superficial "yeah we've got actors that pass messages and maybe supervise one another" without covering preemptive multitasking, hot-reloading of functions, distribution across nodes, and the myriad other features that give Erlang its reputation for extreme fault-tolerance.
I think reactive/synchronous languages [1] deserve a category of their own. They share little/no overlap with any of the others. My background in electrical engineering biases me to Verilog, VHDL, and LabVIEW as exemplars, but there are many others. The distinguishing characteristic of the category is that programs are effectively declarative functions of time and can be composed as such, much like electronic circuits.
(I categorized the temporal declarative language TLA+ into the Prolog family in another comment -- a main distinction here is, though one can express reactive relationships in TLA+, the intent of the language and design of the TLC model checker is that such relationships are only usable for verification code, whereas implementation code must be written as state-succession pairs. A true reactive language permits and encourages both styles of coding for implementation code -- e.g. Verilog's = and <= operators. Vice-versa, a true logic language includes nondeterminism as a core construct, whereas reactive languages typically do not.)
Verilog/VHDL isn't so much a programming language as a hardware definition language and woe to anyone who conflates the two. Verilog/VHDL is it's own class of thing and sits outside of software languages.
From a PLT perspective they are programming languages just like any other, and one can analyse them using the same machinery (formal semantics, typically of the operational style) as any other programming language.
Just because they are not even close to an imperative paradigm doesn’t make them any less of a programming language. They are reasonably close to process-calculus/message-passing languages in the sense that you can treat a Verilog process as a receiver listening to channels that contain messages generated when nets/variables are driven.
I agree that a software developer who tries to write Verilog as if it were C will very quickly run into trouble, but that isn’t because it’s not a programming language; it’s because it’s a programming language with a similar syntax to C but with drastically different semantics.
They're languages for descriptions of reactive dataflow. That they happen to specialize in dataflow which is synthesizable as hardware doesn't change that they're programming languages.
(Yes, it's easy to accidentally write unsynthesizable code in them -- but similar issues hold true for many declarative languages.)
Without weighing in on exactly if it is or isn't a "language", I think VHDL/verilog absolutely have a place in the list and are a fantastic thing for (some, particularly lower-level interested) programmers to learn precisely because it's such a different way of thinking about programming and huge piles of heterogeneous behavior happening concurrently. It really cooked my head for the first week or two and I think I came out the other side better. (I hated writing VHDL with a passion. It was painful medicine for a software person used to algol-style languages. But I'm really glad I did.)
My education was in EE but I switched over to programming ASAP. So my take on Verilog/VHDL is, like, super uninformed/perpetual honeymoon/starry-eyed. But it always seemed to me that these HDLs, since they fundamentally are pretty low-level and asynchronous, if translated to assembly somehow, ought to expose a ton of instruction level parallelism.
Paradigm-wise I find these the most interesting. All the others (maybe except prolog) are really just different ways to write a sequence of instructions. HW definition is fundamentally different. It does everything all the time and you have to figure out how to make it do what you want. Lots of opportunities for tricks to sneak in some calculations on temporarily unused wires, etc.
An article like this is hard to write. This is a good one. I have a few quibbles, but they are just quibbles.
IMHO* a distinguishing feature of Algol like languages (aka "procedural" languages) is the distinction between expressions and statements. Though personally I've never seen the appeal, that distinction seems to be popular for some reason.
* this isn't even a quibble -- the article is fine without it. Just something that has always seemed weird to me.
But if you change your definitions such that expressions always evaluate to a single value and do not have any side effects while statements produce some kind of side effect (and may or may not yield a value), the distinction becomes important.
Especially if you believe side-effects need special handling (i.e. you are a functional programmer).
It's been done in some Pascal descendants: functions were required to be pure but procedures were allowed to mutate global state. Some even went as far as to remove functions altogether (with quite unpalatable results).
The only popular language I can think of that requires side effects to be declared is Haskell. Which doesn't have a distinction between statements and expressions.
Are there any good examples of languages that have it, where it isn't "useless"?
python, javascript, pretty much any language that is popular. I believe everything in Haskell is an expression, even the do notation, no?
But I think the argument being advanced is that the distinction between statements and expressions is fundamentally unnecessary. I don’t really know of any good argument in favour of them: I tend to think lisps have the perfect and simplest possible syntax.
I'm not sure I fully agree with that characterization. Many people would intuitively classify assignment as a statement, yet it's an expression in C, which is fully in the Algol tradition. And Rust is certainly a descendant of C, but blurs the lines a lot further with loops being expressions.
In the case of Lua, the expression/statement distinction allows the language to be completely whitespace insensitive with no delimiters. The Lua grammar top level rule is just Stmt (ws Stmt)*. I think Ruby might work similarly, but I don't remember.
C is less particular about the expression/statement distinction. Java enforces redundancy.
There is no difference between expressions and statement in Ruby (a descendent of the Agol Family). Everything evaluates to a value so everything is an expression.
(to be fair, it took a lot of inspiration from Smalltalk and LISPs)
My big quibble with this is that the OO category is way too narrow. The way it is worded it excludes almost all languages we'd currently consider OO, and is so strict it largely makes the category pointless. Loosen it to exclude the "You do conditionals..." bit, and focus on the message passing, and e.g. Ruby and other languages that are message passing focused to the point of it tending to dominate semantics over it's Algol-related syntax fits firmly in this category.
Listing Self rather than Smalltalk as the basis of OO languages is also a bit odd. Calling it a "purer form" I guess is a justification, but the other view is once you losen the definition as above, self is just the root of one small (albeit influential, via Javascript, branch of OO languages that almost all owe more to Smalltalk than Self. If another language than Smalltalk should be at the root, it should be Simula, for inspiring message passing, not Self.
I always thought that the "message passing" style of OO (which I always hated as a term, because it implies asynchronicity, but I digress) is firmly something else than the "classes and interfaces" style of OO, and that it's an unfortunate accident of history that we give them the same name.
I think part of the problem of trying to separate the two is that the most prominent implementations appear deceptively similar on the surface, even with unfamiliar syntax. E.g. put even Smalltalk in front of someone familiar with C++, and they'll quickly latch on to the similarities once a few of the basics are explained, to the point that explaining message passing as different from method invocation is tricky, not least because so much of message dispatch is implemented in terms of method invocation.
This is ironically exacerbated when what is often presented as the important bit of Smalltalk are examples like the one in the article of allowing definition of control structures (you can do that in Ruby too, e.g. a partial impl: "def true.ifTrue = yield" - now you can do "(1 < 5).ifTrue { ... called if true }" - with a bit more thought you can chain ifTrue/ifFalse).
But while that's neat, that doesn't even require dynamic dispatch, just the combination of being able to invoke methods/dispatch messages to true and false, combined with convenient syntax and support for closures. You could add that to a language and still not have a Smalltalk descendant in any meaningful way.
What matters much more is that objects at least may take control over the target of a message dispatch even if they often don't, whereas method invocations in "those other" OO languages is mainly controlled by the class definition, and that is often glossed over as an advanced subject or "scary magic". For example how Ruby ORMs tend to use introspection to let you dynamically treat columns on database tables as methods based on the actual current schema of the database - in other words the ability to do not just dynamic dispatch but late binding.
ST80 does not really do the object controlled message dispatch that was there in earlier Smalltalks. Well, you can override Behavior>>#doesNotUnderstand: and do all sorts of cool tricks with that, but it is mostly meant as error-recovery path (which is partly apparent from the name), not as something that should be regularly used.
One weird aspect of Smalltalk that more or less directly comes from the control structures implemented as messages taking blocks is that on the language level there are two distinct function-like objects: methods and blocks (ie. lambdas) that behave differently and interact with each other (return statement is scoped to method and only valid during the dynamic extent of said method invocation).
> ST80 does not really do the object controlled message dispatch that was there in earlier Smalltalks. Well, you can override Behavior>>#doesNotUnderstand: and do all sorts of cool tricks with that, but it is mostly meant as error-recovery path (which is partly apparent from the name), not as something that should be regularly used.
The same way you can override "method_missing" in Ruby but you apply it as rarely as possible, but the dispatch is still dynamic and methods can be overridden and dynamically defined. That's the point. Not that you literally implement a dispatch directly on the object.
Put another way, the main distinction is that the precise method body invoked by sending a given message to an object may be impossible to statically determine.
> One weird aspect of Smalltalk that more or less directly comes from the control structures implemented as messages taking blocks is that on the language level there are two distinct function-like objects: methods and blocks (ie. lambdas) that behave differently and interact with each other (return statement is scoped to method and only valid during the dynamic extent of said method invocation).
Ruby sort-of inherits this too, but at any point where you take the value of a block, it becomes an object - it's purely an implementation artefact, and with lambda/proc providing both lexical and method-local scope for return.
this distinction is something i have been struggling with for some time. i am coming from pike which only uses method invocation terminology (in pike it's actually called "function call"), and never even hints at message passing, yet in pike objects can take control over the target of a function call and even dynamically create functions based on arguments given.
when learning smalltalk i could not see the what was so special about message passing, and why it is even called that.
consequently, i find the distinction between message passing vs function or method calling academic. more interesting are distinctions like static vs dynamic dispatch and early vs late binding and whether things are defined at compile time or runtime.
the term "message passing" always made me feel like this should be something completely different and not even remotely similar to function calling. yet i couldn't see that difference and that left me irritated because i felt like i was missing something.
The language is made a lot more confusing by the fact that most of the time you can improve performance of a conceptually message-passing implementation by implementing it as much as possible in terms of invocation, so it's totally reasonably why even very dynamic implementations would end up spoken of as method invocation.
E.g. my long-languishing partial Ruby compiler uses C++ style vtables because they're fast and the "only" challenge is that you need to propagate method re-definitions down a chain of descendant classes (and avoid overwriting overridden versions in the descendants). In practical terms, pseudo-code for a method dispatch is ob->class_ptr.vtable[method_slot](args...) and all of the dynamism happens with a combination of dynamically overwriting and propagating method pointers down the vtable chain (I was worried it'd lead to way too much memory spent on sparse vtables and having to fall back on a hash table for less-used method names, but in practice the number of classes is usually very constrained) and filling in thunks that forwards to method_missing for names that are seen in the system but not implemented by the current class.
Effectively you can consider the vtables as perfect pre-filled caches. To someone casually looking at them to get an idea of Ruby, it'll look like Ruby's object model is almost the same as C++'s.
So I agree with you that the key practical difference is static vs. dynamic dispatch and early vs. late binding. With those distinctions you don't really even need to make the compile vs. runtime distinction. That is, you can statically compile something that includes dynamic calls to modify the object model, like my prototype compiler.
Elsewhere I suggested one way of looking at it is that to distinguish the C++/Java etc. and Smalltalk model of OO, the key test is how common it is for there to be code where determining which method body will be invoked by a given call devolves to the halting problem if you don't have the precise inputs ahead of time (e.g. in Ruby, almost every ORM would cause this).
I'm a bit confused about this quibble when the article text is about "Self and Smalltalk", not just Self? If anything the quibble should be "why is the section called Self when it's about the _two_ ur languages Self (the ur language for prototype based OO) and Smalltalk (the ur language for class based OO)"?
The title is "Self (object oriented programming)" and it ends with "Self disposed of the notion of class and worked solely with objects. As this is a purer form, I have chosen Self as the type specimen for this ur-language", so while it gives Smalltalk the space it deserves, it then turns around at points out Self as the ur-language for OO.
You seem to confuse "ur language" for meaning "the original that others came from", which makes sense if you're familiar with the term "ur", but is not how this article uses the term. They use it to mean type specimen (as noted in the article). As type specimen, Self makes a lot more sense: it took the OO concepts pioneered by Smalltalk and then made them even more OO.
As type specimen self makes less sense to me. In that sense it might make sense as the ur-language of it's own type (prototype-based OO languages), but it's too specific to be a specimen of the overall OO category.
I don't agree it made the concepts more OO in any sense. Self's changes feel far less significant than what Smalltalk brought to the table. That an object inherits it's structure and functionality from something that is itself an object is true of all Smalltalk derived languages I'm aware of, after all.
Smalltalk says "objects are instances of some class of thing", whereas Self went "no, objects are just objects, and what class of thing they are is mutable".
Since we can get from Self's model to Smalltalk's model by adding restrictions on what you can do to both objects and prototypes, Self is quite objectively a better type specimen. I wouldn't call it an "ur language", that was a silly choice in term, but as basis for discussing OOP models, it's the most permissive, and you can get the other OOP models by adding various (sets of) restrictions.
(Much like how a regular grammar is a context free grammar with additional rules)
I suspect we'll have to agree to disagree on this, as I don't agree with you at all that it is "objectively a better type specimen". I'd argue it's not even a good type specimen, in that it shares fewer characteristics with the majority of OO languages than Smalltalk does.
The defining aspect of OO languages if we go by Kay is message passing and late binding. How that is achieved is secondary to the classification. Both Self and Smalltalk provides that, but Smalltalks way of providing that is more typical of the class of languages as a whole.
To me, either Kay is authoritative, or we go the other direction and include more of the ALGOL-derived languages. In neither case is Self a good representative of what OO languages are like, and that to me makes it a poor type specimen.
He was absolutely authoritative, but: back in the early days of OOP. Both the meaning and fundamental aspects of OOP have changed (quite a bit) since then, and what Kay called OOP might be the historical forefather to, but not current type specimen of, OOP.
> Listing Self rather than Smalltalk as the basis of OO languages is also a bit odd.
That was my first reaction too. I also agree that Simula deserves a mention here. AFAICT more languages derived from that model than from Smalltalk or Self directly, and I'm pretty sure the authors of later languages such as C++ have acknowledged as much.
Yeah, most Stroustrup and Gosling has acknowledged that, at least.
The reason I'd be ok with Smalltalk there is that Smalltalk at least was a significant break and you could argue that many of the "non-Smalltalk-y" OO languages are really ALGOL-derived languages that took some OO aspects (especially if you move Ruby out of the ALGOL bucket) in that most of them tend to combine support for both non-object types and objects, and have lots of constructs that operate on non-object values.
As such I can see the merits in both/either of Smalltalk and Simula treated as the "ur-language" for OO in a way I can't for Self.
Self is remarkable, but as I've mentioned before, more for the advances its implementation brought.
The author's notion of OO is intentionally narrow. I don't know about Simula, but C++ is still reasonably close to Algol, compared to what Smalltalk and Self bring to the table (the author also mentions programming environments vs. text files).
If you accept that narrow definition at least for the scope of the article, it makes sense.
It's exceptionally narrow, in that it rules out the vast majority of languages that even purists would agree are OO.
E.g. in Ruby you can not take the value of anything and get anything but an object (e.g. integers are objects, true is an object, nil is an object), but Ruby is not an OO language by the article's definition because it fails the part about conditionals.
Even though you can do this in Ruby (probably buggy, just threw it together) - it's just not idiomatic and the language has syntactic sugar for "less OO" forms:
So it might make sense if you accept that narrow definition, but I don't think many people will find that narrow definition to make sense. I certainly don't.
Simula itself was almost a superset of Algol-60. It's pretty much Algol-60 cleaned up a little bit and with a Java-like object model bolted on top. C++ is a direct descendant of that, sometimes even syntactically - e.g. the keywords "class", "new", and "virtual" all come from Simula where they had largely the same meaning. It's close enough to our mainstream OO languages today that Simula code can be easily understood, at least so long as it doesn't use the async features:
From the perspective of "Here is a list of language families from which you probably should learn at least one of each", the message-passing OO languages are completely divorced from the class-based OO languages. Tossing the latter in with the ALGOL languages is probably fine, as long as the actual language-to-learn recommendation is one of them.
I think this taxonomy is interesting, but I have some quibbles on the characterization of the languages I use the most.
If I were to put Ruby into one of these categories, I would place it first under Self (the object-oriented languages). Ruby is the most object-oriented language that I've ever used in that everything is an object that you send signals to. Even classes in Ruby are objects (they are instances of the `Class` class). Ruby was explicitly inspired by Smalltalk, one of the two exemplars cited by the post.
After the "Self" category, Ruby would fit better under the "Lisp" family than the "ALGOL" family because of Ruby's deep metaprogrammability.
I'm guessing the author was fooled by the availability of C-style `for` loops in Ruby, but that's generally not how Ruby programmers write a loop. It's much more common to write `list_of_things.map {…}` or use any number of other iteration methods available through the `Enumerable` module.
JavaScript to me fits better under the ML (functional languages) family than "ALGOL". The first-class nature of JavaScript functions is the core feature of the language. Of course, if you define "functional languages" by having static type systems this grouping wouldn't work for you. But for me it's all about the functions. You can pass functions around and return them from other functions. You can write utility functions to memoize or otherwise transform functions.
And while not everything in JavaScript is an object the non-object values in JavaScript have "object" versions. JavaScript still has some object-oriented chops. Functions are themselves objects, and while method calls are usually just reading a function off of an object and calling it, you can intervene in the property-reading step to enforce a more "message passing" style.
I'm guessing that most developers will have similar quibbles about the categorization in this article of the languages they are most familiar with. But this is still an interesting frame of reference. And if you only work in languages that fit squarely in the "imperative" category (or write code in an imperative way), I encourage you to explore some of the others.
I agree with your characterization of Ruby as a Self/Smalltalk category language. (Or, a language with an ALGOL-style syntax and a Smalltalk-style OO system.)
> After the "Self" category, Ruby would fit better under the "Lisp" family than the "ALGOL" family because of Ruby's deep metaprogrammability.
I don't agree with this: Lisp's metaprogramming capabilities come from its macro system, while Ruby's metaprogramming capabilities are due to metaclasses (which is exactly why it fits in the Smalltalk category).
JavaScript is also not a ML-style language in my opinion. ML-like languages have not just types, but also pattern matching and algebraic data types. First-class functions is not unique to the ML category, it also applies to the Lisp and Smalltalk categories (and maybe APL, I don't know it enough).
I also still think of JavaScript as an ALGOL-style language augmented with Self concepts (Self, not Smalltalk!), but this may be because when I first started with JavaScript it didn't have classes yet. The good old days...
I think of "functional" meaning first-class and higher-order functions, but lots of people use that term to mean pattern matching and type systems. It definitely seems like the author of this article meant the "ML" category to mean the latter.
When they added the `class` keyword in JavaScript it didn't change the capabilities of the language—it's still prototypal under the hood, but I guess the syntax matters. I certainly see people writing "classes" in JS a lot more now.
ALGOL 68 has first-class and higher-order functions. Arguably ALGOL 60’s call-by-name allows you to do some higher-order things, even though its functions are not first class.
> I don't agree with this: Lisp's metaprogramming capabilities come from its macro system, while Ruby's metaprogramming capabilities are due to metaclasses (which is exactly why it fits in the Smalltalk category).
Though, see the Common Lisp Object System Meta Object Protocol (CLOS MOP). There were a bunch of meta-object systems for Lisp, with the CLOS MOP as the most prominent example.
>If I were to put Ruby into one of these categories, I would place it first under Self (the object-oriented languages). Ruby is the most object-oriented language that I've ever used in that everything is an object that you send signals to. Even classes in Ruby are objects (they are instances of the `Class` class). Ruby was explicitly inspired by Smalltalk, one of the two exemplars cited by the post.
Not everything, there are still a few keywords that are not objects. For example `end.class` will raise a syntax error.
>Every programmer needs to know a language in the ALGOL family well. Once you do, then it’s worth branching out. Learning a new language that traces to an unfamiliar ur-language each year will pay dividends
I second this. I would consider myself an intermediate level programmer and learning Scheme (via the excellent book "Structure and Interpretation of Computer Programs") took my programming to a new level and made me think of programming from a completely different angle.
Agree on the families, but I would pick a different representative for many of the categories.
Algol -> C. Mostly because you can actually do things with C, and yet it remains a fairly small language that's a relatively pure exemplar of the Algol tradition.
Lisp -> Scheme. Also because it's a tiny language that tries to push the fundamentals of the Lisp family (code-as-data, recursion, functional programming, macros) as far as possible.
ML -> Haskell. ML is eager, Haskell is lazy. If you're going to learn about the ML family, you might as well learn the concept of lazyness, which results in a very different style of programming than FP of the Scheme variety (in the Lisp family).
APL -> J. The usage of special symbols is largely irrelevant to the concepts in APL, and it's a barrier to accessibility. You can learn all the important parts of array-oriented programming with J and use actual words to do it.
Self, Forth, and Prolog I would keep as exemplars of their type. I would also add TCL as another ur-language for string-based scripting languages (with Perl, PHP, and SNOBOL as other representatives of the category).
In academic literature, "ur-" doesn't just mean original, as much as it means "essential essence of" or "fundamental". In contexts where you're talking about the first/original thing, or something that provided characteristics that would eventually become a trend, "proto-" would be more appropriate.
For a famous example, Umberto Eco's essay "The Ur-Fascist" isn't describing what the first or original fascist movement was, but instead exploring what makes Fascism Fascism. This usage seems to be pretty consistent across academia, who engage in the vast majority of cases of slapping foreign prefixes onto English words.
Yeah, K doesn't even have proper n-dimensional arrays. Funny enough, the author writes "If you do a lot of numerical work, learn J earlier." which may be a typo, but still…
> You can learn all the important parts of array-oriented programming with J and use actual words to do it.
J doesn't use actual words. It uses symbols just like APL. In fact, APL uses proper words or abbreviations for utilities things that are not part of the core language, whereas J just uses numeric codes combined with more glyphs. However, J is ASCII-only (and uses bi- and even tri-glyphs) where APL uses pleasant and mnemonic Unicode single glyphs, so you don't need to parse which adjacent ASCII symbols form a "word".
Same for APL, but the idea is to learn what the symbols mean, so you can keep it concise and think in composing functions and not reading pages of text. Sort of like mathematicians and mathematical symbols.
I program in J and APL, but I have taken to BQN lately. It is the best of both with some additions too.
> A reader pointed out to me that getting Forth in a deep way usually involves building Forths, since they’re small enough for a single person to build one from the ground up fairly quickly.
I would say the same about Lisp.
In fact, each list in Lisp is like its own little stack in Forth. Prepending an element to the front of a list is like pushing onto a stack. Separating the first element from the rest of a list is like popping from a stack.
Lists of lists & atoms are like stacks that can contain references to other stacks, as well as atoms.
Years of bitter personal experience have taught me that FORTRAN is the real
ur-language in the sense that a suitably determined programmer can write FORTRAN code in just about any language. I have certainly had to debug this code all over the place.
I've never heard of ALGOL both predating and influencing FORTRAN. FORTRAN was announced as a product at the Western Joint Computer Conference in 1957 [1]. ALGOL was developed at a meeting in Zurich in 1958 [2].
ALGOL has certainly influenced the development of FORTRAN at least, with Fortran 90 and later fitting in pretty neatly amongst the various ALGOL-likes.
It is sort of angels dancing on pins but who came first between cobol, Fortran and Algol is curious. Even this article seems to be putting Algol afterwards. It was a more collective language, but I didn’t see it penetrate as much as remain a discussion topic. Perhaps the thread through C gave it entrance?
And then there’s basic…. Maybe that is just pidgin, giving those who could not speak at all some words?
I've always considered Fortran to be the "first", but with some quick research, it looks like Algol was the first C-like lang to add an if/then statement. My guess is that the author was trying to approximate modern imperative programming.
Fortran was the first compiled language, that is, the first language at a level higher than assembler. But Algol was a block-structured language in a way that Fortran wasn't, and virtually all languages of that "group" are block-structured today ("goto considered harmful" and all that). So Algol is more appropriate as the "prototype" of what exists today.
Mercury Autocode is perhaps a "missing link" between assembler and Fortran, though I'm not sure of the dates and whether the people developing Fortran knew about Mercury Autocode. Things developed quite fast in the 1950s.
> Lisp is about a year younger than Fortran, which makes it the second oldest language still in use today
I disagree with this line since Lisp is not really a "language", but a family of them. If we consider Clojure and original Lisp to be the same language, we should also consider that to be true of Algol and Rust.
The original LISP was a language and its implementation. It was also documented - see for example the LISP 1 manual and other publications about it. From that a series of languages evolved, which shared the original core (syntax, data structures, operators, ...) That's what I would consider the core Lisp language family. Most of these languages will have 'Lisp' in its name.
> I am aware of seven ur-languages in software today. I’ll name them for a type specimen, the way a species in paleontology is named for a particular fossil that defines it and then other fossils are compared to the type specimen to determine their identity
Very useful for an intermediate programmer who appreciates the craft and wants to extend their skills by becoming a true polyglot.
As for total beginners, it can be demeaning to explain why asking which language of the ALGOL descendants to learn is a pointless question. For their case, I don't bury the lead: I tell them "They all have the same lineage and are fundamentally similar, so just pick the one you'll learn the most robustly with. Most everything else is learning libraries."
The claim that Fortran and assembly languages "trace to" Algol is... odd. Fortran I was released in 1957; the first fragmentary spec for Algol wasn't released until 1958 (with Fortran project leader John Backus as one of the primary contributors). And there were also recognizable assemblers, in the modern sense, at the time (although terms like that weren't always used with their modern meanings in the 1950s, which lays up all sorts of traps for the unwary).
The first point jumped out to me as contradictory: for immutable values copying is only an implementation detail. Apparently "value semantics" can also refer to mutable values with exclusive ownership, is that the idea here or something else?
I had the same reaction. I think it stems from choosing an extremely strict set of characteristics. If you take the infamous Alan Kay quote [1], then e.g. Ruby fits in that category. But with the characteristics as-is, the category of OO languages as opposed to language that allow OO programming, is near empty and it becomes easier to "get away with" talking about the purity of Self rather than recognising that Smalltalk has a lot of spiritual descendants, while Self is the basis for a much smaller branch (Javascript being by far the most prominent spiritual descendant, but even JS is ruled out by the characteristics given).
Ending up with Javascript and especially Ruby as ALGOL derived makes no sense to me. Where more limited OO languages have imported OO aspects, JS and Ruby have wrapped ALGOL syntax and a few concepts around semantics that are much closer to Self and Smalltalk respectively.
[1] "OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme LateBinding of all things".
I'm ambivalent about that. As a technicality and inspiration it makes sense, but having learnt Simula (I studied in Oslo; it was the introductory language for many years) it does not give you the same focus on objects as Smalltalk does. But if you want to put another language than Smalltalk as the OO ur language, I'd certainly pick Simula over Self.
My knowledge of Simula is limited to reading about it and looking at small code samples, so I'm learning. But it's probably telling that I've learned more about Simula and seen Simula code but I've only ever heard about Self as a kind of footnote.
Self is mostly important for its implementation and the papers about it, much more so than the language itself. If you're interested in compiler technology, especially the original papers by Urs Hölzle and David Ungar are very approachable. There's a very extensive bibliography here [1]. E.g. the paper on Polymorphic Inline Caches [2] by Urs Hölzle, Craig Chambers, and David Ungar was a seminal paper on optimising method dispatch for dynamic languages (note that lots of other work has been done since, but many of these papers are useful because they're short and simple and give a good foundation for digging into newer techniques)
Simula 67 is much closer to the mainstream OO concepts we see in C++, Java, C# and Python than Smalltalk, or let alone Self. ST was the first dynamically typed OO language (scripting language actually) though.
Ii is, because it inspired both the Smalltalk branch and the more mainstream OO languages, and hence it makes sense to consider it as a possible ur-language in that sense.
Smalltalk is a branch, but it's an important enough branch introducing important enough new concepts that unlike with Self I wouldn't have an issue with people considering Smalltalk it's own ur-language, and because I agree with you that Simula at least on the surface will seem more familiar to people familiar with ALGOL-derived languages with OO mechanisms than to Smalltalk.
Self, on the other hand, is less important for its concepts (ok, so it has prototypes instead of classes, but classes in Smalltalk are also objects, so I don't buy that it's that conceptually different, especially in a dynamic language where you can dynamically instantiate and mutate classes) than it is for the papers on its implementation.
> considering Smalltalk it's own ur-language, and because I agree with you that Simula at least on the surface will seem more familiar to people familiar with ALGOL-derived
For ST we have to differentiate the 72 and 74 from the 76 and later versions. Starting from 76 it has inheritance, compiled methods and virtual method dispatch quite similar to (though less efficient than) Simula 67.
> What do you have in mind, apart from duck typing?
The focus on message passing and late binding combined. "Duck typing" is seriously diminishing it. You can write code that appears that way even in C++ with RTTI and inheriting from a shared root class and heavy use of virtual. But to achieve the equivalent of the combination of message passing and late binding in a language like C++ not built for it you typically end up having to build your own message dispatch machinery with no syntactic support to make it cleaner that will make your code look fundamentally un-idiomatic, and so doing so is tends to be limited to specific problems.
It's this combination that makes Smalltalk-derived languages feel different.
> For ST we have to differentiate the 72 and 74 from the 76 and later versions. Starting from 76 it has inheritance, compiled methods and virtual method dispatch quite similar to (though less efficient than) Simula 67.
I'd argue when we say Smalltalk without qualifying it, most of us will be talking about Smalltalk-80.
> message passing and late binding combined. "Duck typing" is seriously diminishing it
Actually even ST-72 made synchronous calls, but at least with a token stream interpreted by the receiving object (thus at least a bit of "message passing"). In ST-76 and later versions "message passing" is just nomenclature used by the ST folks for something that is just ordinary method dispatch and call (if you have doubts, you can analyze the innards of the ST-80 VM yourself e.g. with these tools: https://github.com/rochus-keller/Smalltalk ). The major difference is the dispatch based on signature hash (similar to e.g. Java interface method calls) instead of static offsets, which enables late binding (at the expense of performance); and since everything including ordinary integers derive from Object, all values and objects are subject to dynamic method dispatch; it's no coincidence that Smalltalk was the first language to be associated with duck typing. The unification of scalar values and references, dynamic typing, and likewise the minimal syntax where control structures are implemented by means of runtime constructs were already known from Lisp; also closures (i.e. ST blocks) were already known before they were added to ST.
I've not suggested it is anything but synchronous, so I don't know why you're bringing that up. It's not what we're talking about when we talk about "message passing" in this context.
> In ST-76 and later versions "message passing" is just nomenclature used by the ST folks for something that is just ordinary method dispatch and call (if you have doubts, you can analyze the innards of the ST-80 VM yourself e.g. with these tools: https://github.com/rochus-keller/Smalltalk ).
Sure, you can implement method dispatch the same way. I've written a (partial; unfinished; very buggy) Ruby compiler that allows dynamic method redefinition with even basic C++-style vtables. The point is not the dispatch method but the ability to override them at will.
> The major difference is the dispatch based on signature hash (similar to e.g. Java interface method calls) instead of static offsets, which enables late binding
That late binding is an important part of it.
But you don't even need to deviate from static offsets to enable that late binding (you do need to do so if you want the ability to do dynamic interface-based inheritance, but even then you can use a vtable-like approach - see e.g. Protocol Extension: A Technique for Structuring Large Extensible Software-Systems, M. Franz, 1995 - which adds dynamic inheritance at runtime to Oberon) as long as the dictionaries/vtables/whatever you look them up in are mutable.
Which closes the loop to my point, that Simula 67 is much closer to the mainstream OO concepts we see in C++, Java, C# and Python than Smalltalk, and there is no reason to elevate Self (nor ST) as the "ur OO language".
Which I agreed with you is a reasonable stance. To quote myself:
> Ii is, because it inspired both the Smalltalk branch and the more mainstream OO languages, and hence it makes sense to consider it as a possible ur-language in that sense.
I then went on to argue simply that because Smalltalk is at the root of a significant branch, I wouldn't have an issue with considering that an ur-language if one considers that branch important enough and/or consider message passing and late binding to be essential for a language to be object oriented, as opposed to having some object oriented features.
But I went on to again agree with you:
> I agree with you that Simula at least on the surface will seem more familiar to people familiar with ALGOL-derived languages with OO mechanisms than to Smalltalk.
To sum it up: I've argued that a reasonable case can be made either for Simula or Smalltalk depending on how you define OO, but that no well established definition of OO would make Self a reasonable candidate.
It is absolutely reasonable to group Self and Javascript together for that reasons, sure. I just don't think even that is sufficient reason to consider Self the "ur-language". If he'd created a separate category for prototype-base OO, Self would belong there, though.
I take issue with calling stack languages the inverse of Lisp. This isn't really the case since functions in stack languages are fixed arity, while Lisp functions are not. The inverse of RPN would be the fixed-arity prefix notation used by the REBOL family of languages.
I think you could make a case for including Ada and/or COBOL, separately from ALGOL. Ada is/was an early attempt to remove errors from programming, and COBOL was an attempt to make a "business language" that would be understandable by suits.
Ah Mumps. I did a very little bit of programming in that. It should definitely be on a list, but maybe not this one. Perhaps the "languages you should run away from screaming" list :-)
I despise this writing style. What is the point the author is trying to make? Define what a ur-language is up front, don't be so damn coy. This isn't much more than a list of interesting languages for some definition of interesting.
It's talking about different families of programming languages, much like how we classify the different languages that people today speak[1]. I definitely got thrown off by the choice of terminology, but it makes sense once you go through the list.
Jesus! I thought these navel gazing low effort language posts went out of favor in 2012.
I know all of these languages and Forth isn't foundational for anything we currently use, maybe the JVM because it has a stack if you really try and torture the definition. It is just a collection of languages so the author can look smart.
If they wanted to provide insight, they would demarcate how the semantics of computation are different from each of these languages. It doesn't even mention the power of Forth in being able to extend the language and runtime from within the language itself.
Ahh this is good. But what are ur-languages really? Can we tell when we have counted all of them? Is the collection determined more by the application domain (the problems we think are important to solve) or the architecture of Von Neumann type computing)? E.g. I don't know much about quantum computing but I suppose there is an ur-language associated with it.
Whenever I see a taxonomy I itch for some underlying logic that unifies them. If the explanatory scheme is successful then you might actually convert the taxonomy into a tree, where the root is some property that all of them ur-languages share etc.
They're more grouped by their main abstraction mode, not underlying hardware architecture.
In theory you can emulate anything on something that's Turing complete, can't you? It will just be harder or easier. May not be worth the trouble. But still possible.
Objective C is a good example of a Smalltalk derived language that was in heavy use for a while. Even though it's technically a superset of C, in actual use it's more like a message passing language.
It's really Smalltalk and C pasted together with hardly any mixing, like JSX=JS+HTML. I can't see how you could put it in either column, it's absolutely in both.
Ruby is mostly a Self in this classification, right? Or maybe a Self/ALGOL hybrid. It does have iteration as a method (.each), but also has traditional if statements.
To some extent yes, it's certainly influenced by Smalltalk. I just wish the author hadn't focused that much on control flow in that paragraph.
Ruby afaik also supports live programming up to a point (REPL), but it's still not the same live-ness as in Self or Smalltalk.
Most of these are too old for me but ive always had this idea of fundamental language paradigms too.. my ur-languages are maybe Python, Lisp, Haskell, SQL. If you drop the requirement to choose the oldest language in each paradigm (which is more practical from a learners POV), there are many sets of basis vectors (languages) that cover the entire vector space (all language features).
I think mentioning Simula would be fair.
The first object-oriented language.
""
Simula (1967) is generally accepted as being the first language with the primary features of an object-oriented language. It was created for making simulation programs, in which what came to be called objects were the most important information representation.
I also think a lot of programmers could get a sort of enlightenment by getting very proficient in SQL. (To the point where you can do general programming in it using recursive CTEs if you have to, even though probably impractical for real use cases.)
TLA+ and PlusCal are specification languages, not programming languages. Spec languages have their own ur-languages: TLA+ is from the LTL dynasty, Alloy and Event-B are from Z, PRISM is CSP I think?
Respectfully (you having literally wrote the book!) I disagree that there's a strong difference between specification and programming languages -- and specifically I feel that the "gap" between verification and implementation can be filled by recognizing this.
To take TLA+ as an example -- I've certainly written (short!) programs in both TLA+ and PlusCal -- both imperative (typically an implementation of something that I'm modeling) and declarative (to solve logic problems in the same manner I would use Prolog).
And there's nothing privileged about TLC's capabilities and limitations -- one could equally imagine a TLA+ interpreter which disallows cross-state nondeterminism or use of temporal predicates but allows interaction with the environment via a special variable or predicates, without needing to augment the language, thus allowing "normal" programs to be written and executed. Or a TLA+ interpreter which permits full nondeterminism via a SMT backend, which can act as a richer version of Prolog.
Imagine writing correctness-critical code snippets in TLA+, verifying them, then compiling and linking them into an application written primarily in another language! I don't believe there is anything about TLA+-the-language which would need to change for such a tool to exist.
There are of course language elements specific to verification in TLA+ -- namely the temporal operators pose a challenge to "synthesizing" an executable program. But I don't think this makes TLA+ any less a programming language than the existence of, say, Haskell's type language makes Haskell a programming language.
Verilog sits in a similar space, where it was designed for verification, and is full of constructs usable only for verification, but also has a "synthesizable" subset which can be and is used expressly for implementation without involving verification of any sort. Although one can argue Verilog is a hardware description language rather than a programming language, I haven't seen anyone argue Verilog is a specification language rather than either of those.
You raise good points! Ultimately the technical distinction between a specification and programming language comes down to the existence or not of a compiler. Imagine if Python came out in 1960, but without anybody building a Python interpreter. People would consider it a spec language.
(Also, lots of spec languages are weaker than programming languages, to make verifying them easier. IIRC mcrl2, FDR, and Promela all fall under this category.)
I still think there are practical differences, though, between a language designed for the purposes of specification and one designed for implementing programs. TLA+ has temporal predicates and cross-state nondeterminism because it makes modeling at a high-level easier, even though it makes programmatic implementation harder. That leads to different design decisions through the language.
More importantly to the OP, thinking of them as fundamentally different is useful because they draw on different inspirations. It could be that PLs and SLs converge later, but their origins and lineages are still different. And that's interesting and worth studying.
I don't think talking about ur-languages is as useful as talking about archetypes: "ur-" to me implies "the original", and that's a hard thing to do. TFA has some ur-s like self that aren't the original, but still are useful as archetypes. Off the top of my head, the archetypes of formal speciifcation would be
* Temporal Logic (LTL, CTL, TLA)
* Relational algebra (Z, B, Alloy)
* Guarded Command Language (Promela, SPIN)
* Process Calculi (CSP, FDR)
* Labelled Transition Systems (Petri Nets, mclr2)
* Just drawing a diagram and figuring out the semantics later (UML)
* Abstraction over an existing programming language
This is all real messy, and I see lots of overlapping concepts and unclear cases. Are state machines their own thing, part of LTS, or an implementation detail of the specification approach? Should we be distinguishing between the mathematical formalisms and the languages themselves? What about the vaster world of specifying code and not designs? etc etc etc
Agreed! Phylogeny is messy. And thinking a bit more I realize I'm wading deep into "cereal-is-soup" territory, which is a sure recipe for unhappy diners.
PlusCal is very much in the ALGOL family (with the extension of nondeterminism, which is needed to serve its purpose of verification).
TLA+ is similar to Prolog, extended with a next-state operator and the temporal operators (again, to serve its niche purpose).
(Though the main TLA+ implementation, TLC, doesn't implement logic variables as richly as Prolog does, so coding style within a relation tends to be almost a bit more imperative. They exist in the language though, via the unbounded ∃ operator.)
Yes, having used both TLA+ and Prolog extensively, they share many similarities (both having roots in first-order logic). In practice -- TLA+ code is often a bit more "imperative" owing to the fact that unbounded existential quantification isn't implemented in the TLC interpreter (whereas in Prolog, it is a core aspect of any interpreter).
And of course, TLA+ has a suite of temporal operators used exclusively for declaring contracts in service of its role for verifying specifications. Which (as @hwayne points out elsewhere) stem from mathematics -- Linear Temporal Logic specifically -- and not programming.
The two earliest macro systems that were clearly designed to be general purpose languages that I know of are Christopher Strachey's GPM[2] and Calvin Mooers TRAC[3] programming language. These languages appeared at roughly the same time, the mid 1960s. I prefer the syntax of TRAC, but otherwise they are almost isomorphic. TRAC was featured in Computer Lib/Dream Machines[4] by Ted Nelson where the author said it was one of the three important languages for programmers to learn. A good introduction to TRAC and it's implementation can be found in Études for Programmers[5].
Other more contemporary examples of macro programming languages are m4, and TeX. LaTeX is programmed in the TeX macro system.
[1] Daniel Weise and Roger Crew, "Programable Syntax Macros", ACM SIGPLAN, 1993, https://dl.acm.org/doi/pdf/10.1145/173262.155105
[2] Christopher Strachey, “A general purpose macrogenerator,” Computer Journal, 8(3), pp. 225-241, 1965
[3] Calvin Mooers, "TRAC, a procedure-describing language for the reactive typewriter", CACM, Vol 9(3), March 1966, pp. 215-219, https://dl.acm.org/doi/10.1145/365230.365270
[4] Ted Nelson, "Computer Lib/Dream Machines", 1974, Self-published. (There is a 2nd edition from Microsoft Press, but I'm only familiar with the 1st edition).
[5] Charles Wetherell, "Études for Programmers", 1978, Prentice Hall. (It's out of print and available from Amazon for $427. I'm going to have to start locking up my old books.)