Worth noting that bjz is a compiler hacker, and this is a list of "languages" from a language implementor's perspective. From the perspective of a user of the language, nearly all of these are just "Rust, the language"; to a user there's no such thing as e.g. a "trait language" or an "attribute language", there's just Rust, which has traits and attributes in the same way that JavaScript has objects and closures without having an "object language" or "closure language". Of the things that can be described as actual languages there, Unsafe Rust is just a minimal superset of Safe Rust, whereas Const Rust is just a subset of Safe Rust; it's not three languages to learn, it's one language where a handful of things are either available or not in certain contexts. The one thing that legitimately is its own language is the macro-rules language, which is a DSL for syntax manipulation.
Not quite sure what you mean by sublanguages - procedural and declarative macros perhaps?
In any case I wonder about the simplicity/complexity dance. Simplification ends up with something like Lisp, in which you write DSLs that each has to be independently learned. Elixir does something similar. Some differentiation might just be a natural consequence of chasing ergonomics.
If I had to guess: Safe Rust, Unsafe Rust, proc macros, declarative macros, the attribute language, and maybe async Rust (although I'd really just consider this sugar on Safe Rust).
I love Rust, but I would love to see more syntactic unification between these components. Both macro languages, in particular.
As far as I'm concerned any programming language either has a good, builtin, type safe macro system, or people are going to use code generation (on large complex projects). And I never want to have to work on something with custom code generation.
Typescript seems to have embraced the third option, which is having such a batshit insanely expressive dynamic type system that you don't need either 99% of the time while still having proper type safety.
Unsafe Rust is a minimal superset of Safe Rust, proc macros aren't their own language, there's no such distinction between Rust and "the attribute language", and async Rust isn't its own distinct thing, it's all just Rust.
Unsafe Rust is, in the Rust Language book's own words, a "second hidden language" within Safe Rust[1]. You're absolutely right that it's a superset; that's what makes it distinct and therefore a unique language with unique semantics (even if the syntax is nearly identical).
Procedural macros can be used to define new languages within Rust, which means that grokking them requires the developer to understand their expressive capacity. For example, the paste[2] crate introduces a little bit of novel syntax for combining identifiers.
Attributes are the counterpart to Unsafe Rust: their syntax is described separately from the rest of Rust[3]. Understanding how to use them involves components of Rust that aren't tied to the syntax of programs (e.g. doc and feature attributes, which connect to Rust's standard tooling instead).
I agree that async Rust isn't its own distinct thing. I threw that one in as a possible interpretation, to make the count work.
> that's what makes it distinct and therefore a unique language with unique semantics
Unsafe Rust doesn't have different semantics from Safe Rust, it's the exact same language but it allows more operations. Which is to say, for any given expression that compiles with Safe Rust, if you wrap that expression in an unsafe block it will have precisely the same semantics. For the book to describe it as a "second, hidden language" is being a bit fanciful on its part; it later goes on to clarify that it just gives you a handful of extra powers.
> Procedural macros can be used to define new languages within Rust
Sure, but by that logic, Rust (and every other language that supports DSLs) has infinite languages inside of it, which does not amount to a particularly useful distinction; either a language supports crafting DSLs or it doesn't. To suggest that this means that the language itself has infinite sublanguages obscures its actual sublanguages, such as the built-in macro-rules DSL.
> Attributes are the counterpart to Unsafe Rust: their syntax is described separately from the rest of Rust
This is a misunderstanding of the reference manual (which is itself non-normative and fairly ad-hoc in structure); most pages contain their own grammar specification, e.g. the syntax for items is described at https://doc.rust-lang.org/reference/items.html , but there is no user-facing item sublanguage in the same way that there is no user-facing attribute sublanguage. Attributes are attributes, they're just a part of Rust.
> Unsafe Rust doesn't have different semantics from Safe Rust, it's the exact same language but it allows more operations.
"Allows more operations" seems like a straightforward meaning for "different semantics" to me. In Safe Rust, you can't access union members. In Unsafe Rust you can.
I agree that it's a little fanciful, but I also don't think it's really wrong.
> To suggest that this means that the language itself has infinite sublanguages obscures its actual sublanguages, such as the built-in macro-rules DSL.
In many languages, the macro language is either homoiconic or nearly homoiconic. In Crystal, for example, the macro DSL looks like normal Crystal, but with a few extra sigils. You can't invent new syntax in it.
When we talk about understanding C we generally take that to include the C preprocessor, despite the latter being a conceptually separate macro language. I think the same standard (except much, much better) applies to procedural macros in Rust: you need to understand a separate set of token production and consumption rules that interact but do not align with Rust's core syntax.
> This is a misunderstanding of the reference manual
I wasn't referring to the production rules themselves; I'm aware they're on most pages. I was referring to the fact that the attribute language is explicitly taken directly from a separate language (C#).
> "Allows more operations" seems like a straightforward meaning for "different semantics" to me. In Safe Rust, you can't access union members. In Unsafe Rust you can
The semantics are the same. The meaning of the operation is same. In both safe and unsafe accessing union members is the same operation with the same meaning. The only difference is that in unsafe Rust you are allowed do this. This is a difference in permission, not meaning
Unsafe Rust is just Rust, but there's a few extra things you can do inside unsafe blocks that you can't normally do outside them (because they're unsafe). I wouldn't call it a different language.
See my response to 'kibwen. We can nitpick all day about whether it's really a separate language; I think it satisfies at least one important definition of "language" by having its own separate syntax and semantics that aren't accessible to Safe Rust.
But by that argument every feature in a language is it's own separate language. "This is the language that can only be written inside function definitions", "This is the language that can only be written inside ternary operators" and so on.
The list I gave was the ones I've heard people in the Rust community, including official Rust language documents, refer to as separate languages. Does that guarantee that we're carving the nature of "computer language" at its joints? No; it's merely a shorthand for expressing the range of syntaxes and semantics that you need to know to "fully" know Rust.
It's the same syntax. There are no special operators in unsafe rust. All unsafe code is syntactically valid safe code. The reason it will not compile it because certain operations are not permitted, not because of syntax errors
These two are the exact same language. Some features (like calling unsafe functions, or dereferencing raw pointers) are "locked" outside unsafe blocks, but syntactically and semantically they are identical. It's easy to show this: put a unsafe block around a normal "safe" block, and all that happens is an extra "this does not need to be an unsafe block" warning from the compiler.
My other responses say this in more detail, but: they're not the exact same language if they have separate semantics and syntaxes, which they do. The fact that they can be composed together (much like C and the C preprocessor) does not make them the same language; it just makes them compatible.
There's something wrong if a language is so unsuited to programming itself that it needs another language - or five more languages - to get the job done.
This is one of the great things about JavaScript - it's flexible enough to do everything in its ecosystem.
I don't think that's a good metric. Most of the most successful programming languages of all time have had at least one macro or attribute language attached to them.
I find it hard to believe that there's no way to get the benefits of Rust without 5 languages.
Why can't a programming language be one language that does everything it needs to do? Seems crazy to me to have 5 or 6 sublanguages.