There was a Guy Steele talk a while ago on this topic. He also gives searchable names for some of the things (eg the ‘2d inference rule diagrams’). He calls it ‘computer science metanotation’, though I think it’s more of a PLT thing. https://m.youtube.com/watch?v=dCuZkaaou0Q
> The most popular programming language in computer science has no compiler, interpreter, or complete specification. It is remarkably concise. It grew over decades; today, dozens of variations are in use. Its complexity has reached the point where it needs to be re-explained every time it is used—but too often it is not; instead, each variant is more or less taken for granted. This has led to ambiguities and inconsistencies. Much effort has been spent in hand-translating, in both directions, between this language and other languages that do have compilers, primarily for the purpose of implementing yet other languages.
To me this seems like a pretty sorry state of affairs. The folks trying the hardest to speak precisely about the most precise human art use ambiguous and inconsistent notation. wat?
This is also the notation that is used in logic as well.
The thing about it is that, while every paper can have a slightly different variant of this metalanguage, all texts are supposed to contain an introduction defining exactly what this means. That's because it's common that, if you are talking about a new logic, you will have new syntax as well, or a slightly different way to make inference
The notation goes back to Frege. It's not easy to google if you don't know what you're looking for, but this seems like a reasonable summary: https://plato.stanford.edu/entries/frege-logic
You can see that the turnstile symbol |- was already used, and the horizontal line (which was called "Fregescher Schlussstrich", Frege's Conclusion Stroke) in my class, was apparently somehow part of the turnstile itself and only became a separated element in modern notation it seems.
Agreed, I stumbled through a course on essentially TAPL at university, and bought & read Pierce's TAPL to help. It's a great book, but the submission would've helped me a lot as an additional primer. I didn't have the background in mathematical logic for it frankly.
(It was a third of fourth year CS course, but where everyone else would've had more formal logic in first and second, I'd had the electronics part of my programme, which was actually based in EE. I naïvely thought I'd read enough to have the requisite background and I absolutely hadn't, or at least not to understand & follow fast enough.)
Despite a CS major, I still get confused about the meaning of |– vs. |=, and about the respective meta-syntactic levels of the variables used. Ironically, one issue for me is the lack of explicit typing in the notation itself.
This reminds me that the “model” terminology also confuses me. A “model of X” for me is an idealized and simplified/abstracted version of X, whereas in mathematical logic it seems to be the opposite, a concretization/instantiation.
To anyone hesitating to read this: it's an explanation of type system notation in Computer Science papers, basically a primer on BNF notation, inference rules, etc for type systems.
Honestly, I just need a cheat sheet for how to read the symbols using english words. I understand the concepts of the logical application of typing, but I don't read computer science dissertations regularly enough to make them the mapping of symbols to meaning stick.
> Honestly, I just need a cheat sheet for how to read the symbols using english words.
Amen… however I believe the existence of such a cheat sheet would represent an existential economic risk for PLT researchers, and they’d be incentivized to add more Greek (and, eventually, Cyrillic, then Czech, then eventually Xhosa and probably Klingon) symbols in order to retain access to funding. It’s only a matter of time before someone reinvents the paper town inside of PLT papers, if they haven’t already.*
* the above is meant to be taken facetiously
> I understand the concepts of the logical application of typing, but I don't read computer science dissertations regularly enough to make them the mapping of symbols to meaning stick.
I just wish CS (and, indeed, mathematics) was even remotely internally consistent… cross the boundaries between any two extremely narrow domains and ω can mean anything. It’s (very) vaguely consistent within PLT, but even there getting lowercase omega “stuck” in your head as having one English translation is just a pathway to pain.
I read `A |- x : t` as "In context A, x has type t". An entire inference rule (i.e. the horizontal bar separating some premises from a conclusion) is read as as "if P, Q, ... hold, then R holds"; or, sometimes, "We obtain R if we have P, Q, ...".
That works if typing rules are unambiguous. But you might have |- x : t1, as well as |- x :t2 (for example, numeric literals in Haskell can have several types).
Instead of saying "x has type t", one could think of it as "x can be inferred to be of type t" or "we can derive/prove that x has type t". The latter mirrors the usage in formal logic where the turnstyle symbol generally is used to mean "is provable".
This is all true, although nothing about the reading "x has type t" implies that "x" can only have one type. It is a separate metatheorem about your system of inference rules that terms have unique types -- and not even one that is always true, e.g. in the presence of subtyping.
> for example, numeric literals in Haskell can have several types
Technically, numeric literals in Haskell have precisely one type: `(Num a) => a`. The literal `42` is actually interpreted as the expression `fromInteger (42 :: Integer)`, where `fromInteger` is a function on the `Num` typeclass. In turn, the `Num` typeclass constraint desugars to a function taking a dictionary of functions, `NumDict a -> a`. The typeclass machinery just means GHC will infer the appropriate instance whose dictionary should be provided.
(technically technically, GHC can be even cleverer if it knows statically which dictionary will be used, avoiding using the dictionary entirely; but this desugaring to dictionaries is ground truth, semantically.)
One of those rare comments that I can't intuit the meaning of even though I know the definitions of all the words. In so far as this is math, this is completely, perfectly, thoroughly, abstract.
So in reality did you mean to say reified ie made concrete? Or did you mean to say obviated ie made irrelevant? And you think computer science did this? The thing where the point of the game is to build more and more abstract representations of things?
Like what do you think the word "abstracted" actually means?
The comment is praise for maintaining compatibility. To me, the sentence "It’s a testament to computer science that this has been so thoroughly abstracted over the years" means: "It’s a testament to computer science that the original notation has been successfully applied to many things without drastic changes". I don't know whether that is one of the standard definitions of the past tense verb "abstracted", but it's a common usage in software engineering.
> has been successfully applied to many things without drastic changes. but it's a common usage in software engineering.
give me an example sentence. let me try one: "<protobufs> have been abstracted over the years because they have been applied to many things without drastic changes". doesn't work
the only way in which "abstracted" is used in SWE is "abstracted away" or "abstracted out", which means ~the details aren't important or ~it's being handled by someone/something else for you. having written out this response i am now 100% sure it's the former in this case.
no one seems to get that what i'm making fun of here is this person reveling in the fact that they don't have to understand type theory "like our forefathers did" but then stating that fact using theory speak. it's like which is it: do you care about abstractions and abstracting things out (and thus should be fine with type theory) or are you a blue collar dev that just copy-pastes. you can't have it both ways (certainly you might try but this leads to the cognitive dissonance on display here).
It seems clear enough to me what the GP might have meant, at least clear enough that it need not be responded to with remarks like "Like what do you think the word "abstracted" actually means?" that reads to me as saying that, whatever the GP thinks, it is wrong.
Type notation presumably started off as a very concrete thing—probably just about anyone can read something like BOOL = TRUE | FALSE. But then—I'm speaking here from a general perspective of computer-theoretic notation, so it's entirely possible that these are not actual notations that occur in type theory—things were added like syntactic sugar, say, for character classes, about whose abstract-ness we can argue but which definitely adds a decoding layer for people not used to it; and things like parametrized types/higher types that very definitely do add abstraction. It's very likely that, almost any time a definition benefits from an extra side remark, "For example, …", it is illustrating an additional abstraction.
> let me give you an example sentence and you tell me if the word "abstracted" makes sense, in context:
> "the accounting spreadsheet abstracts away all of the calculus because all the numbers are written down exhaustively"
I'm not even sure what the meaning is supposed to be, because I would not expect to use a spreadsheet to do calculus. (Of course there are ways of forcing calculus into a spreadsheet, but I wonder if you just meant 'calculation'.) Here I would say that it is absolutely relevant to speak of a spreadsheet abstracting away the details of a computation: the logic of the spreadsheet will probably refer often to the contents of cells, without worrying about what the contents are, and that is very definitely a kind of abstraction.
I would certainly not call the process of entering numbers into a spreadsheet an abstraction; but it seems to be the opposite of what type theory does. For example, type theory allows general reasoning about families of types, without reasoning case by case for each individual one. This seems to be the very essence of abstraction (in which capacity it is, conceptually, not completely distinct from the formulas in a spreadsheet).
If you see no abstraction, only concreteness, in type-theory notation, I suspect that is more an indication that you have learned to see through the abstraction than that it isn't there.
how does it make sense to utter the sentence "this has been abstracted [away]" when the `this` is itself the essence of abstraction.
i'm not really sure how people aren't getting what i'm putting down here: type theory can't be abstracted away because type theory itself is the thing that does the abstracting.
> > It’s a testament to computer science that this has been so thoroughly abstracted over the years
As your brackets indicate, you are reading a word "[away]" that is not explicitly present in the text. It seems to me to make more sense to read it without adding "away". (Wiktionary, for example, does not require that, or any, preposition in sense 6 of https://en.wiktionary.org/wiki/abstract#Verb, which is the one that seems to me to fit here.)
For example, I am a mathematician, and I could say that mathematics has been abstracted over the years (millennia!). I do not mean that the mathematics has been abstracted away, which I agree with you would be in some sense meaningless, but rather that the mathematics itself has literally been abstracted: from only being able to count and do arithmetic with specific real-world objects, to being able to do count and do arithmetic with numerals that did not refer to a specific thing, to being able to use letter to refer to numerals of unknown or variable value, to … well, there's not such a clear linear progression after that, but there's definitely a tower of abstraction. I hope that this sense of "mathematics has been thoroughly abstracted over the years" is an intelligible one (even if phrased differently from the way I assume you, or perhaps even I, might phrase it ourselves). It is in that sense that I take the sentence "It’s a testament to computer science that this [meaning, I think, notation in type theory] has been so thoroughly abstracted over the years".
1. Separated or disconnected; withdrawn; removed; apart. [First attested in the mid 16th century.]
2. (now rare) Separated from matter; abstract; ideal, not concrete.
3. (now rare) Abstract; abstruse; difficult.
4. Inattentive to surrounding objects; absent in mind; meditative.
three of these four connote the same thing as "away".
i mean what's the point of fighting so hard to make up an interpretation of this word that doesn't exist when the obvious interpretation clearly fits: this person is happy that the type theory isn't relevant anymore.
> i mean what's the point of fighting so hard to make up an interpretation of this word that doesn't exist when the obvious interpretation clearly fits: this person is happy that the type theory isn't relevant anymore.
I agree that it's probably not worth two or more people, none of whom is ljm, fighting over what ljm might have meant, but neither I nor hathawsh (https://news.ycombinator.com/item?id=37149803) reads the comment the way that you do, so I am not so sure that this is the obvious interpretation, and I am fairly sure that it is not the most charitable interpretation.
The interpretation I reference in https://news.ycombinator.com/item?id=37152933does exist, and is supported by a dictionary definition just as yours is; so, without addressing whether it is obvious or whether it fits—surely subjective matters—it is certainly a possible interpretation, just as yours is.
anything is possible, including that none of these words are actually the words we think they are and we're actually communicating across an interdimensional gateway that's doing the translation for us. it is certainly possible, but just like with which interpretation of abstracted to take, it's certainly not probable.
Type theory started being developed before CS and computers were a thing. It was developed as a foundation for logic and math when the naive set theory was shown to be self-contradictory. Its notation developed with the general development of math notation as logic and model theory evolved over time.
Lambda calculus is an abstraction of computation - but it needs to be defined, somehow, and you can use prose ("a term is a character representing a parameter or value, an application of a term to another term is expressed by writing the two terms with a space between them") or rules ("term := x | y | z ; app := x x") where the rules are an abstraction of the ideas in the prose.
Similarly you can describe substitution with words or a formal syntax. The latter is usually more concise and less ambiguous.
Lambda calculus is arguably far more abstract because often type theories imply a lot of implrmentation constraints, which makes sense since that's kind of the piount.
> 𝗍𝗋𝗎𝖾+2:𝖨𝗇𝗍 means “𝗍𝗋𝗎𝖾+2 has type 𝖨𝗇𝗍”, which makes even less sense, as the expression 𝗍𝗋𝗎𝖾+2 is nonsense and does not have any type at all.
Well, Python called where `True + 2` is actually an int and is equal to 3. :) This doesn't say anything about whether this should be the case or not, just whether it is.
If you think that True + 2 makes sense, you can just define your own judgments that allow it (*). Logic, and the theory behind type systems, don't care about your axioms and inference rules, they just let you reason about them, and their interactions.
(*) Something like
|- True : Bool
|- True : Int
or, if you only want to allow it in certain expressions
Even though 'True + 2' doesn't throw an error in Python or C, it's still stupid because it makes the semantics of the language hard to reason about, in order to give programmers a little syntactic sugar.
This is for backward-compatibility reasons. The comparison operators in Python produced 1 or 0 before boolean was a thing, and programs existed that exploited this, i.e. you can do something like `(a < b) - (a > b)` in a sorting algorithm. Same thing in C, or even more so, as the comparison operators produce an `int` instead of a `bool`.
>Even though 'True + 2' doesn't throw an error in Python or C, it's still stupid because it makes the semantics of the language hard to reason about
It's consistent. If we add a 64 bit integer and a 32 bit integer, we'd expect to get a 64 bit integer as the result. Same with if we add a 64 bit integer and a 4 bit integer. So why shouldn't it also apply if we add a 64 bit integer and a 1 bit integer? Mathematically, bool is a 1-bit integer with fancy names for 1 and 0.
Because booleans are not 1-bit integers. They're not integers at all.
The only thing computers represent directly are ones and zeros. 1 and 0 are not integers either; they're just high or low voltage levels on wires. The fact that we label them "1" and "0" is just for convenience; it doesn't make them integers.
Even addition itself is an abstraction. As far as the CPU is concerned, addition is just a multi-bit XOR operation coupled with some AND operations.
We abstract sets of 1s and 0s to mean numbers, pointers, characters, booleans, symbols, arrays, structures, etc. and we abstract operations on them as well. If we don't make our abstractions explicit, we cannot reason about what results to expect from a computer's operations, nor can we enlist computers to help us with that reasoning process.
By your argument I might as well add integers to strings or arrays or structures -- they're all just bits after all. We might as well not have types at all and just let the programmer keep track of all operations manually. That's assembler programming and it's extremely bug-prone and it doesn't scale.
Sometimes I just hate it when somone gives away my hard-earned arcane knowledge for free ;) I sure wish this post existed when I learned this stuff. I hope increased accessibility leads to less shit languages though.
When I was reading the Ada Reference Manual[1], I immediately recognized this kind of syntax. I didn't actually know what it was called, but it's interesting to see it in practical use. The whole language is defined in that kind of notation.
The Ada Reference Manual specifies the notation they use. They're using a variant of Backus-Naur Form and they describe their particular variant in the section I linked above.
This seems like maybe a right place to evangelize [1] the only hill I've decided to die on: the formatting of type annotations that use a colon in their syntax. Specifically, there should be an equal amount of space on both sides of the colon. I keep meaning to write a proper essay, but this is the essential argument.
I think of it this way: there are two symbols that by coincidence are written the same (two aligned dots). I mean that genuinely (as in, I actually think of them as separate things), but it also helps as an explanative device.
The first I'll call the prefatory or label(ing) colon. Its usage matches usage in English [2], where the text preceding the colon introduces the text after the colon, or the thing on the left is a label or name for the thing on the right. For example, it is used to start a block (as in Python), or to define a key-value pair, or define a name-value pair in a struct (as in C or Rust).
The second is the type annotation. This syntax is borrowed from mathematics. It is a binary relation, and binary relations are written with equal space on the left and right. Just as you'd never write "x= 1" (another relation), "x> y" (relation), or "x+ z" (operator), you'd never write "x: X". Instead you write "x = 1", "x > y", and "x + z" or "x+z" instead.
Whenever I see "a: b", I immediately think labeling colon. Despite having seen it thousands of times, I always have to perform an additional mental step (however trivial) when it turns out to be a type annotation; it takes a small amount of mental energy to dismiss each instance of syntax that looks wrong.
[Truthfully, I'm somewhat baffled how it would ever even cross anyone's mind when designing a language to write type annotations like "x: X" given the long-established, pre-existing precedent in mathematics and the way its semantics seem backwards (if you had to use a labeling colon, "X: x" would make much more sense to me).]
edit: I should be explicit that I'm referring to programming language syntax, and that I much prefer "x : X" over "X x" (there are good reasons I think for the type to be on the right-hand side, but other people have already written about that).
[1] "Evangelion" is a really lovely word, from εὐαγγέλιον good news = εὐ- good + ἄγγελος messenger, the double gamma pronounced even in ancient Greek as "ŋg" ("ng") rather than "ːg" ("g").
You may be operating under several misconceptions. Firstly the notation f: X->Y, putting more space to the right of the colon does occur in mathematical writing (admittedly only 1 of the 3 books I checked used that notation exclusively). Secondly that is still a labelling, you're labelling a map of a specific form (it's almost never used to label anything else, except when dealing with types but that's begging the question).
I suppose it's hard to tell the difference between labelling and stating a property, if I had to give an argument I'd say that you rarely see a formula on the left hand side, except when defining new notations (e.g. let f+g : X -> Y be the map defined by (f+g)(x) = f(x) + g(x)). Compare this to something like 'let x^2 + y^2 = 1' which kind of functions as a labelling (or omits it really) but is really more stating a proposition.
One alternative meaning in mathematics for the colon that does truly differ is using ':' as shorthand for 'such that', though sometimes people use a . or no symbol at all. This is commonly used when defining a set by some proposition e.g. { x : x \in IN and x | 2}, but also sometimes when using quantifiers.
f: X->Y means f maps an input from domain X (set of all real numbers, say) to output range Y (set of all positive real numbers, say, f could be 'abs').
I think you might be thinking it's LX.Y (L for lambda)? But consider X=Y, LX.X is the identity function, some f: X->X could be the identity function, but it could also be anything else that returns a value of the same type as its argument - it could be 'add1' in this example of X being the reals, for example.
I'm not quite sure why this seems to be so controversial, except for the light abuse of notation in my english. Saying that f: X->Y means f maps an input from domain X to output in range Y is to say that we are defining a function -- and labeling it f, since it could just as easily be labeled g and still be the same function. Defining is naming -- naming is labeling [with scope].
It's not defining a function though, it's describing (by one aspect) a whole class of functions.
Take:
f: ints -> ints
f can be 'add1', 'identity', 'double', etc.
If you know that, and calling it 'defining a function' is just loose English (you didn't mean completely defining) then fine, but that's what's confusing people.
It read to me like you thought it was equivalent to a function definition in code, X & Y described what would actually happen over different inputs, so I was just trying to explain, didn't mean to offend if you knew already.
Oooh, I see what you're saying. For want of an "s". Yes, in this context, I am aware that it is a whole class of functions -- and that f is a label being affixed to that class of functions. Sorry for the confusion -- that makes much more sense now.
No offense received -- was just genuinely confused as to why it netted two comments and a few downvotes for a seemingly uncontroversial statement. But then again, we're talking mathematical logic here, so loose english is of course reasonably discouraged.
Thanks for the clarification, and I hope you have a great evening!
That is an interesting view. What you said about the extra mental step is true for me when I read the common "X x" notation, which is astonishing because I saw it vastly more often than "x: X". The later reads very naturally for me and I suspect that at least for my brain it is close to how the colon is used in natural language. Often you have a proposition and what follows after the colon adds detail to proposition or describes it in more detail - exactly like a type is additional information about the thing to its left.
Pretty sure it's standard practice to always do "t[space]:[space]T" with equidistant spaces between the colon. In spite of Type Theory generally being an inconsistent mess, this is a rare case where everyone seems to be fairly consistent. I was actually curious how I did it in undergrad, and even I had the common-sense of making it look nicely symmetrical[1] :)
Ah, I should've been more explicit that I was referring to programming language syntax. For example, this is unfortunately not the convention in Python, Rust, Swift, or TypeScript.
Ah gotcha, I actually 100% agree here. I think both the parser (and my brain) would probably be better served by space on both sides of a variable : type (or type : variable) declaration.
For my part, while I have some mixed feelings about introducing whitespace signifigance, I sympathize w the OP and in TypeScript projects, have sometimes wished for more intuitive disambiguation. As for conventions, I might prefer
I try to do this at home [eg 0] and it flabbergasted some coworkers when one such notation accidentally slipped into a day job PR last year. I didn't try to argue for it- but I like it personally, mainly for making control flow easier to skim.
That doesn't match my experience with the usage of a labeling or prefatory colon (continuing with my ad hoc terminology). I'd read it as
Here's an x: X.
– which is nonsensical (the type X is not an example of the instance x; indeed, it's the other way round), or as
It's the case that x: more detailed things that explain or follow from x are X.
– which is also backward (since a type is a more general thing than one of many specific instances of the type), or as
Here's a quick premise x: the main point is X.
– which is wrong because x is more relevant (because it's a local variable, whereas X is likely global and shows up in many places; indeed, within a particular scope many variables may simultaneously have the same type, but a name refers unambiguously to a particular variable), or as
I think I would completely agree for you if it were a definition and not a type annotation!
That is, "x: 5" instead of "x = 5" (or "x ← 5", &c) looks fine to my eyes. And of course, that's how values for the members of a struct are in fact defined in many languages (such as Rust).
I suppose the underlying point is that has-type is a relation, not a definition. Asymmetrical space around a binary relation violates (my) experience, but it's also a bit confusing to use the same syntax for definition and type annotation.
You may have a broader scenario in mind, but my experience is limited to languages like Rust and C where the type is linked to an object the moment that object is declared for the first time, and can't be changed later. These languages do use the same syntax for definition and type annotation.
I will remember this conversation when I encounter a language that allows this (declaration in the first line, type annotations follow):
var x
x : Integer
x = 5
...
x : String
x = "hello"
> Truthfully, I'm somewhat baffled how it would ever even cross anyone's mind when designing a language to write type annotations like "x: X" given the long-established, pre-existing precedent in mathematics and the way its semantics seem backwards (if you had to use a labeling colon, "X: x" would make much more sense to me).
What maths precedent are you refering to? The thing I would consider most similar to today's type annotations are statements like "x ∈ ℝ" or "let x be a set of integers". In both cases the variable comes first.
Besides, it's much more practical this way. An identifier is very simple to parse and usually short, while types can be long monsters.
Putting the type first would be weird in languages that do type inference. Because there you sometimes leave out the type info, but need to add it occasionally. Then it makes sense to append it, rather than pre-pend.
I had assumed ocaml had the space because you put spaces on both sides of the colon in French.
I don’t particularly disagree with you but one thing I think about is the symmetry of a relation or operator as written and if it’s meaning. Eg I prefer an assignment operator to not have a vertical line of symmetry. The other operator examples you give like < or + or = take the same type on both sides. : doesn’t even take the same kind on each side (lbs being a value and rhs a type). Maybe it should be written in an asymmetric way because of this.
I guess because StackExchange has higher Google ranking and allows comments, easy edits, supports latex etc. It's a great platform for stuff like this.
Also, the langdev SE is still in beta, and pretty early beta at that; posting useful material there helps get it off the ground. That said, I wouldn't be surprised if Alexis also posts it to her blog at some point.
It is used in The definition of Standard ML to define the syntax and semantics
of the language. That is the most "in the wild" usage I'm familiar with in the sense that it gives a formal specification of the language, which has been used across multiple implementations of the language successfully...