Hacker News new | past | comments | ask | show | jobs | submit login
The Derivative of a Regular Type is its Type of One-Hole Contexts (2001) [pdf] (strictlypositive.org)
109 points by pxeger1 on Jan 12, 2023 | hide | past | favorite | 39 comments



Some related links (specifically that they perform "high school" algebra on types):

- Seven trees in one [0], a paper showing a bijection between binary trees and seven-tuples of binary trees.

- The algebra of algebraic data types [1] discusses Taylor expansions of data types.

- Differentiation of higher-order types [2] is commenting on OP's paper.

I used to be really interested in type theory, but haven't had as much time to explore it, but over the years I've saved these links (and the one posted).

[0]: https://arxiv.org/pdf/math/9405205.pdf

[1]: https://codewords.recurse.com/issues/three/algebra-and-calcu...

[2]: http://conal.net/blog/posts/differentiation-of-higher-order-...


Seven trees in one [0], a paper showing a bijection between binary trees and seven-tuples of binary trees.

Is this a surprising result? I am not quite interested enough to read through the paper in order to find out. Naively I would say, sounds maybe a bit fiddly, could have a clever trick to do elegantly, but does not sound too surprising or hard.

Build a complete binary tree with six nodes and attach the seven trees from the tuple as children. This is not a bijections as trees without the complete tree do not map back into any tuple of trees, so fiddle a bit with those six nodes to encode some special tuples or classes of tuples. Okay, now you have a new problem, two encodings for some tuples, so we have to also fiddle a bit with the trees before attaching them. Yeah, that could get fiddly...


It's definitely surprising, for a couple of reasons: 1. It isn't just the uninteresting result that the set of trees has the same cardinality as the set of 7-tuples of trees; the bijection here is given by a finite, non-looping program built out of `isEmpty : Tree -> Bool`, `getLeft : Tree -> Maybe Tree`, `getRight : Tree -> Maybe Tree`, and the constructors `empty : Tree` and `join : Tree x Tree -> Tree` 2. It isn't true for any number 1 < x < 7 3. In any case, why should it work out exactly? Why not "one tree can be encoded into seven trees, or one of these 13 remaining cases"?

The paper is quite good, but Dan Piponi has a great blog post that recasts the isomorphism as a game of "nuclear pennies", which is a fun puzzle to work out yourself: http://blog.sigfpe.com/2007/09/arboreal-isomorphisms-from-nu...


Nice puzzle. I found a 18 step solution but of course somebody posted one in the comments years ago.

Why not "one tree can be encoded into seven trees, or one of these 13 remaining cases"?

This direction is no problem, at least if we accept a graph with zero vertices as a tree which is probably non-standard as it should then have -1 edges. But if we allowed it, then a trivial mapping would be T -> (T,Ø,Ø,Ø,Ø,Ø,Ø). The other direction is much more problematic, one can easily map n trees into one, but then one gets stuck with either some trees that do not map back to any tuple or some tuples that have multiple representations. So now I am really curious what makes 7 special, so I will probably have to read the paper after all.


Why doesn't this work for x=2?

Let T denote the space of binary trees.

Any 2-tuple of trees (L, R) can be written as a binary tree with left node S and right node T, so we have an injection from T x T to T.

For the other direction we can embed a binary tree S as (S, {}) (with {} an empty binary tree).

By Schreuder-Bernstein there is a bijection between T and T x T.


It gives a set-theoretic bijection, but not one that plays by all the rules I mentioned above. In particular, you don't get a bijection that corresponds to a finite, non-looping program.

Generally speaking, Cantor–Schröder–Bernstein does not give you a (finite) construction of the bijection, because you have to follow the inverses of f and g back until you either end up with something from the first set with no preimage under g, or something from the second set with no preimage under f, or you find that you are iterating forever. You have to make a decision based on whether or not a certain computation terminates. That's essentially why it is a non-constructive theorem (in fact, it even implies the law of excluded middle)

But in your specific case, we're kind of in luck. The injections are simple enough that you can work out the bijection that you'd get from König's proof manually, and it goes like this:

Concretely, let's write (AB) for the tree with left subtree A and right subtree B, and write . for the empty tree. Your injections are f(A,B) = (AB) and g(T) = (T.). Alternately applying f/g partitions the set of trees into a bunch of infinite sequences. Your bijection is given by: if T is part of the sequence that begins with the empty tree, pair T with (T,.). If T is part of some other sequence, then T is not the empty tree; it is of the form T = (LR), and you should pair T with (L,R). Concretely, the sequence that starts with the empty tree looks like:

. -> . . -> (..) -> (..) . -> ((..).) -> ((..).) . -> (((..).).) -> **

In other words, the single trees that appear in the empty list's sequence are the fully-left-leaning trees like ((((..).).).); all other trees are in the other sequences. So to decide where your tree goes in the bijection, you have to do this:

if (T is fully left-leaning) then (T,.) else (left-child(T), right-child(T))

And computing whether or not T is fully left-leaning involves an unbounded amount of computation. You have to actually walk the whole tree. So this bijection won't correspond to a finite, non-looping program. In a sense, the algorithm you get from Cantor–Schröder–Bernstein is not "continuous", but the one you get from the Seven Trees In One construction is.


Maybe Schroder-Bernstein could simplify. Easier to write 2 injections.


That theorem is not valid constructively, so no way to make a computer program out of it. I.e., there is no program that given types A, B and injections (f : A -> B) and (g : B -> A) produces a bijection A -> B.


I'm pretty sure it is.


But from reading the first bit of the paper I think they are talking about something stronger, not any bijection but one that only requires inspecting and manipulating a bounded number of layers of the trees.




I think lenses/optics have stolen some of zippers' (very localized) thunder in recent years, but I still enjoy finding excuses to use them every once in a while. (OK, so really just a couple Advent of Code problems.)

I have never actually had the wherewithal to go through the whole automatic-derivation-of-zippers thing, but this (https://stackoverflow.com/questions/25554062/zipper-comonads...) StackOverflow Q&A (along with all of Conor McBride's SO answers - compiled at https://personal.cis.strath.ac.uk/conor.mcbride/so-pigworker...) might be of interest in conjunction with the linked paper.


I like the bit at the end giving credit to trains:

> I feel very lucky to have stumbled on this interpretation of differentiation for datatypes with such potential for both utility and fascination, and has been intriguing me since the day I made the connection (whilst changing trains at Shrewsbury). This work has benefited greatly from hours spent on trains, [...]


My favorite is from Bob Atkey's Quantitative Type Theory paper:

> This work is dedicated to Orwell the dog. Orwell was a good dog and knew well the difference between zero, one, and many

https://bentnib.org/quantitative-type-theory.pdf

(zero, one, many was the ring that the paper used.)


I’m so bummed that I can’t understand this paper, coming from the lisp camp. Do I need to go to conferences or something to understand this? I read “learn you a Haskell” but this seems to be a bit beyond that. I can’t tell if the concept of “derivative of a type” is an overload of calculus operators (totally fine) and there’s no actual relationship or if this is a serious suggestion of a “derivative”. Huge fan of ADTs, though, great way to write composable code, so if anyone has some insight on how to get some insight, I’d love to hear it!


Maybe with some examples:

https://wiki.haskell.org/Zipper

https://en.m.wikipedia.org/wiki/Zipper_(data_structure)

To give you an example: the zipper of a list is two lists - one for the elements after the current one (the hole), another for the elements before the current one, plus the current element.


Hey sorry this is late but thank you so much! The paper on zippers was especially helpful.


The Erlang version might be more familiar to you https://ferd.ca/yet-another-article-on-zippers.html?


Slightly related but, Jimmy Koppel teaches 'Algebriac Refactoring' where you use tricks from factoring polynomials applied to the type level of functions to yield better designs. It's super cool stuff.


Any link you could share to check this out? Quick search leads to a youtube video of his, but the audio is so bad it's unwatchable (to me at least), maybe there is something better out there?


Sorry, it was part of his software training. Not seeing a public version out there. Maybe he'll jump in with a link I've missed.


I can't find any such thing in his research papers. Wonder why he didn't publish on it, even a functional pearl. Really liked his paper "Semantic Code Search via Equational Reasoning".


Reminds me of this beautiful 1964 paper by Janus Brzozowski where he applies differential calculus rules symbolically to regular expressions to derive a recogniser for it.

https://dl.acm.org/doi/pdf/10.1145/321239.321249


For the derivative, shouldn't there be some notion of continuum, of a limiting process to define a slope at a particular point? Otherwise this is just a delta is it not?


There is a general notion of derivations I believe theyre using. Ie anything that obeys the Leibniz rule (fg)' = f'g+fh'.

Its an informative exercise to see how many of the standard rules of calculus can be derived using just this rule.



Something that I've found interesting is that the domain of a derivative always seems to be phrased in terms of open sets, even when the original function is defined on a closed set. It's like taking the derivative removes exactly the bounds from the domain. I wonder if one-hole contexts are to blame for that.


No, because in discrete differentiation like this, there is no limiting process where openness matters, and open/closedness is completely trivial in discrete spaces.

The "one-hole" is in the discrete types, not the potentially continuous values.


Right, the "elements of x" made me think values, but those elements really are other types.

Crazy where operator overloading will get you. If there are postmodern math papers I feel like this is one of them


If you want to take the derivative from all directions, the domain must be open. That's probably all there is to it.


Can you give an example of what you mean?



Ah, I think I see. Open sets are just how we encode information topologically. Theyre sort of the "atoms" of topology; the types of opens sets you have say how the topology behaves.

In particular open sets are neighborhoods of all of the points they contain. This means they contain all the topological information about all of their points.

Closed sets are not neighborhoods of their points in general. Eg [0,1] contains no neighborhood of 0 or 1. Then we would require knowledge of the space around those points to know how a function behaves just on [0,1].

In standard calculus this amounts to "taking the left and right limits".

Does that help?


> Open sets are just how we encode information topologically. Theyre sort of the "atoms" of topology; the types of opens sets you have say how the topology behaves.

This is an interesting perspective, thank you. It reminds me of NAND gates


Now take the derivative of a delimited continuation.


This unpublished manuscript was published in 2001, according to Google Scholar.


The concept of a derivation on types could be interesting... but the notation is a bit much for me.

Differential fields are (as I understand it) a hot topic at the moment. Maybe there is some interesting crossovers.


wtf?

> Clearly it’s equivalent to Choice’s use of a boolean tag. Isn’t it remarkable that algebraic manipulations agree with our intuition?

no, thats not interestimg at all. It is literally just encoding left/right in a bool.

that page seems to be doing "calculus" not on the sets but the sizes of the sets. entirely uninteresting and boring.

Haskel: making the easy stuff hard.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: