Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire

siraben · on Aug 5, 2020

Here's a clearer version of the paper[0].

This paper[1] is a good way for object-oriented programmers to understand why functional programmers care about these higher order abstractions. The goal is not so different from OOP, which is about structuring programs as reusable components (AKA design patterns). The difference of course is that one doesn't need to resort to diagrams and prose, instead they can be written directly in the programming language.

Relatedly, one of my favorite books in functional programming is Algebra of Programming[2], which is about treating programs as an algebra that can be manipulated as easy (with practice) as algebraic expressions in math. The book solves various problems through program derivation, for instance solving the maximum subsegment problem by refining a naïve spec, deriving sorting algorithms, greedy algorithms, dynamic programming and thinning algorithms.

This draft book[3] has similar content to the earlier chapters of AoP but is far more accessible. The downside of AoP is that I had to resort to reading it with a math professor to understand it.

[0] https://maartenfokkinga.github.io/utwente/mmf91m.pdf

[1] http://www.cs.ox.ac.uk/jeremy.gibbons/publications/hodgp.pdf

[2] https://themattchan.com/docs/algprog.pdf

[3] http://www4.di.uminho.pt/~jno/ps/pdbc.pdf

misja111 · on Aug 5, 2020

>> The goal is not so different from OOP, which is about structuring programs as reusable components (AKA design patterns)

And it is exactly the overuse of design patterns that is one of the reasons that OOP has become less popular over the years. I see a lot of similarities with FP there. Once you learn a new design pattern it opens your eyes to new and beautiful solutions for existing problems. And of course from then on you want to use it wherever you can. To repeat a cliché: If your latest tool is a hammer, every problem looks like a nail.

This overuse of abstractions leads to software that is hard to understand and because of that, it actually becomes harder to maintain or extend, while the whole point of design patterns was to make it easier. This doesn't mean that design patterns, either in OOP or FP, are bad, no they still are really powerful but only if used when they are strictly needed; they should most of the time not be used to introduce some opportunity for reusability if reusability is currently not needed.

bollu · on Aug 5, 2020

Luckily, the nice thing about functional "design patterns" as invented by the haskell community always have mathematical laws attached to them.

The advantage of having formal laws is that functional programmers can reason mathematically about their design patterns.

This is completely different from the OOP case, where a "design pattern" is something someone found useful X number of times, with no rhyme or reason for structuring things this way. Which is why, in my opinion, it's so hard to talk about OOP design patterns beyond "it seems to fit".

This is completely unlike the OOP community, where a design pattern has a poor spec written in english.

siraben · on Aug 5, 2020

Having learned FP's design patterns before OOP's, it was amusing to ask questions such as, "what are the laws for visitors/iterators/builders?"

In the FP community, the laws for folds/unfolds/functors etc. are sacred, which make for very robust programs and well-behaved libraries.

Sharlin · on Aug 5, 2020

It’s the difference between the engineer’s approach and a mathematician’s one. (Physicists, they just write FORTRAN.) To be fair, design patterns were quite explicitly introduced as ”these are things OO people seem to reinvent and converge on, so seemed worth documenting.”

siraben · on Aug 5, 2020

Yes, overuse of abstractions is a problem, however I'd argue from experience that using recursions schemes in FP allow us to be free from "the goto of FP", that is, direct recursion. You can of course still write programs in directly recursive style, then later replace them with a mathematically equivalent program that uses recursion schemes, which is essentially refactoring. Similarly, I can easily generalize a program by replacing, for instance, map on lists with fmap, and now it will work for any functor.

I think what adds to the value of FP's abstractions is how they have mathematical (viz. categorical) laws in mind, which make them precise and elegant.

Anyone know if the same can be said for design patterns? It requires more classes to be added, more boilerplate, which definitely adds to the complexity.

siraben · on Aug 5, 2020

Also, these laws are immensely valuable to compilers. The Haskell foldr/build rewrite rule allows for intermediate lists to be eliminated altogether[0], so you can write

  sum (map square [1..n])

and the compiler will remove the intermediate list altogether, putting a tail-recursive loop in the expression's place.

[0] Paper on deforestation https://link.springer.com/content/pdf/10.1007%2F3-540-19027-...

jmeister · on Aug 5, 2020

Thanks for these awesome references!

scythmic_waves · on Aug 5, 2020

Seconded!

ktpsns · on Aug 5, 2020

Here is the PDF: https://ris.utwente.nl/ws/portalfiles/portal/6142049/meijer9... (and here is a more useful link to a publication database containing a link to a PDF: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.1...)

The text is from 1991.

pwm · on Aug 5, 2020

Let's try to make this a tad less abstract with a (hopefully) more practical teaser:

1. Let's define a simple binary tree and a (recursive) function to process it:

    data Tree a
      = Leaf a
      | Node (Tree a) (Tree a)

    processTree :: (a -> b) -> (b -> b -> b) -> Tree a -> b
    processTree processLeaf processNode = walk
      where
        walk (Leaf v) = processLeaf v
        walk (Node l r) = processNode (walk l) (walk r)

processTree takes 2 functions, namely the "how to process a leaf?" and the "how to process a node?" ones as well as a tree and then... processes it :) Now, given a tree of Ints, we can use the above to eg. sum its values, find the biggest one or count the nodes:

    λ> myTree = Node (Node (Leaf 3) (Leaf 9)) (Leaf 7)
    λ> processTree (+ 1) (+) myTree
    22
    λ> processTree id max myTree
    9
    λ> processTree (const 1) (+) myTree
    3

Great. Now, instead of defining the above Tree type as an inductive type, let's factor that part out:

    data TreeShape a rec
      = Leaf a
      | Node rec rec

This type is kinda like the original except we add a new type variable (rec) for the inductive part. Ok, now comes the interesting (crazy?) part: Let's define a weird type called Fix:

    newtype Fix f = Fix {unFix :: f (Fix f)}

Don't get lost in it! Just read on. Let's use this weird Fix type to define our original Tree type as such:

    type Tree a = Fix (TreeShape a)

Also let's rewrite the original processTree function using this new type:

    processTree :: (TreeShape a b -> b) -> Tree a -> b
    processTree process = walk
      where
        walk (Fix (Leaf v)) = process (Leaf v)
        walk (Fix (Node l r)) = process (Node (walk l) (walk r))

Looks kinda like the original except that it now uses the same(!) function (aka. algebra) to process both leaves and nodes. This allows us to write simple (non-recursive) functions:

    sum (Leaf v) = v
    sum (Node l r) = l + r
    max (Leaf a) = a
    max (Node l r) = max l r
    size (Leaf _) = 1
    size (Node l r) = l + r

and recover:

    λ> myTree = Fix $ Node (Fix $ Node (Fix $ Leaf 3) (Fix $ Leaf 9)) (Fix $ Leaf 7)
    λ> processTree sum myTree
    19
    λ> processTree max myTree
    9
    λ> processTree size myTree
    3

At this point you might rightfully ask: Why oh why would we go through all this suffering just to recover the same functionality we had originally? Good question :) To answer it we need to realise that "TreeShape a" (and thus also "Tree a") is a member of a class called Functor, which means it knows how to map functions over itself. The magic incantation to do such mapping is called fmap in Haskell. Let's make TreeShape work with fmap:

    instance Functor (TreeShape a) where
      fmap _ (Leaf v) = Leaf v
      fmap f (Node l r) = Node (f l) (f r)

Now, knowing the above, we can rewrite our processTree function into a _much_ more generic form that works not just on our Tree type but on _any_ Functor:

    processFunctor :: Functor f => (f b -> b) -> Fix f -> b
    processFunctor process = process . fmap (processFunctor process) . unFix

This processFunctor function (aka. catamorphism) has no notion of the shape it processes other than that shape should know how to map a function over itself (ie. it's a Functor). As an added bonus we don't even need to write Functor instances for our types by hand as deriving them is a deterministic and mechanical process and thus the compiler is more than happy to do it for us:

    data TreeShape a rec
      = Leaf a
      | Node rec rec
      deriving stock Functor

To summarise: if we want to process some recursive data structure and that data structure happens to be a Functor then all we need to do is derive it's functor instance, write a simple function to do something with an element of the structure and apply that function to the whole using processFunctor.

jordanscales · on Aug 5, 2020

This is really great, thanks :)

`fmap _ (Leaf v) = Leaf (f v)` right? (s/v/f v)

I'm also curious about "the compiler is more than happy to do it for us" - is that in the form of `deriving Functor`? I have never seen stock before!

Lastly, does `processFunctor` exist in the standard library? Seems powerful but unsure why I would personally have to write it with each project.

tominated · on Aug 5, 2020

Not quite - think of the functor as a container for the recursive field, not the values attached to leaves. The leaf is your base recursive case and doesn't have a value to map over. Having your functor instance map over this recursive field is what allows you to write a non-recursive function that expects the field to have the already processed value from it's children!

And yeah, the compiler can derive the functor instance! I don't know the mechanism for how it works but I think it might assume the last type parameter is the 'container' field (think the values in a list or maybe)

It's not in the standard lib, but you can find a few implementations (it's usually called cata for catamorphism) available on hackage. I have used recursion-schemes and it works pretty well.

I learned about this stuff via this excellent blog series: https://blog.sumtypeofway.com/posts/introduction-to-recursio...

pwm · on Aug 5, 2020

> `fmap _ (Leaf v) = Leaf (f v)` right? (s/v/f v)

Compare the Functor instances of the original inductive Tree type with TreeShape a:

    instance Functor Tree where
      fmap :: (b -> c) -> Tree b -> Tree c
      fmap f (Leaf v) = Leaf (f v)
      fmap f (Node l r) = Node (fmap f l) (fmap f r)

    instance Functor (TreeShape a) where
      fmap :: (rec -> b) -> TreeShape a rec -> TreeShape a b
      fmap _ (Leaf v) = Leaf v
      fmap f (Node l r) = Node (f l) (f r)

You see that in the latter the function we map over acts on the rec type which is the induction we factored out and is only present in the Node case. You can view the Leaf case as the termination case where Fix stops expanding. It's useful to write out the evaluation by hand on a small example to see how it works.

> I'm also curious about "the compiler is more than happy to do it for us" - is that in the form of `deriving Functor`?

It is indeed the deriving Functor. You can even examine what the compiler does by dumping out the instance using the -ddump-deriv flag.

For more background reading as to why this is possible here are a few links:

- https://mail.haskell.org/pipermail/libraries/2011-February/0...

- https://www.schoolofhaskell.com/user/edwardk/snippets/fmap

> I have never seen stock before!

It comes from the DerivingStrategies lang extension which makes it explicit that this derivation is "built in". Everything works the same without it though so we can just ignore it as well.

> Lastly, does `processFunctor` exist in the standard library?

If by standard you mean the Prelude then no, but there are tons of packages, eg.:

- https://hackage.haskell.org/package/data-fix

- https://hackage.haskell.org/package/recursion-schemes

- https://hackage.haskell.org/package/compdata

- https://hackage.haskell.org/package/yaya

aebtebeten · on Aug 5, 2020

Useful as background reading for the "Ph.D. Haskell programmer" in https://willamette.edu/~fruehr/haskell/evolution.html .

tuukkah · on Aug 5, 2020

TLDR explanation of the title: the paper introduces special mathematical notation for some recursion schemes: banana brackets (|...|) for folds and lense brackets [(...)] for unfolds as well as envelope brackets [[...]] and barbed wire (overlaying <...> and [...]).

vmchale · on Aug 5, 2020

That's a lovely paper. I'm glad it's still getting attention! Spread they joy.

johndoe42377 · on Aug 5, 2020

To paraphrase: unnecessary, redundant abstractions is the root of all evil.

At least this is what plagued Haskell libraries - unnecessary abstract bullshit in dependencies.

Cabal works fine without any of those.