Hacker News new | past | comments | ask | show | jobs | submit login
Generalizing 'jq' and Traversal Systems using optics and standard monads (chrispenner.ca)
161 points by todsacerdoti on Oct 7, 2020 | hide | past | favorite | 98 comments



This is a really exciting area. See also the Cambria project [1] and the HN discussion from yesterday [2]. See [3,4] for a great introduction to category theory for programmers--we are all indebted to Milewski / Fong / Spivak / et al. for making this topic more accessible.

[1] https://www.inkandswitch.com/cambria.html

[2] https://news.ycombinator.com/item?id=24699615

[3] https://bartoszmilewski.com/2014/10/28/category-theory-for-p...

[4] http://brendanfong.com/programmingcats.html


So, how does Cambria differ from XSLT?

For anyone interested in computing pre-history, early twentieth century punched-card processing, before von Neumann, was all about taking (monoidal) data structures (encoded as a deck of cards) with historical values and either querying them or running updates (encoded as a second deck of cards) to produce freshly copied current values.

https://en.wikipedia.org/wiki/Unit_record_equipment


Whenever I see Milewski’s name I immediately think of this video: https://youtu.be/ADqLBc1vFwI

It’s a shame I don’t know that many people who’d enjoy it as much as I do!


A great find. Thanks.


Optics, as the OP calls it, is definitely an underestimated problem of modern computing.

So much work goes in most languages into processing nested data structures, especially in this day and age of heterogeneous APIs, and a lot of it is actually not that difficult to generalize conceptually into most languages. A few weeks ago I wrote a PoC in Python and it was remarkably easy to make a lens for traversing data structures.

Declarative optics and updates would make writing business logic a lot easier on so many domains.


Is using the word optics for other things than projecting light with glass a new thing? I do not recall seeing this word used for other purposes much say 10-15 years ago. Not a native English speaker but have read English for at least 30 years.


The earliest Newton reference of which I am aware is Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire (1991), where iirc both the bananas and the lenses are (visually!) optical.

https://research.utwente.nl/en/publications/functional-progr...

In particular, the metaphor isn't used in Algorithmics: Towards Programming as a Mathematical Activity (1986).

https://ir.cwi.nl/pub/2686


I believe the concepts in the first paper are unrelated to those of modern lenses (i.e. functional references). See: https://stackoverflow.com/questions/17198072/how-is-anamorph...


I wonder how these "optics" compare to .NET's Language Integrated Queries (LINQ) (see https://en.wikipedia.org/wiki/Language_Integrated_Query )


You are onto something, since LINQ was developed by Erik Meijer, one of the authors of the paper linked in the childcomment of the siblingcomment.


LINQ is definitely a precursor, and has tons of fantastic ideas. It's unfortunate that Microsoft didn't continue to invest heavily into this as it's rarely mentioned nowadays.


I'm not sure I'd call LINQ a precursor when LINQ itself was based on haskell's monadic list comprehensions.

monadic list comprehensions aren't quite the same thing as optics, however.


List comprehensions aren't quite the same thing; they are declarative but they describe an iterative method of creating lists through maps and filters, whereas optics are semantically oriented around traversals and projections.


The nice thing about comprehensions is that, depending upon how one looks at them, they can be declaratively defined yet iteratively implemented.


isn't that what I said?


Wikipedia claims there are LINQ implementations for Javascript/Typescript and for PHP (in addition to C#, of course).


These are really just simplified versions, though.


Why do you need all this theory when you can use simple imperative languages?

    for staff in company.staff:
      for pet in staff.pets:
        if pet.type == 'cat':
          print(pet.name + " belongs to " + staff.name)
I don't understand why functional languages are used at all.


Functions are composable. Your statement is not. If I would ask you to create a snippet that searches for cats, another one that searches for dogs and a third one that searches for staff that owns both a cat and a dog you will either duplicate a lot of code or end up with functions.

Now if you restrict yourself to not introduce any side effects, the interpreter can be proven correct, the execution can be analysed much better and you can have some neat safety guarantees.

That's why.


What's wrong with functions? A function `printCatsBelongingToStaff()` is much easier to read than a line of functional code.

I don't understand what you mean by "analyzed much better" and "neat safety guarantees". Is my code hard to analyze or unsafe?


Setting aside for the moment that optics themselves are (edit: in many implementations, at least) just functions, albeit of a somewhat different flavor than `printCatsBelongingToStaff`...

Functional programmers (of the Haskell bent, at least) would prefer that they not have to write printCatsBelongingToStaff, printCatsBelongingToCustomers, printCatsBelongingToChildrenOfStaff, printDogsBelongingToStaff, renameCatsBelongingToStaff, &c. They would instead prefer to write print, rename, staff, customers, cats, dogs (or perhaps just pets, with an ability to discriminate between cats and dogs...), children, and compose. They would also prefer to write these functions in such a way that the computer (rather than the human) can analyze the resulting compositions and guarantee that they operate in the way you would expect (with "expect" being precisely codified between the types of the functions and a succinct set of laws).

The big picture here is that computers are much faster and more consistent in their checking ability than humans, so the more things we can encode in our programs in a computer-checkable way, or in laws that we can prove are preserved under composition, the more reliable our software will be on whole.


Depends on the use case. YAGNI? The presented function is very simple to read and understand what it does.

Of course in Python you could parametrize the predicates or the attributes or attribute values, and it would still be fine FP code.


> "analyzed much better" and "neat safety guarantees"

Ah - how about if we add "by the compiler": They can be analyzed much better by the compiler, and we get neat safety guarantees by the compiler.

> Is my code hard to analyze or unsafe?

Not for a human, no :) But optics allow us to drop some boilerplate, retain all flexibility, and also allow the compiler to do a deeper analysis so we can compose computational patterns. That composition composes, so as we build more complex things we have tools with us to help navigate them.


A functional coding technique is strictly more powerful.

Say we want to reuse your cat search code. Sometimes we want to log it, and sometimes we want to export it as a different data structure.

You could write a function that took a company and returned a list of pets.

What if you wanted to take the first 10 of 10 million. Returning the full list wouldn't be very efficient, you would need to alter your function to support limits.

What if you were then given a list of companies. You would need to modify the function, or write a wrapper.

What if it isn't a list anymore but an async stream of companies. More modifications.

With functional techniques, we can abstract the concept of "filtering a company for pets" without worrying about the concrete implementation of how the returned pets are stored. This allows the one function to be reused efficiently in many different applications, in ways not possible with imperative code.

This is just one example of abstraction that is possible. Functional techniques are not a good fit for all problems, but they are a good tool to have in your toolbox as they make some problems much easier to reason about.

Paul Graham has written a lot on the power of functional languages. OnLisp is a good place to start: http://www.paulgraham.com/onlisp.html

Or some of his early blog posts discuss these concepts: http://www.paulgraham.com/articles.html

Another classic that will change the way you look at code design is Structure and Interpretation of Computer Programs: https://web.mit.edu/alexmv/6.037/sicp.pdf


Your code is impossible to analyze. Console.log is a side effect.

For loops can be easily mapped into functional constructs, but not your snipped does not compute anything. It's not a function.


Don't be pedantic, this is super simple code that is easy to analyze. Side-effects are not the terror that haskell programmers make of it.


After working on and maintaining several very large codebases, I would strongly disagree. When large blocks of code start being called only for their side-effects then refactoring/deleting old code gets pretty tricky.

Conversely, when writing green-fields code, side effects are extremely convenient and make developers more productive in the very short-term.


(Side) effects are the reason you write code in the first place.

So I'd agree that focusing on the reason you are writing the code makes everyone more productive, though not just in the short term. ¯\_(ツ)_/¯

Now how you organize your code so that you don't mix things up willy-nilly is a different, more nuanced and, I think, more interesting question.

I am a big fan of hexagonal architecture, where you have a very localized and super-testable core with, ideally, all the complex functionality, surrounded by adapters that are as trivial as possible and communicate with the outside world, be that the user, persistence, network etc.

FP is one way of achieving this, but certainly not the only one.


A better reading of his comment is this. You can analyze side effects easily in a small function or system. Not so easily in a large system. That's why it's better to avoid side effects.


Oh, you mean using analyzing software. Yeah, I've never had a use for that. On the other hand, you could make my function append to a string and then return the string. Then it wouldn't have side effects so it would be analyzable.


> Oh, you mean using analyzing software. Yeah, I've never had a use for that.

One of the major goals for Haskell programmers (and users of fancy-type-systems in general) is to make it easy to transform runtime errors to compile-time errors. Further, an additional goal (albeit one which Haskell doesn't prioritize as much as some of its relatives) is to make the compiler better at telling the programmer what is wrong and how to fix it. If you have no interest in these projects and don't see why they could lead to code which is more reliable and easier to maintain, then your confusion about why people care about this stuff in the first place is perfectly reasonable IMO.


> On the other hand, you could make my function append to a string and then return the string. Then it wouldn't have side effects

State mutation is a side effect. If you mean it builds it up locally, it would then not have nonlocal side effects (e.g., it would have a pure functional interface even though it has an imperative implementation). But to do that, you'd have construction and mutation overhead in the code, whereas with functional idioms / comprehensions, you would not.

This is especially the case when (as is often the case in practice) you are taking collection datastructures and producing collection structures to work with.


(I'm new to FP so excuse this beginner question) How could you ever (in a non-trivial case) avoid local state mutation? It seems like any function which takes a collection and returns a collection would have to maintain some local state.

For example, if you want to take a list of integers and return a list of those those integers plus one ([1,2,3] -> [2,3,4]), that new list needs to be built up in memory somehow before it is returned, no? Sure that process might be hidden behind a generic `map` function (something like `map(lambda x:x+1, [1,2,3])`), lambda, but then you just have the `map` function building up the new list. Or if `map` returns a generator (so you don't need to construct the whole return list first), you're still storing some local state to keep track of which element of the input list you're currently processing.

What am I missing or getting wrong here?


Let's take a look at how we might write a function like `map` the (pure) functional way in a language like JS:

    const head = ([x, ...xs]) => x;
    const tail = ([x, ...xs]) => xs;
    
    const map = (list, fn) => {
      if (list.length === 0) {
        return [];
      } else {
        return [fn(head(list)), ...map(tail(list), fn)];
      }
    };
    
    map([1, 2, 3], x => x + 1); // [2, 3, 4]
We're not keeping track of any state here. Using recursion you don't need to keep track of the current element in the list for example (when you run this on a physical machine it will of course, but not at the conceptual level).


> How could you ever (in a non-trivial case) avoid local state mutation? It seems like any function which takes a collection and returns a collection would have to maintain some local state.

Somewhere underneath there will need to be something maintaining state, but it won't have to be local state in the function (in a pure language, typically it will be within a built-in with a pure interface.) FP isn't about changing the fact that computers operate by mutating state, but to isolate such mutations (and other side effects) behind pure interfaces, so that risk and difficulties associated with effectful code are isolated to, ideally, extremely well understood pieces of infrastructure code rather than permeating large codebases.


You are correct that memory allocation takes place to implement the FP, and you could call that a state mutation. The underlying computer mutates RAM.

But that's not what FP people mean by state.

RAM is an implementation detail. (You don't even strictly need RAM to compute. But that's another conversation.)

In your example, one part of the state is the list [1,2,3]. That isn't mutated when the program runs. It's passed around. The implementation probably passes it by reference - a pointer to the list - but actually the programmer can't tell and doesn't care if it's copied or passed by reference. There's no visible difference, when values can't be mutated.

The other part is [2,3,4]. As the program runs, it will be allocated in new memory, and from the programmer's point of view, it's as if the value [2,3,4] always existed, just waiting to be looked at. When it does look, that's always the value it's going to find, so in a very meaningful sense, that value is already there.

It's not usually allocated and stored in RAM until the program looks there, but it could be, it makes no difference from the perspective of the FP programmer. (In some implementations it actually might be. If you called map(f,[1,2,3]) twice it might "rediscover" the value already existing in RAM on the second call and use that.)

And the [1,2,3] might get freed at some point. But that only happens when it's not being "looked at". It gets forgotten then. (Or in an exotic implementation if there's still a reference to it, the memory containing the value might be freed, and it might be reallocated and recalculated when it's looked at again later. All invisible to the FP programmer.)

For your generator example, the implementation might create a state variable to implement it, or it might not. Either way it's hidden from the pure FP program, and it's as if the list [2,3,4] is just there, waiting to be looked at. Some implementations won't use a stateful generator like you'd think in Python though. They may instead represent the list as having a lazy tail: [2,3,lazymore...] where lazymore... is a placeholder in the list in RAM, which represents the part of the list which hasn't been looked at yet. This is lazy evaluation. The lazymore... is completely invisible to the pure FP program, because the act of looking at it (to do something useful) causes it to be replaced with the "real" calculated value, [4] in this case. Only those "real" values are visible to the program.

Overall, in the pure FP programming model, it's as if all the values exist already and never change, and they are determined by the FP expressions from other values which also never change.

The only "effect" is an implementation detail, triggered by the the act of looking at values to see what they already are, which converts lazy placeholders into useful values. The equivalence between lazy placeholder and useful value is so well hidden in pure FP that the implementation is free to do things like calculate values before they are needed or even if they are never needed, and to discard some values (putting the placeholder back) and recalculate them again later whenever it feels like. Yet to the pure FP programmer, it's always the same values.

The underlying implementation will allocate, free, move values around in memory, and perform lazy evaluation as its way of "looking at" values as requested. But those are all implementation details which are hidden from the pure FP programmer, and the details will vary between different implementations too. In practice there's still debugging and timing and memory usage visible, but not to the FP program (except through "cheating" non-pure FP escape hatches), and we think of those as part of the implementation, separate from the FP programming model.


To me the biggest advantage is expression. You can package up a whole traversal or other optics into a single expression and then you can give it a name, refer to it, augment it. For example what if later you'd like to add another `if` into the inner loop? Would you duplicate the code? Would you add a callback to the original imperative code? The big idea here is that access to some subpart of a big object (like a path) can be abstracted into its own expression type and made composable, and the less important idea is that it can also be well-typed.

The second advantage is conciseness. I don't know if you've actually used jq, but if you have, I really don't know why anyone prefers a verbose for-loop-and-if-statement block over a single concise jq expression. Paul Graham expressed this idea better than I could have so read this: http://www.paulgraham.com/power.html


Optics allow us to do something like this:

  import optics.for as o_for
  o_for company.staff.pets as (.{pet}, optics.parentNode as owner) where (pet.type == "cat"):
    print "{.name|%s} belongs to {owner.name|%s}")
We're guaranteed to be safe against null values in company and staff and pets – we don't have to think about it – and no "for" or "if" is required for walking down the structure. Also the types of pet and owner are completely certain, and we can give verified compile-time guarantees that the string we're populating is 100% correct to match the input.

This is the very simplest use of optics. Optics also allow us to do express wild things over those structures, such as selecting nodes based on some defined statistical relationship to the statistics of defined parts of the whole structure. It's as simple and concise as the example above. And as efficient as possible.


The point is really that lenses are values that represent locations in a data structure. And, as values, they can be combined, transformed, serialized, etc etc. Imagine having a type that represents a chain of method selectors, and that gives you some idea of the purpose.

The fact that method selectors only appear very rarely as first-class values in most languages means that most people aren’t tuned in to scenarios where they could be applied. But I bet you’ve invented special cases of this yourself, when you had a function that needed to dig data out of one of several locations, depending on other inputs.


Simple?

I know it's easy to forget how much .. stuff .. there is in an imperative language. You have the side-effect in print, and if you replace that with appending a string you're going to invoke the weird and wonderful assignment operator.

It's easy to forget how strange these things (state and side effects) are once you have mastered them in your craft.

The reason I love functional style with immutable data is because it is so dead simple and easy on my tired brain.


Because it is easier to reason about data transformations. Here when I see loops and conditionals it is confusing because anything could be happening. company.staff.filter(notCats).forEach(printMessage) has more information of what is going on for example imho. But there is a lot more to it.


How would you define the `printMessage` function? The `pet` object doesn't have context about `staff`, and the `filter` function is filtering staff, not pets. It could return staff that have both cats and dogs, so it would incorrectly print dogs.


It was a rough example but the idea is thinking in terms of data in and data out and how can it be done. Here the train of thought first would be: do I have the right data form for the thing I am doing? Here we have:

{company: {staff: [{..., pets: []}]}}

And what we want to do is to produce a list of all the pet cats with its owner name.

[{cat: "bla", owner: "bla"}...]

or

[{owner: "bla", cats:[...],...}, ...]

So I guess what Im trying to explain is that what is important is the change in the way of thinking about problems. When you think of them as data transformations there is a whole lot of possibilities that open. And it is not more expensive because you even have things like transducers. When you abstract away the iteration you are able to compose much more easily.

And to answer your question directly in my example the print function would print for each cat belonging to the staff which is not great flattening would be more elegant. What I wanted to convey is thinking in data transformations and abstracting away the implementation details of the iteration.


In javascript:

  const stf = [
      {name: "x", pets: [{type: "cat", name: "kitty"},
                       {type: "cat", name: "kitty2"}]},
      {name: "y", pets: [{type: "dog"}]},
      {name: "z", pets: [{type: "cat", name: "miau"}]},
  ];

  const myTransform = ({ pets, name: ownerName }) => pets
        .filter(({type}) => type === "cat")
        .map(({ name: catName }) => ({ ownerName, catName }))

  stf
      .map(myTransform)
      .flat()
      .forEach(({ownerName, catName}) => console.log(`Cat ${catName} to ${ownerName}`))


Because I can abstract over whether it's staff.pets or staff.children etc. etc.

It's all right. In the sense of using a tool when you find yourself constrained perhaps it's worth waiting till you find a problem where you feel "there's got to be a better way".

It's happened to me when I was writing generators/shrinkers for generative tests that operate over any Thrift or Protobuf type.


I like both imperative and functional style programming. They're isomorphic if composition is favored. Imagine interpreters iterating over ASTs. Easy peasy.

I strongly dislike hybrids, multiparadigm, metaprogramming, and inheritance-heavy OOAD.

Most of our work is CRUD, meaning data centric. Right?

My arm wavy overgeneratalization is that functional programming promotes functional decomposition style programming, which emphasizes flow of control modeling over data modeling. Especially when noobs go overboard with lambdas, stream processing, reducers, and so forth, which obfuscates the data transformations.

Whereas imperative programming can be more straightforward for simple data processing. Until noobs go overboard with OOAD stuff like inheritance, making wrapper classes for everything, and design pattern overkill. Then the wheels fall off and the engine explodes once someone reads a book about inversion of control, dependency injection, mocking, aspects, or optionals (non-null).

YMMV.


That's perfectly fine functional code you just wrote. Type system trickery can also be FP but a lot (most of?) practical FP code is written in dynamic FP languages like Clojure and Erlang and Elixir, or just Python, TypeScript, etc.


Functional idioms are often more expressive when you want to do something with a result other than an immediate side effect like printing. Sure, you can use imperative loops to build up a data structure to work with, but functional idioms (or comprehensions, which are generally an imperative-looking wrapper over functional idioms) can be quite concise and expressive for that purpose.

  for (owner <- company.staff)
    for (pet <- staff.pets if pet.type=="cat")
      yield (owner.name, pet.name)
builds a datastructure with the same information yours prints, with no extra boilerplate around building the datastructure.


i would say this is just imperative programming.


That's more difficult to type than something like

`.company.staff | .pets | flatten | filter .type == "cat" | log(...)`


Maybe so, but much easier to read. Code is read 10x more often than written, especially in teams, so it makes sense to optimize for legibility rather than elegance in some "higher" mindset, or less keys pressed (because most of the time will be spent staring at the line you wrote than actually typing if you write a functional line like that.)


That's a bad argument because you didn't specify who the audience is. Functional code is more readable to me (and many teams I've worked with), imperative code is more readable to you and that's fine. Go with your team's preference. And, most of time, what your team prefers can easily be changed with a few sessions to introduce them to a functional style.


Perhaps its easier for you to read. It takes me 2-3 longer to read the loops since each loop is unique and I have to parse the semantics/context, whereas its quick for me to parse the behavior from the named functional operators even at a glance. Conversely I find writing the functional version a bit trickier for complex reduce operations.


I totally disagree. I can read the short version pretty quickly and get an idea what it does - every word matters.

The original version with for loops just looks like a block of text. Spacing would help, but I would still need to look carefully to pick out the important bits.


Forget it, Jake. It’s Chinatown.


Underrated comment.


It's a tradeoff. The problem is that academia often comes from the angle of abstracting and generalizing everything, which is even more harmful than copy & pasting and using the simplest code possible (mostly because copy & paste + simplest code is still readable and can be improved upon, while academic style code usually is inaccessible).

You don't want to listen to any specific camp. The best is always in the middle and you are on the right track:

"Pick what is easy to read."

However there is a caveat, which is that functional style is only hard to read at the beginning. Have a look at how modern languages integrate functional programming (see java, C#, C++, etc.).

Java streams are a common example. They make most code much easier to read ONCE you are used to it, because they provide standardized behaviour that is side-effect free. When you write everything in plain syntax, the reader needs to abstract your code in his head, while when using streams, you just know what `filter`, `map`, `reduce`, etc. do and you only need to look at the small customizations. It's easier to read once you are used to it.

However there is a limit and this limit get hit quickly. A lot of people like to make their code as complicated and fancy as possible and reduce every possible line. This is definitely wrong. Code must be easy to understand and change. If you fail that, you are doing it wrong.

This also depends on your environment. If your co-workers don't understand the code you write, you have failed, no matter if your code is "nicer" or not. You either need to write simpler code or educate your co-workers to create common ground.


Wait a second, you left out the most important part: What's in the `log(...)` function? How do you get the name of the pet and the name of the staff in that line if they have the same accessor `.name`?

That's my point. The fact that you didn't include it implies to me that it was too difficult to figure out, and a true implementation would be far more complicated, so it's not a fair comparison.


I disagree, there are many ways to solve this problem. for example

   .company.staff | .pets | flatten | filter .type == "cat" | print(.name + " belongs to " + ..name)


The theory lets compilers catch a ton of bugs at compile time. When you can get a functional language to type check, it really does Just Work. And when you limit side-effects, you know another function isn't going to monkey with something it's not supposed to.

That said, I'm presently struggling with AST transformations and other fun stuff in Haskell, so I completely agree with stating it imperatively. It's far, far easier to learn and read simple imperative code.

That's why in my language Tenet[1] I'm trying a hybrid approach: the underlying representation is immutable, but the language uses ordinary control flow and mutation.

[1]: https://tenet-lang.org/


>When you can get a functional language to type check, it really does Just Work.

Not all functional languages have static typing. Also type checking helps, but saying that if the types check out it just work it pushing it IMO.

No type checker will catch this error:

  sqrt :: double -> double
  sqrt x = x

  sqrt 10


Dependent types are able to express the constraints that prevent or catch this error.

The types of the arguments to the function can have value constraints on it, and those constraints can be determined from a value that exists there: the name of the function.


What would be the dependent type you could use to prevent the bug in the given example though? There is nothing wrong with the function itself, there is just a discrepancy between what it does and how it is named. I don't know of any constraint that would fix that.

In any case, the more code you put into the type system the higher the chance that your dependent type specification haS a bug in it.


In dependent types, types depend on values. Here, the type depends on the value that is the function name :) As far as I know, this is literally actually possible in Idris, right now.

Wrote a bit more earlier: https://news.ycombinator.com/item?id=24716477

> the more code you put into the type system the higher the chance that your dependent type specification haS a bug in it.

Unbounded infinity has infinite, uncountable bugs. A defined, bounded, well-constructed, proven type system has fewer bugs. Dependent types are not an axis of explosion, but rather a way to express useful bounds on multiple axes.


Reading your other comment, I'll concede that it's not impossible to encode these things into the type system. I don't think it will scale beyond toy examples (at least until AGI is a thing). There are plenty of function names that represent highly specific business processes (and jargon) and I don't see how any type system will have enough context for that.

Less bugs does sound good, but after a minute thought there still seem to be uncountably infinite bugs even in the presence of dependent types. Just a few less than without. :)


We have a lllooootttt of work ahead of us! heh!

(But yes: It's a good place to hook the AGI up!)


Not only do dependent types have huge decidability issues, it's my understanding that integrating them with side effects is still an area of active research.


I think fortunately those are orthogonal matters :)

First, I don't know that dependent types have huge decidability issues that are an inherent obstacle in general; Idris for one has a very practical and elegant model of computation – things like totality checking, linearity rules and elimination of "scaffolding" types do a lot for us.

Dependent types do allow telling the compiler about more dimensions to consider – in a way that allows for a computational explosion at compile time – but those were always an available to consider as part of the computation; Dependent types dont add that complexity, rather they allow us to address it in the type system. I see it such that we're in a stronger position to manage and navigate that complexity by allowing us to express it to the compiler in a succinct and clear way near the core context of desired action. Dependent types allow us to do less by allowing us to do more.

And integrating purely functional mechanisms with side effects is a fun and interesting avenue of research, and I don't know that adding dependent types makes that more difficult; I'd think it makes it easier?


To continue the thought: This dependent type is expressible, trivially decidable, and has no additional side effect burden:

"This function's type depends on its function name. If its function name is "sqrt", then {check some simple rules about square roots, like if x is 0 then output is 0, if x > 1 then output < x, etc.}"

This one too:

"This function's type depends on its function name. If its function name is "sqrt", then check that it calls and returns a formally verified square root function applicable to its input type that is known to be decidable and appropriate to the floating point math definitions we're operating in right now"


> Not all functional languages have static typing.

Even the multi-paradigm, late-bound languages that are adding functional programming are adding static typing, e.g. TypeScript and mypy.

> No type checker will catch this error...

Sure, math is hard. In the domain of structural transformations, one's intuition combined with a decent type checker really does work, though.

Especially, I've done large refactorings of complex transformations, tracked down the typing errors, and then been pleasantly surprised when my test-suites passed the first time.

> sqrt :: double -> double

I can use QuickCheck[1]:

    prop_Sqrt_Sqr n = sqrt n * sqrt n == abs n
Because it can exploit the type system, it can then plug in various values of Double to see if squaring my square root squares properly.

[1]: http://www.cse.chalmers.se/~rjmh/QuickCheck/manual.html


because it isn't a type error?

Obviously if the spec says to create a word processor program, and you instead write a flight simulator, the compiler isn't going to correct that mistake.


that is exactly my point, static type system help a lot, but saying that 'if the types check there are no bugs' is too much. There are many classes of errors that are beyond types, even dependant types.


How would you abstract code so that you can also update pet's name in addition to the above statement while minimizing duplicate code? I think that is one of the key things about lenses.


The syntax you just typed is an example of the theory in practice. Syntax != semantics.


Because of "composability", even if the highly coupled functions will likely never be reused.

And leaky "abstractions" that create more problems than they solve but look cooler than doing things the straightforward way.


That exact syntax works for pretty much any Monad. You need the category theory to avoid reinventing the wheel every time.


"Optics" and "lenses" seem like such bad names just to be able to make a "view data through a lens" pun.


It definitely can feel a bit strained at times, but the basic metaphor of:

- lenses “focus” on elements of a product type

- prisms “split” a sum type so that optics can work over selected branches

...feels nice when you’ve been working with it for awhile.


It's not a pun, it's an analogy. When we want to see things in a different way in the real world, we literally use lenses (or more generally, optics).


They're good names for a couple reasons

1. The analogies made do hint at their meanings.

2. But at the same time, the names are more proper nouns than definitions. For something so abstract, it's better to give it an opaque, easy-to-remember name than to give it a "better" name that maybe oversimplifies what it is.

This is basically the "Functor" vs "Mappable" argument.


They're practically useless for searching for them. Unsurprisingly trying to co-opt widely used terms tends to be ambiguous and confusing (cf. crypto <-> cryptography, cryptocurrency, and those are at least related). Perhaps the only exception in this group is "data lens", which works as a compound [1]. Guess what optics software does? No ETL, that's for sure. Optics patterns? Uh-oh. Optics development? Nuh-uh. Optics data? Turns out there are multiple companies with that name, and they're not doing ETL. Optics library? Image libraries of lenses (the kind you can touch). Optics computing? Sorry, that's a synonym of optical computing. Optics design? That's the engineering of optical systems. Optics scheme? Guess what arrangements of lenses are called... Optics programming? Aha, now the topic at hand actually manages to be in the majority of the results - barely, since four out of ten are still about that other thing humans have been at for a couple hundred years. Optics <programming language here>? Things for doing optical computations and simulations.

[1] Also follows the rule of "if you're going to name something after a common word, maybe give it a qualifier not usually used with that word to make it unique and easy to find". Examples: sphinx (doc, search), click (cli), pyramid (web framework), pacman (linux), language (programming).


Try searching "optics Haskell"

https://www.google.com/search?source=hp&ei=mL5-X7iCJNCs0PEPo...

First result is to the popular and well-documented `optics` library which is exactly this topic. It's perfectly discoverable.

On that note, try googling "Haskell >>=" or "Haskell <$>". Good results for both. Looks like the "Haskell's symbols and esoterica isn't googlable" is a dead argument now :)


> On that note, try googling "Haskell >>=" or "Haskell <$>". Good results for both. Looks like the "Haskell's symbols and esoterica isn't googlable" is a dead argument now :)

Wow, it works! I wonder when this happened.


In response to [1], specifically: “Van Laarhoven optics” and “profunctor optics” are qualifiers for two of the more common constructions.

In response to the naming in general: I’m not overly convinced about the naming thing considering that the Go Programming Language is not often confused with the game of Go, nor is the Rust Programming Language with the fungus or iron oxide.


We recently implemented a subset of JQ in Clojure, converting filters and functions into Clojure predicates to allow our users to apply (K)JQ queries directly to Kafka topics holding JSON-like data (e.g Avro, Json, Transit, etc). If it's maps or vectors it works.

The basic implementation is predicate composition as we support multiple combinations of filters.

I hadn't thought any deeper about 'traversal systems' or optics, just parsing a grammar, creating predicates, and composing them. This is an interesting read, perhaps I should take the Clojure impl. further.

https://www.youtube.com/watch?v=7krDIMjVzZM

Probably the most fun coding I've done this year because I had an excuse to use the wonderful Instaparse again.


Not CT (IIRC) but see also "Domain Modeling Made Functional" - Scott Wlaschin

https://www.youtube.com/watch?v=Up7LcbGZFuo


So overly complicated. The "state of the art" of FP now means heavy type systems and heavy machinery to deal with them. The 80/20 pareto of FP is just pure functions + composing those functions and it works in any language. Here is all of it in python:

    from __future__ import print_function
    import json

    struct = json.loads(open('staff.json','r').read())
    staff = struct['staff']
    salaries = struct['salaries']

    def visit_pets(staff, cond, visit):
        for person in staff:
            for pet in person["pets"]:
                if (cond((person, pet))):
                    visit((person, pet))

    # Find all cats owned by any of our employees
    visit_pets(staff, lambda t: t[1]["type"] == "cat", lambda t: print(t[1]))

    # Find each pet and their owner
    visit_pets(staff, lambda t: True, lambda t: print("{} belongs to {}".format(t[1]["name"], t[0]["name"])))

    # Give a $5 raise to anyone who owns a dog
    dog_owners = set()
    visit_pets(staff, lambda t: t[1]["type"] == "dog", lambda t: dog_owners.add(t[0]["id"]))
    for staff_id,salary in salaries.items():
        if staff_id in dog_owners:
            salaries[staff_id] += 5
In my view it gets you 80% of the power of optics in exchange for zero machinery and can be understood by almost anyone. For doing these kinds of ops on a deeper structure, using Scala/Javascript or something with first-class support for HOFs, folds/filters/maps would work better.

I wish more people promoted this view of FP - just pure functions, higher order functions and referential transparency. It gives you the ability to reason locally and extend code well. The remaining type system and architecture astronomy buys very little in relation to what it costs.


For me, the remaining power of optics (what you call 20%):

1. let the compiler reason about my code to catch errors when the requirement changes (say, when some record fields change their names or types),

2. thus lower my cognitive load, and

3. allow me to reason about more important aspects of the project.

At work, the use of a powerful and expressive type system allowed me to work on multiple projects at once, achieving at least 2 times boost in productivity (definitely more than 20% gain).

Moreover, if optics is not well-known to everyone, I can just link to this blog post as a documentation, and give more examples if needed (I think people underestimate how coworkers could be productive without completely understanding everything).

It is hard to explain the experience of “pair-programming with the compiler”, but such complication is worthwhile for the productivity gain (and developer happiness).


This example has nothing to do with type systems. The same system exists e.g. for Clojure (see Specter).

The fundamental idea behind the approach in the article is the following:

Treat the path through a nested data structure as a first-class thing.

This gives helps us mainly in two scenarios:

1. Allowing for easy field-update-like code to exist for immutable data structures in addition to mutable data structures. This is especially valuable for updates.

2. Unify everything that can be "field-like" under a consistent field syntax.

In most class-based languages we can write something like

    myObject.field0.subfield1.subsubfield2
but `field0.subfield1.subsubfield2` isn't a first-class item that we can pass around and use. We can certainly write something like the following:

    def myPath(someObject):
        return someObject.field0.subfield1.subsubfield2
but (to the first point) if someObject is deeply immutable, we can't use `myPath` to actually update `subsubfield2` and instead have to cumbersomely write out the entire reconstruction of an immutable data structure at every level.

To the second point, even if someObjet were mutable, there are a variety of things that would be nice to have as field accesses, but aren't, simply because they don't fit into a class definition. As an example, consider the bits of an integer.

    # Imagine .permissions.bitMask.lastBit to be a path 
    # through user
    set(user, .permissions.bitMask.lastBit, 0)
is so much nicer than the corresponding bit shifting code.

If path access through data were a first-class concept, there's no reason that paths should be restricted to only fields in classes. You can fit the same interface to anything that supports the notion of "field-like" access.


This is exactly how we shipped our JQ-like product feature in 62 lines of Clojure. Just pure functions composed into a single predicate and and a decent grammar to interpret the input JQ.


What's the best optics library for typescript (or javascript)? Optika, partial.lenses, or other?


While I think it's closer to be dead on arrival (PartiQL), I like PartiQL/GoogleSQL way (generalizing SQL over nested structures) much more than the examples in the article


I wish this blog were better optimized for mobile. I can't scroll back horizontally.


This guy is way better at jq than me.


The theory that describes what jq does and is proven to lead to further places is way better at jq than the implementation of jq that currently exists.


i get it, i just grabbed an optics lib to use it.


Sincere apology, in joy too though: My previous comment was intended to be a joke-via-overdefinition but also constructive :) On reread it came out as kinda not-fun pedantry. Thanks for your understanding. I got your joke too :)


less of the !




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: