The counting zeroes in "100!" problem is actually a common high school level math contest problem. You're supposed to calculate it by hand, not with code. The trick was to notice
- number of trailing zeros is equal to times you can divide something by 10
- There are a lot more 2s than 5s as the divisor, so number of 5s is the limiting factor
- 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 contribute (at least) a factor of 5 each
- 25, 50, 75, 100 contribute another factor of 5 each
So answer is 20 + 4 zeros.
I love monoids but I personally thought that was a bad example. Not only did it complicate the problem, it didn't actually lead to any understanding that will let you write a generic and efficient solution for "n!". Something like this O(log(n)):
Your solution still assumes we can break down the problem according to the fundamental theorem of arithmetic, which is a monoid homomorphism. You simply have added more insight into the problem than I.
The goal of the article is show there's a consistent structural framework of problem solving underneath. I mention briefly that you can improve the solution along your lines.
The idea that let's us prove your algorithm above correct using this theory is that monoid morphisms compose (they form a functor category), and that equivalence relations that respect monoid composition induces monoid morphisms to the quotient classes.
Then we just define the equivalnce relation where two numbers are equal if they have the same number of fives in their prime factorization. The composite monoid morphism yields your algorithm.
I didn't want to overcomplicate the article with this, but I did mention it (it's in the hard-to-parse paragraph at the end of the section)
I use monoid homomorphisms extensively in xi-editor, and have a couple of slides on it at my RustConf talk earlier this month. I hope to be writing up the ideas in more detail soon (both what I've implemented, and speculative explorations such as doing incremental syntax highlighting).
I don't fully understand how category theory terms are helpful in programming. It always feels backwards to me, that first you write the code, then you look through the code and pick out all the places where category-theory terminology might appear. Monoids are nice and all, but in the end it's just a name given to a specific design pattern: you may as well reason about it in terms of code.
Sometimes, wondering if your type is an example of some particular abstraction can reveal unexpected useful operations, when you fill in the gaps.
For example, there's a Haskell library that defines a type called "Fold a b" [1]. It's basically a little state machine which you feed values of type "a" until you close it and then it returns an output of type "b".
Turns out these Folds are examples of an abstraction called a "Comonad". This basically means that there is a function
> extract:: Fold a b -> b
That runs a fold without giving it any input, and a function
> duplicate :: Fold a b -> Fold a (Fold a b)
That turns a Fold that takes as and returns b into a Fold that takes as and returns another Fold that takes as and returns b. But what on earth could this be useful for?
Turns out that "duplicate" lets you keep a Fold "open" when you pass it to functions that run the Fold and only return its result (say, a function that feeds the Fold with values read from a file). If you pass a "duplicated Fold" to such functions, you will get another Fold as result that you can keep feeding.
The Applicative instance is even more useful: it lets you combine two Folds into one so that their results can be calculated in a single run.
This kind of useful discovery doesn't always happen, though. Sometimes the new functions turn out to be useless, or too inefficient.
I, for one, think that Fold's documentation can be seriously improved by describing its behaviour in terms of code, rather than category theory. All I'm saying is that when even you describe a Fold, you do so in terms of non-category-theory terms, and that nothing is added by category theory itself. This is much the same way lenses have some kind of category-theoretic interpretation, but exactly the same level of understanding can be achieved by thinking about them in terms of code. And if there's no difference, that means category theory is essentially unnecessary.
I also want to say that I don't believe Fold was discovered from category theory, it seems to me more like a neat generalization of foldl, and then once they had an idea, they asked questions like "is this a comonad?". Indeed, it's far easier to me to figure out what's going on by reading http://hackage.haskell.org/package/foldl-1.2.1/docs/src/Cont... to see the Applicative instance itself.
Honestly, it just requires a certain familiarity with the general concepts. Once it's internalized you'll often be thinking in terms of the abstractions themselves. E.g. "If I can make my <domain-object-X> into a <Monoid/whatever>, I can apply <all-these-laws>!" type thing. For a concrete example, I've been doing this in an (unreleased) game to just regularize all the special cases of "effects" (poison, slowness, etc.) affecting the player. Bundling it all up into an Effect data type and defining a Monoid for it means that I don't have to care at all when or how they get applied, whether they have a timer, etc.
> and that nothing is added by category theory itself
In some sense you're right. CT is really just a systematic way of describing patterns. Theoretically we could just prove loads of special cases and never have to resort to CT to prove anything... but it sure helps when you can just invoke the Yoneda Lemma for some special case that you're working with rather than going through the whole process of re-proving the Yoneda Lemma for the umteenth time.
(Btw, this isn't specific to CT. It's just that math-level specification/proof basically always tends towards generalization... for really good reasons.)
EDIT: I should say that I'm not yet close to the level where I can say that a "Monad is just a monoid in the category of endofunctors". I'm still just a muggle in this whole CT thing and yet it's even helping me!
One of the benefits of having these more abstract concepts is that you can more easily reason about code you're about to write. Instead of asking "will this new code work in cases A, B, C, D..." you can ask yourself if it preserves the properties of a monoid morphism.
For this to actually be a helpful tool, you of course need to have internalized the concept.
No, you first think about the code, and the concepts you have handy structure the way you think and then write. If you don't know what a subroutine is, you might be writing monolithic code, and sometimes think how would you eliminate some of the repetition. If you don't know the concept of objects, you can't structure your code the OO way.
If you don't have a working knowledge of [insert here], you are also losing something that might give your code a better structure.
The names communicate a lot, for example a Monoid must satisfy certain (useful) laws. We would all be writing less adhoc APIs and more compositional code by understanding such algebraic structures.
Apart from all the other answers already given, it will allow you to write automatic unit-tests, proving the properties of your code (using the laws of the given category). Perhaps there are already editors which allow for automatic refactoring, given the homomorphisms of the categories.
"Monoid homomorphism" is an extremely compact way to communicate the concept to someone who already knows some first-year undergrad maths. "Divide and conquer" is more specific and ties in less with what a mathematician already knows about monoids.
> "Monoid homomorphism" is an extremely compact way to communicate the concept to someone who already knows some first-year undergrad maths.
This looks like an intentional exaggeration, and I think serves no purpose. I am a math professor now, and my speciality is very algebraic, but I definitely didn't know what a monoid, or even what a homomorphism, was in my first year of undergrad. (I may not even have known what a monoid was in my first year of grad school ….)
Fair enough. I was generalising from the example of myself. (Although I am surprised you didn't know what a homomorphism was. Presumably, therefore, you didn't know what a group was at that time?)
In the same way you had to understand the concept of inheritance before it was useful for you to use it in your code, understanding monoid/monad/<insert category pattern> is useful for you to utilise the pattern.
It's not just a pattern (i.e. a structured way of working around a language deficiency). In a good language you can factor out these concepts explicitly in code: you represent e.g. Monoid as an actual interface, and then write code that makes use of that interface.
In which case you need a name for that interface. When you have an interface that corresponds to something that's already known under a particular name, normally you'd use that name for the interface.
Ah... Monoids-shmonoids. While I appreciate abstract algebra, I feel that forcing it onto students of programming is a terrible idea. An entire generation of potential engineers and scientists (and even mathematicians) was lost when they started teaching children that geometric figures can only be "congruent" to each other, not "equal". In the same way, I would have never been able to use, say, generic lists without giving this idea a second thought if they had told me first that those are an example of an important kind of "endofunctor in the category of types" (i.e. a monad).
There are a million monad tutorials for Haskell (and, with respect to their use in Haskell, it's probably better to learn them from there). I think that a good one is "You Could Have Invented Monads!".
This[1] is a good paper about monads in functional programming. It reads quite similarly to a monad tutorial, but with a little more formality.
As for actual category theory, I've enjoyed the first few chapters of "Algebra: Chapter 0", by Aluffi.
If you're after more type theory, the usual recommendations are "Types and Programming Languages", by Pierce, and "Software Foundations", also by Pierce.
I'd suggest http://eed3si9n.com/herding-cats/ . Though a lot of the time I find it's best to wait until you have a concrete use case, otherwise the concept is too abstract to make sense.
If you're looking for textbooks, Abstract and Concrete Categories—The Joy of Cats[0] by Adámek, Herrlich, and Strecker provides a nice introduction and is freely available as a PDF download.
- number of trailing zeros is equal to times you can divide something by 10
- There are a lot more 2s than 5s as the divisor, so number of 5s is the limiting factor
- 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 contribute (at least) a factor of 5 each
- 25, 50, 75, 100 contribute another factor of 5 each
So answer is 20 + 4 zeros.
I love monoids but I personally thought that was a bad example. Not only did it complicate the problem, it didn't actually lead to any understanding that will let you write a generic and efficient solution for "n!". Something like this O(log(n)):