Are there any examples of Haskell being used to do anything really cool and useful to the average programmer. To me it seems as an esoteric language that I would simply not bother using because no one uses it and deploying it would be a headache.
"really cool and useful to the average programmer" is pretty vague and hard to answer. But I'd say that it absolutely does. Namely the ability to build things with less code and have them be MUCH more maintainable in the long run because of strong static types and purity.
Case in point: one company ported a 43k line Groovy app to 8200 lines of Haskell [1]. The resulting Haskell app also was at least 64x faster and eliminated a number of major bugs with the Groovy system. If that's not "really cool and useful to the average programmer", then I don't know what is.
But this is just the tip of the iceberg. Haskell has an awesome library called diagrams [2] for the construction of declarative graphics. It can construct some impressively complex images with a remarkably small amount of code [3]. The pandoc universal document converter [4] is written in Haskell. There's also the classic paper comparing Haskell, Ada, C++, and Awk [5]. I'm sure there are other really cool and useful things that I'm forgetting.
I'm hoping that we can find a GHCJS-based FRP solution at some point soon. I've seen Elm in action, and it's pretty neat, same with Purescript, but I'd rather stay within legitimate Haskell for the time being.
Note that you can translate the skills you learn from Haskell to other languages. The way I write code in Python and Javascript for my day job has drastically changed after I have learned Haskell.
The major change which I see is how the Python code becomes more pure. Using code which are side effect free gives me the ability to easily reason about them. I can jump in the code even after many months and easily understand them without trying hard. Using function composition to glue things makes complex system very simple.
Imagine you implement some function. As a trivial example, let's say you wrote a plus operator (spelled as (+) when not being applied). One of the properties it should have is associativity.
To check whether it is associative, you can write (in the GHCI repl):
> import Test.QuickCheck
> let associative op x y z = (x `op` y) `op` z == x `op` (y `op` z)
> quickCheck $ associative (+)
+++ OK, passed 100 tests.
QuickCheck just applied our "associative" function with 100 various random triplets and made sure that (+) behaved associatively in all the examples.
Now, the (-) operator is of course not associative, so let's see what QuickCheck says about that:
QuickCheck now spells out a simple example triplet (x,y,z = 0,0,1) which breaks the associativity claim.
This kind of testing is called "property testing" and is an awesome way to test pure functions that finds bugs far more easily (and more thoroughly) than traditional UTs.
For more "real-worldy" examples, consider QuickChecking that your URL parse function is reversible (build back a URL from the parsed components). Ditto with any kind of parser/builder or [de-]serializer.
Most pure functions are expected to obey various laws, and QuickCheck is a great way to verify they do.
* Parallelism annotations!
Sprinkling `par` annotations on your code (or replacing `map` with `parMap`) is guaranteed not to change the semantics of your program. It may speed up (due to extra parallelism) or slow down (due to extra overhead), but it is safe to throw it around.
* Type safety
No null dereference errors (though this is finally getting to some mainstream language).
Exhaustive handling of all inputs can easily be verified by the compiler.
Data types that describe your data precisely and yet more concisely declared than imprecise data types in other languages.
Advanced features allowing to use the type checker to get more compile-time assurances (e.g: Red Black Tree respects all the RB invariants).
* Expressiveness
The kind of abstraction power Haskell gives you is quite unique.
This updates the game state by traversing all the units whose distance from target <= 1.0, and subtracts 3 from their health. This kind of expressive power is based on library-level abstractions (here, the lens library) with no language support or even macros.
It seems like you already know the answer to your question. There are plenty of other programming-related things to learn besides programming languages, if you don't want to learn (esoteric?) programming languages. Enough to keep you busy for a lifetime.