Gremlin was written by a genius level developer to be used by other genius level developers. There are maybe a handful of Gremlin experts in the entire world and less than 100 that are any good at it.
It is extremely powerful, but after a few lines, the mental acrobatics needed to understand what the query does is beyond your average developer.
My first paid Neo4j gig 7 years ago was writing a rules engine in gremlin. It was about 25 lines of code. If you were to ask me today what each of those lines did, I would be at a loss. So would anyone who didn't live in those specific queries day in and day out.
Graph adoption was severely limited by its use. Cypher can be learned in a day, and "business people" can look at a cypher query and understand what is going on for the most part.
It takes about a week to "bolt on" Gremlin to any database. I've done it myself, that's why you see it so often. It takes months to be any good at it.
Far from a genius myself, I've quite enjoyed using Gremlin and the Apache Tinkerpop project. If you start thinking of it as a functional language, I believe that helps clarify quite a bit what is going on.
Unfortunately when using it with a distributed backend (Cassandra, for example), having to write query templates to take advantage of bit-wise comparisons for parameterizing, all in Groovy, was extremely painful mostly because I found Groovy to be very awkward.
But there are language-native protocol implementations of Gremlin now.
The little self-contained Apache Tinkerpop project was always fun to play with toy graphhs in.
And there was an ambitious project to implement the Tinkerpop engine on top of a Redis backend (akin to RedisGraph which is now an official module using Cypher) but it is far from a straightforward project. There is even a Tinkerpop implementation on top of PostgreSQL which looks interesting.
I'm not sure this is a death blow for Tinkerpop so much as a marketing coup the grace for Neo4J. They have a strong product (despite the index-free adjacency!) and an even strong branding behind it. Anything that brings graph DB/technologies into the mainstream is always nice, though. Not that everything in the world is a graph problem...but surely there are plenty that can be classified as such.
Gremlin traversal language is a piece of a complete database query as run by TinkerPop's Rexster database. You can see it as a lazy sequence or stream API (think srfi-41 or r7rs scheme generators) with sugar syntax optimized for property graphs.
To take complete advantage of TinkerPop Rexster you really need to embed the Gremlin DSL inside a Turing Complete language (like groovy) and execute that.
I think Gremlin failed because a) the similar look to SQL of cypher queries b) long running and massive marketing campaign by the company behind cypher. c) since tinkerpop developers were hired by the company behind Cassandra, tinkerpop (and Janus graph) have lost momentum.
All this narrow data expert systems that persist data on-disk (!) are doomed to fail! The future is ordered key-value store and multi-model databases with ACID transactions.
It's imperative while Cypher is declarative.
Mostly: if you want the most performant and expressive langage: choose gremlin.
If you want the easiest one and what you implement is standard and not very complex, then use Cypher.
Gremlin is everywhere, it's a horrible name and sort of difficult to reason with at first but it works, it's fast and there are a lot of databases that have support for it.
I do prefer the declarative approach that cypher uses, but for most things gremlin got the job done easier and faster.
I have 101 level of experience with both. Cypher is amazingly intuitive and simple, Gremlin, not as much. Just from a dumb user perspective (like me), Cypher left more like Python, Gremlin more like C++. Both are great, just different learning curve and entry bar.
A lot of the Google-able references talk about how Gremlin is more optimizable than Cypher, etc.