This is a great "get to know ES" project use case for elasticsearch! Frees you a bit from having to use Google search while also teaching you how to use queries / aggregations. Awesome work. Will give this a shot later.
Worth noting that if you like the idea of indexing your mail, there's also Notmuch. It's a dedicated email search and indexing tool and is very nice when paired with something like OfflineIMAP to sync messages.
What I really want is: mutt's UI, but with a SQLite3/PostgreSQL backend, and mutt not to iterate a mailbox when opening it, and an async IMAP client that reconnects as needed.
Similar concept and similar speed, but notmuch is a little more actively developed. I also find the notmuch command line interface a bit easier and the various tools built on top of it to be better. Alot, the terminal UI I mentioned in the parent comment, is almost exactly what I want in a mail program.
In a way, though the instructions specifically handle GMail labels which aren't present in other mbox files. But it's pretty general for any mbox email dump.
Yes. I'm just adding to Cyberdog's comment that it's not immediately obvious from the title of the article that the instructions are for indexing a static file.
The title makes it sound like it's instructions to setup up an alternative API to Gmail search. I was thinking something like Algolia.
What is that http://ohardt.us/download-gmail-mailbox URL where you're supposed to download your email ? Looks fishy, though the hostname doesn't even resolve so not sure what's going on.
It's a teaching tool. The benefit is to show someone how to use ES for a real-word thing.
With that said, I suspect given tuning based on your search patterns and usage - you could get more accurate search results when you control the indices, stop words, etc.
Thanks for putting this together.