Elasticsearch uses Lucene at the core of it's search capability. https://lucene....

SnowflakeOnIce · on April 5, 2019

Would any of the various full-text search plugins (e.g. https://www.sqlite.org/fts5.html) make searching your 80GB database reasonable?

bshipp · on April 5, 2019

@snowflakeonice

What a time to be alive! I've just started populating the sqlite virtual fts table. I will report back with my findings!

mosselman · on April 6, 2019

Any news. I am very interested

bshipp · on April 6, 2019

Okay it finally (30 seconds ago!) finished indexing.

The reason it ran out of disk space is that I included 3 columns to index on (in this case: name, path, filename) and it ballooned my 66GB db to 185GB!

However, every single query afterward was instantaneous. Literally milliseconds to pull 90K results from three full-text columns across 500 million rows. And the search words were anywhere in the column, not just the beginning. Incredible. I'm simply blown away.

All I did was this single command: CREATE VIRTUAL TABLE fts USING fts5(hash UNINDEXED, title, path, filename);

and then wrote a normal INSERT statement to populate it like I would a regular table. It was so painless.

Just be aware that each column appears to drastically increase the size of the DB.

I'm so excited! I have so many other databases to try this out on!

EDIT: now I'm going to move it off the SSD to a mechanical drive and see if it still holds the same performance.

mosselman · on April 7, 2019

Wow, thanks for posting your results. This all sounds very promising. I am thinking about gathering a few pieces of data from custom sensors around the house in order to determine what I can do to cut down on energy costs (well that is the excuse ;)) and instead of using kibana and the like I’d rather use a lightweight SQLite dB, your stats make me hopeful of using it for this.

Also it is all just very interesting.

bshipp · on April 6, 2019

Final Comment on this: after moving it to a slow mechanical drive the query speed dropped dramatically, as expected. What was almost instantaneous on the SSD took anywhere from 40 to 120 seconds on the slower drive. However, previously the dumb full table scan took anywhere from 120-240 seconds on the SSD and I never even bothered trying on the slow drive!

bshipp · on April 6, 2019

Hahaha, it's still running. I ended up running out of drive space on the SSD I started it on, which began a lengthy process of shuffling things around, cleaning indexes and VACUUMing two very large databases (60 and 80GB, which ended up running all night), and finally (as of this morning) restarting the process of populating the FTS table. I will respond when it's finally done, I promise!