Hacker News new | past | comments | ask | show | jobs | submit login
Lucene's FuzzyQuery is 100 times faster in 4.0 (2011) (mikemccandless.com)
48 points by sciurus on Aug 23, 2014 | hide | past | favorite | 6 comments



At first, i was disappointed that the Levenshtein Automaton is not some sort of golem. But then i read the rest of the post and realised that the Lucene committers have at least created a pretty impressive Frankenstein's monster.

The Python code; does it execute at runtime as part of Lucene, or is it only used in the build phase, to generate Java code which executes as part of Lucene? It sounds like the Python code implements the DFA generation, which would have to happen at runtime, surely?


I wonder whether they reached out to the authors of that paper, or the author of the Python implementation? You'd think people would be pretty ready to help a big open source project like Lucene. Particularly the academics --- it's unlikely they'd find a higher-impact open source use of their algorithm!


>We wrote our own scary Python code, tapping into various functions in the Moman package, to auto-generate unreadable Java code

Hnggg

If I wrote this code, I don't think I could ever convince myself I did it correctly.


Code generation does serve its purpose, but I would argue that it's only a bit more effort to make it readable and with code-generated comments as well.

I once had to build an Oracle PL/SQL interface for a rather complicated integration project between two old systems and the business requirements for what had to go where changed daily.

In the end I opted to code generate the PL/SQL interface and supporting code from a spreadsheet that the business always kept up-to-date with the latest changes to the requirements. Ultimately they could change things to their hearts' content and I would only have to press a button to re-generate it all.


Remember a story like that on thedailywtf.. Was it yours?:)





Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: