Hacker News new | past | comments | ask | show | jobs | submit login

You can also implement linear-time regex matching in user code with webassembly. In my tests it was faster than native regex matching (https://jasonhpriestley.com/regex-dfa)



As someone familiar with JS engine regex internals, those results are totally shocking to me. How can I try your example? In particular I wasn't able to figure out which regexes you are testing with.


I tested with this regex:

  /^abcd(a(b|c)*d)*a(bc)*d$/
I used long strings of repeated "abcdabcd" as the test strings.

It's possible I made a mistake somewhere, and I can put together an open-source repo with the test setup when I get the time. But I'm curious why you find the results shocking?


The results are shocking because the native regular expression engines have thousands of man hours poured into them, with specialized JITs, exotic optimizations, etc. And your page is reporting the native one at 10x slower!

One thing I spotted is that you're asking the native engines to record capture groups, but your wasm implementation doesn't support those. For fairness you might switch to non-capturing groups: `(?:bc)` instead of `(bc)`. However this cannot explain the magnitude of the difference.

I dug into it some more, reducing it to this case:

    /^(?:abc*d)*$/
what happens is that the backtracking engine has to be prepared to backtrack into every iteration of the outermost loop. This means that the backtracking stack grows as the length of the input; the engine spends most of its time allocating memory! These nested loops definitely make the NFA approach look good.

Regardless it's a cool project, thanks for sharing!


There is nothing shocking here, parent implements strictly regular expressions and compile them to DFA, so of course it will be fast, especially using only ASCII characters and hand chosen regular expressions. Russ Cox articles covers this very well.


Where can I find the "wat2wast utility" mentioned in your article?


It's part of https://github.com/WebAssembly/wabt

I think I had to build from source.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: