The results are shocking because the native regular expression engines have thousands of man hours poured into them, with specialized JITs, exotic optimizations, etc. And your page is reporting the native one at 10x slower!
One thing I spotted is that you're asking the native engines to record capture groups, but your wasm implementation doesn't support those. For fairness you might switch to non-capturing groups: `(?:bc)` instead of `(bc)`. However this cannot explain the magnitude of the difference.
I dug into it some more, reducing it to this case:
/^(?:abc*d)*$/
what happens is that the backtracking engine has to be prepared to backtrack into every iteration of the outermost loop. This means that the backtracking stack grows as the length of the input; the engine spends most of its time allocating memory! These nested loops definitely make the NFA approach look good.
Regardless it's a cool project, thanks for sharing!
There is nothing shocking here, parent implements strictly regular expressions and compile them to DFA, so of course it will be fast, especially using only ASCII characters and hand chosen regular expressions. Russ Cox articles covers this very well.
One thing I spotted is that you're asking the native engines to record capture groups, but your wasm implementation doesn't support those. For fairness you might switch to non-capturing groups: `(?:bc)` instead of `(bc)`. However this cannot explain the magnitude of the difference.
I dug into it some more, reducing it to this case:
what happens is that the backtracking engine has to be prepared to backtrack into every iteration of the outermost loop. This means that the backtracking stack grows as the length of the input; the engine spends most of its time allocating memory! These nested loops definitely make the NFA approach look good.Regardless it's a cool project, thanks for sharing!