Yes, it does. Capture groups aren't special here. The issue is a semantic quibbl...

ridiculous_fish · on Nov 28, 2018

> when folks say, "a regex engine that guarantees matching in linear time," what they actually mean is, "a regex engine that guarantees matching in linear time with respect to the input searched where the regex itself is treated as a constant."

Well the Rust docs say "all searches execute in linear time with respect to the size of the regular expression and search text." Their engine compiles to a DFA, not an NFA or PikeVM; I suppose this is the basis for their claim.

> As another example, if you compile the regex to a DFA before matching, then the time complexity is going to be O(n) since every byte of input results in executing a small constant number of instructions, regardless of the size of the regex. However, DFAs typically don't handle capturing groups

Now you have arrived at my question! Rust compiles to a DFA that supports capture groups. My question is whether capture groups ruin the linearity of the DFA matching.

Thanks for the Laurikari's Tagged DFAs reference, I hadn't heard of that. I'll check it out!

burntsushi · on Nov 28, 2018

> Well the Rust docs say "all searches execute in linear time with respect to the size of the regular expression and search text."

Yeah, that's ambiguous phrasing on my part. I meant that it was linear time with respect to both the size of the regex and the search text.

> Their engine compiles to a DFA, not an NFA or PikeVM; I suppose this is the basis for their claim.

No, it doesn't. rust/regex uses some combination of the Pike VM, (bounded) backtracking and a lazy DFA. It will compile a DFA ahead of time in some cases where Aho-Corasick can be used.

> Rust compiles to a DFA that supports capture groups.

No, it uses a lazy DFA to answer "where does it match," but it still must use either the Pike VM or the bounded backtracker to resolve capture groups.

> My question is whether capture groups ruin the linearity of the DFA matching.

Yeah I think I would probably look at Tagged DFAs to answer this. You'll want to check out recent papers that cite Laurikari's work, since I think there have been some developments!