Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Squaring isn't backreferences, though you can square more than a literal. An example squaring is /(foo|bar*){2}/ which is equivalent to /(foo|bar*)(foo|bar*)/, not /(foo|bar*)\g1/.

Backreferences aren't technically regular, and as you say they can take exponential time to match. But the theorem that regular expression equivalence is EXPSPACE-complete applies to real regular expressions, not just perl "regexes".

IIRC the proof is something like, given a Turing machine that runs in space S, to consider traces of the the computation, which are concatenations of S-long strings (plus a head indicator holding the machine's internal state etc). You can make a regex that matches invalid executions of the machine, which are basically of the form

/.* (foo1.{S}bar1 | foo2.{S}bar2 | ...) .*/

where foo1->bar1, foo2->bar2 etc are all the types of invalid transitions (basically, either something on the tape not next to the head changed, or the head changed in a way not indicated in the state table).

You also set rules to enforce the initial state. Then you ask whether:

/wrong initial state | invalid execution | execution that doesn't end in "accept"/

is the same regex as

/.*/

If it's the same, then there are no strings which don't match the first regex; such a string must have the right initial state, a valid execution, and end in "accept". So if the two are the same, then all valid executions end in "reject". Since you aren't constrained to allow only a single state transition rule, this actually shows NEXPSPACE hardness, but that's equal to EXPSPACE by Savitch's theorem. Anyway I could be misremembering this, it's been a while, but it doesn't require backreferences.

The squaring is required to make the {S} part without exponential blowup. Otherwise, I think the problem is only PSPACE-hard (PSPACE-complete? I don't remember).



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: