I wish there was a decent set of tools for using regexes as list generators. I've seen a few tools and libraries, but they make weird assumptions and never quite do what I want.
Here's what I'd like:
- Infinite repetition operators are automatically rewritten as constrained repetition operators based on some global parameters e.g. * = {0,2}, + = {1,2}, {n,} = {n,n+3} or some such
- Character classes can be rewritten as well so . can be kept as "just printable ASCII codes" instead of veering off into all sorts of unicode bits
- Exhaustive list generation, generate every possible match using the rules above. Too many tools seem to go about some sort of random walk against the FSM and you never know what you're going to end up with
- Decent coverage of operators. Too many have weird limits like no {,} operator or no [] character classes.
Getting exhaustive lists like this can be a better way of showing what a regex will match than trying a bunch of examples. A Perl and/or Python library would be extra awesome.
Can you give me a use case for this? It is not very hard to do, given that random matches are already being generated, but I'm skeptical of its usefulness.
1) regex search against an indexed database: use a regex to generate a search list, search a well indexed database with this list instead of linearly scanning data in the database with the regexes. AFAIK there's no way to index in a way that is regex friendly, only wildcard friendly...so most database systems that offer regex search just do a more traditional regex match against the contents of the table elements. In most cases, even searching tens of thousands of terms will be faster than the typical regex alternative.
2) education - you might use the generated lists while learning regexes to verify that you've accounted for various edge cases
3) data verification - you're using a regex to verify data, something is slipping through, see if it exists in the edge of generated matches (surprisingly common problem)
4) software testing & quality assurance. Define the possible inputs to a method via regex and generate all possible inputs instead of just testing at boundary conditions etc.
5) security/penetration testing, like #4 but to test cryptographic systems, passwords etc.
there's tons more...
Generating random members isn't sufficient as the runtime can grow extraordinarily long as you keep generating hoping you get all possible members, and you may not be certain when you can halt.
While not a visual tool I highly recommend txt2regex[1]. It is a regex wizard that helps you build regexs for a number of different environments. It supports:
When you need wizards and debuggers to formulate a question your grandmother would understand then your API is clearly broken. Regular expressions are a failure. A new interface to 'pattern matching' is long overdue. Smart young people, it's your turn!
As one who always forgets regex after 3 months of not using it, I find them helpful :)
http://pineapple.io/resources/tagged/regular-expressions?sor...
I particularly find this one fascinating, even though I don't fully grasp: http://jkff.info/articles/ire/