I was super excited until I read that you have to submit requests to be voted on. While I understand the difficulty (impossibility?) of having this kind of service on-demand, I really would rather not have to submit my obscure and possibly business-intelligence related queries to a community for voting. This could be a game-changing service if they could somehow make it on-demand.
It is on-demand. And no. of votes is not a pre-requisite (it is however helpful though). However, we do have want to keep an eye out on the number of grep-jobs we can run a day without affecting other services' performance. You should submit your grep!
Here is an example web grep that I ran to find the top ranked sites that used kissmetrics.com for user tracking. Last month there was a huge blow-up over kissmetrics possibly using ETAGs and other hacks to track users across multiple sites.
Grep is a mapjob which takes hours to run. You'll be waiting quite a while until anyone can afford to quickly run regex queries against billions of documents! And by then, there will be hundreds of billions of documents.
Google Code Search does do regexes against an impressively large set of documents nearly instantly, though it's clearly much smaller than the set of all webpages. It'd be interesting to know how much Google could scale it; could they handle 100x the number of documents in the current code search? 10,000x?
One thing you should note is that Google Code Search, as far as I know, supports regular expressions that are actually regular. This means you can't have an expression like /(ab..)\1/, for example.
In all, re2, the regular expression engine that Google Code uses, is a very interesting project; you should read about it on its google code page: http://code.google.com/p/re2/.
The issue is not so much how much cpu time the regex evaluation takes up, it's the I/O time of loading every byte of every page we've crawled.
That being said, re2 does look pretty cool... having a guarantee that nothing in an re can blow up is pretty nice, on top of the overall speed improvement.
We do plan to eventually support regexes but slowly introduce them with limited-support and not jump head-first with PCRE. Of course, this is only within the scope of webgrepper that I am talking about.