Searching for what is not there using REGEX in only a single step

Bob Bell bbell at hp.com
Tue Jun 1 11:44:00 EDT 2004


On Fri, May 28, 2004 at 01:08:25PM -0400, Greg Rundlett <greg at freephile.com> wrote:
>NOTE:  I know how to solve this problem by processing the text in 
>2 steps, first finding all occurences of  /A(.*)C/ and then searching 
>for B in $1, but I'm wondering if there is some advanced expression for 
>doing it in only one step.
>
>I have an interesting little problem that I'm wondering if someone knows 
>how to solve using regular expressions:
>
>Given some larger text, where you have many subsections that are made up 
>of a token A followed by an indeterminate amount of text NOT including 
>token B and then token C, how can you find those chunks of text?  I've 
>been trying with Perl-compatible Regular Expressions through PHP, but 
>can't come up with a way to do it.

Well, I don't know about PCRE in PHP, but in pure Perl, you could do the 
following: /A(?(?=B)(?>.*)|.)*C/

This matches token A followed by token C, with a possible series of 
"stuff" in the middle.  The "stuff" is evaluated conditionally.  It uses 
look-ahead to see if what's coming matches token B, and if so it 
independently matches the rest of the line, irrevocably consuming token 
C, so that the required match to token C will fail, and the RE as 
a whole will fail to match.  Otherwise, the "stuff" in the middle 
matches any character, one character at a time.

Thanks for the opportunity to learn more about Perl REs. :-)

-- 
Bob Bell



More information about the gnhlug-discuss mailing list