Searching for what is not there using REGEX in only a single step
Greg Rundlett
greg at freephile.com
Fri May 28 13:07:01 EDT 2004
NOTE: I know how to solve this problem by processing the text in 2
steps, first finding all occurences of /A(.*)C/ and then searching for
B in $1, but I'm wondering if there is some advanced expression for
doing it in only one step.
I have an interesting little problem that I'm wondering if someone knows
how to solve using regular expressions:
Given some larger text, where you have many subsections that are made up
of a token A followed by an indeterminate amount of text NOT including
token B and then token C, how can you find those chunks of text? I've
been trying with Perl-compatible Regular Expressions through PHP, but
can't come up with a way to do it.
For example,
I have an XML file, with a bunch of records. Some records are fine.
Others are missing a chunk. I want to find the broken records and
insert the missing tags.
Broken Record
</fh>
30101 Agoura Ct., #115<br /></location_addr1>
<location_addr2></location_addr2>
Fixed Record
</fh>
<location id="">
<location_name>
</location_name>
<location_addr1>30101 Agoura Ct., #115<br /></location_addr1>
<location_addr2></location_addr2>
I thought I would be able to find </fh> followed by </locacation_addr1>
and do a lookback negative assertion to say that <location_addr1> was
not present. However, not knowing the length of text between </fh> and
</location_addr1> seems to make this impossible.
--
FREePHILE
We are 'Open' for Business
Free and Open Source Software
http://www.freephile.com
(978) 270-2425
"Paul Lynde to block..."
-- a contestant on "Hollywood Squares"
More information about the gnhlug-discuss
mailing list