Pattern Matching & Replacement Prob

Lawrence Tilly mail.list.tilly at gmail.com
Wed Dec 20 11:09:35 EST 2006


Greetings, all.

I've continued to lurk but it has been a while since I've had need to
post here.  However, I do now turn for suggestions.

I have a file with a lot of commands.  These commands are encapsulated
in double quotes.  Within the double quotes are several segment parts
which are defined by single quotes.

The problem is that some of those internal segments are filled by
free-text comments from users and...well...users will be users.  Some
like to put in words like " Dan's " or " it's " and so those wonderful
single quotes are messing up the parsing of the command.

The basic pattern that is safe and consistent is a single-quote,
followed by zero or more of any character, then another single quote
followed by a comma.  Such as:

'asbckwwe',   or ' ',

What I need to do is this:
  1.  Find the first ' and use it to flag the start of that segment.
  2.  Continue thru the string until the next ' is encountered.
  3.  If that ' is followed immediately by a comma ( no white space,
etc.) mark that segment as complete and start again.
  4.  If that ' is followed by any other character or white space,
escape it by adding a second ' and then continue searching thru that
segment.
  5. Repeat #4 until #3 occurs since the user may have typed more than
one word with an apostrophe in it in the same comment block.

I was starting to address this problem with awk since I am weak on
Perl.  However, I am pretty sure Perl would be an easier solution if I
did know the language better.  Plus some of our other pre-execution
parsing is taking place with a Perl script and this function could be
added right into that.  So...I hope someone here may be able to offer
up some examples.

advTHANKSance!
-Lawrence


More information about the gnhlug-discuss mailing list