question: text substitution using Perl

Zhao Peng greenmt at gmail.com
Mon Oct 23 13:38:58 EDT 2006


Dear Kevin,

Thanks a lot for your help. I have 3 derived questions from your script.

Question 1:
For the back references, you used ${1} for the 1st captured buffer, 
while some books/people simply use $1, I'm wondering if you use {} as an 
extra caution to make sure it refers to the 1st captured buffer in case 
there are some digits followed.


Question 2:
line 1>   perl -i.bak \
line 2>        -pe 's/ \$(\d+)\. / \$ebcdic${1}. /g;
line 3>             s/ (\d+)\. / s370ff${1}. /g;'     \
line 4>        your-directory-somewhere/*readme*

On the end of line 1 and 3, you have a back slash. Is it for separating 
input to separated lines for better readability? If so, why is there no 
back slash on the end of line 2?


Question 3:
You used a period "." after ${1}, wouldn't it be safe to use "\." as the 
original string only ends with a period and we don't want to change it? 
I think "." can match any single character except a newline.


Thank you for your time.

Zhao

Kevin D. Clark wrote:
> Zhao Peng writes:
> 
>> ********substitution 1************
>> The characteristic of original string:
>> 1, always start with "$"
>> 2, then followed by an integer, could be more than 1 digit, such as 23
>> 3, always end with a period "."
>> 4, there is always a blank before & after original string
>> For example: $2.
>>
>> The characteristic of target string: always has "ebcdic" inserted into
>> the original string between "$" and the integer
>> For example: $ebcdic2.
>>
>> So the substitution will look like this
>> $2.  ->  $ebcdic2.
>> $67.  ->  $ebcdic67.
>>
>> Should the regular expression for original string be: \$\d+\.   ?
> 
> Looks pretty much right to me.
> 
> I would make this replacement like this:
>  
>   s/ \$(\d+)\. / \$ebcdic${1}. /g;
> 
> The ${1} in there is something called a "backreference".
> 
> The 'g' at the end of the line generally specifies "do this as many
> times as possible on each line".
> 
> 
>> ********substitution 2************
>> The characteristic of original string:
>> 1, always start with an integer, could be more than 1 digit, such as 23
>> 2, then end with a period "."
>> 3, there is always a blank before & after original string
>> For example: 2.
>>
>> The characteristic of target string:
>> always has "s370ff" added to the beginning of original string
>> For example: s370ff2.
>>
>> So the substitution will look like this
>>
>> 2.  ->  s370ff2.
>> 14.  ->  s370ff14.
>>
>> Should the regular expression for original string be: \d+\.   ?
> 
> I would make this replacement like this:
>  
>   s/ (\d+)\. / s370ff${1}. /g;
> 
> 
>> My real situation is that I have a bunch of files at one directory, of
>> which for the files whose name contained "readme",  I need to do 2
>> substitutions described above.
> 
> One way to quickly do this might be like this:
> 
>   perl -i.bak \
>        -pe 's/ \$(\d+)\. / \$ebcdic${1}. /g;
>             s/ (\d+)\. / s370ff${1}. /g;'     \
>        your-directory-somewhere/*readme*
> 
> 
> This in itself makes a backup for you, but you might want to make your
> own backup files beforehand.
> 
> Just another Perl hacker,
> 
> --kevin


More information about the gnhlug-discuss mailing list