global search and replace
Kevin D. Clark
kclark at CetaceanNetworks.com
Tue May 13 10:25:34 EDT 2003
Greg Rundlett <greg.rundlett at buzgate.org> writes:
> How do you do this?
Attached is my "transmogrify" script, which I hope you find to be useful.
Regards,
--kevin
--
"There! Now we're both transmogrified! We're even!" -- Calvin
-------------- next part --------------
#!/usr/bin/perl
# Copyright 1998 Kevin D. Clark (alumni.unh.edu!kdc)
# Author: Kevin D. Clark (alumni.unh.edu!kdc)
# Note: I now call the script "doit" because I hate to type too much.
# Description: A utility written in perl for doing a bunch of regexp sorts
# of things to a bunch of files, all the while preserving
# file permissions and making backups as necessary. In addition,
# if this program doesn't need to make a change to a file, the
# file's timestamp is not changed (this can save a *ton* of time
# in a system with thousands of files and millions of lines
# of code...).
# Note: I have rewritten part of this program to make it faster.
# Here's the specs:
#
# "time" output
# The new doit program: 166.6 real 29.3 user 11.3 sys
# The old do-it.pl program: 492.8 real 357.6 user 12.1 sys
# The old shell script version (using sed): 212.6 real 39.9 user 100.6 sys
#
# These numbers come from doing a simple replace on a collection of 586
# C and C++ source files.
# Note: The code that handles multiple statements within a "-e" flag has
# changed somewhat (for the better). See the description below.
# TODO: I gotta throw some perl poetry into this.
# add "--"
# need capability for multiple -e and -d options
require 5;
use File::Find;
require 'getopts.pl';
$SIG{'INT'}='sigHandler';
$SIG{'QUIT'}='sigHandler';
$opt_d=".";
$opt_e="s/s/s/s";
$opt_x=".bak";
$opt_R=$/;
&Getopts("d:e:f:x:acsmhbwR:") || (&usage && exit(1));
if ($opt_h) { &usage; exit(0); }
if ($opt_a) { &handleDateExtention; }
if ($opt_b) { $^I=$opt_x; &backupInfo; exit(0); }
if ($opt_c) { $opt_m=1;
push(@ARGV,
q(\.cc?$),q(\.C$),q(\.cxx$),q(\.cpp$),q(\.hh?$),q(\.java$)); }
if ($opt_f) { &handleFileCommands; }
if ($#ARGV == $[-1) { $opt_s=1; $opt_x=""; }
$/=$opt_R;
chdir($opt_d) || die "Can't cd to $opt_d: $!\n";
if ($opt_m) {
@searchfor=@ARGV;
undef @ARGV;
find(sub {
my($pat);
for $pat (@searchfor) {
if (/$pat/ && -f && -T) {
push (@ARGV,$File::Find::name);
last;
}
}
},
".");
}
$files=@ARGV;
@ARGV=grep($opt_w || -e || ((warn "Warning: $_ doesn't exist!\n"), 0), @ARGV);
@ARGV=grep($opt_w || -w || ((warn "Warning: $_ is read-only!\n"), 0), @ARGV);
exit(1) if ($files && !@ARGV);
$^I=$opt_x;
$_="";
$evalMe = q#while (<>) { $changed{$ARGV}=1 if (#;
$evalMe .= $opt_e;
$evalMe .= q#|0); push(@ARGV2,$ARGV) unless $processed{$ARGV}++; print;}#;
eval($evalMe);
if ($evalError) { $gotSig=1; warn "$evalError\n"; }
undef %changed if $gotSig;
&deleteUselessBackups if ($opt_x ne "");
if ($gotSig) { warn "No changes made.\n"; exit(1); }
if (!$opt_s) {
if ($^I eq ".bak") {
# less clutter for the user who defaults to .bak
print "Type \"$0 -b\" for info concerning backup files.\n";
}
else {
print "Type \"$0 -b -x\'$^I\'\" for info concerning backup files.\n";
}
}
exit(0);
sub handleDateExtention {
my($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst);
my($month,$day);
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
$month = (Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec)[$mon];
$day = (Sun,Mon,Tue,Wed,Thur,Fri,Sat,Sun)[$wday];
$opt_x = ".${day}_${month}_${mday}_${year}_${hour}.${min}.${sec}";
}
sub handleFileCommands {
local $/;
$opt_e="";
open(FILE, $opt_f) || die "Can't open $opt_f: $!\n";
$opt_e = <FILE>;
close(FILE) || die "can't close $opt_f: $!\n";
}
sub usage {
# Note: if you are seeing this and you want to cut and paste from the
# following here-document, remember that some of the things in
# here are escaped so that they are printed out correctly when
# the user types "doit -h". Caveat programmer!
print <<"USAGE"
Usage: $0 [flags] [filename specification]
Description: A utility written in perl for doing a bunch of regexp sorts
of things to a bunch of files, all the while preserving
file permissions and making backups as necessary.
Flags:
These flags may appear in any order.
-h Print this message.
-b Print generic message about backup files.
-d directory Start at this particular directory
(default is the current directory)
-e exression(s) What to do to the file(s) as they stream by.
For example, "s/foo/bar/" or
"s/fooBAR/fooBar/g | tr/[A-Z]/[a-z]/"
Obviously, this must be a perl expression.
Please notice that within the scope of this
this expression you should use the '|'
character to separate your sub-expressions.
(default: do nothing)
-f filename Via this option the user can specify a file
with editing commands in it, all in the
same style as specified in the -e switch.
If this option is specified, the -e switch
is ignored.
-R seperator The input-line seperator to use. This
may be a multi-character string. The empty
seperator implies an awk-like
"paragraph-mode". (default is a newline)
This roughly corresponds to the "RS"
variable in awk.
-x extention Use this extention for backup files.
(default is '.bak', specifying an empty
extention means "don't make backup copies")
-w Make this program attempt to write to files
that are unwriteable, while at the same time
preserving the permissions of these files
(the default is not to write to such files).
-a (see also the -x option)
Automagically create backup files with an
extention that indicates the date at which
this change is being done. This same
extention will be used for all of the files
involved in this change. This option
overrides anything that might have been
specified via the -x option.
Note: this feature is perhaps
not very useful in practice.
-s Be silent! (except for errors)
Reading from stdin implies this option.
Filename Specification:
These "filename specifiers" may be the actual names of files
themselves, or with the following two flags doit will search for the
files recursively for you. These specifiers and flags should appear
after any of the flags mentioned above.
-m Interpret the filenames specified as being
meta-filenames and not actual filenames.
If this option is specified the filenames
specified will be searched for in a recursive
manner (starting at the specified directory),
and the specified changes will take place on
all of the specified filenames.
-c "Code mode", this is the equivalent to
specifying the metafilename of
"\\.cc?\$\" "\\.C\$\" "\\.h\$\" "\\.cxx\$\" "\\.cpp\$\" "\\.java\$\"
(hope someone finds this to be useful!)
If the -m option is not specified (or implied), then whatever
filenames appear on the command line will be changed. If no
filenames appear on the command line, then this program will act
as a filter, reading from stdin and writing to stdout.
Author: Kevin D. Clark (alumni.unh.edu!kdc)
See also: man perlre, man perlop
And remember: it's spelled "doit", but it's pronounced "transmogrify".
USAGE
}
sub backupInfo {
if ($^I eq "") {
print "Note: via the -x option, you specified an empty backup\n";
print "extention, which means \"don't make backup files\".\n";
print "For the examples below, the assumption is made that you didn't\n";
print "do this sort of thing. In the examples below, the assumption\n";
print "is made that the backup extention was \".bak\".\n";
$^I=".bak";
}
if ($ENV{"SHELL"}=~/csh/) { # blech!
print <<"BACKUPINFO_CSH"
TO UNDO THESE CHANGES TYPE:
foreach A ( `find $opt_d -name \\\*$^I -print` )
mv \$A `echo \$A |sed 's/\\$^I\$//'`
end
BACKUPINFO_CSH
}
else {
print <<"BACKUPINFO_SH"
TO UNDO THESE CHANGES TYPE:
for A in `find $opt_d -name \\\*$^I -print` ; do
mv \$A `echo \$A |sed 's/\\$^I\$//'`
done
BACKUPINFO_SH
}
print "\nTO GET RID OF THE BACKUP FILES TYPE:\n";
print "find $opt_d -name \*$^I -print | xargs rm -f\n";
print "\nNOTE:\n";
print " Neither of these are are 100% bulletproof; caveat programmer.\n\n";
}
sub deleteUselessBackups {
my($file);
for $file (@ARGV2) {
if (!$changed{$file}) {
rename("$file$^I", $file) ||
warn "Unable to \"mv $opt_d/$file$^I $opt_d/$file\"\n";
}
}
}
# perl doesn't clean things up as gracefully as I want
sub sigHandler {
my($sig) = @_;
exit(1) unless $^I;
warn "Caught signal SIG$sig -- cleaning up gracefully...\n";
$gotSig=1;
}
# It's the
__END__
of the world as we know it, and I feeeel fine....
More information about the gnhlug-discuss
mailing list