searching/grepping for words "near" each other
Andy Bair
pab at korelogic.com
Thu Apr 30 16:25:15 EDT 2009
One way to do what you want is to use hipdig.pl which is a utility in
the FTimes suite. You can download FTimes and read more information
at the following URL.
http://ftimes.sourceforge.net/FTimes/index.shtml
The hipdig utility is a Perl script that "digs" (searches) for hosts,
IPs, passwords, and custom regular expressions. The online man page
for hipdig.pl is located at the URL below.
http://ftimes.sourceforge.net/FTimes/Man+Pages/hipdig.shtml
You can use the hipdig custom type specify a regex that returns
characters around target strings as shown in the following example.
My test file is shown below.
$ cat /tmp/test.1
abc
foobar
def
uvw
barfoo
xyz
The command below specifies a custom type (-t) which is a regex that
searches for the string foobar and barfoo that are 0-20 characters
from each other. Notice that special characters are URL-encoded in
the output so %0a is the newline character.
$ hipdig.pl -h -t 'custom=(?i)foobar.{0,20}barfoo' /tmp/test.1
name|type|tag|offset|string
"/tmp/test.1"|regexp||4|foobar%0adef%0auvw%0abarfoo
Hope that helps.
Andy
KoreLogic Security
603.465.3236 (Office)
603.340.2498 (Mobile)
http://www.korelogic.com
GnuPG Fingerprint: 688A 79EC B1E5 5748 CE87 1F20 2C45 60E7 0583 23B6
On Thu, Apr 30, 2009 at 03:35:55PM +0000, VirginSnow at vfemail.net wrote:
> OK, I know we have a few grep gurus on this list...
>
> I want to search a text file for a few (alphabetic) words which must
> be "near" each other, but not necessarily on the same line. "Near"
> could be defined however you like... within a certain number of words
> from each other, a certain number of charecters from each other, or
> some similar constraint.
>
> Is there any way to do this using grep? If not, is there some other
> tool (short of a desktop search engine) capable of doing this?
>
> This seems like a rather elementary search task, so I figure someone
> must have figured a convenient way to do it...
>
> Any suggestions?
> _______________________________________________
> gnhlug-discuss mailing list
> gnhlug-discuss at mail.gnhlug.org
> http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://mail.gnhlug.org/mailman/private/gnhlug-discuss/attachments/20090430/7e832e8f/attachment.bin
More information about the gnhlug-discuss
mailing list