searching/grepping for words "near" each other

Andy Bair pab at korelogic.com
Thu Apr 30 16:25:15 EDT 2009


One way to do what you want is to use hipdig.pl which is a utility in
the FTimes suite.  You can download FTimes and read more information
at the following URL.

  http://ftimes.sourceforge.net/FTimes/index.shtml

The hipdig utility is a Perl script that "digs" (searches) for hosts,
IPs, passwords, and custom regular expressions.  The online man page
for hipdig.pl is located at the URL below.

  http://ftimes.sourceforge.net/FTimes/Man+Pages/hipdig.shtml

You can use the hipdig custom type specify a regex that returns
characters around target strings as shown in the following example.
My test file is shown below.

  $ cat /tmp/test.1 

    abc
    foobar
    def
    uvw
    barfoo
    xyz

The command below specifies a custom type (-t) which is a regex that
searches for the string foobar and barfoo that are 0-20 characters
from each other.  Notice that special characters are URL-encoded in
the output so %0a is the newline character.

  $ hipdig.pl -h -t 'custom=(?i)foobar.{0,20}barfoo' /tmp/test.1 

    name|type|tag|offset|string
    "/tmp/test.1"|regexp||4|foobar%0adef%0auvw%0abarfoo

Hope that helps.

Andy

KoreLogic Security
603.465.3236 (Office)
603.340.2498 (Mobile)
http://www.korelogic.com
GnuPG Fingerprint: 688A 79EC B1E5 5748 CE87  1F20 2C45 60E7 0583 23B6

On Thu, Apr 30, 2009 at 03:35:55PM +0000, VirginSnow at vfemail.net wrote:
> OK, I know we have a few grep gurus on this list...
> 
> I want to search a text file for a few (alphabetic) words which must
> be "near" each other, but not necessarily on the same line.  "Near"
> could be defined however you like... within a certain number of words
> from each other, a certain number of charecters from each other, or
> some similar constraint.
> 
> Is there any way to do this using grep?  If not, is there some other
> tool (short of a desktop search engine) capable of doing this?
> 
> This seems like a rather elementary search task, so I figure someone
> must have figured a convenient way to do it...
> 
> Any suggestions?
> _______________________________________________
> gnhlug-discuss mailing list
> gnhlug-discuss at mail.gnhlug.org
> http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 195 bytes
Desc: not available
Url : http://mail.gnhlug.org/mailman/private/gnhlug-discuss/attachments/20090430/7e832e8f/attachment.bin 


More information about the gnhlug-discuss mailing list