And the prize goes to... Re: Oops. And a brainteaser. Re: The Hosstraders retire

Bill Sconce sconce at in-spec-inc.com
Sun Feb 18 15:00:25 EST 2007


On Sun, 18 Feb 2007 10:05:43 -0500
"Michael ODonnell" <michael.odonnell at comcast.net> wrote:

> On Sat, 17 Feb 2007 17:18:41 -0500
> "Ben Scott" <dragonhawk at gmail.com> wrote:
> >  Shouldn't 
> >      [grep -in "hosstrad" ~/Mail/...gnhlug.x/*]
> >  yield *every* line which has "Hosstraders" in it?
> >      -i = case insensitive
> >      -n = line numbers
Bill:
Oops again!  Ben is correct.  It does yield *every* line containing
"hosstrad".  My English was faulty.  (I said "yields one line from
each of the files".)

> "Ben Scott" <dragonhawk at gmail.com> wrote:
> >  Perhaps the shell did not expand the file names in dated order
Bill:
That's warm!


"Michael ODonnell" <michael.odonnell at comcast.net> wrote:
> You're right about that, but that would only be part of the
> problem; MH msgs are stored in files with simple numeric names
> like 1, 2, 423, 1111, etc, but although the numeric ordering
> of those file names initially reflects the order in which the
> msgs arrive in the folder ...
Bill:
Ah HA.  This *could* have caused trouble.  (The sloppiness of my
greppiness becomes ever more apparent.  :)


> (each new msg is written
> into a file whose name is the number of the current largest
> numeric filename + 1) ....
Bill:
However, I've checked: my GNHLUG folder indeed has (still has)
files/messages named in ascending order.  Each filename is just
a string of decimal digita, and the numeric values ascend by
date.


» there are all sorts (pun intended) of
> operations that can affect which msgs reside in which files, so
> it's generally not a safe assumption that the numeric ordering
> of the filenames reflects the timestamp ordering of the msgs
> they contain.  For example, this command:
> 
>    sortm +inbox -textfield subject -nolimit -verbose
> 
> ...would reorder (ie. rename) all files in your inbox such
> that messages in the same thread would be grouped together,
> sorted by date within each thread.  The resultant numeric
> ordering of the filenames would then be useless for figuring
> out the timestamp ordering of the msgs.
Bill:
>>>>  Wow!  I'd never seen the "sortm" command before.
>>>>  I didn't have it.  I do have it now.  (In UbuntuLand
>>>>  a package called mailutils-mh gives it to you.  And a
>>>>  whole lot more.)  Thanks for the tip.

> If I wanted to force the filename/timestamp ordering in
> question I might do this:
> 
>    sortm +myGNHLUGemailFolder   # default sort key is Date:
> 
> ...and then my approach to Bill's search might be this:
> 
>    cd ~/Mail/myGNHLUGemailFolder
>    grep -in "hosstrad" $( ls [0-9]* | sort -n )
Bill:
Now we're talking.  Fire the "sorceror's apprentice" (me and my
expertise with grep), hire a real sorceror (tools designed for
the job).  What a concept.  THANKS FOR THE TIP!

-Bill

__________________________________________________________________
P.S. The prize has yet to be awarded.  Ben's explanation is close,
but his wording (including that "perhaps") makes me hold out for
one or two details.

Tip: there were LOTS of files, each one with a filename of a 
string of decimal digits, the values of which grew by +1 with
each successive message as it arrived (and they had not been 
sorted or renamed).  For instance, the messages bridging New
Year's 2006 were:
    <snip>                      <size>    <date>     <filename>
    -rw------- 1 sconce users    4442 2005-12-31 12:15 4572
    -rw------- 1 sconce users    3867 2005-12-31 12:15 4573
    -rw------- 1 sconce users    3495 2005-12-31 16:16 4574
    -rw------- 1 sconce users    4469 2006-01-01 15:29 4575
    </snip>




More information about the gnhlug-discuss mailing list