sorting pathnames by basename

Kevin D. Clark kclark at CetaceanNetworks.com
Mon Aug 19 17:31:14 EDT 2002


mod+gnhlug at std.com (Michael O'Donnell) writes:

> Thank you.  I think.  For the benefit of those
> here assembled, please supply an explanation.

OK, since you asked.

You have a list of stuff that you want sorted.  The problem is is that
you want your stuff sorted according to a field contained in the input
(the last field).  Further complicating matters is the fact that this
field is located at a non-constant place in the input field.

(I initially thought about using "sort", but then I had trouble with
the sort options, and I gave up and used my favorite tool: Perl)


Now, just to complicate matters, suppose your input consisted of a
million filenames (or so).  How to do this efficiently?

Big suggestion:  finding all of those comparison fields, once per 
                 sort comparison, is going to be really expensive...


The solution: take the input, generate a list from each line, each
element in the list consists of a tuple consisting of the original
line plus the comparison key.  Sort the tuple-list using the
comparison key, and then after the sort is done, strip off all of the
comparison keys, returning the original list ({sans} tuples), sorted.

This is actually a well-known technique in Perl, called the
Schwartzian Transform.  Look it up on the web -- there are plenty of
good descriptions of it.

> BTW, this is actually a fairly good example of
> why my immune system always concludes that I'm
> in physical danger when perl code is visible...

Honestly, I wrote that one-liner more with the intent of showing you
how cool Perl is, not with the intent of scaring you off from Perl.

Regards,

--kevin
-- 
Kevin D. Clark / Cetacean Networks / Portsmouth, N.H. (USA)
cetaceannetworks.com!kclark (GnuPG ID: B280F24E)
alumni.unh.edu!kdc




More information about the gnhlug-discuss mailing list