sorting pathnames by basename
Kevin D. Clark
kclark at CetaceanNetworks.com
Mon Aug 19 17:31:14 EDT 2002
mod+gnhlug at std.com (Michael O'Donnell) writes:
> Thank you. I think. For the benefit of those
> here assembled, please supply an explanation.
OK, since you asked.
You have a list of stuff that you want sorted. The problem is is that
you want your stuff sorted according to a field contained in the input
(the last field). Further complicating matters is the fact that this
field is located at a non-constant place in the input field.
(I initially thought about using "sort", but then I had trouble with
the sort options, and I gave up and used my favorite tool: Perl)
Now, just to complicate matters, suppose your input consisted of a
million filenames (or so). How to do this efficiently?
Big suggestion: finding all of those comparison fields, once per
sort comparison, is going to be really expensive...
The solution: take the input, generate a list from each line, each
element in the list consists of a tuple consisting of the original
line plus the comparison key. Sort the tuple-list using the
comparison key, and then after the sort is done, strip off all of the
comparison keys, returning the original list ({sans} tuples), sorted.
This is actually a well-known technique in Perl, called the
Schwartzian Transform. Look it up on the web -- there are plenty of
good descriptions of it.
> BTW, this is actually a fairly good example of
> why my immune system always concludes that I'm
> in physical danger when perl code is visible...
Honestly, I wrote that one-liner more with the intent of showing you
how cool Perl is, not with the intent of scaring you off from Perl.
Regards,
--kevin
--
Kevin D. Clark / Cetacean Networks / Portsmouth, N.H. (USA)
cetaceannetworks.com!kclark (GnuPG ID: B280F24E)
alumni.unh.edu!kdc
More information about the gnhlug-discuss
mailing list