how to list file and sort by filename using ls

Kevin D. Clark kclark at mtghouse.com
Mon Apr 24 17:36:01 EDT 2006


Zhao Peng <greenmt at gmail.com> writes:

> output of echo $LANG:
>
>     en_US.UTF-8
>
> "LANG=C ls -ul" does do what I expected to do.
>
> What does "C" mean? character?

To be more specific, I probably should have specified LC_COLLATE
instead of "LANG".  No big deal.

All of this stuff refers to "locale" settings, which all relates to
"internationalization" (which is frequently abbrebiated "I18N").

I think that this web page gives a good description of what UTF-8 is:

   http://docs.sun.com/app/docs/doc/805-4123/6j3tmpc75?a=view

   UTF-8 is a file system safe Universal Character Set Transformation
   Format of Unicode / ISO/IEC 10646-1 formulated by XoJIG of X/Open
   in 1992 and approved by ISO and IEC as Amendment 2 to ISO/IEC
   10646-1:1993 in 1996.

This is a far more precise description of what UTF-8 is than I can
conjure up at this time of day.  (-:

So, part of the notion of a locale is a *character set*, and
furthermore, there is an associated way to *collate/sort* these
characters as well.

en_US.UTF-8 sees 'a' and 'A' as being equivalent when these are
sorted.

When LANG=C, your telling the system that you want the {old, default,
non-I18N, characters are functionally at most 1*sizeof(char) wide,
this is how the "C" language originally did it} manner of
sorting/collating.  In this "locale", 'a' and "A" are different.

Many people, including myself, are more used to the "C" locale's way
of sorting, but we can see the merits of other locales too.

You can learn more by reading the man pages for locale, setlocale(),
strcoll(), etc.

Regards,

--kevin
-- 
GnuPG ID: B280F24E                     And the madness of the crowd
alumni.unh.edu!kdc                     Is an epileptic fit
                                       -- Tom Waits




More information about the gnhlug-discuss mailing list