extract string

Jon maddog Hall maddog at li.org
Wed Jan 11 05:39:00 EST 2006


Zhao,

I am really busy right now, so I have not read all of the responses to your
problem completely, but I did notice this:


greenmt at gmail.com said:
> You said that "there is an extra column in the 3rd line". I disagree  with
> you from my perspective. As you can see, there are 3 commas in  between
> "jesse" and "Dartmouth college". For these 3 commas, again, if  we think the
> 2nd one as an merely indication that the value for age  column is missing,
> then the 3rd line will be be read as ["jesse",  MISSING, "Dartmouth
> college"], not ["jesse",empty,empty, "Dartmouth  college"] as you suggested.

A lot of these textual commands depend on the concept of a "field delimiter".
In your first example, it seemed clear that a possible "field delimiter" was
the comma (","), and so if you saw two commas together, it represented an
"empty" field.  Not a "missing" field, because the field was technically still
there....it just had NO data in it.  When you included the line:

 "jesse",,,"Dartmouth college"

and claimed that the middle comma represented a missing age, to a textual
based scanning program that has been told that the comma is a field separator
means that there are now four fields in the line, not just three.

If, from the beginning, you had shown that you meant for the comma to be used
both as a delimiter and as a piece of data, then a lot of the answers would
have been completely different (and probably considerably more complex).

md
-- 
Jon "maddog" Hall
Executive Director           Linux International(R)
email: maddog at li.org         80 Amherst St. 
Voice: +1.603.672.4557       Amherst, N.H. 03031-3032 U.S.A.
WWW: http://www.li.org

Board Member: Uniforum Association, USENIX Association

(R)Linux is a registered trademark of Linus Torvalds in several countries.
(R)Linux International is a registered trademark in the USA used pursuant
   to a license from Linux Mark Institute, authorized licensor of Linus
   Torvalds, owner of the Linux trademark on a worldwide basis
(R)UNIX is a registered trademark of The Open Group in the USA and other
   countries.




More information about the gnhlug-discuss mailing list