a question regarding the use of Split operator in Perl

Ben Scott dragonhawk at gmail.com
Tue Sep 11 14:35:24 EDT 2007


On 9/11/07, Jerry <greenmt at gmail.com> wrote:
> I want to separate the phrases by 2+ whitespace to
> separate these phrases, so I think the perl code
> should look like this
> @list = split ( /\s{2,}/, $_);

  This may do what you want:

	@list = split ( /\s{2,}|\s\*\s/, $_);

That will do the split on either "two spaces" OR "a space, a star, and
a space".  Someone wise once gave me the tip, "Use split() when you
want to specify what you are throwing away"; in this case, we're
throwing away either of those two things.

  Some other tips (unrelated to your actual question):

  (1) Alternate pattern matching delimiters

  The use of alternate pattern delimiters can make regular expressions
more readable.  For example:

	@list = split ( m{\s{2,}|\s\*\s}, $_);

That uses {braces} instead of the default /slashes/ to identify the
pattern.  You can specify whatever character you want as a delimiter
(I tend to use <> and {} a lot).  The "m" prefix signifies a matching
pattern (as opposed to s//, which is a substitution pattern).  The "m"
is option for m// but required for anything else.

  (2) Optional syntax

  You don't need to put parenthesis around arguments to split, and you
don't need to explicitly specify the default pattern match target
($_).  The resulting being the following, which I personally find more
readable:

	@list = split m{\s{2,}|\s\*\s};

  (3) The /x modifier

  You can use the /x modifier to allow whitespace and comments to be
embedded in a pattern.  (Literal whitespace becomes syntactically
insignificant.)  So now we get (you'll need to view in a monospace
font for it to line up properly):

	@list = split m{   # split on and discard ...
		\s{2,}     # ... two or more whitespace characters in a row ...
		|          # ... or ...
		\s\*\s     # ... on a space, a star, and a space (exactly; in that order)
		}x;


  Hope this helps!

-- Ben


More information about the gnhlug-discuss mailing list