Back on page 50 when we were creating a song list
from a file, we used a regular expression to match the field delimiter
in the input file. We claimed that the expression
line.split(/\s*\|\s*/)
matched a vertical bar surrounded by optional
whitespace. Let's explore regular expressions in more detail to see why
this claim is true.
Regular expressions are used to match patterns against strings.
Ruby provides built-in support that makes pattern matching and
substitution convenient and concise. In this section we'll work
through all the main features of regular expressions. There are some
details we won't cover: have a look at page 205 for
more information.
Regular expressions are objects of type
Regexp
. They can be
created by calling the constructor explicitly or by using the literal
forms /
pattern/ and %r\
pattern\.
a = Regexp.new('^\s*[a-z]')
|
� |
/^\s*[a-z]/
|
b = /^\s*[a-z]/
|
� |
/^\s*[a-z]/
|
c = %r{^\s*[a-z]}
|
� |
/^\s*[a-z]/
|
Once you have a regular expression object, you can match it against a
string using
Regexp#match(aString)
or the match
operators
=~
(positive match) and
!~
(negative match). The
match operators are defined for both
String
and
Regexp
objects.
If both operands of the match operator are
Strings
, the one on
the right will be converted to a regular expression.
a = "Fats Waller"
|
a =~ /a/
|
� |
1
|
a =~ /z/
|
� |
nil
|
a =~ "ll"
|
� |
7
|
The match operators return the character position at which the match
occurred. They also have the side effect of setting a whole load of
Ruby variables.
$&
receives the part of the string that was
matched by the pattern,
$`
receives the part of the string
that preceded the match, and
$'
receives the string after the
match. We can use this to write a method,
showRE
, which
illustrates where a particular pattern matches.
def showRE(a,re)
|
if a =~ re
|
"#{$`}<<#{$&}>>#{$'}"
|
else
|
"no match"
|
end
|
end
|
|
showRE('very interesting', /t/)
|
� |
very in<<t>>eresting
|
showRE('Fats Waller', /ll/)
|
� |
Fats Wa<<ll>>er
|
The match also sets the thread-global variables
$~
and
$1
through
$9
.
The variable
$~
is a
MatchData
object
(described beginning on page 336) that holds everything you might
want to know about the match.
$1
and so on hold the values of
parts of the match. We'll talk about these later. And for people who
cringe when they see these Perl-like variable names, stay
tuned. There's good news at the end of the chapter.