We have to admit that while all these weird variables are very
convenient to use, they aren't very object oriented, and they're
certainly cryptic. And didn't we say that everything in Ruby was an
object? What's gone wrong here?
Nothing, really. It's just that when Matz designed Ruby, he produced a
fully object-oriented regular expression handling system. He then made
it look familiar to Perl programmers by wrapping all these
$-variables on top of it all. The objects and classes are still there,
underneath the surface. So let's spend a while digging them out.
We've already come across one class: regular expression literals
create instances of class
Regexp
(documented beginning on page
361).
re = /cat/
|
re.type
|
� |
Regexp
|
The method
Regexp#match
matches a regular expression
against a string. If unsuccessful, the method returns
nil
. On success,
it returns an instance of class
MatchData
, documented beginning
on page 336. And that
MatchData
object gives you
access to all available information about the match. All that good stuff
that you can get from the $-variables is bundled in a handy little
object.
re = /(\d+):(\d+)/ # match a time hh:mm
|
md = re.match("Time: 12:34am")
|
md.type
|
� |
MatchData
|
md[0] # == $&
|
� |
"12:34"
|
md[1] # == $1
|
� |
"12"
|
md[2] # == $2
|
� |
"34"
|
md.pre_match # == $`
|
� |
"Time: "
|
md.post_match # == $'
|
� |
"am"
|
Because the match data is stored in its own object, you can keep the
results of two or more pattern matches available at the same time,
something you can't do using the $-variables. In the next example,
we're matching the same
Regexp
object against two strings. Each
match returns a unique
MatchData
object, which we verify by
examining the two subpattern fields.
re = /(\d+):(\d+)/ # match a time hh:mm
|
md1 = re.match("Time: 12:34am")
|
md2 = re.match("Time: 10:30pm")
|
md1[1, 2]
|
� |
["12", "34"]
|
md2[1, 2]
|
� |
["10", "30"]
|
So how do the $-variables fit in? Well, after every pattern match,
Ruby stores a reference to the result (
nil
or a
MatchData
object) in a thread-local variable (accessible using
$~
).
All the other regular expression variables are then derived from this
object. Although we can't really think of a use for the following
code, it demonstrates that all the other
MatchData
-related $-variables
are indeed slaved off the value in
$~
.
re = /(\d+):(\d+)/
|
md1 = re.match("Time: 12:34am")
|
md2 = re.match("Time: 10:30pm")
|
[ $1, $2 ] # last successful match
|
� |
["10", "30"]
|
$~ = md1
|
[ $1, $2 ] # previous successful match
|
� |
["12", "34"]
|
Having said all this, we have to 'fess up. Andy and Dave normally use
the $-variables rather than worrying about
MatchData
objects. For
everyday use, they just end up being more convenient. Sometimes we just
can't help being pragmatic.