Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

Ruby Programming
Previous Page Home Next Page

Object-Oriented Regular Expressions

We have to admit that while all these weird variables are very convenient to use, they aren't very object oriented, and they're certainly cryptic. And didn't we say that everything in Ruby was an object? What's gone wrong here?

Nothing, really. It's just that when Matz designed Ruby, he produced a fully object-oriented regular expression handling system. He then made it look familiar to Perl programmers by wrapping all these $-variables on top of it all. The objects and classes are still there, underneath the surface. So let's spend a while digging them out.

We've already come across one class: regular expression literals create instances of class Regexp (documented beginning on page 361).

re = /cat/
re.type Regexp

The method Regexp#match matches a regular expression against a string. If unsuccessful, the method returns nil. On success, it returns an instance of class MatchData, documented beginning on page 336. And that MatchData object gives you access to all available information about the match. All that good stuff that you can get from the $-variables is bundled in a handy little object.

re = /(\d+):(\d+)/     # match a time hh:mm
md = re.match("Time: 12:34am")
md.type MatchData
md[0]         # == $& "12:34"
md[1]         # == $1 "12"
md[2]         # == $2 "34"
md.pre_match  # == $` "Time: "
md.post_match # == $' "am"

Because the match data is stored in its own object, you can keep the results of two or more pattern matches available at the same time, something you can't do using the $-variables. In the next example, we're matching the same Regexp object against two strings. Each match returns a unique MatchData object, which we verify by examining the two subpattern fields.

re = /(\d+):(\d+)/     # match a time hh:mm
md1 = re.match("Time: 12:34am")
md2 = re.match("Time: 10:30pm")
md1[1, 2] ["12", "34"]
md2[1, 2] ["10", "30"]

So how do the $-variables fit in? Well, after every pattern match, Ruby stores a reference to the result (nil or a MatchData object) in a thread-local variable (accessible using $~). All the other regular expression variables are then derived from this object. Although we can't really think of a use for the following code, it demonstrates that all the other MatchData-related $-variables are indeed slaved off the value in $~.

re = /(\d+):(\d+)/
md1 = re.match("Time: 12:34am")
md2 = re.match("Time: 10:30pm")
[ $1, $2 ]   # last successful match ["10", "30"]
$~ = md1
[ $1, $2 ]   # previous successful match ["12", "34"]

Having said all this, we have to 'fess up. Andy and Dave normally use the $-variables rather than worrying about MatchData objects. For everyday use, they just end up being more convenient. Sometimes we just can't help being pragmatic.


Ruby Programming
Previous Page Home Next Page

 
 
  Published under the terms of the Open Publication License Design by Interspire