Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions

  




 

 

Ruby Programming
Previous Page Home Next Page

Regular Expressions

Most of Ruby's built-in types will be familiar to all programmers. A majority of languages have strings, integers, floats, arrays, and so on. However, until Ruby came along, regular expression support was generally built into only the so-called scripting languages, such as Perl, Python, and awk. This is a shame: regular expressions, although cryptic, are a powerful tool for working with text.

Entire books have been written about regular expressions (for example, Mastering Regular Expressions ), so we won't try to cover everything in just a short section. Instead, we'll look at just a few examples of regular expressions in action. You'll find full coverage of regular expressions starting on page 56.

A regular expression is simply a way of specifying a pattern of characters to be matched in a string. In Ruby, you typically create a regular expression by writing a pattern between slash characters (/pattern/). And, Ruby being Ruby, regular expressions are of course objects and can be manipulated as such.

For example, you could write a pattern that matches a string containing the text ``Perl'' or the text ``Python'' using the following regular expression.

/Perl|Python/

The forward slashes delimit the pattern, which consists of the two things we're matching, separated by a pipe character (``|''). You can use parentheses within patterns, just as you can in arithmetic expressions, so you could also have written this pattern as

/P(erl|ython)/

You can also specify repetition within patterns. /ab+c/ matches a string containing an ``a'' followed by one or more ``b''s, followed by a ``c''. Change the plus to an asterisk, and /ab*c/ creates a regular expression that matches an ``a'', zero or more ``b''s, and a ``c''.

You can also match one of a group of characters within a pattern. Some common examples are character classes such as ``\s'', which matches a whitespace character (space, tab, newline, and so on), ``\d'', which matches any digit, and ``\w'', which matches any character that may appear in a typical word. The single character ``.'' (a period) matches any character.

We can put all this together to produce some useful regular expressions.

/\d\d:\d\d:\d\d/     # a time such as 12:34:56
/Perl.*Python/       # Perl, zero or more other chars, then Python
/Perl\s+Python/      # Perl, one or more spaces, then Python
/Ruby (Perl|Python)/ # Ruby, a space, and either Perl or Python

Once you have created a pattern, it seems a shame not to use it. The match operator ``=~'' can be used to match a string against a regular expression. If the pattern is found in the string, =~ returns its starting position, otherwise it returns nil. This means you can use regular expressions as the condition in if and while statements. For example, the following code fragment writes a message if a string contains the text 'Perl' or 'Python'.

if line =~ /Perl|Python/
  puts "Scripting language mentioned: #{line}"
end

The part of a string matched by a regular expression can also be replaced with different text using one of Ruby's substitution methods.

line.sub(/Perl/, 'Ruby')    # replace first 'Perl' with 'Ruby'
line.gsub(/Python/, 'Ruby') # replace every 'Python' with 'Ruby'

We'll have a lot more to say about regular expressions as we go through the book.
Ruby Programming
Previous Page Home Next Page

 
 
  Published under the terms of the Open Publication License Design by Interspire