A character class is a set of characters between brackets:
[
characters
]
matches any single character between the
brackets.
[aeiou]
will match a vowel,
[,.:;!?]
matches
punctuation, and so on. The significance of the special regular
expression characters---
.|()[{+^$*?
---is turned off inside
the brackets. However, normal string substitution still occurs, so
(for example)
\b
represents a backspace character and
\n
a newline (see Table 18.2 on page 203). In addition,
you can use the abbreviations shown in Table 5.1 on page 59, so
that (for example)
\s
matches any whitespace character, not
just a literal space.
showRE('It costs $12.', /[aeiou]/)
|
� |
It c<<o>>sts $12.
|
showRE('It costs $12.', /[\s]/)
|
� |
It<< >>costs $12.
|
Within the brackets, the sequence c
1-c
2 represents all the
characters between c
1 and c
2, inclusive.
If you want to include the literal characters
]
and
-
within
a character class, they must appear at the start.
a = 'Gamma [Design Patterns-page 123]'
|
showRE(a, /[]]/)
|
� |
Gamma [Design Patterns-page 123<<]>>
|
showRE(a, /[B-F]/)
|
� |
Gamma [<<D>>esign Patterns-page 123]
|
showRE(a, /[-]/)
|
� |
Gamma [Design Patterns<<->>page 123]
|
showRE(a, /[0-9]/)
|
� |
Gamma [Design Patterns-page <<1>>23]
|
Put a
^
immediately after the opening bracket to negate a
character class:
[^a-z]
matches any character that isn't a
lowercase alphabetic.
Some character classes are used so frequently that Ruby provides
abbreviations for them. These abbreviations are listed in Table
5.1 on page 59---they may be used both within brackets and in
the body of a pattern.
showRE('It costs $12.', /\s/)
|
� |
It<< >>costs $12.
|
showRE('It costs $12.', /\d/)
|
� |
It costs $<<1>>2.
|
Character class abbreviations
Sequence
|
As [ ... ]
|
Meaning
|
\d
|
[0-9] |
Digit character |
\D
|
[^0-9] |
Nondigit |
\s
|
[\s\t\r\n\f] |
Whitespace character |
\S
|
[^\s\t\r\n\f] |
Nonwhitespace character |
\w
|
[A-Za-z0-9_] |
Word character |
\W
|
[^A-Za-z0-9_] |
Nonword character |
|
|
Finally, a period (``.'') appearing outside brackets represents any
character except a newline (and in multiline mode it matches a newline,
too).
a = 'It costs $12.'
|
showRE(a, /c.s/)
|
� |
It <<cos>>ts $12.
|
showRE(a, /./)
|
� |
<<I>>t costs $12.
|
showRE(a, /\./)
|
� |
It costs $12<<.>>
|