Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Mail Systems
Eclipse Documentation

How To Guides
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Problem Solutions




Thinking in Java
Prev Contents / Index Next

Creating regular expressions

You can begin learning regular expressions with a useful subset of the possible constructs. A complete list of constructs for building regular expressions can be found in the javadocs for the Pattern class for package java.util.regex.



The specific character B


Character with hex value 0xhh


The Unicode character with hex representation 0xhhhh






Carriage return


Form feed



The power of regular expressions begins to appear when defining character classes. Here are some typical ways to create character classes, and some predefined classes:

Character Classes


Represents any character


Any of the characters a, b, or c (same as a|b|c)


Any character except a, b, and c (negation)


Any character a through z or A through Z (range)


Any of a,b,c,h,i,j (same as a|b|c|h|i|j) (union)


Either h, i, or j (intersection)


A whitespace character (space, tab, newline, formfeed, carriage return)


A non-whitespace character ([^\s])


A numeric digit [0-9]


A non-digit [^0-9]


A word character [a-zA-Z_0-9]


A non-word character [^\w]

If you have any experience with regular expressions in other languages, you’ll immediately notice a difference in the way backslashes are handled. In other languages, “\\” means “I want to insert a plain old (literal) backslash in the regular expression. Don’t give it any special meaning.” In Java, “\\” means “I’m inserting a regular expression backslash, so the following character has special meaning.” For example, if you want to indicate one or more word characters, your regular expression string will be “\\w+”. If you want to insert a literal backslash, you say “\\\\”. However, things like newlines and tabs just use a single backslash: “\n\t”.

What’s shown here is only a sampling; you’ll want to have the java.util.regex.Pattern JDK documentation page bookmarked or on your “Start” menu so you can easily access all the possible regular expression patterns.

Logical Operators


X followed by Y


X or Y


A capturing group. You can refer to the ith captured group later in the expression with \i

Boundary Matchers


Beginning of a line


End of a line


Word boundary


Non-word boundary


End of the previous match

As an example, each of the following represent valid regular expressions, and all will successfully match the character sequence "Rudolph":


Thinking in Java
Prev Contents / Index Next

   Reproduced courtesy of Bruce Eckel, MindView, Inc. Design by Interspire