A
Matcher objects
encapsulate a regular expression and a string of text (a
Pattern and a
java.lang.CharSequence) and defines methods for
matching the pattern to the text in several different ways, for
obtaining details about pattern matches, and for doing
search-and-replace operations on the text. Matcher has no public
constructor. Obtain a Matcher by passing the
character sequence to be matched to the matcher( )
method of the desired Pattern object. You can also
reuse an existing Matcher object with a new
character sequence (but the same Pattern) by
passing a new CharSequence to the
matcher's reset( ) method. In
Java 5.0, you can use a new Pattern object on the
current character sequence with the usePattern( )
method.
Once you have created or reset a Matcher, there
are three types of comparisons you can perform between the regular
expression and the character sequence. All three comparisons operate
on the current
region of the character sequence.
By default, this region is the entire sequence. In Java 5.0, however,
you can set the bound of the region with region(
). The simplest type of comparison is the matches(
) method. It returns TRue if the pattern
matches the complete region of the character sequence, and returns
false otherwise. The lookingAt(
) method is similar: it returns true if the pattern matches
the complete region, or if it matches some subsequence at the
beginning of the region. If the pattern does not match the start of
the region, lookingAt( ) returns
false. matches( ) requires the
pattern to match both the beginning and ending of the region, and
lookingAt( ) requires the pattern to match the
beginning. The find( ) method, on the other hand,
has neither of these requirements: it returns true
if the pattern matches any part of the region. As will be described
below, find( ) has some special behavior that
allows it to be used in a loop to find all matches in the text.
If matches( ), lookingAt( ), or
find( ) return TRue, then
several other Matcher methods can be used to
obtain details about the matched text. The
MatchResult interface defines the start(
), end( ) and group(
) methods that return the starting position, the ending
position and the text of the match, and of any matching
subexpressions within the Pattern. See
MatchResult for details. The
MatchResult interface is new in Java 5.0, but
Matcher implements all of its methods in Java 1.4
as well. Calling MatchResult methods on a
Matcher returns results from the most recent
match. If you want to store these results, call
toMatchResult( ) to obtain an indepedent,
immutable MatchResult object whose methods can be
queried later.
The no-argument version of find( ) has special
behavior that makes it suitable for use in a loop to find all matches
of a pattern within a region. The first time find(
) is called after a Matcher is created
or after the reset( ) method is called, it starts
it search at the beginning of the string. If it finds a match, it
stores the start and end position of the matched text. If
reset( ) is not called in the meantime, then the
next call to find( ) searches again but starts the
search at the first character after the match: at the position
returned by end( ). (If the previous call to
find( ) matched the empty string, then the next
call begins at end( )+1 instead.) In this way, it
is possible to find all matches of a pattern within a string simply
by calling find( ) repeatedly until it returns
false indicating that no match was found. After
each repeated call to find( ) you can use the
MatchResult methods to obtain more information
about the text that matched the pattern and any of its subpatterns.
Matcher also defines methods that perform
search-and-replace operations. replaceFirst( )
searches the character sequence for the first subsequence that
matches the pattern. It then returns a string that is the character
sequence with the matched text replaced with the specified
replacement string. replaceAll( ) is similar, but
replaces all matching subsequences within the character sequence
instead of just replacing the first. The replacement string passed to
replaceFirst( ) and replaceAll(
) is not always replaced literally. If the replacement
contains a dollar sign followed by an integer that is a valid group
number, then the dollar sign and the number are replaced by the text
that matched the numbered group. If you want to include a literal
dollar sign in the replacement string, preceed it with a backslash.
In Java 5.0, you can use the static quoteReplacement(
) method to properly quote any special characters in a
replacement string so that the string will be interpreted literally.
replaceFirst( ) and replaceAll(
) are convenience methods that cover the most common
search-and-replace cases. However, Matcher also
defines lower-level methods that you can use to do a custom
search-and-replace operation in conjunction with calls to find( ),
and build up a modified string in a StringBuffer.
In order to understand this search-and-replace procedure, you must
know that a Matcher maintains a
"append position", which starts at
zero when the Matcher is created, and is restored
to zero by the reset( ) method. The
appendReplacement( ) method is designed to be used
after a successful call to find( ). It copies all
the text between the append position and the character before the
start( ) position for the last match into the
specified string buffer. Then it appends the specified replacement
text to that string buffer (performing the same substitutions that
replaceAll( ) does). Finally, it sets the append
position to the end( ) of the last match, so that
a subsequent call to appendReplacement( ) starts
at a new character. appendReplacement( ) is
intended for use after a call to find( ) that
returns TRue. When find( )
cannot find another match and returns false, you
should complete the replacement operation by calling
appendTail( ): this method copies all text between
the end( ) position of the last match and the end
of the character sequence into the specified
StringBuffer.
The reset( ) method has been mentioned several
times. It erases any saved information about the last match, and
restores the Matcher to its initial state so that
subsequent calls to find( ) and
appendReplacement( ) start at the begining of the
character sequence. The one-argument version of reset(
) also allows you to specify an entirely new character
sequence to match against. It is important to understand that several
other Matcher methods call reset(
) themselves before they perform their operation. They are:
matches( ), lookingAt( ), the
one-argument version of find( ),
replaceAll( ), and replaceFirst(
).
Prior to Java 5.0, the region of the input text that a
Matcher operates on is the entire character
sequence. In Java 5.0, you can define a different region with the
region( ) method, which specifies the position of
the first character in the region and the position of the first
character after the end of the region. regionStart(
) and regionEnd( ) return the current
value of these region bounds. By default, regions are
"anchoring" which means that the
start and end of the region match the ^ and
$ anchors. (See Pattern for
regular expression grammar details.) Call
useAnchoringBounds( ) to turn anchoring bounds on
or off in Java 5.0. The bounds of a region are
"opaque" by default, which means
that the Matcher will not look through the bounds
in an attempt to match look-ahead or look-behind assertions (see
Pattern). In Java 5.0, you can make the bounds
transparent with useTransparentBounds(true).
Matcher is not threadsafe, and should not be used
by more than one thread concurrently.
Figure 16-131. java.util.regex.Matcher
public final class
Matcher implements MatchResult {
// No Constructor
// Public Class Methods
5.0 public static String
quoteReplacement (String
s );
// Public Instance Methods
public Matcher
appendReplacement (StringBuffer
sb , String
replacement );
public StringBuffer
appendTail (StringBuffer
sb );
public int
end ( ); Implements:MatchResult
public int
end (int
group ); Implements:MatchResult
public boolean
find ( );
public boolean
find (int
start );
public String
group ( ); Implements:MatchResult
public String
group (int
group ); Implements:MatchResult
public int
groupCount ( ); Implements:MatchResult
5.0 public boolean
hasAnchoringBounds ( );
5.0 public boolean
hasTransparentBounds ( );
5.0 public boolean
hitEnd ( );
public boolean
lookingAt ( );
public boolean
matches ( );
public Pattern
pattern ( );
5.0 public Matcher
region (int
start , int
end );
5.0 public int
regionEnd ( );
5.0 public int
regionStart ( );
public String
replaceAll (String
replacement );
public String
replaceFirst (String
replacement );
5.0 public boolean
requireEnd ( );
public Matcher
reset ( );
public Matcher
reset (CharSequence
input );
public int
start ( ); Implements:MatchResult
public int
start (int
group ); Implements:MatchResult
5.0 public MatchResult
toMatchResult ( );
5.0 public Matcher
useAnchoringBounds (boolean
b );
5.0 public Matcher
usePattern (Pattern
newPattern );
5.0 public Matcher
useTransparentBounds (boolean
b );
// Methods Implementing MatchResult
public int
end ( );
public int
end (int
group );
public String
group ( );
public String
group (int
group );
public int
groupCount ( );
public int
start ( );
public int
start (int
group );
// Public Methods Overriding Object
5.0 public String
toString ( );
}