Java in a Nutshell, 5th Edition [Electronic resources]

نسخه متنی -صفحه : 1191/ 874
نمايش فراداده

Matcherjava.util.regex

Java 1.4

A Matcher objects encapsulate a regular expression and a string of text (a Pattern and a java.lang.CharSequence) and defines methods for matching the pattern to the text in several different ways, for obtaining details about pattern matches, and for doing search-and-replace operations on the text. Matcher has no public constructor. Obtain a Matcher by passing the character sequence to be matched to the matcher( ) method of the desired Pattern object. You can also reuse an existing Matcher object with a new character sequence (but the same Pattern) by passing a new CharSequence to the matcher's reset( ) method. In Java 5.0, you can use a new Pattern object on the current character sequence with the usePattern( ) method.

Once you have created or reset a Matcher, there are three types of comparisons you can perform between the regular expression and the character sequence. All three comparisons operate on the current

region of the character sequence. By default, this region is the entire sequence. In Java 5.0, however, you can set the bound of the region with region( ). The simplest type of comparison is the matches( ) method. It returns TRue if the pattern matches the complete region of the character sequence, and returns false otherwise. The lookingAt( ) method is similar: it returns true if the pattern matches the complete region, or if it matches some subsequence at the beginning of the region. If the pattern does not match the start of the region, lookingAt( ) returns false. matches( ) requires the pattern to match both the beginning and ending of the region, and lookingAt( ) requires the pattern to match the beginning. The find( ) method, on the other hand, has neither of these requirements: it returns true if the pattern matches any part of the region. As will be described below, find( ) has some special behavior that allows it to be used in a loop to find all matches in the text.

If matches( ), lookingAt( ), or find( ) return TRue, then several other Matcher methods can be used to obtain details about the matched text. The MatchResult interface defines the start( ), end( ) and group( ) methods that return the starting position, the ending position and the text of the match, and of any matching subexpressions within the Pattern. See MatchResult for details. The MatchResult interface is new in Java 5.0, but Matcher implements all of its methods in Java 1.4 as well. Calling MatchResult methods on a Matcher returns results from the most recent match. If you want to store these results, call toMatchResult( ) to obtain an indepedent, immutable MatchResult object whose methods can be queried later.

The no-argument version of find( ) has special behavior that makes it suitable for use in a loop to find all matches of a pattern within a region. The first time find( ) is called after a Matcher is created or after the reset( ) method is called, it starts it search at the beginning of the string. If it finds a match, it stores the start and end position of the matched text. If reset( ) is not called in the meantime, then the next call to find( ) searches again but starts the search at the first character after the match: at the position returned by end( ). (If the previous call to find( ) matched the empty string, then the next call begins at end( )+1 instead.) In this way, it is possible to find all matches of a pattern within a string simply by calling find( ) repeatedly until it returns false indicating that no match was found. After each repeated call to find( ) you can use the MatchResult methods to obtain more information about the text that matched the pattern and any of its subpatterns.

Matcher also defines methods that perform search-and-replace operations. replaceFirst( ) searches the character sequence for the first subsequence that matches the pattern. It then returns a string that is the character sequence with the matched text replaced with the specified replacement string. replaceAll( ) is similar, but replaces all matching subsequences within the character sequence instead of just replacing the first. The replacement string passed to replaceFirst( ) and replaceAll( ) is not always replaced literally. If the replacement contains a dollar sign followed by an integer that is a valid group number, then the dollar sign and the number are replaced by the text that matched the numbered group. If you want to include a literal dollar sign in the replacement string, preceed it with a backslash. In Java 5.0, you can use the static quoteReplacement( ) method to properly quote any special characters in a replacement string so that the string will be interpreted literally.

replaceFirst( ) and replaceAll( ) are convenience methods that cover the most common search-and-replace cases. However, Matcher also defines lower-level methods that you can use to do a custom search-and-replace operation in conjunction with calls to find( ), and build up a modified string in a StringBuffer. In order to understand this search-and-replace procedure, you must know that a Matcher maintains a "append position", which starts at zero when the Matcher is created, and is restored to zero by the reset( ) method. The appendReplacement( ) method is designed to be used after a successful call to find( ). It copies all the text between the append position and the character before the start( ) position for the last match into the specified string buffer. Then it appends the specified replacement text to that string buffer (performing the same substitutions that replaceAll( ) does). Finally, it sets the append position to the end( ) of the last match, so that a subsequent call to appendReplacement( ) starts at a new character. appendReplacement( ) is intended for use after a call to find( ) that returns TRue. When find( ) cannot find another match and returns false, you should complete the replacement operation by calling appendTail( ): this method copies all text between the end( ) position of the last match and the end of the character sequence into the specified StringBuffer.

The reset( ) method has been mentioned several times. It erases any saved information about the last match, and restores the Matcher to its initial state so that subsequent calls to find( ) and appendReplacement( ) start at the begining of the character sequence. The one-argument version of reset( ) also allows you to specify an entirely new character sequence to match against. It is important to understand that several other Matcher methods call reset( ) themselves before they perform their operation. They are: matches( ), lookingAt( ), the one-argument version of find( ), replaceAll( ), and replaceFirst( ).

Prior to Java 5.0, the region of the input text that a Matcher operates on is the entire character sequence. In Java 5.0, you can define a different region with the region( ) method, which specifies the position of the first character in the region and the position of the first character after the end of the region. regionStart( ) and regionEnd( ) return the current value of these region bounds. By default, regions are "anchoring" which means that the start and end of the region match the ^ and $ anchors. (See Pattern for regular expression grammar details.) Call useAnchoringBounds( ) to turn anchoring bounds on or off in Java 5.0. The bounds of a region are "opaque" by default, which means that the Matcher will not look through the bounds in an attempt to match look-ahead or look-behind assertions (see Pattern). In Java 5.0, you can make the bounds transparent with useTransparentBounds(true).

Matcher is not threadsafe, and should not be used by more than one thread concurrently.

Figure 16-131. java.util.regex.Matcher

public final class

Matcher implements MatchResult { // No Constructor // Public Class Methods

5.0 public static String

quoteReplacement (String

s ); // Public Instance Methods public Matcher

appendReplacement (StringBuffer

sb , String

replacement ); public StringBuffer

appendTail (StringBuffer

sb ); public int

end ( ); Implements:MatchResult public int

end (int

group ); Implements:MatchResult public boolean

find ( ); public boolean

find (int

start ); public String

group ( ); Implements:MatchResult public String

group (int

group ); Implements:MatchResult public int

groupCount ( ); Implements:MatchResult

5.0 public boolean

hasAnchoringBounds ( );

5.0 public boolean

hasTransparentBounds ( );

5.0 public boolean

hitEnd ( ); public boolean

lookingAt ( ); public boolean

matches ( ); public Pattern

pattern ( );

5.0 public Matcher

region (int

start , int

end );

5.0 public int

regionEnd ( );

5.0 public int

regionStart ( ); public String

replaceAll (String

replacement ); public String

replaceFirst (String

replacement );

5.0 public boolean

requireEnd ( ); public Matcher

reset ( ); public Matcher

reset (CharSequence

input ); public int

start ( ); Implements:MatchResult public int

start (int

group ); Implements:MatchResult

5.0 public MatchResult

toMatchResult ( );

5.0 public Matcher

useAnchoringBounds (boolean

b );

5.0 public Matcher

usePattern (Pattern

newPattern );

5.0 public Matcher

useTransparentBounds (boolean

b ); // Methods Implementing MatchResult public int

end ( ); public int

end (int

group ); public String

group ( ); public String

group (int

group ); public int

groupCount ( ); public int

start ( ); public int

start (int

group ); // Public Methods Overriding Object

5.0 public String

toString ( ); }

Returned By

Pattern.matcher( )