Better Faster Lighter Java [Electronic resources] نسخه متنی

اینجــــا یک کتابخانه دیجیتالی است

با بیش از 100000 منبع الکترونیکی رایگان به زبان فارسی ، عربی و انگلیسی

Better Faster Lighter Java [Electronic resources] - نسخه متنی

Justin Gehtland; Bruce A. Tate

| نمايش فراداده ، افزودن یک نقد و بررسی
افزودن به کتابخانه شخصی
ارسال به دوستان
جستجو در متن کتاب
بیشتر
تنظیمات قلم

فونت

اندازه قلم

+ - پیش فرض

حالت نمایش

روز نیمروز شب
جستجو در لغت نامه
بیشتر
لیست موضوعات
توضیحات
افزودن یادداشت جدید








9.7 The Search Service


The search service uses the

same collected object pattern as the
crawler/indexer. Our two classes this time are the
QueryBean, which is the main entry point into the
search service, and the HitBean, a representation
of a single result from the result set. In order to perform a search,
we need to know the location of the index to search, the search query
itself, and which field of the indexed documents to search:

private String query;
private String index;
private String field;

We also need an extensible collection to store our search results:

private List results = new ArrayList( );

We must provide a constructor for the class, which will take three
values:

public QueryBean(String index, String query, String field)
{
this.field = field;
this.index = index;
this.query = query;
}

The field variable contains the name of the field of an indexable
document we want to search. We want this to be configurable so future
versions might allow searching on any field in the document; for our
first version, the only important field is
"contents". We provide an overload of the
constructor that only takes index and
query and uses "contents" as
the default for field:

public QueryBean(String index, String query)
{
this(index, query, "contents");
}

The search feature itself is fairly straightforward:

public void execute( ) throws IOException, ParseException {
results.clear( );
if (query == null) return;
if (field == null) throw new IllegalArgumentException("field cannot be null");
if (index == null) throw new IllegalArgumentException("index cannot be null");
IndexSearcher indexSearcher = new IndexSearcher(index);
try {
Analyzer analyzer = new StandardAnalyzer( );
Query q = QueryParser.parse(query, field, analyzer);
Hits hits = indexSearcher.search(q);
for (int n=0; n<hits.length( ); n++) {
if (hits.score(n) < THRESHOLD_SCORE) {
return;
}
Document d = hits.doc(n);
String title = safeGetFieldString(d, "title");
results.add(new HitBean(d.getField("url").stringValue( ),
safeGetFieldString(d, "title"), hits.score(n)));
}
} finally {
indexSearcher.close( );
}
}

First, we make sure our results collection is empty and all our
arguments are within appropriate ranges. If they are, we create a new
instance of Lucene's
IndexSearcher, pointing it to the current version
of the search index. Next, we invoke Lucene to do the actual search
by creating an instance of Lucene's
Query class, passing in our search term(s), the
field we are searching, and a new instance of
Lucene's StandardAnalyzer. The
result of the IndexSearcher's
search method is a collection of Lucene Hit
objects, sorted in descending order by score. We grab the values we
need from them in order to create instances of our own
HitBean class. Notice we're using
the helper method safeGetFieldString to retrieve
values from the hit's document:

private String safeGetFieldString(Document d, String field) {
Field f = d.getField(field);
return (f == null) ? " : f.stringValue( );
}

This prevents us from adding a null instead of the empty string as
our field value. Last, but certainly not least (it's
in the finally block because it's important), we
close the indexSearcher handle to the index. This
step is vital when we start exposing the service via a web service:
open handles to the index prevent other users from accessing it.

The HitBean is primarily for storing simple result
data:

final String url;
final String title;
final float score;
private static NumberFormat nf;
static {
nf = NumberFormat.getNumberInstance( );
nf.setMaximumFractionDigits(2);
}
public HitBean(String url, String summary, float score) {
this.url = url;
this.title = summary;
this.score = score;
}
public String getScoreAsString( ) {
return nf.format(getScore( ));
}
public String getUrl( ) {
return url;
}
public String getTitle( ) {
return title;
}
public float getScore( ) {
return score;
}

Instances of the class store a full URL to the result file, the title
of that file, and a relative rank score. We provide a series of
getters to retrieve those values and a single constructor to
initialize them. The only interesting part is the use of the
java.text.NumberFormat class to create a formatter
for our result score.

Once we chose Lucene as our search tool, our code became very
straightforward. After a user supplies a search term, we simply
verify that the query will run as provided and then execute it,
compiling the results into a simple series of
HitBean instances.


9.7.1 Principles in Action


Keep it simple: simple objects representing query and results, unit
tests for search results

Choose the right tools: Lucene, JUnit

Do one thing, and do it well: QueryBean focuses on
search, ResultBean is simple data structure, and
IndexPathBean encapsulates the configurable index
property

Strive for transparency: shadow-copied

index so search and index can run
simultaneously

Allow for extension: none



/ 111