Lucene 1.4.3 API

Jakarta Lucene is a high-performance, full-featured text search engine library.

See:
          Description

Core
org.apache.lucene.analysis API and code to convert text into indexable tokens.
org.apache.lucene.analysis.de Support for indexing and searching of German text.
org.apache.lucene.analysis.ru Support for indexing and searching Russian text.
org.apache.lucene.analysis.standard A grammar-based tokenizer constructed with JavaCC.
org.apache.lucene.document The Document abstraction.
org.apache.lucene.index Code to maintain and access indices.
org.apache.lucene.queryParser A simple query parser implemented with JavaCC.
org.apache.lucene.search Search over indices.
org.apache.lucene.search.spans The calculus of spans.
org.apache.lucene.store Binary i/o API, used for all index data.
org.apache.lucene.util Some utility classes.

 

Contributions
org.apache.lucene.search.highlight The highlight package contains classes to provide "keyword in context" features typically used to highlight search terms in the text of results pages.

 

Jakarta Lucene is a high-performance, full-featured text search engine library. The API is divided into several packages:

To use Lucene, an application should:
  1. Create Document's by adding Field's.
  2. Create an IndexWriter and add documents to to it with addDocument();
  3. Call QueryParser.parse() to build a query from a string; and
  4. Create an IndexSearcher and pass the query to its search() method.
Some simple examples of code which does this are: To demonstrate these, try something like:
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexFiles rec.food.recipes/soups
adding rec.food.recipes/soups/abalone-chowder
  [ ... ]

> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.SearchFiles
Query: chowder
Searching for: chowder
34 total matching documents
0. rec.food.recipes/soups/spam-chowder
  [ ... thirty-four documents contain the word "chowder", "spam-chowder" with the greatest density.]

Query: path:chowder
Searching for: path:chowder
31 total matching documents
0. rec.food.recipes/soups/abalone-chowder
  [ ... only thrity-one have "chowder" in the "path" field. ]

Query: path:"clam chowder"
Searching for: path:"clam chowder"
10 total matching documents
0. rec.food.recipes/soups/clam-chowder
  [ ... only ten have "clam chowder" in the "path" field. ]

Query: path:"clam chowder" AND manhattan
Searching for: +path:"clam chowder" +manhattan
2 total matching documents
0. rec.food.recipes/soups/clam-chowder
  [ ... only two also have "manhattan" in the contents. ]
    [ Note: "+" and "-" are canonical, but "AND", "OR" and "NOT" may be used. ]

The IndexHtml demo is more sophisticated.  It incrementally maintains an index of HTML files, adding new files as they appear, deleting old files as they disappear and re-indexing files as they change.
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML -create java/jdk1.1.6/docs/relnotes
adding java/jdk1.1.6/docs/relnotes/SMICopyright.html
  [ ... create an index containing all the relnotes ]

> rm java/jdk1.1.6/docs/relnotes/smicopyright.html

> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML java/jdk1.1.6/docs/relnotes
deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html

HTML indexes are searched using SUN's JavaWebServer (JWS) and Search.jhtml.  To use this: Note that indexes can be updated while searches are going on.  Search.jhtml will re-open the index when it is updated so that the latest version is immediately available.
 



Copyright © 2000-2005 Apache Software Foundation. All Rights Reserved.