Uses of Package
org.apache.lucene.analysis

Packages that use org.apache.lucene.analysis
Package
Description
Text analysis.
Analyzer for Arabic.
Analyzer for Bulgarian.
Analyzer for Bengali Language.
Provides various convenience classes for creating boosts on Tokens.
Analyzer for Brazilian Portuguese.
Analyzer for Catalan.
Normalization of text before the tokenizer.
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams.
Analyzer for Sorani Kurdish.
Fast, general-purpose grammar-based tokenizers.
Analyzer for Simplified Chinese, which indexes words.
Construct n-grams for frequently occurring terms and phrases.
A filter that decomposes compound words you find in many Germanic languages into the word parts.
Basic, general-purpose analysis components.
A general-purpose Analyzer that can be created with a builder-style API.
Analyzer for Czech.
Analyzer for Danish.
Analyzer for German.
Analyzer for Greek.
Fast, general-purpose URLs and email addresses tokenizers.
Analyzer for English.
Analyzer for Spanish.
Analyzer for Estonian.
Analyzer for Basque.
Analyzer for Persian.
Analyzer for Finnish.
Analyzer for French.
Analyzer for Irish.
Analyzer for Galician.
Analyzer for Hindi.
Analyzer for Hungarian.
A Java implementation of Hunspell stemming and spell-checking algorithms (Hunspell), and a stemming TokenFilter (HunspellStemFilter) based on it.
Analyzer for Armenian.
Analysis components based on ICU
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
Analyzer for Indonesian.
Analyzer for Indian languages.
Analyzer for Italian.
Analyzer for Japanese.
Analyzer for Korean.
Analyzer for Lithuanian.
Analyzer for Latvian.
MinHash filtering (for LSH).
Miscellaneous Tokenstreams.
Analyzer for Nepali.
Character n-gram tokenizers and filters.
Analyzer for Dutch.
Analyzer for Norwegian.
Analysis components for path-like strings such as filenames.
Set of components for pattern-based (regex) analysis.
Provides various convenience classes for creating payloads on Tokens.
Analysis components for phonetic search.
Analyzer for Polish.
Analyzer for Portuguese.
Automatically filter high-frequency stopwords.
Filter to reverse token text.
Analyzer for Romanian.
Analyzer for Russian.
Word n-gram filters.
TokenFilter and Analyzer implementations that use a modified version of Snowball stemmers.
Analyzer for Serbian.
Fast, general-purpose grammar-based tokenizer StandardTokenizer implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in Unicode Standard Annex #29.
Stempel: Algorithmic Stemmer
Analyzer for Swedish.
Analysis components for Synonyms.
Analysis components for Synonyms using Word2Vec model.
Analyzer for Tamil.
Analyzer for Telugu Language.
Analyzer for Thai.
Analyzer for Turkish.
Utility functions for text analysis.
Tokenizer that is aware of Wikipedia syntax.
Uses already seen data (the indexed documents) to classify an input ( can be simple text or a structured document).
Uses already seen data (the indexed documents) to classify new documents.
Utilities for evaluation, data preparation, etc.
Unicode collation support.
The logical representation of a Document for indexing and searching.
Taxonomy index implementation using on top of a Directory.
Code to maintain and access indices.
High-performance single-document main memory Apache Lucene fulltext search index.
Misc extensions of the Document/Field API.
Monitoring framework
Intervals queries
Document similarity query generators.
A simple query parser implemented with JavaCC.
QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*"
Extendable QueryParser provides a simple and flexible extension mechanism by overloading query field names.
Precedence Query Parser Implementation
Lucene Flexible Query Parser Implementation
Standard Lucene Query Configuration.
Standard Lucene Query Nodes.
This package contains classes that implement interval function support for the standard syntax parser.
Lucene Query Node Processors.
A simple query parser for human-entered queries.
Parser that produces Lucene Query objects from XML streams.
XML Parser factories for different Lucene Query/Filters.
Additional queries (some may have caveats or limitations)
This package contains a flexible graph-based proximity query, TermAutomatonQuery, and geospatial queries.
Highlighting search terms.
This package contains several components useful to build a highlighter on top of the Matches API.
Analyzer based autosuggest.
Support for document suggestion
The UnifiedHighlighter -- a flexible highlighter that can get offsets from postings, term vectors, or analysis.
Some utility classes.
Utility classes for working with token streams as graphs.