7#ifndef STANDARDANALYZER_H
8#define STANDARDANALYZER_H
#define LUCENE_CLASS(Name)
Definition LuceneObject.h:24
An Analyzer builds TokenStreams, which analyze text. It thus represents a policy for extracting index...
Definition Analyzer.h:19
Utility template class to handle hash set collections that can be safely copied and shared.
Definition HashSet.h:17
Version
Definition Constants.h:40
Filters StandardTokenizer with StandardFilter, LowerCaseFilter and StopFilter, using a list of Englis...
Definition StandardAnalyzer.h:23
virtual ~StandardAnalyzer()
StandardAnalyzer(LuceneVersion::Version matchVersion)
Builds an analyzer with the default stop words (STOP_WORDS_SET).
void setMaxTokenLength(int32_t length)
Set maximum allowed token length. If a token is seen that exceeds this length then it is discarded....
LuceneVersion::Version matchVersion
Definition StandardAnalyzer.h:61
virtual TokenStreamPtr tokenStream(const String &fieldName, const ReaderPtr &reader)
Constructs a StandardTokenizer filtered by a StandardFilter, a LowerCaseFilter and a StopFilter.
StandardAnalyzer(LuceneVersion::Version matchVersion, HashSet< String > stopWords)
Builds an analyzer with the given stop words.
bool replaceInvalidAcronym
Specifies whether deprecated acronyms should be replaced with HOST type.
Definition StandardAnalyzer.h:58
HashSet< String > stopSet
Definition StandardAnalyzer.h:55
int32_t maxTokenLength
Definition StandardAnalyzer.h:63
bool enableStopPositionIncrements
Definition StandardAnalyzer.h:59
virtual TokenStreamPtr reusableTokenStream(const String &fieldName, const ReaderPtr &reader)
Creates a TokenStream that is allowed to be re-used from the previous time that the same thread calle...
StandardAnalyzer(LuceneVersion::Version matchVersion, const ReaderPtr &stopwords)
Builds an analyzer with the stop words from the given reader.
void ConstructAnalyser(LuceneVersion::Version matchVersion, HashSet< String > stopWords)
Construct an analyzer with the given stop words.
StandardAnalyzer(LuceneVersion::Version matchVersion, const String &stopwords)
Builds an analyzer with the stop words from the given file.
int32_t getMaxTokenLength()
static const int32_t DEFAULT_MAX_TOKEN_LENGTH
Default maximum allowed token length.
Definition StandardAnalyzer.h:52
Definition AbstractAllTermDocs.h:12
boost::shared_ptr< TokenStream > TokenStreamPtr
Definition LuceneTypes.h:63
boost::shared_ptr< Reader > ReaderPtr
Definition LuceneTypes.h:547