Lucene++ - a full-featured, c++ search engine
API Documentation


Loading...
Searching...
No Matches
Public Types | Public Member Functions | Static Public Member Functions | Protected Member Functions | Protected Attributes
Lucene::AbstractField Class Reference

#include <AbstractField.h>

+ Inheritance diagram for Lucene::AbstractField:

Public Types

enum  Store { STORE_YES , STORE_NO }
 Specifies whether and how a field should be stored. More...
 
enum  Index {
  INDEX_NO , INDEX_ANALYZED , INDEX_NOT_ANALYZED , INDEX_NOT_ANALYZED_NO_NORMS ,
  INDEX_ANALYZED_NO_NORMS
}
 Specifies whether and how a field should be indexed. More...
 
enum  TermVector {
  TERM_VECTOR_NO , TERM_VECTOR_YES , TERM_VECTOR_WITH_POSITIONS , TERM_VECTOR_WITH_OFFSETS ,
  TERM_VECTOR_WITH_POSITIONS_OFFSETS
}
 Specifies whether and how a field should have term vectors. More...
 

Public Member Functions

virtual ~AbstractField ()
 
virtual String getClassName ()
 
boost::shared_ptr< AbstractFieldshared_from_this ()
 
virtual void setBoost (double boost)
 Sets the boost factor hits on this field. This value will be multiplied into the score of all hits on this this field of this document.
 
virtual double getBoost ()
 Returns the boost factor for hits for this field.
 
virtual String name ()
 Returns the name of the field as an interned string. For example "date", "title", "body", ...
 
virtual bool isStored ()
 True if the value of the field is to be stored in the index for return with search hits. It is an error for this to be true if a field is Reader-valued.
 
virtual bool isIndexed ()
 True if the value of the field is to be indexed, so that it may be searched on.
 
virtual bool isTokenized ()
 True if the value of the field should be tokenized as text prior to indexing. Un-tokenized fields are indexed as a single word and may not be Reader-valued.
 
virtual bool isTermVectorStored ()
 True if the term or terms used to index this field are stored as a term vector, available from IndexReader#getTermFreqVector(int,String). These methods do not provide access to the original content of the field, only to terms used to index it. If the original content must be preserved, use the stored attribute instead.
 
virtual bool isStoreOffsetWithTermVector ()
 True if terms are stored as term vector together with their offsets (start and end position in source text).
 
virtual bool isStorePositionWithTermVector ()
 True if terms are stored as term vector together with their token positions.
 
virtual bool isBinary ()
 True if the value of the field is stored as binary.
 
virtual ByteArray getBinaryValue ()
 Return the raw byte[] for the binary field. Note that you must also call getBinaryLength and getBinaryOffset to know which range of bytes in this returned array belong to the field.
 
virtual ByteArray getBinaryValue (ByteArray result)
 Return the raw byte[] for the binary field. Note that you must also call getBinaryLength and getBinaryOffset to know which range of bytes in this returned array belong to the field.
 
virtual int32_t getBinaryLength ()
 Returns length of byte[] segment that is used as value, if Field is not binary returned value is undefined.
 
virtual int32_t getBinaryOffset ()
 Returns offset into byte[] segment that is used as value, if Field is not binary returned value is undefined.
 
virtual bool getOmitNorms ()
 True if norms are omitted for this indexed field.
 
virtual bool getOmitTermFreqAndPositions ()
 
virtual void setOmitNorms (bool omitNorms)
 If set, omit normalization factors associated with this indexed field. This effectively disables indexing boosts and length normalization for this field.
 
virtual void setOmitTermFreqAndPositions (bool omitTermFreqAndPositions)
 If set, omit term freq, positions and payloads from postings for this field.
 
virtual bool isLazy ()
 Indicates whether a Field is Lazy or not. The semantics of Lazy loading are such that if a Field is lazily loaded, retrieving it's values via stringValue() or getBinaryValue() is only valid as long as the IndexReader that retrieved the Document is still open.
 
virtual String toString ()
 Prints a Field for human consumption.
 
- Public Member Functions inherited from Lucene::Fieldable
virtual ~Fieldable ()
 
virtual String stringValue ()=0
 The value of the field as a String, or empty.
 
virtual ReaderPtr readerValue ()=0
 The value of the field as a Reader, which can be used at index time to generate indexed tokens.
 
virtual TokenStreamPtr tokenStreamValue ()=0
 The TokenStream for this field to be used when indexing, or null.
 
- Public Member Functions inherited from Lucene::LuceneObject
virtual ~LuceneObject ()
 
virtual void initialize ()
 Called directly after instantiation to create objects that depend on this object being fully constructed.
 
virtual LuceneObjectPtr clone (const LuceneObjectPtr &other=LuceneObjectPtr())
 Return clone of this object.
 
virtual int32_t hashCode ()
 Return hash code for this object.
 
virtual bool equals (const LuceneObjectPtr &other)
 Return whether two objects are equal.
 
virtual int32_t compareTo (const LuceneObjectPtr &other)
 Compare two objects.
 
- Public Member Functions inherited from Lucene::LuceneSync
virtual ~LuceneSync ()
 
virtual SynchronizePtr getSync ()
 Return this object synchronize lock.
 
virtual LuceneSignalPtr getSignal ()
 Return this object signal.
 
virtual void lock (int32_t timeout=0)
 Lock this object using an optional timeout.
 
virtual void unlock ()
 Unlock this object.
 
virtual bool holdsLock ()
 Returns true if this object is currently locked by current thread.
 
virtual void wait (int32_t timeout=0)
 Wait for signal using an optional timeout.
 
virtual void notifyAll ()
 Notify all threads waiting for signal.
 

Static Public Member Functions

static String _getClassName ()
 
- Static Public Member Functions inherited from Lucene::Fieldable
static String _getClassName ()
 

Protected Member Functions

 AbstractField ()
 
 AbstractField (const String &name, Store store, Index index, TermVector termVector)
 
void setStoreTermVector (TermVector termVector)
 
- Protected Member Functions inherited from Lucene::LuceneObject
 LuceneObject ()
 

Protected Attributes

String _name
 
bool storeTermVector
 
bool storeOffsetWithTermVector
 
bool storePositionWithTermVector
 
bool _omitNorms
 
bool _isStored
 
bool _isIndexed
 
bool _isTokenized
 
bool _isBinary
 
bool lazy
 
bool omitTermFreqAndPositions
 
double boost
 
FieldsData fieldsData
 
TokenStreamPtr tokenStream
 
int32_t binaryLength
 
int32_t binaryOffset
 
- Protected Attributes inherited from Lucene::LuceneSync
SynchronizePtr objectLock
 
LuceneSignalPtr objectSignal
 

Member Enumeration Documentation

◆ Index

Specifies whether and how a field should be indexed.

Enumerator
INDEX_NO 

Do not index the field value. This field can thus not be searched, but one can still access its contents provided it is stored.

INDEX_ANALYZED 

Index the tokens produced by running the field's value through an Analyzer. This is useful for common text.

INDEX_NOT_ANALYZED 

Index the field's value without using an Analyzer, so it can be searched. As no analyzer is used the value will be stored as a single term. This is useful for unique Ids like product numbers.

INDEX_NOT_ANALYZED_NO_NORMS 

Index the field's value without an Analyzer, and also disable the storing of norms. Note that you can also separately enable/disable norms by calling Field#setOmitNorms. No norms means that index-time field and document boosting and field length normalization are disabled. The benefit is less memory usage as norms take up one byte of RAM per indexed field for every document in the index, during searching. Note that once you index a given field with norms enabled, disabling norms will have no effect. In other words, for this to have the above described effect on a field, all instances of that field must be indexed with NOT_ANALYZED_NO_NORMS from the beginning.

INDEX_ANALYZED_NO_NORMS 

Index the tokens produced by running the field's value through an Analyzer, and also separately disable the storing of norms. See NOT_ANALYZED_NO_NORMS for what norms are and why you may want to disable them.

◆ Store

Specifies whether and how a field should be stored.

Enumerator
STORE_YES 

Store the original field value in the index. This is useful for short texts like a document's title which should be displayed with the results. The value is stored in its original form, ie. no analyzer is used before it is stored.

STORE_NO 

Do not store the field value in the index.

◆ TermVector

Specifies whether and how a field should have term vectors.

Enumerator
TERM_VECTOR_NO 

Do not store term vectors.

TERM_VECTOR_YES 

Store the term vectors of each document. A term vector is a list of the document's terms and their number of occurrences in that document.

TERM_VECTOR_WITH_POSITIONS 

Store the term vector + token position information.

See also
#YES
TERM_VECTOR_WITH_OFFSETS 

Store the term vector + token offset information.

See also
#YES
TERM_VECTOR_WITH_POSITIONS_OFFSETS 

Store the term vector + token position and offset information.

See also
#YES
#WITH_POSITIONS
#WITH_OFFSETS

Constructor & Destructor Documentation

◆ ~AbstractField()

virtual Lucene::AbstractField::~AbstractField ( )
virtual

◆ AbstractField() [1/2]

Lucene::AbstractField::AbstractField ( )
protected

◆ AbstractField() [2/2]

Lucene::AbstractField::AbstractField ( const String &  name,
Store  store,
Index  index,
TermVector  termVector 
)
protected

Member Function Documentation

◆ _getClassName()

static String Lucene::AbstractField::_getClassName ( )
inlinestatic

◆ getBinaryLength()

virtual int32_t Lucene::AbstractField::getBinaryLength ( )
virtual

Returns length of byte[] segment that is used as value, if Field is not binary returned value is undefined.

Returns
length of byte[] segment that represents this Field value.

Implements Lucene::Fieldable.

◆ getBinaryOffset()

virtual int32_t Lucene::AbstractField::getBinaryOffset ( )
virtual

Returns offset into byte[] segment that is used as value, if Field is not binary returned value is undefined.

Returns
index of the first character in byte[] segment that represents this Field value.

Implements Lucene::Fieldable.

◆ getBinaryValue() [1/2]

virtual ByteArray Lucene::AbstractField::getBinaryValue ( )
virtual

Return the raw byte[] for the binary field. Note that you must also call getBinaryLength and getBinaryOffset to know which range of bytes in this returned array belong to the field.

Returns
reference to the Field value as byte[].

Implements Lucene::Fieldable.

◆ getBinaryValue() [2/2]

virtual ByteArray Lucene::AbstractField::getBinaryValue ( ByteArray  result)
virtual

Return the raw byte[] for the binary field. Note that you must also call getBinaryLength and getBinaryOffset to know which range of bytes in this returned array belong to the field.

Returns
reference to the Field value as byte[].

Implements Lucene::Fieldable.

Reimplemented in Lucene::LazyField, and Lucene::NumericField.

◆ getBoost()

virtual double Lucene::AbstractField::getBoost ( )
virtual

Returns the boost factor for hits for this field.

The default value is 1.0.

Note: this value is not stored directly with the document in the index. Documents returned from IndexReader#document(int) and Searcher#doc(int) may thus not have the same value present as when this field was indexed.

Implements Lucene::Fieldable.

◆ getClassName()

virtual String Lucene::AbstractField::getClassName ( )
inlinevirtual

Reimplemented from Lucene::Fieldable.

Reimplemented in Lucene::Field, Lucene::LazyField, and Lucene::NumericField.

◆ getOmitNorms()

virtual bool Lucene::AbstractField::getOmitNorms ( )
virtual

True if norms are omitted for this indexed field.

Implements Lucene::Fieldable.

◆ getOmitTermFreqAndPositions()

virtual bool Lucene::AbstractField::getOmitTermFreqAndPositions ( )
virtual

◆ isBinary()

virtual bool Lucene::AbstractField::isBinary ( )
virtual

True if the value of the field is stored as binary.

Implements Lucene::Fieldable.

◆ isIndexed()

virtual bool Lucene::AbstractField::isIndexed ( )
virtual

True if the value of the field is to be indexed, so that it may be searched on.

Implements Lucene::Fieldable.

Reimplemented in Lucene::Field.

◆ isLazy()

virtual bool Lucene::AbstractField::isLazy ( )
virtual

Indicates whether a Field is Lazy or not. The semantics of Lazy loading are such that if a Field is lazily loaded, retrieving it's values via stringValue() or getBinaryValue() is only valid as long as the IndexReader that retrieved the Document is still open.

Returns
true if this field can be loaded lazily

Implements Lucene::Fieldable.

◆ isStored()

virtual bool Lucene::AbstractField::isStored ( )
virtual

True if the value of the field is to be stored in the index for return with search hits. It is an error for this to be true if a field is Reader-valued.

Implements Lucene::Fieldable.

Reimplemented in Lucene::Field.

◆ isStoreOffsetWithTermVector()

virtual bool Lucene::AbstractField::isStoreOffsetWithTermVector ( )
virtual

True if terms are stored as term vector together with their offsets (start and end position in source text).

Implements Lucene::Fieldable.

◆ isStorePositionWithTermVector()

virtual bool Lucene::AbstractField::isStorePositionWithTermVector ( )
virtual

True if terms are stored as term vector together with their token positions.

Implements Lucene::Fieldable.

◆ isTermVectorStored()

virtual bool Lucene::AbstractField::isTermVectorStored ( )
virtual

True if the term or terms used to index this field are stored as a term vector, available from IndexReader#getTermFreqVector(int,String). These methods do not provide access to the original content of the field, only to terms used to index it. If the original content must be preserved, use the stored attribute instead.

Implements Lucene::Fieldable.

◆ isTokenized()

virtual bool Lucene::AbstractField::isTokenized ( )
virtual

True if the value of the field should be tokenized as text prior to indexing. Un-tokenized fields are indexed as a single word and may not be Reader-valued.

Implements Lucene::Fieldable.

◆ name()

virtual String Lucene::AbstractField::name ( )
virtual

Returns the name of the field as an interned string. For example "date", "title", "body", ...

Implements Lucene::Fieldable.

◆ setBoost()

virtual void Lucene::AbstractField::setBoost ( double  boost)
virtual

Sets the boost factor hits on this field. This value will be multiplied into the score of all hits on this this field of this document.

The boost is multiplied by Document#getBoost() of the document containing this field. If a document has multiple fields with the same name, all such values are multiplied together. This product is then used to compute the norm factor for the field. By default, in the Similarity#computeNorm(String, FieldInvertState) method, the boost value is multiplied by the Similarity#lengthNorm(String,int) and then rounded by Similarity#encodeNorm(double) before it is stored in the index. One should attempt to ensure that this product does not overflow the range of that encoding.

See also
Document::setBoost(double)
Similarity::computeNorm(String, FieldInvertState)
Similarity::encodeNorm(double)

Implements Lucene::Fieldable.

◆ setOmitNorms()

virtual void Lucene::AbstractField::setOmitNorms ( bool  omitNorms)
virtual

If set, omit normalization factors associated with this indexed field. This effectively disables indexing boosts and length normalization for this field.

Implements Lucene::Fieldable.

◆ setOmitTermFreqAndPositions()

virtual void Lucene::AbstractField::setOmitTermFreqAndPositions ( bool  omitTermFreqAndPositions)
virtual

If set, omit term freq, positions and payloads from postings for this field.

NOTE: While this option reduces storage space required in the index, it also means any query requiring positional information, such as PhraseQuery or SpanQuery subclasses will silently fail to find results.

Implements Lucene::Fieldable.

◆ setStoreTermVector()

void Lucene::AbstractField::setStoreTermVector ( TermVector  termVector)
protected

◆ shared_from_this()

boost::shared_ptr< AbstractField > Lucene::AbstractField::shared_from_this ( )
inline

◆ toString()

virtual String Lucene::AbstractField::toString ( )
virtual

Prints a Field for human consumption.

Reimplemented from Lucene::LuceneObject.

Field Documentation

◆ _isBinary

bool Lucene::AbstractField::_isBinary
protected

◆ _isIndexed

bool Lucene::AbstractField::_isIndexed
protected

◆ _isStored

bool Lucene::AbstractField::_isStored
protected

◆ _isTokenized

bool Lucene::AbstractField::_isTokenized
protected

◆ _name

String Lucene::AbstractField::_name
protected

◆ _omitNorms

bool Lucene::AbstractField::_omitNorms
protected

◆ binaryLength

int32_t Lucene::AbstractField::binaryLength
protected

◆ binaryOffset

int32_t Lucene::AbstractField::binaryOffset
protected

◆ boost

double Lucene::AbstractField::boost
protected

◆ fieldsData

FieldsData Lucene::AbstractField::fieldsData
protected

◆ lazy

bool Lucene::AbstractField::lazy
protected

◆ omitTermFreqAndPositions

bool Lucene::AbstractField::omitTermFreqAndPositions
protected

◆ storeOffsetWithTermVector

bool Lucene::AbstractField::storeOffsetWithTermVector
protected

◆ storePositionWithTermVector

bool Lucene::AbstractField::storePositionWithTermVector
protected

◆ storeTermVector

bool Lucene::AbstractField::storeTermVector
protected

◆ tokenStream

TokenStreamPtr Lucene::AbstractField::tokenStream
protected

The documentation for this class was generated from the following file:

clucene.sourceforge.net