Lucene++ - a full-featured, c++ search engine
API Documentation


Loading...
Searching...
No Matches
Public Member Functions | Static Public Member Functions | Protected Member Functions | Protected Attributes
Lucene::SloppyPhraseScorer Class Reference

#include <SloppyPhraseScorer.h>

+ Inheritance diagram for Lucene::SloppyPhraseScorer:

Public Member Functions

 SloppyPhraseScorer (const WeightPtr &weight, Collection< TermPositionsPtr > tps, Collection< int32_t > offsets, const SimilarityPtr &similarity, int32_t slop, ByteArray norms)
 
virtual ~SloppyPhraseScorer ()
 
virtual String getClassName ()
 
boost::shared_ptr< SloppyPhraseScorershared_from_this ()
 
virtual double phraseFreq ()
 Score a candidate doc for all slop-valid position-combinations (matches) encountered while traversing/hopping the PhrasePositions. The score contribution of a match depends on the distance:
 
- Public Member Functions inherited from Lucene::PhraseScorer
 PhraseScorer (const WeightPtr &weight, Collection< TermPositionsPtr > tps, Collection< int32_t > offsets, const SimilarityPtr &similarity, ByteArray norms)
 
virtual ~PhraseScorer ()
 
boost::shared_ptr< PhraseScorershared_from_this ()
 
virtual int32_t docID ()
 Returns the following:
 
virtual int32_t nextDoc ()
 Advances to the next document in the set and returns the doc it is currently on, or NO_MORE_DOCS if there are no more docs in the set.
 
virtual double score ()
 Returns the score of the current document matching the query. Initially invalid, until nextDoc() or advance(int32_t) is called the first time, or when called from within Collector#collect.
 
virtual int32_t advance (int32_t target)
 Advances to the first beyond the current whose document number is greater than or equal to target. Returns the current document number or NO_MORE_DOCS if there are no more docs in the set.
 
double currentFreq ()
 Phrase frequency in current doc as computed by phraseFreq().
 
virtual float termFreq ()
 
virtual String toString ()
 Returns a string representation of the object.
 
- Public Member Functions inherited from Lucene::Scorer
 Scorer (const SimilarityPtr &similarity)
 Constructs a Scorer.
 
 Scorer (const WeightPtr &weight)
 
virtual ~Scorer ()
 
boost::shared_ptr< Scorershared_from_this ()
 
SimilarityPtr getSimilarity ()
 Returns the Similarity implementation used by this scorer.
 
virtual void score (const CollectorPtr &collector)
 Scores and collects all matching documents.
 
void visitSubScorers (QueryPtr parent, BooleanClause::Occur relationship, ScorerVisitor *visitor)
 
void visitScorers (ScorerVisitor *visitor)
 
- Public Member Functions inherited from Lucene::DocIdSetIterator
virtual ~DocIdSetIterator ()
 
boost::shared_ptr< DocIdSetIteratorshared_from_this ()
 
- Public Member Functions inherited from Lucene::LuceneObject
virtual ~LuceneObject ()
 
virtual void initialize ()
 Called directly after instantiation to create objects that depend on this object being fully constructed.
 
virtual LuceneObjectPtr clone (const LuceneObjectPtr &other=LuceneObjectPtr())
 Return clone of this object.
 
virtual int32_t hashCode ()
 Return hash code for this object.
 
virtual bool equals (const LuceneObjectPtr &other)
 Return whether two objects are equal.
 
virtual int32_t compareTo (const LuceneObjectPtr &other)
 Compare two objects.
 
- Public Member Functions inherited from Lucene::LuceneSync
virtual ~LuceneSync ()
 
virtual SynchronizePtr getSync ()
 Return this object synchronize lock.
 
virtual LuceneSignalPtr getSignal ()
 Return this object signal.
 
virtual void lock (int32_t timeout=0)
 Lock this object using an optional timeout.
 
virtual void unlock ()
 Unlock this object.
 
virtual bool holdsLock ()
 Returns true if this object is currently locked by current thread.
 
virtual void wait (int32_t timeout=0)
 Wait for signal using an optional timeout.
 
virtual void notifyAll ()
 Notify all threads waiting for signal.
 

Static Public Member Functions

static String _getClassName ()
 
- Static Public Member Functions inherited from Lucene::PhraseScorer
static String _getClassName ()
 
- Static Public Member Functions inherited from Lucene::Scorer
static String _getClassName ()
 
- Static Public Member Functions inherited from Lucene::DocIdSetIterator
static String _getClassName ()
 

Protected Member Functions

PhrasePositionsflip (PhrasePositions *pp, PhrasePositions *pp2)
 Flip pp2 and pp in the queue: pop until finding pp2, insert back all but pp2, insert pp back. Assumes: pp!=pp2, pp2 in pq, pp not in pq. Called only when there are repeating pps.
 
int32_t initPhrasePositions ()
 Init PhrasePositions in place. There is a one time initialization for this scorer:
 
PhrasePositionstermPositionsDiffer (PhrasePositions *pp)
 We disallow two pp's to have the same TermPosition, thereby verifying multiple occurrences in the query of the same word would go elsewhere in the matched doc.
 
- Protected Member Functions inherited from Lucene::PhraseScorer
bool doNext ()
 Next without initial increment.
 
void init ()
 
void sort ()
 
void pqToList ()
 
void firstToLast ()
 
- Protected Member Functions inherited from Lucene::Scorer
virtual bool score (const CollectorPtr &collector, int32_t max, int32_t firstDocID)
 Collects matching documents in a range. Hook for optimization. Note, firstDocID is added to ensure that nextDoc() was called before this method.
 
- Protected Member Functions inherited from Lucene::LuceneObject
 LuceneObject ()
 

Protected Attributes

int32_t slop
 
Collection< PhrasePositions * > repeats
 
Collection< PhrasePositions * > tmpPos
 
bool checkedRepeats
 
- Protected Attributes inherited from Lucene::PhraseScorer
WeightPtr weight
 
Weight__weight = nullptr
 
ByteArray norms
 
double value
 
bool firstTime
 
bool more
 
PhraseQueuePtr pq
 
std::vector< PhrasePositionsPtr_holds
 
PhrasePositions__first = nullptr
 
PhrasePositions__last = nullptr
 
double freq
 
- Protected Attributes inherited from Lucene::Scorer
SimilarityPtr similarity
 
- Protected Attributes inherited from Lucene::LuceneSync
SynchronizePtr objectLock
 
LuceneSignalPtr objectSignal
 

Additional Inherited Members

- Data Fields inherited from Lucene::Scorer
WeightPtr weight
 
- Static Public Attributes inherited from Lucene::DocIdSetIterator
static const int32_t NO_MORE_DOCS
 When returned by nextDoc(), advance(int) and docID() it means there are no more docs in the iterator.
 

Constructor & Destructor Documentation

◆ SloppyPhraseScorer()

Lucene::SloppyPhraseScorer::SloppyPhraseScorer ( const WeightPtr weight,
Collection< TermPositionsPtr tps,
Collection< int32_t >  offsets,
const SimilarityPtr similarity,
int32_t  slop,
ByteArray  norms 
)

◆ ~SloppyPhraseScorer()

virtual Lucene::SloppyPhraseScorer::~SloppyPhraseScorer ( )
virtual

Member Function Documentation

◆ _getClassName()

static String Lucene::SloppyPhraseScorer::_getClassName ( )
inlinestatic

◆ flip()

PhrasePositions * Lucene::SloppyPhraseScorer::flip ( PhrasePositions pp,
PhrasePositions pp2 
)
protected

Flip pp2 and pp in the queue: pop until finding pp2, insert back all but pp2, insert pp back. Assumes: pp!=pp2, pp2 in pq, pp not in pq. Called only when there are repeating pps.

◆ getClassName()

virtual String Lucene::SloppyPhraseScorer::getClassName ( )
inlinevirtual

Reimplemented from Lucene::PhraseScorer.

◆ initPhrasePositions()

int32_t Lucene::SloppyPhraseScorer::initPhrasePositions ( )
protected

Init PhrasePositions in place. There is a one time initialization for this scorer:

  • Put in repeats[] each pp that has another pp with same position in the doc.
  • Also mark each such pp by pp.repeats = true. Later can consult with repeats[] in termPositionsDiffer(pp), making that check efficient. In particular, this allows to score queries with no repetitions with no overhead due to this computation.
  • Example 1 - query with no repetitions: "ho my"~2
  • Example 2 - query with repetitions: "ho my my"~2
  • Example 3 - query with repetitions: "my ho my"~2 Init per doc with repeats in query, includes propagating some repeating pp's to avoid false phrase detection.
    Returns
    end (max position), or -1 if any term ran out (ie. done)

◆ phraseFreq()

virtual double Lucene::SloppyPhraseScorer::phraseFreq ( )
virtual

Score a candidate doc for all slop-valid position-combinations (matches) encountered while traversing/hopping the PhrasePositions. The score contribution of a match depends on the distance:

  • highest score for distance=0 (exact match).
  • score gets lower as distance gets higher. Example: for query "a b"~2, a document "x a b a y" can be scored twice: once for "a b" (distance=0), and once for "b a" (distance=2). Possibly not all valid combinations are encountered, because for efficiency we always propagate the least PhrasePosition. This allows to base on PriorityQueue and move forward faster. As result, for example, document "a b c b a" would score differently for queries "a b c"~4 and "c b a"~4, although they really are equivalent. Similarly, for doc "a b c b a f g", query "c b"~2 would get same score as "g f"~2, although "c b"~2 could be matched twice. We may want to fix this in the future (currently not, for performance reasons).

Implements Lucene::PhraseScorer.

◆ shared_from_this()

boost::shared_ptr< SloppyPhraseScorer > Lucene::SloppyPhraseScorer::shared_from_this ( )
inline

◆ termPositionsDiffer()

PhrasePositions * Lucene::SloppyPhraseScorer::termPositionsDiffer ( PhrasePositions pp)
protected

We disallow two pp's to have the same TermPosition, thereby verifying multiple occurrences in the query of the same word would go elsewhere in the matched doc.

Returns
null if differ (i.e. valid) otherwise return the higher offset PhrasePositions out of the first two PPs found to not differ.

Field Documentation

◆ checkedRepeats

bool Lucene::SloppyPhraseScorer::checkedRepeats
protected

◆ repeats

Collection<PhrasePositions*> Lucene::SloppyPhraseScorer::repeats
protected

◆ slop

int32_t Lucene::SloppyPhraseScorer::slop
protected

◆ tmpPos

Collection<PhrasePositions*> Lucene::SloppyPhraseScorer::tmpPos
protected

The documentation for this class was generated from the following file:

clucene.sourceforge.net