Lucene++ - a full-featured, c++ search engine
API Documentation


Loading...
Searching...
No Matches
Public Member Functions | Static Public Member Functions | Data Fields | Protected Member Functions | Protected Attributes | Static Protected Attributes
Lucene::IndexFileDeleter Class Reference

This class keeps track of each SegmentInfos instance that is still "live", either because it corresponds to a segments_N file in the Directory (a "commit", ie. a committed SegmentInfos) or because it's an in-memory SegmentInfos that a writer is actively updating but has not yet committed. This class uses simple reference counting to map the live SegmentInfos instances to individual files in the Directory. More...

#include <IndexFileDeleter.h>

+ Inheritance diagram for Lucene::IndexFileDeleter:

Public Member Functions

 IndexFileDeleter (const DirectoryPtr &directory, const IndexDeletionPolicyPtr &policy, const SegmentInfosPtr &segmentInfos, const InfoStreamPtr &infoStream, const DocumentsWriterPtr &docWriter, HashSet< String > synced)
 Initialize the deleter: find all previous commits in the Directory, incref the files they reference, call the policy to let it delete commits. This will remove any files not referenced by any of the commits.
 
virtual ~IndexFileDeleter ()
 
virtual String getClassName ()
 
boost::shared_ptr< IndexFileDeletershared_from_this ()
 
void setInfoStream (const InfoStreamPtr &infoStream)
 
SegmentInfosPtr getLastSegmentInfos ()
 
void refresh (const String &segmentName)
 Writer calls this when it has hit an error and had to roll back, to tell us that there may now be unreferenced files in the filesystem. So we re-list the filesystem and delete such files. If segmentName is non-null, we will only delete files corresponding to that segment.
 
void refresh ()
 
void close ()
 
void checkpoint (const SegmentInfosPtr &segmentInfos, bool isCommit)
 For definition of "check point" see IndexWriter comments: "Clarification: Check Points (and commits)". Writer calls this when it has made a "consistent change" to the index, meaning new files are written to the index and the in-memory SegmentInfos have been modified to point to those files.
 
void incRef (const SegmentInfosPtr &segmentInfos, bool isCommit)
 
void incRef (HashSet< String > files)
 
void incRef (const String &fileName)
 
void decRef (HashSet< String > files)
 
void decRef (const String &fileName)
 
void decRef (const SegmentInfosPtr &segmentInfos)
 
bool exists (const String &fileName)
 
void deleteFiles (HashSet< String > files)
 
void deleteNewFiles (HashSet< String > files)
 Deletes the specified files, but only if they are new (have not yet been incref'd).
 
void deleteFile (const String &fileName)
 
- Public Member Functions inherited from Lucene::LuceneObject
virtual ~LuceneObject ()
 
virtual void initialize ()
 Called directly after instantiation to create objects that depend on this object being fully constructed.
 
virtual LuceneObjectPtr clone (const LuceneObjectPtr &other=LuceneObjectPtr())
 Return clone of this object.
 
virtual int32_t hashCode ()
 Return hash code for this object.
 
virtual bool equals (const LuceneObjectPtr &other)
 Return whether two objects are equal.
 
virtual int32_t compareTo (const LuceneObjectPtr &other)
 Compare two objects.
 
virtual String toString ()
 Returns a string representation of the object.
 
- Public Member Functions inherited from Lucene::LuceneSync
virtual ~LuceneSync ()
 
virtual SynchronizePtr getSync ()
 Return this object synchronize lock.
 
virtual LuceneSignalPtr getSignal ()
 Return this object signal.
 
virtual void lock (int32_t timeout=0)
 Lock this object using an optional timeout.
 
virtual void unlock ()
 Unlock this object.
 
virtual bool holdsLock ()
 Returns true if this object is currently locked by current thread.
 
virtual void wait (int32_t timeout=0)
 Wait for signal using an optional timeout.
 
virtual void notifyAll ()
 Notify all threads waiting for signal.
 

Static Public Member Functions

static String _getClassName ()
 

Data Fields

bool startingCommitDeleted
 

Protected Member Functions

void message (const String &message)
 
void deleteCommits ()
 Remove the CommitPoints in the commitsToDelete List by DecRef'ing all files from each SegmentInfos.
 
void deletePendingFiles ()
 
RefCountPtr getRefCount (const String &fileName)
 
- Protected Member Functions inherited from Lucene::LuceneObject
 LuceneObject ()
 

Protected Attributes

HashSet< String > deletable
 Files that we tried to delete but failed (likely because they are open and we are running on Windows), so we will retry them again later.
 
MapStringRefCount refCounts
 Reference count for all files in the index. Counts how many existing commits reference a file.
 
Collection< IndexCommitPtrcommits
 Holds all commits (segments_N) currently in the index. This will have just 1 commit if you are using the default delete policy (KeepOnlyLastCommitDeletionPolicy). Other policies may leave commit points live for longer in which case this list would be longer than 1.
 
Collection< HashSet< String > > lastFiles
 Holds files we had incref'd from the previous non-commit checkpoint.
 
Collection< CommitPointPtrcommitsToDelete
 Commits that the IndexDeletionPolicy have decided to delete.
 
InfoStreamPtr infoStream
 
DirectoryPtr directory
 
IndexDeletionPolicyPtr policy
 
DocumentsWriterPtr docWriter
 
SegmentInfosPtr lastSegmentInfos
 
HashSet< String > synced
 
- Protected Attributes inherited from Lucene::LuceneSync
SynchronizePtr objectLock
 
LuceneSignalPtr objectSignal
 

Static Protected Attributes

static bool VERBOSE_REF_COUNTS
 Change to true to see details of reference counts when infoStream != null.
 

Detailed Description

This class keeps track of each SegmentInfos instance that is still "live", either because it corresponds to a segments_N file in the Directory (a "commit", ie. a committed SegmentInfos) or because it's an in-memory SegmentInfos that a writer is actively updating but has not yet committed. This class uses simple reference counting to map the live SegmentInfos instances to individual files in the Directory.

The same directory file may be referenced by more than one IndexCommit, i.e. more than one SegmentInfos. Therefore we count how many commits reference each file. When all the commits referencing a certain file have been deleted, the refcount for that file becomes zero, and the file is deleted.

A separate deletion policy interface (IndexDeletionPolicy) is consulted on creation (onInit) and once per commit (onCommit), to decide when a commit should be removed.

It is the business of the IndexDeletionPolicy to choose when to delete commit points. The actual mechanics of file deletion, retrying, etc, derived from the deletion of commit points is the business of the IndexFileDeleter.

The current default deletion policy is KeepOnlyLastCommitDeletionPolicy, which removes all prior commits when a new commit has completed. This matches the behavior before 2.2.

Note that you must hold the write.lock before instantiating this class. It opens segments_N file(s) directly with no retry logic.

Constructor & Destructor Documentation

◆ IndexFileDeleter()

Lucene::IndexFileDeleter::IndexFileDeleter ( const DirectoryPtr directory,
const IndexDeletionPolicyPtr policy,
const SegmentInfosPtr segmentInfos,
const InfoStreamPtr infoStream,
const DocumentsWriterPtr docWriter,
HashSet< String >  synced 
)

Initialize the deleter: find all previous commits in the Directory, incref the files they reference, call the policy to let it delete commits. This will remove any files not referenced by any of the commits.

◆ ~IndexFileDeleter()

virtual Lucene::IndexFileDeleter::~IndexFileDeleter ( )
virtual

Member Function Documentation

◆ _getClassName()

static String Lucene::IndexFileDeleter::_getClassName ( )
inlinestatic

◆ checkpoint()

void Lucene::IndexFileDeleter::checkpoint ( const SegmentInfosPtr segmentInfos,
bool  isCommit 
)

For definition of "check point" see IndexWriter comments: "Clarification: Check Points (and commits)". Writer calls this when it has made a "consistent change" to the index, meaning new files are written to the index and the in-memory SegmentInfos have been modified to point to those files.

This may or may not be a commit (segments_N may or may not have been written).

We simply incref the files referenced by the new SegmentInfos and decref the files we had previously seen (if any).

If this is a commit, we also call the policy to give it a chance to remove other commits. If any commits are removed, we decref their files as well.

◆ close()

void Lucene::IndexFileDeleter::close ( )

◆ decRef() [1/3]

void Lucene::IndexFileDeleter::decRef ( const SegmentInfosPtr segmentInfos)

◆ decRef() [2/3]

void Lucene::IndexFileDeleter::decRef ( const String &  fileName)

◆ decRef() [3/3]

void Lucene::IndexFileDeleter::decRef ( HashSet< String >  files)

◆ deleteCommits()

void Lucene::IndexFileDeleter::deleteCommits ( )
protected

Remove the CommitPoints in the commitsToDelete List by DecRef'ing all files from each SegmentInfos.

◆ deleteFile()

void Lucene::IndexFileDeleter::deleteFile ( const String &  fileName)

◆ deleteFiles()

void Lucene::IndexFileDeleter::deleteFiles ( HashSet< String >  files)

◆ deleteNewFiles()

void Lucene::IndexFileDeleter::deleteNewFiles ( HashSet< String >  files)

Deletes the specified files, but only if they are new (have not yet been incref'd).

◆ deletePendingFiles()

void Lucene::IndexFileDeleter::deletePendingFiles ( )
protected

◆ exists()

bool Lucene::IndexFileDeleter::exists ( const String &  fileName)

◆ getClassName()

virtual String Lucene::IndexFileDeleter::getClassName ( )
inlinevirtual

◆ getLastSegmentInfos()

SegmentInfosPtr Lucene::IndexFileDeleter::getLastSegmentInfos ( )

◆ getRefCount()

RefCountPtr Lucene::IndexFileDeleter::getRefCount ( const String &  fileName)
protected

◆ incRef() [1/3]

void Lucene::IndexFileDeleter::incRef ( const SegmentInfosPtr segmentInfos,
bool  isCommit 
)

◆ incRef() [2/3]

void Lucene::IndexFileDeleter::incRef ( const String &  fileName)

◆ incRef() [3/3]

void Lucene::IndexFileDeleter::incRef ( HashSet< String >  files)

◆ message()

void Lucene::IndexFileDeleter::message ( const String &  message)
protected

◆ refresh() [1/2]

void Lucene::IndexFileDeleter::refresh ( )

◆ refresh() [2/2]

void Lucene::IndexFileDeleter::refresh ( const String &  segmentName)

Writer calls this when it has hit an error and had to roll back, to tell us that there may now be unreferenced files in the filesystem. So we re-list the filesystem and delete such files. If segmentName is non-null, we will only delete files corresponding to that segment.

◆ setInfoStream()

void Lucene::IndexFileDeleter::setInfoStream ( const InfoStreamPtr infoStream)

◆ shared_from_this()

boost::shared_ptr< IndexFileDeleter > Lucene::IndexFileDeleter::shared_from_this ( )
inline

Field Documentation

◆ commits

Collection<IndexCommitPtr> Lucene::IndexFileDeleter::commits
protected

Holds all commits (segments_N) currently in the index. This will have just 1 commit if you are using the default delete policy (KeepOnlyLastCommitDeletionPolicy). Other policies may leave commit points live for longer in which case this list would be longer than 1.

◆ commitsToDelete

Collection<CommitPointPtr> Lucene::IndexFileDeleter::commitsToDelete
protected

Commits that the IndexDeletionPolicy have decided to delete.

◆ deletable

HashSet<String> Lucene::IndexFileDeleter::deletable
protected

Files that we tried to delete but failed (likely because they are open and we are running on Windows), so we will retry them again later.

◆ directory

DirectoryPtr Lucene::IndexFileDeleter::directory
protected

◆ docWriter

DocumentsWriterPtr Lucene::IndexFileDeleter::docWriter
protected

◆ infoStream

InfoStreamPtr Lucene::IndexFileDeleter::infoStream
protected

◆ lastFiles

Collection< HashSet<String> > Lucene::IndexFileDeleter::lastFiles
protected

Holds files we had incref'd from the previous non-commit checkpoint.

◆ lastSegmentInfos

SegmentInfosPtr Lucene::IndexFileDeleter::lastSegmentInfos
protected

◆ policy

IndexDeletionPolicyPtr Lucene::IndexFileDeleter::policy
protected

◆ refCounts

MapStringRefCount Lucene::IndexFileDeleter::refCounts
protected

Reference count for all files in the index. Counts how many existing commits reference a file.

◆ startingCommitDeleted

bool Lucene::IndexFileDeleter::startingCommitDeleted

◆ synced

HashSet<String> Lucene::IndexFileDeleter::synced
protected

◆ VERBOSE_REF_COUNTS

bool Lucene::IndexFileDeleter::VERBOSE_REF_COUNTS
staticprotected

Change to true to see details of reference counts when infoStream != null.


The documentation for this class was generated from the following file:

clucene.sourceforge.net