Lucene++ - a full-featured, c++ search engine
API Documentation
This class keeps track of each SegmentInfos instance that is still "live", either because it corresponds to a segments_N file in the Directory (a "commit", ie. a committed SegmentInfos) or because it's an in-memory SegmentInfos that a writer is actively updating but has not yet committed. This class uses simple reference counting to map the live SegmentInfos instances to individual files in the Directory. More...
#include <IndexFileDeleter.h>
Public Member Functions | |
IndexFileDeleter (const DirectoryPtr &directory, const IndexDeletionPolicyPtr &policy, const SegmentInfosPtr &segmentInfos, const InfoStreamPtr &infoStream, const DocumentsWriterPtr &docWriter, HashSet< String > synced) | |
Initialize the deleter: find all previous commits in the Directory, incref the files they reference, call the policy to let it delete commits. This will remove any files not referenced by any of the commits. | |
virtual | ~IndexFileDeleter () |
virtual String | getClassName () |
boost::shared_ptr< IndexFileDeleter > | shared_from_this () |
void | setInfoStream (const InfoStreamPtr &infoStream) |
SegmentInfosPtr | getLastSegmentInfos () |
void | refresh (const String &segmentName) |
Writer calls this when it has hit an error and had to roll back, to tell us that there may now be unreferenced files in the filesystem. So we re-list the filesystem and delete such files. If segmentName is non-null, we will only delete files corresponding to that segment. | |
void | refresh () |
void | close () |
void | checkpoint (const SegmentInfosPtr &segmentInfos, bool isCommit) |
For definition of "check point" see IndexWriter comments: "Clarification: Check Points (and commits)". Writer calls this when it has made a "consistent change" to the index, meaning new files are written to the index and the in-memory SegmentInfos have been modified to point to those files. | |
void | incRef (const SegmentInfosPtr &segmentInfos, bool isCommit) |
void | incRef (HashSet< String > files) |
void | incRef (const String &fileName) |
void | decRef (HashSet< String > files) |
void | decRef (const String &fileName) |
void | decRef (const SegmentInfosPtr &segmentInfos) |
bool | exists (const String &fileName) |
void | deleteFiles (HashSet< String > files) |
void | deleteNewFiles (HashSet< String > files) |
Deletes the specified files, but only if they are new (have not yet been incref'd). | |
void | deleteFile (const String &fileName) |
![]() | |
virtual | ~LuceneObject () |
virtual void | initialize () |
Called directly after instantiation to create objects that depend on this object being fully constructed. | |
virtual LuceneObjectPtr | clone (const LuceneObjectPtr &other=LuceneObjectPtr()) |
Return clone of this object. | |
virtual int32_t | hashCode () |
Return hash code for this object. | |
virtual bool | equals (const LuceneObjectPtr &other) |
Return whether two objects are equal. | |
virtual int32_t | compareTo (const LuceneObjectPtr &other) |
Compare two objects. | |
virtual String | toString () |
Returns a string representation of the object. | |
![]() | |
virtual | ~LuceneSync () |
virtual SynchronizePtr | getSync () |
Return this object synchronize lock. | |
virtual LuceneSignalPtr | getSignal () |
Return this object signal. | |
virtual void | lock (int32_t timeout=0) |
Lock this object using an optional timeout. | |
virtual void | unlock () |
Unlock this object. | |
virtual bool | holdsLock () |
Returns true if this object is currently locked by current thread. | |
virtual void | wait (int32_t timeout=0) |
Wait for signal using an optional timeout. | |
virtual void | notifyAll () |
Notify all threads waiting for signal. | |
Static Public Member Functions | |
static String | _getClassName () |
Data Fields | |
bool | startingCommitDeleted |
Protected Member Functions | |
void | message (const String &message) |
void | deleteCommits () |
Remove the CommitPoints in the commitsToDelete List by DecRef'ing all files from each SegmentInfos. | |
void | deletePendingFiles () |
RefCountPtr | getRefCount (const String &fileName) |
![]() | |
LuceneObject () | |
Protected Attributes | |
HashSet< String > | deletable |
Files that we tried to delete but failed (likely because they are open and we are running on Windows), so we will retry them again later. | |
MapStringRefCount | refCounts |
Reference count for all files in the index. Counts how many existing commits reference a file. | |
Collection< IndexCommitPtr > | commits |
Holds all commits (segments_N) currently in the index. This will have just 1 commit if you are using the default delete policy (KeepOnlyLastCommitDeletionPolicy). Other policies may leave commit points live for longer in which case this list would be longer than 1. | |
Collection< HashSet< String > > | lastFiles |
Holds files we had incref'd from the previous non-commit checkpoint. | |
Collection< CommitPointPtr > | commitsToDelete |
Commits that the IndexDeletionPolicy have decided to delete. | |
InfoStreamPtr | infoStream |
DirectoryPtr | directory |
IndexDeletionPolicyPtr | policy |
DocumentsWriterPtr | docWriter |
SegmentInfosPtr | lastSegmentInfos |
HashSet< String > | synced |
![]() | |
SynchronizePtr | objectLock |
LuceneSignalPtr | objectSignal |
Static Protected Attributes | |
static bool | VERBOSE_REF_COUNTS |
Change to true to see details of reference counts when infoStream != null. | |
This class keeps track of each SegmentInfos instance that is still "live", either because it corresponds to a segments_N file in the Directory (a "commit", ie. a committed SegmentInfos) or because it's an in-memory SegmentInfos that a writer is actively updating but has not yet committed. This class uses simple reference counting to map the live SegmentInfos instances to individual files in the Directory.
The same directory file may be referenced by more than one IndexCommit, i.e. more than one SegmentInfos. Therefore we count how many commits reference each file. When all the commits referencing a certain file have been deleted, the refcount for that file becomes zero, and the file is deleted.
A separate deletion policy interface (IndexDeletionPolicy) is consulted on creation (onInit) and once per commit (onCommit), to decide when a commit should be removed.
It is the business of the IndexDeletionPolicy to choose when to delete commit points. The actual mechanics of file deletion, retrying, etc, derived from the deletion of commit points is the business of the IndexFileDeleter.
The current default deletion policy is KeepOnlyLastCommitDeletionPolicy
, which removes all prior commits when a new commit has completed. This matches the behavior before 2.2.
Note that you must hold the write.lock before instantiating this class. It opens segments_N file(s) directly with no retry logic.
Lucene::IndexFileDeleter::IndexFileDeleter | ( | const DirectoryPtr & | directory, |
const IndexDeletionPolicyPtr & | policy, | ||
const SegmentInfosPtr & | segmentInfos, | ||
const InfoStreamPtr & | infoStream, | ||
const DocumentsWriterPtr & | docWriter, | ||
HashSet< String > | synced | ||
) |
Initialize the deleter: find all previous commits in the Directory, incref the files they reference, call the policy to let it delete commits. This will remove any files not referenced by any of the commits.
|
virtual |
|
inlinestatic |
void Lucene::IndexFileDeleter::checkpoint | ( | const SegmentInfosPtr & | segmentInfos, |
bool | isCommit | ||
) |
For definition of "check point" see IndexWriter comments: "Clarification: Check Points (and commits)". Writer calls this when it has made a "consistent change" to the index, meaning new files are written to the index and the in-memory SegmentInfos have been modified to point to those files.
This may or may not be a commit (segments_N may or may not have been written).
We simply incref the files referenced by the new SegmentInfos and decref the files we had previously seen (if any).
If this is a commit, we also call the policy to give it a chance to remove other commits. If any commits are removed, we decref their files as well.
void Lucene::IndexFileDeleter::close | ( | ) |
void Lucene::IndexFileDeleter::decRef | ( | const SegmentInfosPtr & | segmentInfos | ) |
void Lucene::IndexFileDeleter::decRef | ( | const String & | fileName | ) |
void Lucene::IndexFileDeleter::decRef | ( | HashSet< String > | files | ) |
|
protected |
Remove the CommitPoints in the commitsToDelete List by DecRef'ing all files from each SegmentInfos.
void Lucene::IndexFileDeleter::deleteFile | ( | const String & | fileName | ) |
void Lucene::IndexFileDeleter::deleteFiles | ( | HashSet< String > | files | ) |
void Lucene::IndexFileDeleter::deleteNewFiles | ( | HashSet< String > | files | ) |
Deletes the specified files, but only if they are new (have not yet been incref'd).
|
protected |
bool Lucene::IndexFileDeleter::exists | ( | const String & | fileName | ) |
|
inlinevirtual |
SegmentInfosPtr Lucene::IndexFileDeleter::getLastSegmentInfos | ( | ) |
|
protected |
void Lucene::IndexFileDeleter::incRef | ( | const SegmentInfosPtr & | segmentInfos, |
bool | isCommit | ||
) |
void Lucene::IndexFileDeleter::incRef | ( | const String & | fileName | ) |
void Lucene::IndexFileDeleter::incRef | ( | HashSet< String > | files | ) |
|
protected |
void Lucene::IndexFileDeleter::refresh | ( | ) |
void Lucene::IndexFileDeleter::refresh | ( | const String & | segmentName | ) |
Writer calls this when it has hit an error and had to roll back, to tell us that there may now be unreferenced files in the filesystem. So we re-list the filesystem and delete such files. If segmentName is non-null, we will only delete files corresponding to that segment.
void Lucene::IndexFileDeleter::setInfoStream | ( | const InfoStreamPtr & | infoStream | ) |
|
inline |
|
protected |
Holds all commits (segments_N) currently in the index. This will have just 1 commit if you are using the default delete policy (KeepOnlyLastCommitDeletionPolicy). Other policies may leave commit points live for longer in which case this list would be longer than 1.
|
protected |
Commits that the IndexDeletionPolicy have decided to delete.
|
protected |
Files that we tried to delete but failed (likely because they are open and we are running on Windows), so we will retry them again later.
|
protected |
|
protected |
|
protected |
|
protected |
Holds files we had incref'd from the previous non-commit checkpoint.
|
protected |
|
protected |
|
protected |
Reference count for all files in the index. Counts how many existing commits reference a file.
bool Lucene::IndexFileDeleter::startingCommitDeleted |
|
protected |
|
staticprotected |
Change to true to see details of reference counts when infoStream != null.