A cache that uses multiple lists to maintain similarity/distance metrics to build a Minimum Spanning Tree (MST). More...
#include <MSTMultiListCache.h>
Public Member Functions | |
MSTMultiListCache (const int totalESTCount, const int startOwnESTidx, const int numOwnESTs, const ESTAnalyzer *analyzer, const bool repopulateCaches, const int maxCachePerEST) | |
The constructor. | |
virtual | ~MSTMultiListCache () |
The destructor. | |
virtual void | pruneCaches (const int estIdxAdded, std::vector< int > &repopulateList, const bool prefixListSize=true) |
Clear caches that contain entries for a newly added estIdx. | |
virtual void | mergeList (const int estIdx, const SMList &list) |
Merges a given SMList with current entries in the cache for a given EST. | |
void | getBestEntry (int &srcESTidx, int &destESTidx, float &metric, int &alignmentData, int &directionData) const |
Obtains the top-most similar entry from the MSTCache. | |
void | displayStats (std::ostream &os, const int MyRank) const |
Display cache usage statistics. | |
void | preprocess (SMList &list) |
Sort a Similarity Metric List (SMList) based on similarity metric and then prunes it. | |
Protected Member Functions | |
bool | pruneCache (SMList &list, const int estIdx) |
Helper method to prune a given SMList. | |
Private Attributes | |
std::vector< MSTCacheEntry > | cache |
The actual vector of cache entries for various nodes. | |
int | cacheRepopulation |
Number of times the cache needed to be repopulated. | |
int | prunedEntries |
Number of entries that were pruned from the cache. | |
long | analysisCount |
Number of times comparisons of ESTs were performed. |
A cache that uses multiple lists to maintain similarity/distance metrics to build a Minimum Spanning Tree (MST).
This class dervies off the MSTCache
base class provides the functionality to manage a cache of similarity metrics using a collection of Lists. This cache is organized as follows: The cache is essentially a collection of SMList objects, one SMList for each EST (logically) owned by this cache. The lists are maintained in a sorted fashion with the best metrics at the top of the list to facilitate rapid construction of a Minimum Spanning Tree (MST).
Definition at line 70 of file MSTMultiListCache.h.
MSTMultiListCache::MSTMultiListCache | ( | const int | totalESTCount, | |
const int | startOwnESTidx, | |||
const int | numOwnESTs, | |||
const ESTAnalyzer * | analyzer, | |||
const bool | repopulateCaches, | |||
const int | maxCachePerEST | |||
) |
The constructor.
This MST cache requires information about the number of ESTs in the system in order to effectively track the ESTs already added to the MST. These values must be passed in as the various parameters to this class. The constructor essentially passes off the parameters to the base class that saves the information (for future reference) in suitable instance variables.
[in] | totalESTCount | The total number of ESTs to be processed (that is, to be added to the MST). |
[in] | startOwnESTidx | The starting index of the EST that is owned by the process that is using this cache. This value is used to internally normalize est index values to 0 for the first EST owned by this process. |
[in] | numOwnESTs | The number of ESTs owned by the process that is using this cache. This information is used to reserve necessary space to hold SMLists for each EST owned by the manager/worker process. |
[in] | analyzer | The analyzer to be used for EST comparisons, comparing metrics, and ordering entries in this cache depending on the type of EST analyzer (whether it is similarity based or distance metric based). |
[in] | repopulateCaches | If this parameter is true then the MSTCache will request lists to be repopulated when needed. Repopulating lists guarantees that ultimately a MST will be developed. If repopulation is suppressed then the resulting spanning tree may not be a MST. |
[in] | maxCachePerEST | This parameter suggests the maximum number of cached entries that must be maintained for a given EST. This parameter may not be really used by all dervied classes (this is only a suggestion). |
Definition at line 43 of file MSTMultiListCache.cpp.
virtual MSTMultiListCache::~MSTMultiListCache | ( | ) | [inline, virtual] |
The destructor.
The instance variables in this class automatically manage all the memory associated this class. Consequently, the destructor does not have any special tasks to perform. It here present merely to adhere to coding conventions.
Definition at line 125 of file MSTMultiListCache.h.
void MSTMultiListCache::displayStats | ( | std::ostream & | os, | |
const int | MyRank | |||
) | const [virtual] |
Display cache usage statistics.
This method can be used to print the current cache usage statistics. This method prints the following information:
Number of ESTs cached by this cache.
Total number of SMEntries currently in the cache.
Average length of each list and the standard deviation.
The number of times the cache had to be repopulated.
[out] | os | The output stream to which statistics must be written. |
[in] | MyRank | The MPI rank of the process to which this cache is assocaited. This information is used merely for displaying more informative statistics and does not influence the actual values displayed by this method. |
Implements MSTCache.
Definition at line 223 of file MSTMultiListCache.cpp.
References analysisCount, cache, cacheRepopulation, and prunedEntries.
void MSTMultiListCache::getBestEntry | ( | int & | srcESTidx, | |
int & | destESTidx, | |||
float & | metric, | |||
int & | alignmentData, | |||
int & | directionData | |||
) | const [virtual] |
Obtains the top-most similar entry from the MSTCache.
This method overrides the default implementation in the base class. It searches the analysis metrics stored in this cache for various ESTs to locate the best entry with the best metric. It populates the parameters with the appropriate value. Note that the parameters are initialized to -1, -1, and -1.0f respectively.
[out] | srcESTidx | The source EST index from where the similarity metric is being measured. The srcESTidx is already present in the MST. |
[out] | destESTidx | The destination EST index that is the best choice to be added to the MST (based on the local information). |
[out] | metric | The similarity/distance metric between the srcESTidx and the destESTidx. |
[out] | alignmentData | The alignment data associated with the srcESTidx and the destESTidx. |
[out] | directionData | The direction data associated with the srcESTidx and the destESTidx. |
Implements MSTCache.
Definition at line 173 of file MSTMultiListCache.cpp.
References CachedESTInfo::alignmentData, MSTCache::analyzer, cache, ESTAnalyzer::compareMetrics(), CachedESTInfo::directionData, CachedESTInfo::estIdx, ESTAnalyzer::getInvalidMetric(), and CachedESTInfo::metric.
void MSTMultiListCache::mergeList | ( | const int | estIdx, | |
const SMList & | list | |||
) | [virtual] |
Merges a given SMList with current entries in the cache for a given EST.
This method overrides the virtual method in the base class. As per the API requirement, it merges the given list with existing cache entries for the given EST such that only maxCachePerEST
entries are retained in the cache. The retained entries are chosen so that they have the best metrics. Care is taken to ensure that the cache continues to remain sorted with the best metrics at the top of the cache.
[in] | estIdx | The index of the EST with which the given list must be merged. |
[in] | list | The SMList to be merged with any existing entries in the cache for the given estIdx. |
Implements MSTCache.
Definition at line 116 of file MSTMultiListCache.cpp.
References MSTCache::analyzer, ASSERT, cache, ESTAnalyzer::compareMetrics(), MSTCache::copy_n(), ESTAnalyzer::getInvalidMetric(), MSTCache::maxCachePerEST, and MSTCache::startOwnESTidx.
void MSTMultiListCache::preprocess | ( | SMList & | list | ) | [virtual] |
Sort a Similarity Metric List (SMList) based on similarity metric and then prunes it.
This method provides a convenient method for sorting a similarity metric list based on the similarity metric. This method simply uses the STL sort method with a suitable Functor for performing this task. Sorting is performed such that the highest similarity metric is first in the vector after sorting. Once sorting has been completed, this method ensures that the list is no longer than the specified cacheSize.
[in,out] | list | The SMList to be sorted based on the (similarity or distance) metric associated with each CachedESTInfo entry in the list. |
Reimplemented from MSTCache.
Definition at line 204 of file MSTMultiListCache.cpp.
References analysisCount, MSTCache::analyzer, MSTCache::maxCachePerEST, and prunedEntries.
bool MSTMultiListCache::pruneCache | ( | SMList & | list, | |
const int | estIdx | |||
) | [protected] |
Helper method to prune a given SMList.
This is a helper method that is called from the pruneCaches() method to remove all entries corresponding to the given estIdx.
[in,out] | list | The SMList whose entries needs to be pruned. |
[in] | estIdx | The index of the EST whose entries must be removed from this list. |
true
if the list becomes empty once all the entries have been removed. Definition at line 97 of file MSTMultiListCache.cpp.
References prunedEntries.
Referenced by pruneCaches().
void MSTMultiListCache::pruneCaches | ( | const int | estIdxAdded, | |
std::vector< int > & | repopulateList, | |||
const bool | prefixListSize = true | |||
) | [virtual] |
Clear caches that contain entries for a newly added estIdx.
This method overrides the virtual method in the base class and prunes entries corresponding to the estIdxAdded
from the various lists maintained by this cache. The pruning operation is removes unused entries and facilitates repopulation of caches (if requested).
[in] | estIdxAdded | The index of the EST that has just been added to the MST. |
[out] | repopulateList | The index values of ESTs whose cache needs to be repopulated. The list of EST indexes added to this list are only the ESTs that are owned by this process. If none of the caches need to be recomputed then this list is empty (all existing entries are removed). |
[in] | prefixListSize | If this parameter is set to true , then the number of entries added to the repopulateList is inserted at the beginning of the vector. This eases transmission of lists via MPI to the master process. |
Implements MSTCache.
Definition at line 57 of file MSTMultiListCache.cpp.
References cache, cacheRepopulation, MSTCache::nodesAdded, pruneCache(), and MSTCache::repopulateCaches.
long MSTMultiListCache::analysisCount [private] |
Number of times comparisons of ESTs were performed.
Tracks the number of times EST comparisons were performed. This instance variable is initialized to 0 (zero). The set of values sorted via teh sortAndPrune() method are used to track the number of comparisons performed and this instance variable is suitably adjusted.
Definition at line 305 of file MSTMultiListCache.h.
Referenced by displayStats(), and preprocess().
std::vector<MSTCacheEntry> MSTMultiListCache::cache [private] |
The actual vector of cache entries for various nodes.
This cache contains the list of cache entries for each node owned and managed by this MSTCache. The list is initialized to have a set of empty lists in the constructor, when the cache is instantiated for use.
Definition at line 273 of file MSTMultiListCache.h.
Referenced by displayStats(), getBestEntry(), mergeList(), and pruneCaches().
int MSTMultiListCache::cacheRepopulation [private] |
Number of times the cache needed to be repopulated.
This instance variable is used to track the number of times the cache had to be repopulated as one of the entries ran out of data. This variable is initialized to 0 (zero), incremented in the pruneCaches() method, and displayed in the displayStats() method.
Definition at line 283 of file MSTMultiListCache.h.
Referenced by displayStats(), and pruneCaches().
int MSTMultiListCache::prunedEntries [private] |
Number of entries that were pruned from the cache.
This instance variable is used to track the number of entries that were pruned from the cache by the pruneCache() method. This value indicates the number of entries in the cache that were unused. Moreover, this value also a measure of the amount of unnecessary adjacency information calculation operations that were performed for this cache by all the processes put together.
Definition at line 295 of file MSTMultiListCache.h.
Referenced by displayStats(), preprocess(), and pruneCache().