MatrixFileAnalyzer Class Reference

MatrixFileAnalyzer: EST Analyzer that simply obtains distances from a matrix file for processing. More...

#include <MatrixFileAnalyzer.h>

Inheritance diagram for MatrixFileAnalyzer:
Inheritance graph
[legend]
Collaboration diagram for MatrixFileAnalyzer:
Collaboration graph
[legend]

List of all members.

Public Member Functions

virtual ~MatrixFileAnalyzer ()
 The destructor.
virtual void showArguments (std::ostream &os)
 Display valid command line arguments for this analyzer.
virtual bool parseArguments (int &argc, char **argv)
 Process command line arguments.
int initialize ()
 Method to begin EST analysis.
virtual std::string getName () const
 Method to obtain human-readable name for this EST analyzer.
virtual int setReferenceEST (const int estIdx)
 Set the reference EST id for analysis.
virtual int analyze ()
 Method to pefrom a batch of EST analysis.

Protected Member Functions

virtual float getMetric (const int otherEST)
 Analyze and obtain a distance (or similarity) metric.
bool compareMetrics (const float metric1, const float metric2) const
 Method to compare two metrics generated by this class.
float getInvalidMetric () const
 Obtain an invalid (or the worst) metric generated by this analyzer.
bool isDistanceMetric () const
 Determine if this EST analyzer provides distance metrics or similarity metrics.

Protected Attributes

const std::string inputFileName
 The data file from where the distance (or similarity) metrics must be loaded.
int estCount
 The number of EST's for which we have data in distanceValues matrix.
float ** distanceValues
 The array of distance values.

Private Member Functions

 MatrixFileAnalyzer (const int refESTidx, const std::string &outputFileName)
std::string readLine (FILE *fp)
 Method to read a line from a given EST file.
int parseMetrics (const char *line, float *values, const int startPos, const int maxValues)
 Utility method to read metrics from a given line into a given array from a given starting position.
bool parseESTCount (const char *line)
 Helper method to process EST count information from a line.

Static Private Attributes

static arg_parser::arg_record argsList []
 The set of arguments for the MatrixFileAnalyzer.
static char * dataFileName = NULL
 The matrix data file from where distance metrics are to be read.

Friends

class ESTAnalyzerFactory

Detailed Description

MatrixFileAnalyzer: EST Analyzer that simply obtains distances from a matrix file for processing.

This analyzer provides a simple interface for using precomputed distance/similarity values from a given data file. A matrix data file must have the following format:

    # Lines starting with '#' character are assumed to be comments and
    # they are ignored. The first non-comment line must be a number
    # indicating the number of ESTs for which data is present in the
    # file. For example here is a file for 3 ESTs
    3
    
    # After the number of EST's there must be nxn matrix of values
    # where n is number of EST's.
    0.0 10   20
    10  0    15
    20  15.2 0

Definition at line 69 of file MatrixFileAnalyzer.h.


Constructor & Destructor Documentation

MatrixFileAnalyzer::~MatrixFileAnalyzer (  )  [virtual]

The destructor.

The destructor frees up all any dynamic memory allocated by this object for its operations.

Definition at line 60 of file MatrixFileAnalyzer.cpp.

References distanceValues, and estCount.

MatrixFileAnalyzer::MatrixFileAnalyzer ( const int  refESTidx,
const std::string &  outputFileName 
) [private]

Definition at line 53 of file MatrixFileAnalyzer.cpp.

References distanceValues, and estCount.


Member Function Documentation

int MatrixFileAnalyzer::analyze (  )  [virtual]

Method to pefrom a batch of EST analysis.

The ESTAnalyzer base class requires this method to be overloaded in the dervied class(es). This method is used to perform the core tasks of EST analysis for teh MatrixFileAnalyzer. This method operates in the following manner:

  1. First it loads the necessary distance information from the supplied data file file using the initialize() method. If the data is not successfully loaded then this method returns right away with 1.
  2. Upon successfully loading the data, the reference EST is set via the setReferenceEST() method. If the reference EST is not correctly determined, then this method immediately returns with 2.

  3. For each EST in the list of ESTs it performs the following tasks:

  4. It logs the similarity metric using suitable methods in the ESTAnalyzer base class.
  5. If all the processing proceeds successfully, this method returns 0 (zero).

Implements ESTAnalyzer.

Definition at line 219 of file MatrixFileAnalyzer.cpp.

bool MatrixFileAnalyzer::compareMetrics ( const float  metric1,
const float  metric2 
) const [inline, protected, virtual]

Method to compare two metrics generated by this class.

This method provides the interface for comparing metrics generated by this ESTAnalyzer when comparing two different ESTs. This method returns true if metric1 is comparatively better than or equal to metric2.

Note:
EST analyzers that are based on distance measures must override this method.
Parameters:
[in] metric1 The first metric to be compared against.
[in] metric2 The second metric to be compared against.
Returns:
This method returns true if metric1 is comparatively better than metric2.

Reimplemented from ESTAnalyzer.

Definition at line 260 of file MatrixFileAnalyzer.h.

float MatrixFileAnalyzer::getInvalidMetric (  )  const [inline, protected, virtual]

Obtain an invalid (or the worst) metric generated by this analyzer.

This method can be used to obtain an invalid metric value for this analyzer. This value can be used to initialize metric values.

Note:
Dervied distance-based metric classes must override this method to provide a suitable value.
Returns:
This method returns an invalid (or the worst) metric of 1e7 for this EST analyzer.

Reimplemented from ESTAnalyzer.

Definition at line 276 of file MatrixFileAnalyzer.h.

float MatrixFileAnalyzer::getMetric ( const int  otherEST  )  [protected, virtual]

Analyze and obtain a distance (or similarity) metric.

This method can be used to compare a given EST with the reference EST (set via the call to the setReferenceEST()) method.

Parameters:
[in] otherEST The index (zero based) of the EST with which the reference EST is to be compared.
Returns:
This method returns the distance (or similarity as the case may be) value loaded by the data file (loading is performed by the initialize() method).

Implements ESTAnalyzer.

Definition at line 188 of file MatrixFileAnalyzer.cpp.

References distanceValues, estCount, and ESTAnalyzer::refESTidx.

virtual std::string MatrixFileAnalyzer::getName (  )  const [inline, virtual]

Method to obtain human-readable name for this EST analyzer.

This method provides a human-readable string identifying the EST analyzer. This string is typically used for display/debugging purposes (particularly via the PEACE Interactive Console).

Returns:
This method returns the string "MatrixFile" identifiying this analyzer.

Implements ESTAnalyzer.

Definition at line 146 of file MatrixFileAnalyzer.h.

int MatrixFileAnalyzer::initialize (  )  [virtual]

Method to begin EST analysis.

This method is invoked just before commencement of EST analysis. This method loads the list of distance values from the given input file and pouplates the matrix distanceValues for futher use in the analyze method.

Returns:
If the ESTs were successfully loaded from the data file then this method returns 0. Otherwise this method returns with a non-zero error code.

Implements ESTAnalyzer.

Definition at line 96 of file MatrixFileAnalyzer.cpp.

References dataFileName, distanceValues, estCount, inputFileName, parseESTCount(), parseMetrics(), and readLine().

bool MatrixFileAnalyzer::isDistanceMetric (  )  const [inline, protected, virtual]

Determine if this EST analyzer provides distance metrics or similarity metrics.

This method can be used to determine if this EST analyzer provides distance metrics or similarity metrics. If this method returns true, then this EST analyzer returns distance metrics (smaller is better). On the other hand, if this method returns false, then this EST analyzer returns similarity metrics (bigger is better).

Returns:
This method returns true to indicate that this EST analyzer operates using distance metrics.

Reimplemented from ESTAnalyzer.

Definition at line 291 of file MatrixFileAnalyzer.h.

bool MatrixFileAnalyzer::parseArguments ( int &  argc,
char **  argv 
) [virtual]

Process command line arguments.

This method is used to process command line arguments specific to this EST analyzer. This method is typically used from the main method just after the EST analyzer has been instantiated. This method consumes all valid command line arguments. If the command line arguments were valid and successfully processed, then this method returns true.

Note:
The ESTAnalyzer base class requires that derived EST analyzer classes must override this method to process any command line arguments that are custom to their operation. When this method is overridden don't forget to call the corresponding base class implementation to display common options.
Parameters:
[in,out] argc The number of command line arguments to be processed.
[in,out] argv The array of command line arguments.
Returns:
This method returns true if the command line arguments were successfully processed. Otherwise this method returns false. This method checks to ensure that a valid frame size and a valid word size have been specified.

Reimplemented from ESTAnalyzer.

Definition at line 81 of file MatrixFileAnalyzer.cpp.

References ESTAnalyzer::analyzerName, arg_parser::check_args(), and dataFileName.

bool MatrixFileAnalyzer::parseESTCount ( const char *  line  )  [private]

Helper method to process EST count information from a line.

This method is a helper method that is used to read the number of EST's from the file and create the distanceValues matrix.

Parameters:
[in] line The line from where the EST count information is to be read.
Returns:
This method returns true if the EST count was processed successfully. Otherwise this method returns false.

Definition at line 150 of file MatrixFileAnalyzer.cpp.

References FilterFactory::create(), distanceValues, and estCount.

Referenced by initialize().

int MatrixFileAnalyzer::parseMetrics ( const char *  line,
float *  values,
const int  startPos,
const int  maxValues 
) [private]

Utility method to read metrics from a given line into a given array from a given starting position.

This method is a utility method that is used to read parse in a line containing space separated set of values into the array values. Data is stored into the values array starting with startPos.

Parameters:
[in] line The line whose contents is to be updated.
[out] values The array into which the values must be stored.
[in] startPos The starting position in the array from where values must be stored.
[in] maxValues The maximum number of items that must be processed from line.
Returns:
This method returns the number of values actually processed and stored into the values array.

Definition at line 197 of file MatrixFileAnalyzer.cpp.

Referenced by initialize().

std::string MatrixFileAnalyzer::readLine ( FILE *  fp  )  [private]

Method to read a line from a given EST file.

This method is a helper method that is used to load a given line from the file.

Parameters:
[in,out] fp The file pointer from where the data is to be read.
Returns:
The line read from the file. If EOF was reached then this method returns an empty line.

Definition at line 224 of file MatrixFileAnalyzer.cpp.

Referenced by initialize().

int MatrixFileAnalyzer::setReferenceEST ( const int  estIdx  )  [virtual]

Set the reference EST id for analysis.

This method is invoked just before a batch of ESTs are analyzed via a call to the analyze(EST *) method. This method currently saves the index in the instance variable for further look up.

Note:
This method must be called only after the initialize() method is called.
Returns:
This method returns true if the estIdx was within the range of values that were loaded from the data file. Otherwise this method returns 1 as the error code.

Implements ESTAnalyzer.

Definition at line 178 of file MatrixFileAnalyzer.cpp.

References estCount, and ESTAnalyzer::refESTidx.

void MatrixFileAnalyzer::showArguments ( std::ostream &  os  )  [virtual]

Display valid command line arguments for this analyzer.

This method must be used to display all valid command line options that are supported by this analyzer.

Note:
The ESTAnalyzer base class requires that derived EST analyzer classes must override this method to display help for their custom command line arguments. When this method is overridden don't forget to call the corresponding base class implementation to display common options.
Parameters:
[out] os The output stream to which the valid command line arguments must be written.

Reimplemented from ESTAnalyzer.

Definition at line 73 of file MatrixFileAnalyzer.cpp.


Friends And Related Function Documentation

friend class ESTAnalyzerFactory [friend]

Definition at line 70 of file MatrixFileAnalyzer.h.


Member Data Documentation

Initial value:
 {
    {"--dataFile", "Data file containing matrix of distance metrics",
     &MatrixFileAnalyzer::dataFileName, arg_parser::STRING},
    {NULL, NULL, NULL, arg_parser::BOOLEAN}
}

The set of arguments for the MatrixFileAnalyzer.

This instance variable contains a static list of arguments that are used by this analyzer.

Definition at line 377 of file MatrixFileAnalyzer.h.

char * MatrixFileAnalyzer::dataFileName = NULL [static, private]

The matrix data file from where distance metrics are to be read.

This member object is used to hold the file name from where all the distance matrix data is to be loaded. The value is set to the value supplied by the user via a suitable command line argument by the parseArguments() method.

Definition at line 387 of file MatrixFileAnalyzer.h.

Referenced by initialize(), and parseArguments().

The array of distance values.

This matrix contains the distance values between a given pair of ESTs. The zero-based index of the EST is used to look up values in this matrix. For example, given a pair of ESTs <est1, est2> distanceValues[est1][est2] provides the distance from est1 to est2 while distanceValues[est2][est1] provides the distance from est2 to est1. Note that distances do not have to be symmetric.

Definition at line 241 of file MatrixFileAnalyzer.h.

Referenced by getMetric(), initialize(), MatrixFileAnalyzer(), parseESTCount(), and ~MatrixFileAnalyzer().

The number of EST's for which we have data in distanceValues matrix.

This instance variable's value is set by the initialize method in this class.

Definition at line 229 of file MatrixFileAnalyzer.h.

Referenced by getMetric(), initialize(), MatrixFileAnalyzer(), parseESTCount(), setReferenceEST(), and ~MatrixFileAnalyzer().

const std::string MatrixFileAnalyzer::inputFileName [protected]

The data file from where the distance (or similarity) metrics must be loaded.

This instance variable is set to the dat file from where the necessary information is to be loaded. This value is set in the constructor and is never changed during the life time of this class.

Definition at line 221 of file MatrixFileAnalyzer.h.

Referenced by initialize().


The documentation for this class was generated from the following files:

Generated on 19 Mar 2010 for PEACE by  doxygen 1.6.1