FWAnalyzer Class Reference

FWAnalyzer: Frame and Word based Analyzer. More...

#include <FWAnalyzer.h>

Inheritance diagram for FWAnalyzer:
Inheritance graph
[legend]
Collaboration diagram for FWAnalyzer:
Collaboration graph
[legend]

List of all members.

Public Member Functions

virtual ~FWAnalyzer ()
 The destructor.
std::string getFrame (const EST *est, bool start=true)
 Obtains a frame number of bp from a given EST sequence.
virtual void showArguments (std::ostream &os)
 Display valid command line arguments for this analyzer.
virtual bool parseArguments (int &argc, char **argv)
 Process command line arguments.
virtual int initialize ()
 Method to begin EST analysis.
virtual int setReferenceEST (const int estIdx)
 Set the reference EST id for analysis.
virtual int analyze ()
 Method to begin EST analysis.
virtual int getPreferredDummyESTLength () const
 Determine preferred dummy EST lengths to be used with this analyzer.

Protected Member Functions

virtual float getMetric (const int otherEST)
 Analyze and obtain a similarity metric.
virtual float analyzeFrame (const std::string &refFrame, const std::string &otherFrame, const int wordSize)
 Method to compare two frames and compute similarity.
virtual void dumpHeader (ResultLog &log, const double mean, const double variance)
 Helper method to dump result log header.
virtual void dumpESTList (const std::vector< EST * > &estList, const EST *refEST, ResultLog &log)
 Helper method to dump post analysis EST list to a log.
virtual void dumpEST (ResultLog &log, const EST *est, const bool isReference=false)
 Dumps a given EST in 3-column format using R.
 FWAnalyzer (const std::string &analyzerName, const int refESTidx, const std::string &outputFile)
 The default constructor.

Protected Attributes

int frameSize
 The frame size to be used by this analyzer.
std::string referenceFrame
 The reference frame to be used for EST comparisons.

Static Protected Attributes

static int argumentFrameSize = 100
 The frame size supplied by the user as command line input.
static int wordSize = 6
 The word size to be used by this analyzer.

Static Private Attributes

static arg_parser::arg_record commonArgsList []
 The set of common arguments for all FWAnalyzer instances.

Detailed Description

FWAnalyzer: Frame and Word based Analyzer.

This analyzer provides a common base class for all EST analyzers that use a frame and word concept for analyzing ESTs. The total number of base pairs to be compared is called a Frame. A frame is broken into a sequence of fixed size (in bp) words. The frame size and word size (in terms of number of base pairs) is specified as command line arguments.

This class has been implemented by extending the ESTAnalyzer base class. The ESTAnalyzer base class provides most of the standard functionality involved in reading FASTA files and generating formatted output. This class adds functionality to compare EST's using the concept of frames and words

Note:
This class is never directly instantiated. Instead, one of the derived classes are instantiated (via the ESTAnalyzerFactory::create) method and used.

Definition at line 64 of file FWAnalyzer.h.


Constructor & Destructor Documentation

FWAnalyzer::~FWAnalyzer (  )  [virtual]

The destructor.

The destructor frees up all any dynamic memory allocated by this object for its operations.

Definition at line 65 of file FWAnalyzer.cpp.

References EST::deleteAllESTs().

FWAnalyzer::FWAnalyzer ( const std::string &  analyzerName,
const int  refESTidx,
const std::string &  outputFile 
) [protected]

The default constructor.

The default constructor for this class. The constructor is made protected so that this class cannot be directly instantiated. Instead one of the derived analyzer classes must be instantiated (via the ESTAnalyzerFactory::create()) method and used.

Parameters:
[in] analyzerName The human readable name for this EST analyzer. This name is used when generating errors, warnings, and other output messages for this analyzer. This value is simply passed-on to the base class without any checks.
[in] refESTidx The reference EST index value to be used when performing EST analysis. This parameter should be >= 0. This value is simply passed onto the base class.
[in] outputFile The name of the output file to which the EST analysis data is to be written. This parameter is ignored if this analyzer is used for clustering. If this parameter is the empty string then output is written to standard output. This value is simply passed onto the base class.

Definition at line 58 of file FWAnalyzer.cpp.

References frameSize.


Member Function Documentation

int FWAnalyzer::analyze (  )  [virtual]

Method to begin EST analysis.

This method is used to perform the core tasks of EST analysis for all FWAnalyzer classes. This method operates in the following manner:

  1. First it loads the necessary EST information from the supplied FASTA file using the initialize() method. If the EST data is not successfully loaded then this method returns right away with 1.
  2. Upon successfully loading the EST data, the reference EST is set via the setReferenceEST() method. If the reference EST is not correctly determined, then this method immediately returns with 2.

  3. For each EST in the list of ESTs it performs the following tasks:

    1. It extracts a frame from the reference EST and current EST.
    2. Next, it calls the polymorphic analyze() method to obtain similarity metric.

    3. It logs the similarity metric using suitable methods in the ESTAnalyzer base class.

  4. If all the processing proceeds successfully, this method returns 0 (zero).

Returns:
This method returns zero if all the processing proceeded successfully. On errors this method returns a non-zero value.

Implements ESTAnalyzer.

Reimplemented in D2, D2Zim, and TwoPassD2.

Definition at line 206 of file FWAnalyzer.cpp.

References dumpESTList(), dumpHeader(), EST::getESTList(), ESTAnalyzer::htmlLog, initialize(), ESTAnalyzer::outputFileName, ESTAnalyzer::refESTidx, and setReferenceEST().

float FWAnalyzer::analyzeFrame ( const std::string &  refFrame,
const std::string &  otherFrame,
const int  wordSize 
) [protected, virtual]

Method to compare two frames and compute similarity.

This method must be overridden by derived Frame-Word analyzers (see FMWSCA.h) to compare two frames and report a similarity metric.

Parameters:
[in] refFrame The reference frame for comparison purposes. Note that the reference frame is always a constant in a given set of caparisons. Consequently, certain analyzers can pre-compute and reuse metrics to make analysis fast.
[in] otherFrame The other frame for comparison. This frame is always guaranteed to be from a different EST than the refFrame.
[in] wordSize The size of a word within the given frame. This value is always greater than 0 (zero) and less than frame size.
Returns:
This method is expected to return a similarity metric between the given frame and the refFrame.
Note:
The default implementation of this method simply returns 0. Derived FWAnalyzer-based classes must override this method to perform the necessary operations.

Definition at line 293 of file FWAnalyzer.cpp.

Referenced by getMetric().

void FWAnalyzer::dumpEST ( ResultLog log,
const EST est,
const bool  isReference = false 
) [protected, virtual]

Dumps a given EST in 3-column format using R.

This method is a helper method that dumps a given EST out to the log.

Parameters:
[out] log The log to which the EST is to be dumped.
[in] est The EST to be dumped. This parameter is never NULL.
[in] isReference If this flag is true, then this EST is the reference EST to be dumped out.

Reimplemented in CLU.

Definition at line 119 of file FWAnalyzer.cpp.

References getFrame(), EST::getInfo(), EST::getSimilarity(), ESTAnalyzer::htmlLog, and ResultLog::report().

Referenced by dumpESTList().

void FWAnalyzer::dumpESTList ( const std::vector< EST * > &  estList,
const EST refEST,
ResultLog log 
) [protected, virtual]

Helper method to dump post analysis EST list to a log.

This is a helper method that is invoked from the analyze() method to dump the list of analyzed ESTs to a log. This method was introduced to keep the code clustter in the analyze method to a minimum. In addition, it provides the derived classes a chance to customize the working of the class.

This method dumps the list of ESTs to the supplied log.

Parameters:
[in] estList The list of ESTs that must be dumped out.
[in] refEST The reference EST.
[out] log The log to which the header is to be dumped.

Definition at line 135 of file FWAnalyzer.cpp.

References dumpEST(), and ResultLog::startTable().

Referenced by analyze().

void FWAnalyzer::dumpHeader ( ResultLog log,
const double  mean,
const double  variance 
) [protected, virtual]

Helper method to dump result log header.

This is a helper method that is invoked from the analyze() method to dump a result log header. This method was introduced to keep the code clustter in the analyze method to a minimum.

This method dumps some of the analysis parameters to the supplied log.

Parameters:
[out] log The log to which the header is to be dumped.
[in] mean The overall mean similarity for this set of ESTs.
[in] variance The overall variance in similarity for the given set of ESTs currently analyzed.

Definition at line 259 of file FWAnalyzer.cpp.

References ESTAnalyzer::analyzerName, ResultLog::endTable(), ESTAnalyzer::estFileName, frameSize, EST::getESTList(), getTime(), ESTAnalyzer::htmlLog, ESTAnalyzer::refESTidx, ResultLog::report(), ResultLog::reportLine(), and wordSize.

Referenced by analyze().

std::string FWAnalyzer::getFrame ( const EST est,
bool  start = true 
)

Obtains a frame number of bp from a given EST sequence.

This method is used to obtain a frame number of base pairs from a given EST sequence. The frame is extracted either from the beginning or the end of an EST depending on the start parameter.

Parameters:
[in] est The EST from which a frame number of bp must be extracted.
[in] start If this flag is true then this method extracts base pairs from the beginning of the EST sequence. Otherwise a frame is extracted from the end of the EST sequence.

Definition at line 109 of file FWAnalyzer.cpp.

References frameSize, and EST::getSequence().

Referenced by dumpEST(), getMetric(), and setReferenceEST().

float FWAnalyzer::getMetric ( const int  otherEST  )  [protected, virtual]

Analyze and obtain a similarity metric.

This method can be used to compare a given EST with the reference EST (set via the call to the setReferenceEST()) method.

Parameters:
[in] otherEST The index (zero based) of the EST with which the reference EST is to be compared.
Returns:
This method must returns a similarity metric by comparing the ESTs by calling the analyze() method.

Implements ESTAnalyzer.

Reimplemented in CLU, D2, D2Zim, and TwoPassD2.

Definition at line 194 of file FWAnalyzer.cpp.

References analyzeFrame(), EST::getESTList(), getFrame(), referenceFrame, ESTAnalyzer::refESTidx, and wordSize.

virtual int FWAnalyzer::getPreferredDummyESTLength (  )  const [inline, virtual]

Determine preferred dummy EST lengths to be used with this analyzer.

Note:
For more detailed description of the motivation for dummy ESTs please refer to the documentation for the corresponding method in the base class -- getPreferredDummyESTLength().
Returns:
This method overrides the default implementation in the base class to return twice the length of the frame (aka window) size.

Reimplemented from ESTAnalyzer.

Definition at line 220 of file FWAnalyzer.h.

References frameSize.

int FWAnalyzer::initialize (  )  [virtual]

Method to begin EST analysis.

This method is invoked just before commencement of EST analysis. This method loads the list of ESTs from a given input multi-FASTA file and pouplates the list of ESTs.

Returns:
If the ESTs were successfully loaded from the FATA file then this method returns 0. Otherwise this method returns with a non-zero error code.

Implements ESTAnalyzer.

Reimplemented in CLU, D2, D2Zim, and TwoPassD2.

Definition at line 157 of file FWAnalyzer.cpp.

References ESTAnalyzer::chain, ESTAnalyzer::estFileName, HeuristicChain::initialize(), and ESTAnalyzer::loadFASTAFile().

Referenced by analyze().

bool FWAnalyzer::parseArguments ( int &  argc,
char **  argv 
) [virtual]

Process command line arguments.

This method is used to process command line arguments specific to this EST analyzer. This method is typically used from the main method just after the EST analyzer has been instantiated. This method consumes all valid command line arguments. If the command line arguments were valid and successfully processed, then this method returns true.

Note:
Derived EST analyzer classes must override this method to process any command line arguments that are custom to their operation. When this method is overridden don't forget to call the corresponding base class implementation to display common options.
Parameters:
[in,out] argc The number of command line arguments to be processed.
[in,out] argv The array of command line arguments.
Returns:
This method returns true if the command line arguments were successfully processed. Otherwise this method returns false. This method checks to ensure that a valid frame size and a valid word size have been specified.

Reimplemented from ESTAnalyzer.

Reimplemented in CLU, D2, D2Zim, FMWSCA, and TwoPassD2.

Definition at line 80 of file FWAnalyzer.cpp.

References ESTAnalyzer::analyzerName, argumentFrameSize, arg_parser::check_args(), frameSize, ESTAnalyzer::parseArguments(), and wordSize.

int FWAnalyzer::setReferenceEST ( const int  estIdx  )  [virtual]

Set the reference EST id for analysis.

This method is invoked just before a batch of ESTs are analyzed via a call to the analyze(EST *) method. This method extracts the start frame from the reference EST and sets it in the referenceFrame instance variable in this class.

Note:
This method must be called only after the initialize() method is called.
Returns:
If the extraction of the reference EST frame was successful, then this method returns 0. Otherwise this method returns an error code.

Implements ESTAnalyzer.

Reimplemented in CLU, D2, D2Zim, and TwoPassD2.

Definition at line 173 of file FWAnalyzer.cpp.

References EST::getESTList(), getFrame(), referenceFrame, and ESTAnalyzer::refESTidx.

Referenced by analyze().

void FWAnalyzer::showArguments ( std::ostream &  os  )  [virtual]

Display valid command line arguments for this analyzer.

This method must be used to display all valid command line options that are supported by this analyzer. Note that derived classes may override this method to display additional command line options that are applicable to it. This method is typically used in the main() method when displaying usage information.

Note:
Derived EST analyzer classes must override this method to display help for their custom command line arguments. When this method is overridden don't forget to call the corresponding base class implementation to display common options.
Parameters:
[out] os The output stream to which the valid command line arguments must be written.

Reimplemented from ESTAnalyzer.

Reimplemented in CLU, D2, D2Zim, FMWSCA, and TwoPassD2.

Definition at line 71 of file FWAnalyzer.cpp.

References ESTAnalyzer::analyzerName.


Member Data Documentation

int FWAnalyzer::argumentFrameSize = 100 [static, protected]

The frame size supplied by the user as command line input.

A separate, static frame size variable so that it can be assigned as a command line argument.

/note Deriving analyzers are not in any way required to utilize the user-supplied frame size -- see specific analyzer classes for those details.

Definition at line 343 of file FWAnalyzer.h.

Referenced by parseArguments().

Initial value:
 {
    {"--frame", "Frame size (in base pairs, default=100)",
     &FWAnalyzer::argumentFrameSize, arg_parser::INTEGER},
    {"--word", "Word size (in base pairs, default=6)",
     &FWAnalyzer::wordSize, arg_parser::INTEGER},    
    {NULL, NULL, NULL, arg_parser::BOOLEAN}
}

The set of common arguments for all FWAnalyzer instances.

This instance variable contains a static list of arguments that are common all the Frame-Word analyzers. The common argument list is statically defined and shared by all EST instances.

Note:
This makes FWAnalyzer class hierarchy not MT-safe.

Reimplemented from ESTAnalyzer.

Definition at line 399 of file FWAnalyzer.h.

int FWAnalyzer::frameSize [protected]

The frame size to be used by this analyzer.

The frame size (in bp) that must be used for comparisons. The default value is set to 0. However, the value is changed by the parseArguments method depending on the actual value specified by the user.

Definition at line 332 of file FWAnalyzer.h.

Referenced by D2Zim::buildFdHashMaps(), D2::buildWordTable(), dumpHeader(), FWAnalyzer(), getFrame(), D2::getInvalidMetric(), TwoPassD2::getMetric(), D2Zim::getMetric(), getPreferredDummyESTLength(), CLU::getSimilarity(), TwoPassD2::initialize(), D2Zim::initialize(), D2::initialize(), parseArguments(), D2::runD2(), and TwoPassD2::updateParameters().

std::string FWAnalyzer::referenceFrame [protected]

The reference frame to be used for EST comparisons.

This instance variable is set to the reference frame once the setReferenceFrame() method is called. This referenceFrame is used in subsequent analysis() methods.

Definition at line 362 of file FWAnalyzer.h.

Referenced by getMetric(), and setReferenceEST().

int FWAnalyzer::wordSize = 6 [static, protected]

The word size to be used by this analyzer.

The word size (in bp) that must be used for comparisons. The default value is set to 0. However, the value is changed by the parseArguments method depending on the actual value specified by the user.

Note:
The word size must be smaller than the frame size.

Definition at line 354 of file FWAnalyzer.h.

Referenced by D2Zim::buildFdHashMaps(), TwoPassD2::buildWordTable(), D2Zim::buildWordTable(), D2::buildWordTable(), CLU::createCLUHashMap(), dumpHeader(), CLU::filterHashMap(), getMetric(), CLU::getSimilarity(), TwoPassD2::initialize(), D2Zim::initialize(), D2::initialize(), parseArguments(), D2::runD2(), TwoPassD2::runD2Asymmetric(), TwoPassD2::runD2Bounded(), and TwoPassD2::updateParameters().


The documentation for this class was generated from the following files:

Generated on 19 Mar 2010 for PEACE by  doxygen 1.6.1