org.peace_tools.data
Class ClusterNode

java.lang.Object
  extended by org.peace_tools.data.ClusterNode

public class ClusterNode
extends java.lang.Object

A class that represents a single cluster in a cluster tree. This class is a pure data class that is used to encapsulate the information pertaining to a single cluster. This class is a self-referential structure, in that it contains a list of Cluster objects that represent child clusters of this cluster. This definition permits a Cluster to contain a complete cluster collection a part of it. The Cluster objects are created and used by the ClusterTree class that represents the top-level class.


Field Summary
private  java.util.ArrayList<ClusterNode> children
          This array list contains the list of child clusters and EST nodes contained by this cluster node.
private  int clusterOrESTID
          A number that is assigned to this node.
(package private)  java.util.HashMap<java.lang.Integer,java.lang.Integer> estGroups
          The aggregate classification information associated with this EST.
private  java.lang.String name
          A name set to identify filtered clusters.
private  ClusterNode parent
          The parent cluster for this MSTcluster.
 
Constructor Summary
ClusterNode(ClusterNode parent, boolean estNode, int id, java.lang.String name)
          Constructor to create a Cluster.
 
Method Summary
 void addChild(ClusterNode node)
          Add another cluster as the child cluster of this cluster node.
protected  void classify(ESTList estList, javax.swing.ProgressMonitor pm)
          Method to compute classification statistics for this cluster node.
 double[] computeStatistics()
          Method to compute and return statistics about this cluster.
 java.util.ArrayList<ClusterNode> getChildren()
          Obtain the child nodes associated with this node.
 int getClusterId()
          Obtain the cluster id associated with an EST node.
 java.util.HashMap<java.lang.Integer,java.lang.Integer> getESTClasses()
          Obtain the aggregate classification information for this cluster.
 int getESTCount(boolean recursive)
          Method to determine number of ESTs contained in this cluster node.
 int getESTId()
          Obtain the EST id associated with an EST node.
 int getLargestClusterSize()
          Determine the largest cluster in this cluster node.
 java.lang.String getName()
          Obtain the name associated with this cluster (if any) This method must be used to obtain the name associated with this cluster.
 boolean isESTNode()
          Determine if this node represents a EST entry in the cluster tree.
 boolean isLeaf()
          Determine if this cluster is a leaf cluster that has no child clusters or EST entries.
 boolean isRoot()
          Determine if this cluster node is the root cluster.
 void print(java.io.PrintStream out, ClusterNode cluster, java.lang.String indent)
          Method to recursively print the information that is stored in this cluster.
 java.lang.String toString()
          This method returns basic information about this cluster node.
 void write(ESTList estList, java.io.PrintStream os)
          Method to dump the EST data out in FASTA file format.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

parent

private ClusterNode parent
The parent cluster for this MSTcluster. If this MSTcluster is the root cluster then it has no parent (that is parent is null).


clusterOrESTID

private int clusterOrESTID
A number that is assigned to this node. This value could either be the EST index/ID or a cluster ID depending on the sign. Positive values represent clusterIDs while negative values represent EST index/ID.


name

private java.lang.String name
A name set to identify filtered clusters. The name is loaded from the cluster file. Cluster names are typically set when dummy clusters are created to add ESTs that were filtered out based on a specific condition. The named clusters are typically created by filters. By default clusters don't have a name. These indicate regular clusters.


children

private java.util.ArrayList<ClusterNode> children
This array list contains the list of child clusters and EST nodes contained by this cluster node. Typically EST nodes will not have any children.


estGroups

java.util.HashMap<java.lang.Integer,java.lang.Integer> estGroups
The aggregate classification information associated with this EST. The information is used to provide additional summary information about the ESTs in this cluster.

Constructor Detail

ClusterNode

public ClusterNode(ClusterNode parent,
                   boolean estNode,
                   int id,
                   java.lang.String name)
Constructor to create a Cluster. This constructor provides a convenient mechanism to create and initialize a Cluster object with necessary information.

Parameters:
parent - The parent cluster for this cluster node (if known). The parent value is typically set when this node is added as a child to another cluster node.
estNode - If this flag is true, that indicates that this estNode
name - The name to be set for this cluster. This name is typically read from a cluster file. Clusters generated by filters (called dummy clusters) typically have a name. Clusters generated due to regular clustering process do not have a name associated with them.
Method Detail

addChild

public void addChild(ClusterNode node)
Add another cluster as the child cluster of this cluster node. This method adds the given cluster as a child cluster. In addition it also sets up the parent reference in the child to point to this cluster.

Parameters:
node - The cluster node to be added as a direct child of this cluster node.

isRoot

public boolean isRoot()
Determine if this cluster node is the root cluster.

Note: This method is meaningful only after a complete cluster hierarchy has been built.

Returns:
This method returns true if the parent of this cluster is null, indicating this is a root cluster.

isLeaf

public boolean isLeaf()
Determine if this cluster is a leaf cluster that has no child clusters or EST entries.

Note: This method is meaningful only after a complete cluster hierarchy has been built.

Returns:
This method returns true if this cluster has no children.

isESTNode

public boolean isESTNode()
Determine if this node represents a EST entry in the cluster tree.

Returns:
This method returns true if this node represents an est entry in the cluster tree.

getESTId

public int getESTId()
Obtain the EST id associated with an EST node.

Note: The return value from this method is meaningful only if the isESTNode() method returns true.

Returns:
The ID (typically the index of the EST) of the EST associated with this cluster node.

getClusterId

public int getClusterId()
Obtain the cluster id associated with an EST node.

Note: The return value from this method is meaningful only if the isESTNode() method returns false.

Returns:
The ID of this cluster node.

getName

public java.lang.String getName()
Obtain the name associated with this cluster (if any) This method must be used to obtain the name associated with this cluster. The cluster names are set for "dummy" clusters that are created by filters. Real clusters generated due to clustering of ESTs do not have a cluster name associated with them.

Returns:
The name set for this cluster. If a valid name is not set then this method returns an empty string ("").

getChildren

public java.util.ArrayList<ClusterNode> getChildren()
Obtain the child nodes associated with this node.

Note: This method returns null if the node does not have any children.

Returns:
This method must be used to obtain the child nodes associated with a given node.

getESTClasses

public java.util.HashMap<java.lang.Integer,java.lang.Integer> getESTClasses()
Obtain the aggregate classification information for this cluster. This method must be used to obtain the classification information associated with this cluster. This method returns a HahsMap containing the classification entries for this cluster. The key values in the hash map are the index of the DB classifier associated with a given entry. The value is the number of ESTs in this cluster that are associated with this EST.

Note: The return value from this method may be null if a suitable classifier is not available for this cluster or if a classifier has not been computed.

Returns:
A hash map containing the classification information for the ESTs.

toString

public java.lang.String toString()
This method returns basic information about this cluster node.

Overrides:
toString in class java.lang.Object
Returns:
This method returns a simple string representation of the data stored in this cluster node.

getLargestClusterSize

public int getLargestClusterSize()
Determine the largest cluster in this cluster node. This method recursively searches the cluster hierarchy to determine the largest cluster (in terms of ESTs).

Returns:
The largest cluster under this cluster node. If the node is the root, then this return value corresponds to the largest cluster in the entire cluster file.

print

public void print(java.io.PrintStream out,
                  ClusterNode cluster,
                  java.lang.String indent)
Method to recursively print the information that is stored in this cluster. This method recursively prints the child clusters. This method also serves as a manual mechanism to validate the data stored in this cluster.

Parameters:
out - The output stream to which the data is to be serialized.
cluster - The cluster to be printed.
indent - The number of spaces to be used to indent the output.

write

public void write(ESTList estList,
                  java.io.PrintStream os)
Method to dump the EST data out in FASTA file format. This method can be used to dump out the ESTs in this cluster to a file in FASTA compatible format.

Parameters:
estList - The list of ESTs from which the EST data is to be obtained for writing.
os - The output stream to which the EST must be written in a FASTA format.

classify

protected void classify(ESTList estList,
                        javax.swing.ProgressMonitor pm)
Method to compute classification statistics for this cluster node. This method is typically invoked from the ClusterFile.classify() method to compute classifications for this node. This method iterates over all the entries in this cluster and collates information about each EST in the cluster. If this cluster has sub-clusters the the classification is delegated to the sub-clusters and this node does not really compute any additional classification information.

Parameters:
estList - The list of ESTs associated with the clusters. This information is used to compute aggregate classification data for each cluster.
pm - An optional progress monitor to be used to indicate progress as the data is computed.

getESTCount

public int getESTCount(boolean recursive)
Method to determine number of ESTs contained in this cluster node. This method can be used to compute the number of ESTs contained in this cluster node. This method can either recursively descend into the cluster hierarchy to determine number of ESTs or simply return number of ESTs in this specific node. This operation is controlled by the value specified for the recursive parameter.

Parameters:
recursive - This this parameter is true, then this method recursively descends down the cluster hierarchy to determine the number of ESTs in this specific node.
Returns:
The number of ESTs contained in this node.

computeStatistics

public double[] computeStatistics()
Method to compute and return statistics about this cluster. This method can be used to determine aggregate statistics about this cluster node. Typically, calling this method on the root node provides summary statistics about the clusters contained in a cluster file.

Returns:
This method returns an array of doubles containing the following information about the set of clusters contained in this node: numClusters, minClusterSize, maxClusterSize, avgClusterSize, avgClusterSizeSD.