id3
Class DecisionTree

java.lang.Object
  |
  +--id3.CatGraph
        |
        +--id3.RootedCatGraph
              |
              +--id3.DecisionTree

public class DecisionTree
extends RootedCatGraph

DecisonTrees are RootedCatGraphs where each node other than the root has exactly one parent. The root has no parents.


Fields inherited from class id3.CatGraph
cGraph, defaultDistDisp, distDispHelp, logOptions
 
Constructor Summary
DecisionTree()
          Constructor.
DecisionTree(CGraph grph)
          Constructor.
 
Method Summary
 void assign_subtree_levels(Node node, int baseLevel)
          Creates NodeInfo objects for every Node in the branch starting at the given Node and assigns each NodeInfo its appropriate level in the tree.
 void delete_subtree(Node node, Node newNode)
          Removes a subtree recursively.
 void distribute_instances(Node subtree, InstanceList il, double pruningFactor, DoubleRef pessimisticErrors, int ldType, double leafDistParameter, double[] parentWeightDist)
          Distribute instances to a subtree.
 void distribute_instances(Node subtree, InstanceList il, double pruningFactor, DoubleRef pessimisticErrors, int ldType, double leafDistParameter, double[] parentWeightDist, boolean saveOriginalDistr)
          Distribute instances to a subtree.
 
Methods inherited from class id3.RootedCatGraph
display, get_root, get_root, num_nontrivial_leaves, num_nontrivial_leaves, num_nontrivial_nodes, num_nontrivial_nodes, set_root, set_used_attr, trivial_edge
 
Methods inherited from class id3.CatGraph
check_node_in_graph, connect, convertToDotFormat, create_node, get_categorizer, get_graph, get_log_level, get_log_options, get_log_stream, is_sparse, num_attr, num_children, num_leaves, num_nodes, process_DotGraph_display, process_DotPostscript_display, set_log_level, set_log_options, set_log_prefixes, set_log_stream
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DecisionTree

public DecisionTree()
Constructor.

DecisionTree

public DecisionTree(CGraph grph)
Constructor.
Parameters:
grph - The CGraph object to be used to maintain the DecisionTree.
Method Detail

distribute_instances

public void distribute_instances(Node subtree,
                                 InstanceList il,
                                 double pruningFactor,
                                 DoubleRef pessimisticErrors,
                                 int ldType,
                                 double leafDistParameter,
                                 double[] parentWeightDist)
Distribute instances to a subtree. This function is used whenever we replace a node with its child. The distributions of the child include only the instances there while if we replace, we must update all the counts. This function is also the backfitting function for decision trees.
Parameters:
subtree - The subtree over which Instances will be distributed.
il - InstanceList to be distributed over the DecisionTree.
pruningFactor - The amount of pruning to be done on this tree.
pessimisticErrors - Number of errors estimated for the new distribution.
ldType - Leaf Distribution Type.
leafDistParameter - The distribution of instances that reach this leaf node.
parentWeightDist - The weight distribution of the parent node.

distribute_instances

public void distribute_instances(Node subtree,
                                 InstanceList il,
                                 double pruningFactor,
                                 DoubleRef pessimisticErrors,
                                 int ldType,
                                 double leafDistParameter,
                                 double[] parentWeightDist,
                                 boolean saveOriginalDistr)
Distribute instances to a subtree. This function is used whenever we replace a node with its child. The distributions of the child include only the instances there while if we replace, we must update all the counts. This function is also the backfitting function for decision trees.
Parameters:
subtree - The subtree over which Instances will be distributed.
il - InstanceList to be distributed over the DecisionTree.
pruningFactor - The amount of pruning to be done on this tree.
pessimisticErrors - Number of errors estimated for the new distribution.
ldType - Leaf Distribution Type.
leafDistParameter - The distribution of instances that reach this leaf node.
parentWeightDist - The weight distribution of the parent node.
saveOriginalDistr - TRUE if the original instance distribution should be preserved, FALSE otherwise.

delete_subtree

public void delete_subtree(Node node,
                           Node newNode)
Removes a subtree recursively. This is used to
a) Remove a node and all nodes below it if the second parameter is NULL,
b) Remove just the nodes under a particular node, if both parameters are the same (the named node remains in the graph).
c) Replace the subtree rooted at the first parameter with the subtree rooted at the second parameter, if the two parameters are not equal, and are non-null.

We allow replacing node X with a child of node X (or a node related through comman ancestors) or, in general, replacing a subtree with another subtree. In both cases, we disconnect the parents of the new node node from the new node.

We do not allow replacing node X with an ancester (parent, etc.) of node X, as this would make no sense.

The method is as follows:

1) If 'node' is to be deleted, delete the edges connecting it to its parents.
2) If 'node' is to be replaced by 'newNode', delete the edges connecting 'newNode' to its parents.
3) Delete the edges from 'node' to all its children.
4) If 'node' is to be deleted, since it's now completely disconnected, delete it.
5) If 'node' is to be replaced by 'newNode',
5a) Connect all of 'newNode's children to 'node' (adding edges),
5b) Delete all the edges from 'newNode' to its children.
5c) Since 'newNode' is now completely disconnected, delete it.
6) For all the children discovered in step 3, recurse to delete them.

Parameters:
node - Node to be replaced.
newNode - New Node to be used for replacement.

assign_subtree_levels

public void assign_subtree_levels(Node node,
                                  int baseLevel)
Creates NodeInfo objects for every Node in the branch starting at the given Node and assigns each NodeInfo its appropriate level in the tree.
Parameters:
node - The base node where level assignment will start.
baseLevel - The initial level for the base Node.