shared
Class Categorizer

java.lang.Object
  |
  +--shared.Globals
        |
        +--shared.Categorizer
Direct Known Subclasses:
BadCategorizer, ConstCategorizer, NaiveBayesCat, NodeCategorizer, RDGCategorizer, TableCategorizer

public abstract class Categorizer
extends Globals
implements java.lang.Cloneable

Abstract base class for Categorizers. Number of categories must be strictly positive (greater than zero). Description cannot be empty or NULL.


Field Summary
static int CATEGORIZER_ID_BASE
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_ATTR_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_ATTR_EQ_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_ATTR_SUBSET_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_BAD_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_BAGGING_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_CASCADE_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_CLUSTER_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_CONST_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_CONSTRUCT_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_DISC_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_DISC_NODE_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_DTREE_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_IB_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_LAZYDT_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_LEAF_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_LINDISCR_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_MAJORITY_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_MULTI_SPLIT_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_MULTITHRESH_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_NB_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_ODT_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_ONE_R_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_OPTION_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_PROJECT_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_RDG_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_STACKING_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_TABLE_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
static int CLASS_THRESHOLD_CATEGORIZER
          Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.
protected  LogOptions logOptions
          Logging options for this class.
 
Fields inherited from class shared.Globals
badCategorizer, CONFIDENCE_INTERVAL_Z, DBG, DEFAULT_DATA_EXT, DEFAULT_EPSILON, DEFAULT_EVAL_LIMIT, DEFAULT_LAMBDA, DEFAULT_MAX_EVALS, DEFAULT_MAX_STALE, DEFAULT_MIN_EXP_EVALS, DEFAULT_NAMES_EXT, DEFAULT_SAS_SEED, DEFAULT_SEARCH_METHOD, DEFAULT_SHOW_TEST_SET_PERF, DEFAULT_TEST_EXT, DISPLAY_NAMES, EMPTY_STRING, FIRST_CATEGORY_VAL, FIRST_NOMINAL_VAL, LEFT_NODE, MAX_NUM_CATEGORIES, Mcerr, Mcout, optionServer, optionsFileName, REAL_MAX, RIGHT_NODE, SHOW_TEST_SET_PERF_HELP, SINGLE_QUOTE, STORED_REAL_MAX, TS, UNDEFINED_INT, UNDEFINED_REAL, UNDEFINED_VARIANCE, UNKNOWN_AUG_CATEGORY, UNKNOWN_CATEGORY_VAL, UNKNOWN_NODE, UNKNOWN_NOMINAL_VAL, UNKNOWN_STORED_REAL_VAL, UNKNOWN_VAL_STR
 
Constructor Summary
Categorizer(int noCat, java.lang.String dscr, Schema sch)
          Constructor.
 
Method Summary
 void build_distr(InstanceList instList)
          Builds a weight distribution based on the given InstanceList.
abstract  AugCategory categorize(Instance IRC)
          Categorizes the given Instance.
 java.lang.Object clone()
          Clones this Categorizer.
 java.lang.String description()
          Returns the description of this Categorizer.
abstract  void display_struct(java.io.BufferedWriter stream, DisplayPref dp)
          Displays the structure of the Categorizer.
 double[] get_distr()
          Returns the weight distribution for this Categorizer.
 int get_log_level()
          Returns the logging level for this object.
 LogOptions get_log_options()
          Returns the LogOptions object for this object.
 java.io.Writer get_log_stream()
          Returns the stream to which logs for this object are written.
 Schema get_schema()
          Returns the Schema for data to be categorized.
 boolean has_distr()
          Checks if this Categorizer has a weight distribution.
 int num_categories()
          Returns the number of categories.
 CatDist score(Instance IRC)
          Returns the CatDist containing the weighted distribution score for the given Instance.
 void set_description(java.lang.String val)
          Sets the description of this Categorizer.
 void set_distr(double[] val)
          Sets the weight distribution to the given distribution.
 void set_log_level(int level)
          Sets the logging level for this object.
 void set_log_options(LogOptions opt)
          Sets the LogOptions object for this object.
 void set_log_prefixes(java.lang.String file, int line, int lvl1, int lvl2)
          Sets the logging message prefix for this object.
 void set_log_stream(java.io.Writer strm)
          Sets the stream to which logging options are displayed.
 void set_original_distr(double[] dist)
          Sets the original weight distribution to the given distribution.
 void set_used_attr(boolean[] barray)
          Sets the attributes used for this Categorizer.
 boolean supports_scoring()
          Checks if this Categorizer supports scoring.
 double total_weight()
          Returns the total weight of the Instances categorized.
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CATEGORIZER_ID_BASE

public static final int CATEGORIZER_ID_BASE
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_CONST_CATEGORIZER

public static final int CLASS_CONST_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_MULTITHRESH_CATEGORIZER

public static final int CLASS_MULTITHRESH_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_THRESHOLD_CATEGORIZER

public static final int CLASS_THRESHOLD_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_ATTR_CATEGORIZER

public static final int CLASS_ATTR_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_RDG_CATEGORIZER

public static final int CLASS_RDG_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_BAD_CATEGORIZER

public static final int CLASS_BAD_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_TABLE_CATEGORIZER

public static final int CLASS_TABLE_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_IB_CATEGORIZER

public static final int CLASS_IB_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_LAZYDT_CATEGORIZER

public static final int CLASS_LAZYDT_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_NB_CATEGORIZER

public static final int CLASS_NB_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_PROJECT_CATEGORIZER

public static final int CLASS_PROJECT_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_DISC_CATEGORIZER

public static final int CLASS_DISC_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_ATTR_EQ_CATEGORIZER

public static final int CLASS_ATTR_EQ_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_DTREE_CATEGORIZER

public static final int CLASS_DTREE_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_BAGGING_CATEGORIZER

public static final int CLASS_BAGGING_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_LINDISCR_CATEGORIZER

public static final int CLASS_LINDISCR_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_CASCADE_CATEGORIZER

public static final int CLASS_CASCADE_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_STACKING_CATEGORIZER

public static final int CLASS_STACKING_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_ATTR_SUBSET_CATEGORIZER

public static final int CLASS_ATTR_SUBSET_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_MULTI_SPLIT_CATEGORIZER

public static final int CLASS_MULTI_SPLIT_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_ONE_R_CATEGORIZER

public static final int CLASS_ONE_R_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_CONSTRUCT_CATEGORIZER

public static final int CLASS_CONSTRUCT_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_LEAF_CATEGORIZER

public static final int CLASS_LEAF_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_DISC_NODE_CATEGORIZER

public static final int CLASS_DISC_NODE_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_MAJORITY_CATEGORIZER

public static final int CLASS_MAJORITY_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_ODT_CATEGORIZER

public static final int CLASS_ODT_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_CLUSTER_CATEGORIZER

public static final int CLASS_CLUSTER_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

CLASS_OPTION_CATEGORIZER

public static final int CLASS_OPTION_CATEGORIZER
Deprecated. The Java instanceof operator should be used instead of the numerical class identity system.

Class identification number.

logOptions

protected LogOptions logOptions
Logging options for this class.
Constructor Detail

Categorizer

public Categorizer(int noCat,
                   java.lang.String dscr,
                   Schema sch)
Constructor.
Parameters:
noCat - The number of categories for labelling.
dscr - The description of this Categorizer object.
sch - The Schema for the data to be categorized.
Method Detail

set_log_level

public void set_log_level(int level)
Sets the logging level for this object.
Parameters:
level - The new logging level.

get_log_level

public int get_log_level()
Returns the logging level for this object.
Returns:
The log level for this Object.

set_log_stream

public void set_log_stream(java.io.Writer strm)
Sets the stream to which logging options are displayed.
Parameters:
strm - The stream to which logs will be written.

get_log_stream

public java.io.Writer get_log_stream()
Returns the stream to which logs for this object are written.
Returns:
The stream to which logs for this object are written.

get_log_options

public LogOptions get_log_options()
Returns the LogOptions object for this object.
Returns:
The LogOptions object for this object.

set_log_options

public void set_log_options(LogOptions opt)
Sets the LogOptions object for this object.
Parameters:
opt - The new LogOptions object.

set_log_prefixes

public void set_log_prefixes(java.lang.String file,
                             int line,
                             int lvl1,
                             int lvl2)
Sets the logging message prefix for this object.
Parameters:
file - The file name to be displayed in the prefix of log messages.
line - The line number to be displayed in the prefix of log messages.
lvl1 - The log level of the statement being logged.
lvl2 - The level of log messages being displayed.

num_categories

public int num_categories()
Returns the number of categories.
Returns:
The number of categories.

description

public java.lang.String description()
Returns the description of this Categorizer.
Returns:
The description of this Categorizer.

clone

public java.lang.Object clone()
Clones this Categorizer.
Overrides:
clone in class java.lang.Object
Returns:
The clone of this Categorizer.

set_used_attr

public void set_used_attr(boolean[] barray)
Sets the attributes used for this Categorizer. Displays an error message for this Categorizer class or subclass.
Parameters:
barray - A boolean array representing the attributes. TRUE indicates the attribute should be included in the categorization process.

set_description

public void set_description(java.lang.String val)
Sets the description of this Categorizer.
Parameters:
val - The new description.

has_distr

public boolean has_distr()
Checks if this Categorizer has a weight distribution.
Returns:
TRUE if there is a weighted distribution for this Categorizer, FALSE otherwise.

get_distr

public double[] get_distr()
Returns the weight distribution for this Categorizer.
Returns:
The weight distribution for this Categorizer.

total_weight

public double total_weight()
Returns the total weight of the Instances categorized.
Returns:
The total weight of the Instances categorized.

get_schema

public Schema get_schema()
Returns the Schema for data to be categorized.
Returns:
The Schema for data to be categorized.

set_original_distr

public void set_original_distr(double[] dist)
Sets the original weight distribution to the given distribution.
Parameters:
dist - The new original weight distribution.

display_struct

public abstract void display_struct(java.io.BufferedWriter stream,
                                    DisplayPref dp)
Displays the structure of the Categorizer.
Parameters:
stream - The output stream to be written to.
dp - The preferences for display.

categorize

public abstract AugCategory categorize(Instance IRC)
Categorizes the given Instance.
Parameters:
IRC - The Instance to be categorized.
Returns:
The category this Instance is labelled as.

score

public CatDist score(Instance IRC)
Returns the CatDist containing the weighted distribution score for the given Instance. Displays an error message for the Categorizer Class.
Parameters:
IRC - The Instance to be scored.
Returns:
The CatDist containing the weighted distribution.

supports_scoring

public boolean supports_scoring()
Checks if this Categorizer supports scoring.
Returns:
FALSE for the Categorizer class.

build_distr

public void build_distr(InstanceList instList)
Builds a weight distribution based on the given InstanceList.
Parameters:
instList - The InstanceList whose weight distribution is to be calculated.

set_distr

public void set_distr(double[] val)
Sets the weight distribution to the given distribution.
Parameters:
val - The new distribution.