shared
Class InstanceReader

java.lang.Object
  |
  +--shared.InstanceReader

public class InstanceReader
extends java.lang.Object

Provide a set of functions for reading a list of instances from a source which provides a single instance at a time, attribute by attribute. Supports the exclusion of nominal attributes which have more than a set limit on the number of values.


Field Summary
static int mapToLabel
          Special value for mapping operations.
static int unmapped
          Special value for mapping operations.
 
Constructor Summary
InstanceReader(InstanceList ownerList)
          Constructor.
InstanceReader(InstanceList ownerList, int limit)
          Constructor.
InstanceReader(InstanceList ownerList, int limit, boolean makeUnknown)
          Constructor.
InstanceReader(InstanceList ownerList, int limit, boolean makeUnknown, boolean allowUnknownLab)
          Constructor.
 
Method Summary
 Instance add_instance()
          Adds the instance to the list.
 Schema get_schema()
          Returns the Schema being used to read data.
 boolean has_list()
          Checks if this InstanceReader has an InstanceList to store Instances in.
 boolean is_labelled()
          Checks if the Instance being read is labelled.
 void match_values(java.lang.String name, NominalAttrInfo a1, NominalAttrInfo a2)
          Attempts to match values for two fixed value set nominals.
 InstanceList release_list()
          Releases the list we're building.
 void set_from_file(int attrNum, java.io.BufferedReader dataFile)
          Sets the value of an attribute from an MLJ format data file.
 void set_from_file(int attrNum, java.io.StreamTokenizer dataFile)
          Sets the value of an attribute from an MLJ format data file.
 void set_nominal(int attrNum, java.lang.String attrVal)
          Explicitly sets a nominal value.
 void set_real(int attrNum, double attrVal)
          Explicitly sets a real value.
 void set_unknown(int attrNum)
          Sets an attribute's value to unknown.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

unmapped

public static final int unmapped
Special value for mapping operations. Any integer value is valid.

mapToLabel

public static final int mapToLabel
Special value for mapping operations. Any integer value is valid.
Constructor Detail

InstanceReader

public InstanceReader(InstanceList ownerList)
Constructor. Builds an InstanceReader which can be used to construct instances for ownerList. OwnerList MUST have a FileSchema associated with it; this defines the form of all incoming data. The data will be ASSIMILATED to the form of ownerList's schema as it is read.
The limit parameter specifies an optional limit on the number of distinct attribute values which are allowed on any given attribute. If this limit is exceeded, the attribute in question will be projected out, and future incoming data for that attribute will be ignored.
The makeUnknown parameter, if TRUE, will cause all attribute values not present in ownerList's schema to be converted to UNKNOWN.
NOTE: for reading test data, limit should be set to 0 and makeUnknown should be TRUE.
Parameters:
ownerList - The InstaceList in which Instances will be stored.

InstanceReader

public InstanceReader(InstanceList ownerList,
                      int limit)
Constructor. Builds an InstanceReader which can be used to construct instances for ownerList. OwnerList MUST have a FileSchema associated with it; this defines the form of all incoming data. The data will be ASSIMILATED to the form of ownerList's schema as it is read.
The limit parameter specifies an optional limit on the number of distinct attribute values which are allowed on any given attribute. If this limit is exceeded, the attribute in question will be projected out, and future incoming data for that attribute will be ignored.
The makeUnknown parameter, if TRUE, will cause all attribute values not present in ownerList's schema to be converted to UNKNOWN.
NOTE: for reading test data, limit should be set to 0 and makeUnknown should be TRUE.
Parameters:
ownerList - The InstaceList in which Instances will be stored.
limit - The limit number of how many attribute values are possible.

InstanceReader

public InstanceReader(InstanceList ownerList,
                      int limit,
                      boolean makeUnknown)
Constructor. Builds an InstanceReader which can be used to construct instances for ownerList. OwnerList MUST have a FileSchema associated with it; this defines the form of all incoming data. The data will be ASSIMILATED to the form of ownerList's schema as it is read.
The limit parameter specifies an optional limit on the number of distinct attribute values which are allowed on any given attribute. If this limit is exceeded, the attribute in question will be projected out, and future incoming data for that attribute will be ignored.
The makeUnknown parameter, if TRUE, will cause all attribute values not present in ownerList's schema to be converted to UNKNOWN.
NOTE: for reading test data, limit should be set to 0 and makeUnknown should be TRUE.
Parameters:
ownerList - The InstaceList in which Instances will be stored.
limit - The limit number of how many attribute values are possible.
makeUnknown - TRUE if unknown values for attributes are possible, FALSE otherwise.

InstanceReader

public InstanceReader(InstanceList ownerList,
                      int limit,
                      boolean makeUnknown,
                      boolean allowUnknownLab)
Constructor. Builds an InstanceReader which can be used to construct instances for ownerList. OwnerList MUST have a FileSchema associated with it; this defines the form of all incoming data. The data will be ASSIMILATED to the form of ownerList's schema as it is read.
The limit parameter specifies an optional limit on the number of distinct attribute values which are allowed on any given attribute. If this limit is exceeded, the attribute in question will be projected out, and future incoming data for that attribute will be ignored.
The makeUnknown parameter, if TRUE, will cause all attribute values not present in ownerList's schema to be converted to UNKNOWN.
NOTE: for reading test data, limit should be set to 0 and makeUnknown should be TRUE.
Parameters:
ownerList - The InstaceList in which Instances will be stored.
limit - The limit number of how many attribute values are possible.
makeUnknown - TRUE if unknown values for attributes are possible, FALSE otherwise.
allowUnknownLab - TRUE if unknown labels are possible, FALSE otherwise.
Method Detail

match_values

public void match_values(java.lang.String name,
                         NominalAttrInfo a1,
                         NominalAttrInfo a2)
Attempts to match values for two fixed value set nominals. Prints an error message on failure.
Parameters:
name - The name of the attribute.
a1 - The first nominal being compared.
a2 - The second nominal being compared.

release_list

public InstanceList release_list()
Releases the list we're building.
Returns:
The InstanceList being built by this InstanceReader.

set_from_file

public void set_from_file(int attrNum,
                          java.io.BufferedReader dataFile)
Sets the value of an attribute from an MLJ format data file.
Parameters:
attrNum - The number of the attribute being read.
dataFile - The BufferedReader reading the file.

set_from_file

public void set_from_file(int attrNum,
                          java.io.StreamTokenizer dataFile)
Sets the value of an attribute from an MLJ format data file.
Parameters:
attrNum - The number of the attribute being read.
dataFile - The StreamTokenizer reading from the file.

add_instance

public Instance add_instance()
Adds the instance to the list. The instance must be fully constructed and must have its label set. Also, you may not add the same instance twice.
Returns:
The Instance being added.

is_labelled

public boolean is_labelled()
Checks if the Instance being read is labelled.
Returns:
TRUE if the Instance is labelled, FALSE otherwise.

get_schema

public Schema get_schema()
Returns the Schema being used to read data.
Returns:
The Schema containing details about the file beinig read.

has_list

public boolean has_list()
Checks if this InstanceReader has an InstanceList to store Instances in.
Returns:
TRUE if there is an InstanceList present, FALSE otherwise.

set_nominal

public void set_nominal(int attrNum,
                        java.lang.String attrVal)
Explicitly sets a nominal value. The attribute's type must support nominal.
Parameters:
attrNum - The number of the attribute containing the nominal value.
attrVal - The value to be set as a nominal value.

set_real

public void set_real(int attrNum,
                     double attrVal)
Explicitly sets a real value. The attribute's type must support real.
Parameters:
attrNum - The number of the attribute containing the real value.
attrVal - The value to be set as a real value.

set_unknown

public void set_unknown(int attrNum)
Sets an attribute's value to unknown. Works on any type.
Parameters:
attrNum - The number of the attribute for which the unknown value will be inserted.