Hitachi Vantara Pentaho Community Wiki
Child pages
  • OneClassClassifier
Skip to end of metadata
Go to start of metadata

Package

weka.classifiers.meta

Synopsis

Performs one-class classification on a dataset.

Classifier reduces the class being classified to just a single class, and learns the data without using any information from other classes. The testing stage will classify as 'target' or 'outlier' - so in order to calculate the outlier pass rate the dataset must contain information from more than one class.

Also, the output varies depending on whether the label 'outlier' exists in the instances usedto build the classifier. If so, then 'outlier' will be predicted, if not, then the label will be considered missing when the prediction does not favour the target class. The 'outlier' class will not be used to build the model if there are instances of this class in the dataset. It can simply be used as a flag, you do not need to relabel any classes.

For more information, see:

Kathryn Hempstalk, Eibe Frank, Ian H. Witten: One-Class Classification by Combining Density and Class Probability Estimation. In: Proceedings of the 12th European Conference on Principles and Practice of Knowledge Discovery in Databases and 19th European Conference on Machine Learning, ECMLPKDD2008, Berlin, 505--519, 2008.

Options

The table below describes the options available for OneClassClassifier.

Option

Description

classifier

The base classifier to be used.

debug

If set to true, classifier may output additional info to the console.

densityOnly

If true, the density estimate will be used by itself.

nominalGenerator

The nominal data generator to use.

numRepeats

The number of repeats for (internal) cross-validation.

numericGenerator

The numeric data generator to use.

percentageHeldout

The percentage of data that will be heldout in each iteration of (internal) cross-validation.

proportionGenerated

The proportion of data that will be generated compared to the target class label.

seed

The random number seed to be used.

targetClassLabel

The class label to perform one-class classification on.

targetRejectionRate

The target rejection rate, ie, the proportion of target class samples that will be rejected in order to build a threshold.

useInstanceWeights

If true, the weighting on instances is based on their prevalence in the data.

useLaplaceCorrection

If true, then Laplace correction will be used (reduces the number of class labels to two, target and outlier class, regardless of how many class labels actually exist) - useful for classifiers that use the number of class labels to make use of a Laplace value based on the unseen class.

Capabilities

The table below describes the capabilites of OneClassClassifier.

Capability

Supported

Class

Nominal class, Binary class, Missing class values

Attributes

Missing values, Binary attributes, Date attributes, Numeric attributes, Empty nominal attributes, Nominal attributes, Unary attributes

Min # of instances

1

  • No labels