Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

Package

weka.filters.supervised.instance

Synopsis

Produces a random subsample of a dataset using either sampling with replacement or without replacement.
The original dataset must fit entirely in memory. The number of instances in the generated dataset may be specified. The dataset must have a nominal class attribute. If not, use the unsupervised version. The filter can be made to maintain the class distribution in the subsample, or to bias the class distribution toward a uniform distribution. When used in batch mode (i.e. in the FilteredClassifier), subsequent batches are NOT resampled.

Options

The table below describes the options available for Resample.

Option

Description

biasToUniformClass

Whether to use bias towards a uniform class. A value of 0 leaves the class distribution as-is, a value of 1 ensures the class distribution is uniform in the output data.

invertSelection

Inverts the selection (only if instances are drawn WITHOUT replacement).

noReplacement

Disables the replacement of instances.

randomSeed

Sets the random number seed for subsampling.

sampleSizePercent

The subsample size as a percentage of the original set.

Capabilities

The table below describes the capabilites of Resample.

Capability

Supported

Class

Binary class, Nominal class

Attributes

Missing values, String attributes, Numeric attributes, Empty nominal attributes, Binary attributes, Unary attributes, Nominal attributes, Date attributes, Relational attributes

Min # of instances

0

  • No labels