Produces a random subsample of a dataset using either sampling with replacement or without replacement.
The original dataset must fit entirely in memory. The number of instances in the generated dataset may be specified. The dataset must have a nominal class attribute. If not, use the unsupervised version. The filter can be made to maintain the class distribution in the subsample, or to bias the class distribution toward a uniform distribution. When used in batch mode (i.e. in the FilteredClassifier), subsequent batches are NOT resampled.
The table below describes the options available for Resample.
|biasToUniformClass||Whether to use bias towards a uniform class. A value of 0 leaves the class distribution as-is, a value of 1 ensures the class distribution is uniform in the output data.|
|invertSelection||Inverts the selection (only if instances are drawn WITHOUT replacement).|
|noReplacement||Disables the replacement of instances.|
|randomSeed||Sets the random number seed for subsampling.|
|sampleSizePercent||The subsample size as a percentage of the original set.|
The table below describes the capabilites of Resample.
|Class||Binary class, Nominal class|
|Attributes||Missing values, String attributes, Numeric attributes, Empty nominal attributes, Binary attributes, Unary attributes, Nominal attributes, Date attributes, Relational attributes|
|Min # of instances||0|