Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)




Produces a random subsample of a dataset using either sampling with replacement or without replacement. The original dataset must fit entirely in memory. The number of instances in the generated dataset may be specified. When used in batch mode, subsequent batches are NOT resampled.


The table below describes the options available for Resample.

Option Description
invertSelection Inverts the selection (only if instances are drawn WITHOUT replacement).
noReplacement Disables the replacement of instances.
randomSeed The seed used for random sampling.
sampleSizePercent Size of the subsample as a percentage of the original dataset.


The table below describes the capabilites of Resample.

Capability Supported
Class Date class, Binary class, Missing class values, No class, Nominal class, String class, Empty nominal class, Relational class, Unary class, Numeric class
Attributes Missing values, String attributes, Numeric attributes, Empty nominal attributes, Binary attributes, Unary attributes, Nominal attributes, Date attributes, Relational attributes
Min # of instances 0

This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder Powered by Atlassian Confluence