Hitachi Vantara Pentaho Community Wiki
Child pages
  • SubsetSizeForwardSelection
Skip to end of metadata
Go to start of metadata

Package

weka.attributeSelection

Synopsis

SubsetSizeForwardSelection:

Extension of LinearForwardSelection. The search performs an interior cross-validation (seed and number of folds can be specified). A LinearForwardSelection is performed on each foldto determine the optimal subset-size (using the given SubsetSizeEvaluator). Finally, a LinearForwardSelection up to the optimal subset-size is performed on the whole data.

For more information see:

Martin Guetlein (2006). Large Scale Attribute Selection Using Wrappers. Freiburg, Germany.

Options

The table below describes the options available for SubsetSizeForwardSelection.

Option

Description

lookupCacheSize

Set the maximum size of the lookup cache of evaluated subsets. This is expressed as a multiplier of the number of attributes in the data set. (default = 1).

numSubsetSizeCVFolds

Number of cross validation folds for subset size determination

numUsedAttributes

Set the amount of top-ranked attributes that are taken into account by the search process.

performRanking

Perform initial ranking to select top-ranked attributes.

seed

Seed for cross validation subset size determination. (default = 1)

subsetSizeEvaluator

Subset evaluator to use for subset size determination.

type

Set the type of the search.

verbose

Turn on verbose output for monitoring the search's progress.

  • No labels