Hitachi Vantara Pentaho Community Wiki
Child pages
  • PrincipalComponents (attribute transformer)

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

Package

weka.attributeSelection

Synopsis

Performs a principal components analysis and transformation of the data. Use in conjunction with a Ranker search. Dimensionality reduction is accomplished by choosing enough eigenvectors to account for some percentage of the variance in the original data---default 0.95 (95%). Attribute noise can be filtered by transforming to the PC space, eliminating some of the worst eigenvectors, and then transforming back to the original space.

Options

The table below describes the options available for PrincipalComponents.

Option

Description

maximumAttributeNames

The maximum number of attributes to include in transformed attribute names.

normalize

Normalize input data.

transformBackToOriginal

Transform through the PC space and back to the original space. If only the best n PCs are retained (by setting varianceCovered < 1) then this option will give a dataset in the original space but with less attribute noise.

varianceCovered

Retain enough PC attributes to account for this proportion of variance.

Capabilities

The table below describes the capabilites of PrincipalComponents.

Capability

Supported

Class

Date class, Numeric class, Binary class, Nominal class, Missing class values, No class

Attributes

Numeric attributes, Nominal attributes, Binary attributes, Missing values, Date attributes, Empty nominal attributes, Unary attributes

Min # of instances

1