Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)




Reduces the dimensionality of the data by projecting it onto a lower dimensional subspace using a random matrix with columns of unit length (i.e. It will reduce the number of attributes in the data while preserving much of its variation like PCA, but at a much less computational cost).
It first applies the NominalToBinary filter to convert all attributes to numeric before reducing the dimension. It preserves the class attribute.

For more information, see:

Dmitriy Fradkin, David Madigan: Experiments with random projections for machine learning. In: KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, New York, NY, USA, 517-522, 003.


The table below describes the options available for RandomProjection.

Option Description
distribution The distribution to use for calculating the random matrix.
Sparse1 is:
 sqrt(3) * { -1 with prob(1/6), 
               0 with prob(2/3),  
              +1 with prob(1/6) } 
Sparse2 is:
 { -1 with prob(1/2), 
   +1 with prob(1/2) } 
numberOfAttributes The number of dimensions (attributes) the data should be reduced to.
percent The percentage of dimensions (attributes) the data should be reduced to (inclusive of the class attribute). This NumberOfAttributes option is ignored if this option is present or is greater than zero.
randomSeed The random seed used by the random number generator used for generating the random matrix
replaceMissingValues If set the filter uses weka.filters.unsupervised.attribute.ReplaceMissingValues to replace the missing values


The table below describes the capabilites of RandomProjection.

Capability Supported
Class Missing class values, Empty nominal class, Nominal class, String class, Date class, No class, Unary class, Relational class, Numeric class, Binary class
Attributes Unary attributes, Date attributes, Nominal attributes, Relational attributes, String attributes, Empty nominal attributes, Missing values, Numeric attributes, Binary attributes
Min # of instances 0

This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder Powered by Atlassian Confluence