Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)




Discretizes numeric attributes using equal frequency binning, where the number of bins is equal to the square root of the number of non-missing values.

For more information, see:

Ying Yang, Geoffrey I. Webb: Proportional k-Interval Discretization for Naive-Bayes Classifiers. In: 12th European Conference on Machine Learning, 564-575, 2001.


The table below describes the options available for PKIDiscretize.

Option Description
attributeIndices Specify range of attributes to act on. This is a comma separated list of attribute indices, with "first" and "last" valid values. Specify an inclusive range with "-". E.g: "first-3,5,6-10,last".
bins Ignored.
desiredWeightOfInstancesPerInterval Sets the desired weight of instances per interval for equal-frequency binning.
findNumBins Ignored.
ignoreClass The class index will be unset temporarily before the filter is applied.
invertSelection Set attribute selection mode. If false, only selected (numeric) attributes in the range will be discretized; if true, only non-selected attributes will be discretized.
makeBinary Make resulting attributes binary.
useEqualFrequency Always true.


The table below describes the capabilites of PKIDiscretize.

Capability Supported
Class Relational class, Numeric class, Binary class, No class, Empty nominal class, Missing class values, Unary class, Nominal class, String class, Date class
Attributes Binary attributes, String attributes, Nominal attributes, Missing values, Unary attributes, Relational attributes, Empty nominal attributes, Numeric attributes, Date attributes
Min # of instances 0

This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder Powered by Atlassian Confluence