The Wiki will be offline Monday, November 20, for upgrade between 10:00am ET and 5:00pm ET.
Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)

Core Weka

  • SGDText - stochastic gradient descent for learning linear SVMs and logistic regression for text problems. Operates incrementally and directly on string attributes.
  • New incremental version of the multi-class meta classifier (MultiClassClassifierUpdateable).
  • RandomForest now supports building trees in parallel.
  • DatabaseLoader is now much faster when loading data sets with many nominal attributes.
  • Database access now allows custom property files to be set at runtime, allowing access to databases different from the default one without having to restart Weka.
  • TextDirectoryLoader can now operate incrementally.
  • CSVLoader now supports files without a header row.
  • Charts can now be exported to files from running Knowledge Flow processes via an offscreen rendering process.
  • RemoveUseless filter now removes attributes with all missing values.
  • Histogram visualization in the Explorer and Knowledge Flow is now faster.
  • ClassifierPerformanceEvaluator in the Knowledge Flow is now multi-threaded to allow folds to be evaluated in parallel.
  • File-based savers now support gzip compression.
  • File-based loaders now support loading files as a resource from the classpath (including jars).

In Packages

  • multiInstanceLearning - added MITI multi-instance tree learner and MIRI rule learner variant.
  • RerankingSearch - a feature selection meta-search algorithm that speeds up the base search algorithm, contributed by Pablo Bermejo.
  • timeseriesForecasting package now includes support for handling timestamp-based data which contains gaps in the regular time period. Documentation here.
  • sasLoader - SAS sas7bdat file reader.
  • CHIRP - A new classifier based on Composite Hypercubes on Iterated Random Projections, contributed by Leland Wilkinson.
  • PSOSearch - An implementation of the Particle Swarm Optimization (PSO) algorithm to explore the space of attributes, contributed by Sebastian Luna Valero.
  • wekaServer - A simple servlet-based server for executing data mining tasks (Explorer and KnowledgeFlow so far). Documentation here.
  • jfreechartOffscreenRenderer - Offscreen (headless) chart rendering in Knowledge Flow processes using the JFreeChart library. More info here.

This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder Powered by Atlassian Confluence