Hitachi Vantara Pentaho Community Wiki
Child pages
  • What's new or improved in Weka 3.7.6
Skip to end of metadata
Go to start of metadata

Core Weka

  • Weka 3.7 is now GPL 3.0.
  • Weka releases now available on Maven central
  • Logistic now has an option to use conjugate gradient descent rather than quasi-Newton with BFGS updates.
  • weka.classifiers.bayes.NaiveBayesMultinomialText - naive Bayes multinomial classifier that operates directly on string attributes.
  • Appender component for the Knowledge Flow that can append sets of instances together.
  • SubstringLabeler component for the Knowledge Flow that can use substring or regex matching on string attribute values to assign various user defined nominal values to a new "label" attribute.
  • SubstringReplacer component for the Knowledge Flow that can replace substrings or regex matches with user supplied strings in string attribute values.
  • Sorter component for the Knowledge Flow that implements a streaming merge sort that writes a sorted in-memory buffer to a file when full. Can sort descending or ascending on multiple attributes.
  • DatabaseSaver can now truncate the target table if desired.
  • Area under the precision-recall curve evaluation metric.
  • Package manager's cache refresh mechanism is now much faster.
  • Package manager now checks for new versions of existing packages on the server as well as entirely new packages.
  • Random forest now has an option to print all the ensemble trees as part of its output.

In Packages

  • weka.clusterers.CascadeSimpleKMeans, contributed by Martin Guetlein.
  • weka.classifiers.functions.RBFRegressor added to the RBFNetwork package
  • jsonFieldExtractor package - Knowledge Flow step to extract one or more fields from repeating blocks of JSON text into new attributes.
  • No labels