Hitachi Vantara Pentaho Community Wiki
Child pages
  • Using the Weka Forecasting Plugin

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.



In order for the step to be able to identify, in the incoming data stream, where historical priming/training data finishes and future overlay data begins, it is crucial that the values for the forecasted target field(s) are missing in the future overlay data.

The following screeshot shows previewing an input data set that contains overlay data for future time steps to be forecasted. The data is the Australian wine data again. The model is predicting "Fortified" and expecting to use "Dry-white" and "Sweet-white" as overlay data.
Image Added

In this case we are treating the data from August 1993 onwards as "future" time steps to be predicted. It can be seen that values for "Fortified" (our single target in this case) are missing from this point onwards; whereas the values of "Dry-white" and "Sweet-white" (our "overlay" data) are present. Note that there are two missing values for "Rose" as well - these are truly missing in the original data, but have no affect on the model because Rose is neither a target or an input.

The following screenshot shows the output of the step on this data (no confidence intervals are being produced in this example). We can see that the forecaster has now filled in the values of "Fortified" for the future time steps in the overlay rows of the data. Image Added