Hitachi Vantara Pentaho Community Wiki
Child pages
  • The Categorical Dataset
Skip to end of metadata
Go to start of metadata

The Categorical Dataset

A categorical dataset is a dataset that contains a list of numeric values that correspond to a single category. In a categorical chart, there is only one numeric axis. The other axis plots the categories.

A columnar categorical dataset will use the first column's row value as the category and plot the following columns' numeric values per column as a series.

A row-based categorical dataset pivots the dataset, so that numeric column headers become the categories, the first column's values denote each series, and the numeric data is plotted by row.

This is easier to explain with an example (smile)

An Example Dataset

Here is a query and dataset from the Pentaho sample data:

Select department, sum(actual) as actual, sum(budget) as budget from quadrant_actuals group by department

This query results in the following dataset:

DEPARTMENT

ACTUAL

BUDGET

Sales

11,168,773

10,973,392

Executive Management

6,299,022

6,494,166

Finance

12,224,220

12,087,406

Human Resource

13,075,463

12,989,341

Marketing & Communication

13,910,753

13,770,267

Product Development

10,644,102

10,786,611

Professional Services

76,317,649

76,098,206

Handing this dataset to a chart, the chart will plot the data by column by default. Here is a bar chart example:

To plot the data by rows, you must add the by-row property to the action sequence component definition. (You can set this property in the Design Studio by checking the Chart Row Dimension checkbox.)

<by-row>true</by-row>

Here is a bar chart example, with the same dataset plotted by rows:

  • No labels