Hitachi Vantara Pentaho Community Wiki
Child pages
  • The Categorical Dataset

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Wiki Markup
{scrollbar}

The Categorical Dataset

A categorical dataset is a dataset that contains a list of numeric values that correspond to a single category. In a categorical chart, there is only one numeric axis. The other axis plots the categories.

A columnar categorical dataset will use the first column's row value as the category and plot the following columns' numeric values per column as a series.

A row-based categorical dataset pivots the dataset, so that numeric column headers become the categories, the first column's values denote each series, and the numeric data is plotted by row.

This is easier to explain with an example (smile)

An Example Dataset

Here is a query and dataset from the Pentaho sample data:

Code Block
Select department, sum(actual) as actual, sum(budget) as budget from quadrant_actuals group by department

This query results in the following dataset:

DEPARTMENT

ACTUAL

BUDGET

Sales

11,168,773

10,973,392

Executive Management

6,299,022

6,494,166

Finance

12,224,220

12,087,406

Human Resource

13,075,463

12,989,341

Marketing & Communication

13,910,753

13,770,267

Product Development

10,644,102

10,786,611

Professional Services

76,317,649

76,098,206

Handing this dataset to a chart, the chart will plot the data by column by default. Here is a bar chart example:

To plot the data by rows, you must add the by-row property to the action sequence component definition. (You can set this property in the Design Studio by checking the Chart Row Dimension checkbox.)

Code Block
XML
XML
<by-row>true</by-row>

Here is a bar chart example, with the same dataset plotted by rows: