Hitachi Vantara Pentaho Community Wiki
Child pages
  • .08 Transformation Settings
Skip to end of metadata
Go to start of metadata

Transformation Settings

Transformation Settings are a collection of properties that describe the transformation and configure its behavior. Access Transformation Settings from the main menu under Transformation|Settings. The following sections provides a detailed description of the available settings.

The following topics are covered in this section:

Transformation Tab


The transformation tab allows you to specify general properties about the transformation including:

Setting

Description

Transformation name

The name of the transformation.Required information if you want to save to a repository

Description

Short description of the transformation, shown in the repository explorer

Extended description

Long extended description of the transformation

Status

Draft or production status

Version

Version description

Directory

The directory in the repository where the transformation is stored

Created by

Displays the original creator of the transformation.

Created at

Displays the date and time when the transformation was created.

Last modified by

Displays the user name of the last user that modified the transformation.

Last modified at

Displays the date and time when the transformation was last modified.

Logging

The Logging tab allows you to configure how and where logging information is captured. Settings include:

Setting

Description

READ log step

Use the number of read lines from this step to write to the log table. Read means: read from source steps.

INPUT log step

Use the number of input lines from this step to write to the log table. Input means: input from file or database.

WRITE log step

Use the number of written lines from this step to write to the log table. Written means: written to target steps.

OUTPUT log step

Use the number of output lines from this step to write to the log table. Output means: output to file or database.

UPDATE log step

Use the number of updated lines from this step to write to the log table. Update means: updated in a database.

REJECTED log step

Use the number of rejected lines from this step to write to the log table. Rejected means: error record.

Log connection

The connection used to write to a log table.

Log table

Specifies the name of the log table (for example L_ETL)

Use Batch-ID?

Enable if you want to have a batch ID in the L_ETL file. Disable for backward compatibility with Spoon/Pan version < 2.0.

Use logfield to store logging in

Stores the logging text in a CLOB field in the logging table. This allows you to have the logging text together with the run results in the same table. Disable for backward compatibility with Spoon/Pan version < 2.1

Dates

The Dates tab allows you to configure the following date related settings:

Setting

Description

Maxdate connection

Get the upper limit for a date range on this connection.

Maxdate table

Get the upper limit for a date range in this table.

Maxdate field

Get the upper limit for a date range in this field.

Maxdate offset

Increases the upper date limit with this amount. Use this for example, if you find that the field DATE_LAST_UPD has a maximum value of 2004-05-29 23:00:00, but you know that the values for the last minute are not complete. In this case, simply set the offset to -60.

Maximum date difference

Sets the maximum date difference in the obtained date range. This will allow you to limit job sizes.

Dependencies

The Dependencies tab allows you to enter all of the dependencies for the transformation. For example, if a dimension depends on three lookup tables, make sure that the lookup tables have not changed. If the values in these lookup tables have changed, extend the date range to force a full refresh of the dimension. Dependencies allow you to determine if a table has changed when you have a "data last changed" column in the table. Click Get dependencies to detect dependencies automatically.

Miscellaneous

The Miscellaneous tab allows you to configure the following settings:

Setting

Description

Number of rows in rowsets

Allows you to change the size of the buffers between the connected steps in a transformation. Do not change this parameter unless you are running low on memory, for example.

Show a feedback row in transformation steps?

Controls whether or not to add a feedback entry into the log file while the transformation is being executed. By default, this feature is enabled and configured to display a feedback record every 5000 rows.

The feedback size

Sets the number of rows to process before entering a feedback entry into the log. Set this higher when processing large amounts of data to reduce the amount of information in the log file.

Make the transformation database transactional
(before version 3.2: Use unique connections)

This allows us to open one unique connection per defined and used database connection in the transformation. Enabling this option is required to allow a failing transformation to be rolled back completely.
Enabling this option is also necessary when trying to alter connection settings before a query using an "Execute SQL script" step. (see also the Advanced section in the database connection dialog "Enter the SQL statements (separated ...) to execute right after connecting")
Further information can be found in Database transactions in jobs and transformations.

Note: A transformation wide commit for all steps is done at the point in time when the last step finishes. When the transformation fails, a rollback is done. It is not needed to set any commit sizes since they get ignored.

Shared objects file

Specifies the location of the XML file used to store shared objects like database connections, clustering schemas, and more.

Manage thread priorities?

Allows you to enable or disable the internal logic for changing the Java thread priorities based on the number of input and output rows in the "rowset" buffers. This can be useful in some situations where the cost of using the logic exceeds the benefit of the thread prioritization.

Partitioning

The Partitioning tab provides a list of available database partitions. Click New to create a new partition. The Get Partitions button retrieves a list of available partitions that have been defined for the connection.

SQL Button

Click SQL under Transformation Properties to generate the SQL code necessary for creating the logging table. The DDL displays in the Simple SQL Editor allowing you to execute this or any other SQL statement(s) against the logging connection.