Hitachi Vantara Pentaho Community Wiki
Child pages
  • Automatic Documentation Output
Skip to end of metadata
Go to start of metadata

Purpose

This step is used to generate descriptive documentation for one or more transformations or jobs.  This can be used as a way to automatically generate documentation about the purpose of jobs and transformations or as a way to archive their behavior as they change over time.

It takes as input a list of file names and file types (transformation or job) and generates a corresponding set of documentation files optionally containing details such as the transformation name, description(s), creation date, job or transformation graph, logging configuration details and more.

This step can be used to query file based repositories and database or the DI repository in combination with the Get repository names step.

Example (file based)

A sample titled 'Automatic Documentation Output - Generate Kettle HTML Documentation' is included in the <installation-directory>\data-integration\samples\transformations folder.

This example will:

  • Gather a list of ktrs and kjbs from the samples directory (and sub-folders)
  • Map the extension to the file type (transformation or job)
  • Remove unnecessary fields from the stream
  • Generate HTML documentation for all input rows

NOTE: some samples in the samples directory require that you setup sample data sets in advance of running them.  If you do not have that sample data set up, you may get errors running this sample indicating that a database in MySQL doesn't exist.  You can change the Get Files list to point to another location containing ktrs and kjbs that you want to generate documentation for.

Example (Repository based with METADATA)

The following example is attached:

It will:

  • Gather a list of transformations and jobs from the repository
  • Returns the number of step and job entries and the XML representation (you can access any further METADATA from the job and transformation, this is only a simplified sample)

Configuration options

Option

Definition

Step name

Defines the name of the step as it will appear in the transformation graph.

File name field

Select the input field containing the name of the file you are generating documentation for.

File type field

Select the input field containing the type of file (transformation or job)

Target filename

Specify the target location and filename for the generated documentation.

Output type

Select the output type for the generated documentation (PDF, HTML, DOC, Excel, CSV, or METADATA)
Note: The output type METADATA returns a field called meta that is a serialized instance of the object depending on the object type, e.g. TransMeta or JobMeta.

Include the name?

Define whether or not to include the file name in the generated documentation.

Include the description?

Choose to include the description in the generated documentation (description can be modified by going to Edit->Settings).

Include the extended description?

Choose to include the extended description in the generated documentation (extended description can be modified by going to Edit->Settings).

Include the creation date and user?

Choose to include the creation date and user name for the creator in the generated documentation.

Include the modification date and user?

Choose to include the date of the last modification made to the file and user who modified it.

Include the image?

Choose to include the job or transformation graph in the generated documentation.

Include logging configuration details?

Choose to include a summary of the connections used for logging in the transformation or job.

Include the last execution result?

Choose to include a summary of the last execution results such as whether it completed successfully or ended in failure.

  • No labels