Hitachi Vantara Pentaho Community Wiki

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin


Execute Transformations in Batch Mode

What is Pan?

Pan is a program that can execute transformations designed in Spoon when stored as a KTR file or in a repository.

Usually transformations are scheduled to be run at regular intervals (via the PDI Enterprise Repository scheduler, or 3rd-party tools like Cron or Windows Task Scheduler).


The first step is the installation of Oracle Java Runtime Environment version 1.7 or higher.

After this, you can simply unzip the PDI software: in a directory of your choice.  In the data-integration directory where you unzipped the file, you will find a number of files.

When using Unix-like environments (Solaris, Linux, OSX, ...) you will need to make the shell scripts executable. Execute these commands to make all shell scripts in the Kettle directory executable:

Code Block
cd data-integration
chmod +x *.sh

Launching Pan

To launch Pan on the different platforms these are the scripts that are provided:

  • Pan.bat : run Pan on the Windows platform.
  • : run Pan on Unix platforms and OSX

Pan can be run on any platform that has a version of the Java Runtime Environment version 1.7 or higher.

Command line options

These are the command line options that you can use.

  • On Windows system, the use of the minus ("-") in the options causes problems as well as the equal sign ("="). Because of this, from version 2.2.2 on, you can also use this format or any combination of /,- and :,=
  • Fields in italic represent the values that the options use.
  • It's important that if spaces are present in the option values, you use quotes or double quotes to keep them together. Take a look at the examples below for more info.
Code Block

Below are the valid options.

Display version information

Code Block

This option displays the version of the Kettle core library (kettle.jar).
The build version number and build date are shown as well.

Launch XML File

Code Block

This option runs the transformation defined in the XML file. (.ktr : Kettle Transformation)

Named parameters

Code Block

You can set the value of a named parameter, for example: -param:FOO=value

Code Block

List the named parameters (their name, default value and description) that are defined in the specified transformation.

See also: [Named Parameters|EAI:Named Parameters].

Set the logging file

Code Block
-log=Logging Filename

Specifies the log file. The default is the standard output.

Set the logging level

Code Block
-level=Logging Level

The level option sets the log level for the transformation that's being run.
These are the possible values:

  • Error: Only show errors
  • Nothing: Don't show any output
  • Minimal: Only use minimal logging
  • Basic: This is the default basic logging level
  • Detailed: Give detailed logging output
  • Debug: For debugging purposes, very detailed output.
  • Rowlevel: Logging at a row level, this can generate a lot of data.

Choose a repository

Code Block
-rep=Repository name

Connect to the repository with name "Repository name".
You also need to specify the options --user, --pass and --trans.
You can also specify this option in the form of environment variable KETTLE_REPOSITORY.

Set the repository user name

Code Block

This is the username with which you want to connect to the repository.
You can also specify this option in the form of environment variable KETTLE_USER.

Set the repository password

Code Block

The password to use to connect to the repository
You can also specify this option in the form of environment variable KETTLE_PASSWORD.

Select the repository transformation to run

Code Block
-trans=Transformation Name

Use this option to select the transformation to run from the repository

List the directories in the repository

Code Block

Print a listing of all the sub-directories in the repository directory specified with the option "-dir".

Set the repository directory

Code Block

Specifies the directory in the repository to use. Repository directories are specified like this:

  • The root directory: /
  • A subdirectory: /production/Dimensions/

From version 2.2.2 on, a / (slash) is used to separate directories on all platforms.

List the repository transformations

Code Block

Show a list of all the transformations in the repository directory specified with the option "-dir".

List the available repositories

Code Block

Print a listing of all the defined repositories.

Export the complete repository

Code Block

This options exports the complete repository to a single XML file.
To restore this file to a repository, please use the Repository Explorer in Spoon.
See the documentation of Spoon for more information.

Don't log in to the repository

Code Block

If you have set environment variables KETTLE_REPOSITORY, KETTLE_USER, KETTLE_PASSWORD, you can prevent Pan from logging into the repository. For instance, if you want to launch a transformation from an XML file.


Please make sure that you are positioned in the data-integration directory before running the samples
below. If you put these scripts into a batch file or shell script, simply do a change directory to
the installation directory:

If data-integration was installed on windows on the D:\ drive

Code Block

cd \data-integration

If data-integration was installed in the /product directory on a Unix system:

Code Block
cd /product/data-integration/

Run a transformation from file

This example runs a transformation from file on a windows platform:

Code Block
pan.bat /file:"D:\Transformations\Customer Dimension.ktr" /level:Basic

This example runs a transformation from file on a Linux box:

Code Block -file="/PRD/Customer Dimension.ktr" -level=Minimal

Run a transformation from Repository

This example runs a transformation from the repository on a windows platform: (Enter on a single line without returns...)

Code Block
pan.bat /rep:"Production Repository"
            /trans:"update Customer Dimension"

Redirecting output

If you don't want the output of the file to appear on the screen but rather be put into a log file, you can use redirection.
This example adds the Pan output to an ever-growing log file:

Code Block -file="/PRD/trans.ktr" -level=Minimal >> /LOG/trans.log

This example writes the Pan output to a file that gets overwritten every time:

Code Block
pan.bat /file:C:\PRD\trans.ktr /level:Basic > C:\LOG\trans.log

Return codes

Pan returns an error code based on how the execution went:

  • 0 : The transformation ran without a problem.
  • 1 : Errors occurred during processing
  • 2 : An unexpected error occurred during loading / running of the transformation
  • 3 : Unable to prepare and initialize this transformation
  • 7 : The transformation couldn't be loaded from XML or the Repository
  • 8 : Error loading steps or plugins (error in loading one of the plugins mostly)
  • 9 : Command line usage printing


Schedule a transformation on windows

The best way to go at it is to test the command first at the DOS prompt.  Then you can use the Windows Task Scheduler to launch this command.  Windows versions since Windows 2000 have a GUI for doing this accessible through the Control Panel. However it's also possible to use the command line to do this:

Code Block
at 23:30 /every:Monday,Wednesday,Friday "D:\update_dimensions.bat"

To see a list of the scheduled commands simply type:

Code Block

Schedule a transformation on Unix

First create a shell script that runs all the transformations you need. Then you can schedule
this script to run.
On Unix like systems the easiest way to schedule a command is by using the "cron table".
You can do this by entering the following command:

Code Block
crontab -e

Then you can enter the time at which the command needs to be run as well as the command
on a single line in the text file that is presented.
The first options are:

  • Minute: The minute of the hour, 0-59
  • Hour: The hour of the day, 0-23
  • Month day: The day of the month, 1-31
  • Month: The month of the year, 1-12
  • Weekday: The day of the week, 0-6, 0=Sunday

You can specify more then 1 number for each of these values by separating 2 number with a
hyphen - . This means an inclusive number range. If you separate the number by commas
(,), it means distinct values. If you use * instead of a number, it means: every possible hour,
minute, day, month or weekday.
So, if you want to update the dimensions every hour, at 15 and 45 minutes past the hour
during the weekdays, you might enter these lines in a crontab:

Code Block
# Launches the update of the dimensions in the warehouse
15,45 * * * 1-5 /PROD/