Execute Transformations in Batch Mode
What is Pan?
Pan은 Spoon으로 디자인하여 XML 파일이나 데이터베이스 저장소에 저장된 Transformation을 실행하는 프로그램입니다. 보통 Transformation은 일정한 주기로 자동으로 실행되도록 배치모드로 스케쥴링됩니다.
The first step is the installation of Sun Microsystems Java Runtime Environment version 1.4
or higher. You can download a JRE for free at http://www.java.com/.
After this, you can simply unzip the zip-file: Kettle-3.0.zip in a directory of your choice.
In the Kettle directory where you unzipped the file, you will find a number of files.
Under Unix-like environments (Solaris, Linux, OSX, ...) you will need to make the shell
scripts executable. Execute these commands to make all shell scripts in the Kettle directory
cd Kettle chmod +x *.sh
각각 다른 플랫폼에서 Pan을 실행하기위해 스크립트가 제공됩니다:
- Pan.bat : run Pan on the Windows platform.
- pan.sh : run Pan on Unix platforms and OSX
Pan은 Java Runtime Environment 버전 1.5 또는 그 이상이 설치된 어떤 플랫폼에서도 실행할 수 있습니다.
아래는 명령행에서 사용할 수 있는 옵션입니다.
- 윈도우즈 시스템에서, the use of the minus ("-") in the options causes problems as well as the equal sign ("="). Because of this, from version 2.2.2 on, you can also use this format or any combination of /,- and :,=
- Fields in italic represent the values that the options use.
- It's important that if spaces are present in the option values, you use quotes or double quotes to keep them together. Take a look at the examples below for more info.
Below are the valid options.
Display version information
This option displays the version of the Kettle core library (kettle.jar).
The build version number and build date are shown as well.
XML 파일 실행
This option runs the transformation defined in the XML file. (.ktr : Kettle Transformation)
로그 파일 설정
Specifies the log file. The default is the standard output.
로그 레벨 설정
The level option sets the log level for the transformation that's being run.
These are the possible values:
- Error: Only show errors
- Nothing: Don't show any output
- Minimal: Only use minimal logging
- Basic: This is the default basic logging level
- Detailed: Give detailed logging output
- Debug: For debugging purposes, very detailed output.
- Rowlevel: Logging at a row level, this can generate a lot of data.
Connect to the repository with name "Repository name".
You also need to specify the options --user, --pass and --trans.
You can also specify this option in the form of environment variable KETTLE_REPOSITORY.
저장소 사용자 설정
This is the username with which you want to connect to the repository.
You can also specify this option in the form of environment variable KETTLE_USER.
저장소 암호 설정
The password to use to connect to the repository
You can also specify this option in the form of environment variable KETTLE_PASSWORD.
저장소에서 실행할 Transformation 선택
Use this option to select the transformation to run from the repository
저장소의 디렉토리 리스트
Print a listing of all the sub-directories in the repository directory specified with the option "-dir".
저장소 디렉토리 설정
Specifies the directory in the repository to use. Repository directories are specified like this:
- The root directory: /
- A subdirectory: /production/Dimensions/
From version 2.2.2 on, a / (slash) is used to separate directories on all platforms.
저장소의 Transformation 리스트
Show a list of all the transformations in the repository directory specified with the option "-dir".
사용할 가능한 저장소 리스트
Print a listing of all the defined repositories.
전체 저장소 내보내기
This options exports the complete repository to a single XML file.
To restore this file to a repository, please use the Repository Explorer in Spoon.
See the documentation of Spoon for more information.
저장소에 로그인하지 않기
If you have set environment variables KETTLE_REPOSITORY, KETTLE_USER, KETTLE_PASSWORD, you can prevent Pan from logging into the repository. For example if you want to launch a transformation from an XML file.
Please make sure that you are positioned in the Kettle directory before running the samples
below. If you put these scripts into a batch file or shell script, simply do a change directory to
the installation directory:
If Kettle was installed on windows on the D:\ drive
D: cd \Kettle
If Kettle was installed in the /product directory on a Unix system:
파일에 저장된 transformation 실행
This example runs a transformation from file on a windows platform:
pan.bat /file:"D:\Transformations\Customer Dimension.ktr" /level:Basic
This example runs a transformation from file on a Linux box:
pan.sh -file="/PRD/Customer Dimension.ktr" -level=Minimal
저장소에 저장된 transformation 실행
This example runs a transformation from the repository on a windows platform: (Enter on a single line without returns...)
pan.bat /rep:"Production Repository" /trans:"update Customer Dimension" /dir:/Dimensions/ /user:matt /pass:somepassword123 /level:Basic
If you don't want the output of the file to appear on the screen but rather be put into a log file, you can use redirection.
This example adds the Pan output to an ever-growing log file:
pan.sh -file="/PRD/trans.ktr" -level=Minimal >> /LOG/trans.log
This example writes the Pan output to a file that gets overwritten every time:
pan.bat /file:C:\PRD\trans.ktr /level:Basic > C:\LOG\trans.log
Pan returns an error code based on how the execution went:
- 0 : The transformation ran without a problem.
- 1 : Errors occurred during processing
- 2 : An unexpected error occurred during loading / running of the transformation
- 3 : Unable to prepare and initialize this transformation
- 7 : The transformation couldn't be loaded from XML or the Repository
- 8 : Error loading steps or plugins (error in loading one of the plugins mostly)
- 9 : Command line usage printing
윈도우즈에서 transformation 스케쥴링하기
The best way to go at it is to test the command first at the dos prompt.
Then you can use the windows scheduler to launch this command.
Windows versions since Windows 2000 have a GUI for doing this accessible through the
control panel. However it's also possible to use the command line to do this:
at 23:30 /every:Monday,Wednesday,Friday "D:\update_dimensions.bat"
To see a list of the scheduled commands simply type:
유닉스에서 transformation 스케쥴링하기
First create a shell script that runs all the transformations you need. Then you can schedule
this script to run.
On Unix like systems the easiest way to schedule a command is by using the "cron table".
You can do this by entering the following command:
Then you can enter the time at which the command needs to be run as well as the command
on a single line in the text file that is presented.
The first options are:
- Minute: The minute of the hour, 0-59
- Hour: The hour of the day, 0-23
- Month day: The day of the month, 1-31
- Month: The month of the year, 1-12
- Weekday: The day of the week, 0-6, 0=Sunday
You can specify more then 1 number for each of these values by separating 2 number with a
hyphen - . This means an inclusive number range. If you separate the number by commas
(,), it means distinct values. If you use * instead of a number, it means: every possible hour,
minute, day, month or weekday.
So, if you want to update the dimensions every hour, at 15 and 45 minutes past the hour
during the weekdays, you might enter these lines in a crontab:
# # Launches the update of the dimensions in the warehouse # 15,45 * * * 1-5 /PROD/update_dimensions.sh #