Hitachi Vantara Pentaho Community Wiki
Child pages
  • MySQL Bulk Loader
Skip to end of metadata
Go to start of metadata

Description

The MySQL Bulk Loader streams data from inside Kettle to a named pipe using "LOAD DATA INFILE 'FIFO File' INTO TABLE ...." into the database.

Further information about the used command can be found in the MySQL reference: LOAD DATA INFILE syntax

Note: This step actually does not work within a Windows system due to the use of mkfifo for named pipes. If you like, check out to use the GNU Core Utilities for Windows (but this is not officially supported).

Options

Option

Description

Step name

Name of the step.

Connection

Name of the database connection on which the target table resides.

Target schema

The name of the Schema for the table to write data to. This is important for data sources that allow for table names with dots '.' in it.

Target table

Name of the target table.

Fifo file

This is the fifo file used as a named pipe. When it does not exist it will be created with the command mkfifo and chmod 666 (this is the reason why it is not working in Windows).

Delimiter

Delimiter for the fields. (If not given, the tabulator is default.)

Enclosure

The enclosure used for strings.

Escape character

If the enclosure is in the field, it is escaped by the escape character.

Character set

The used character set (optional).

Bulk size (rows)

To split the data loads in chunks of data after which the data load will be restarted.

Use replace clause?

If checked, "REPLACE" is added to the command. If you specify, input rows replace existing rows. In other words, rows that have the same value for a primary key or unique index as an existing row. More details can be found in the MySQL reference.

Use ignore clause?

If checked, "IGNORE" is added to the command. If you specify, input rows that duplicate an existing row on a unique key value are skipped. More details can be found in the MySQL reference.

Fields to load

With the "Get fields" button you can load the fields from the stream. With the "Edit mapping" button you can edit the mapping from the stream to the table fields when the table already exists.

Option

Description

Table field

Name of the field in the table.

Stream field

Name of the field in the stream.

Field format OK?

You can decide if the format should be kept (Don't change formatting) or changed:

  • Format as Date (yyyy-MM-dd)
  • Format as a timestamp (yyyy-MM-dd HH:mm:ss)
  • Format as Number (grouping symbol is "," - decimal is ".")
  • Escape enclosure characters when found

Metadata Injection Support (7.x and later)

All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.

  • No labels

1 Comment

  1. Format as number, what exactly does this do?

    If you have a number created with no precision specifically defined I usually expect that to mean maximum precision.

    However with don't change formatting it seems that it applies the default number mask which is #.# so the mysql table only gets 1dp.

    the workaround is to apply a format in a select values step - but that seems unnecessary. I thought the format was only ever for conversion or display purposes?