Hitachi Vantara Pentaho Community Wiki
Child pages
  • Job checkpoints and restartability

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: remove a duplication

...

The checkpoint log table contains all the fields required to keep track of the re-start behavior.  For example, if we already tried 4 times to update the data warehouse and we had a failure each time, this is what we will find:

Code Block

select ID_JOB_RUN, JOBNAME, NAMESPACE, CHECKPOINT_NAME, ATTEMPT_NR, JOB_RUN_START_DATE, LOGDATE from LOG_CHECKPOINT;select
ID_JOB_RUN, JOBNAME, NAMESPACE, CHECKPOINT_NAME, ATTEMPT_NR, JOB_RUN_START_DATE, LOGDATE from LOG_CHECKPOINT;
+------------+---------------------+-----------+------------------+------------+---------------------+---------------------+
| ID_JOB_RUN | JOBNAME             | NAMESPACE | CHECKPOINT_NAME  | ATTEMPT_NR | JOB_RUN_START_DATE  | LOGDATE             |
+------------+---------------------+-----------+------------------+------------+---------------------+---------------------+
|          4 | Load data warehouse | -         | Load source data |          4 | 2012-08-30 18:05:13 | 2012-08-30 18:17:10 |
+------------+---------------------+-----------+------------------+------------+---------------------+---------------------+

...

Now the checkpoint log table will contain the following:

Code Block
  select ID_JOB_RUN, JOBNAME, NAMESPACE, CHECKPOINT_NAME, ATTEMPT_NR, JOB_RUN_START_DATE, LOGDATE from LOG_CHECKPOINT;
+------------+---------------------+-----------+-------------------+------------+---------------------+---------------------+
| ID_JOB_RUN | JOBNAME             | NAMESPACE | CHECKPOINT_NAME   | ATTEMPT_NR | JOB_RUN_START_DATE  | LOGDATE             |
+------------+---------------------+-----------+-------------------+------------+---------------------+---------------------+
|          4 | Load data warehouse | -         | Update dimensions |          7 | 2012-08-30 18:05:13 | 2012-08-30 21:43:45 |
+------------+---------------------+-----------+-------------------+------------+---------------------+---------------------+

...

Once that happens, the checkpoint name field in the logging table is cleared:

Code Block

select ID_JOB_RUN, JOBNAME, NAMESPACE, CHECKPOINT_NAME, ATTEMPT_NR, JOB_RUN_START_DATE, LOGDATE from LOG_CHECKPOINT;
+------------+---------------------+-----------+-----------------+------------+---------------------+---------------------+
| ID_JOB_RUN | JOBNAME             | NAMESPACE | CHECKPOINT_NAME | ATTEMPT_NR | JOB_RUN_START_DATE  | LOGDATE             |
+------------+---------------------+-----------+-----------------+------------+---------------------+---------------------+
|          4 | Load data warehouse | -         | NULL            |          8 | 2012-08-30 18:05:13 | 2012-08-30 22:35:04 |
+------------+---------------------+-----------+-----------------+------------+---------------------+---------------------+

If we then run the complete job again, it starts from the beginning again with a new run ID (ID_JOB_RUN):

Code Block

select ID_JOB_RUN, JOBNAME, NAMESPACE, CHECKPOINT_NAME, ATTEMPT_NR, JOB_RUN_START_DATE, LOGDATE from LOG_CHECKPOINT;
+------------+---------------------+-----------+-----------------+------------+---------------------+---------------------+
| ID_JOB_RUN | JOBNAME             | NAMESPACE | CHECKPOINT_NAME | ATTEMPT_NR | JOB_RUN_START_DATE  | LOGDATE             |
+------------+---------------------+-----------+-----------------+------------+---------------------+---------------------+
|          4 | Load data warehouse | -         | NULL            |          8 | 2012-08-30 18:05:13 | 2012-08-30 22:35:04 |
|          5 | Load data warehouse | -         | NULL            |          1 | 2012-08-31 00:08:13 | 2012-08-31 00:08:17 |
+------------+---------------------+-----------+-----------------+------------+---------------------+---------------------+

...

When you execute a job using Kitchen, you can use the following option to ignore the last reached checkpoint and to force the execution of a job to start at the "Start" job entry:

Code Block

-custom:IgnoreCheckpoints=true

...

To clear out all checkpoints for all jobs:

Code Block

update LOG_CHECKPOINT set CHECKPOINT_NAME=null;

To clear the checkpoint for a specific job:

Code Block

update LOG_CHECKPOINT set CHECKPOINT_NAME=null where JOB_NAME = 'Load data warehouse';

Internal Variables

In a job, internal variables are automatically set with respect to checkpoints:

...