Introduction
In version 3.0.3 and 3.1.0-M1, we added the ability to launch job entries in parallel. This makes it easier to fire off jobs and transformations in parallel on the same machine or even remotely.
Enabling parallel execution
You can ask a job entries to launch the next job entries in parallel. As such, you can click right on a job entry and select the "Launch next entries in parallel" option.
Once selected, the arrows to the next job entries will be shown in dashed lines and a check-box will appear next to the entry in the pop-up menu.
Functionality
All the job entries following the one where you enabled the "launch in parallel" option are going to be executed in parallel. Since the job entries following that one are dependent on it, these will also be executed in parallel. (because of the backtracking algorithm that the job execution engine uses).
Limitation
The execution model cited above makes it harder to execute to a certain number of job entries in parallel and then simply continue with something else in sequence.
To do this, we suggest you wrap up the parallel work in a separate Job.
Warning
The following warning will be shown when you enable the feature:
7 Comments
user-3fa97
I get a page full of errors when I change my job to run in parallel like this. What could be causing this?
2008/06/09 13:42:14 - kettle - ERROR (version 3.0.3, build 569 from 2008/04/25 12:30:04) : ERROR executing query: org.pentaho.di.core.exception.KettleDatabaseException: 2008/06/09 13:42:14 - kettle - ERROR (version 3.0.3, build 569 from 2008/04/25 12:30:04) : offending row : [value String], [key Integer]2008/06/09 13:42:14 - kettle - ERROR (version 3.0.3, build 569 from 2008/04/25 12:30:04) : 2008/06/09 13:42:14 - kettle - ERROR (version 3.0.3, build 569 from 2008/04/25 12:30:04) : Error setting value #1 [value String] on prepared statement (String)2008/06/09 13:42:14 - kettle - ERROR (version 3.0.3, build 569 from 2008/04/25 12:30:04) : java.sql.SQLException: PreparedStatement has been closed. No further operations allowed.2008/06/09 13:42:14 - kettle - ERROR (version 3.0.3, build 569 from 2008/04/25 12:30:04) : PreparedStatement has been closed. No further operations allowed.2008/06/09 13:42:14 - kettle - ERROR (version 3.0.3, build 569 from 2008/04/25 12:30:04) : ERROR in part: P Set values
Jens Bleuel
Mark,
if the problem still persists in 3.1 GA, please file a bug report at JIRA or discuss on the forum (better then commenting here).
Thanks,
Jens
user-dc351
Mark, this error is caused by a concurrency problem in the repository handler. As Jens suspected it's fixed in the mean time.
Jens Bleuel
Just for the records: It is fixed in 3.1.1 by http://jira.pentaho.com/browse/PDI-1756
user-837c9
Where can we find 3.1.1 ? Only 3.1.0-826 is available on sourceforge.
In this version, 3.1.0-826, the problem still exists:
INFO 06-03 12:16:51,537 - End_Load_GTI - Starting entry [alim_evt_archi_ton_quot]
INFO 06-03 12:16:51,538 - End_Load_GTI - Launched job entry [alim_evt_archi_ton_quot] in parallel.
INFO 06-03 12:16:51,539 - End_Load_GTI - Starting entry [alim_cgo_quot]
INFO 06-03 12:16:51,541 - End_Load_GTI - Launched job entry [alim_cgo_quot] in parallel.
ERROR 06-03 12:16:51,609 - alim_cgo_quot - Error running job entry 'job' : org.pentaho.di.core.exception.KettleException:
An error occurred reading job [alim_cgo_quot] from the repository
Jens Bleuel
> Where can we find 3.1.1 ?
Fix releases are available for customers and also covered in the Kettle Developer Support, please see http://www.pentaho.com/products/buy_bi_suite.php
user-57bde
When 'Launch next entries in parallel' is checked just prior to a subjob that executes once for each input row, does the subjob execute in parallel or only a single thread?