Hitachi Vantara Pentaho Community Wiki
Child pages
  • Salesforce Input
Skip to end of metadata
Go to start of metadata

(warning) PLEASE NOTE: This documentation applies to Pentaho 8.0 and earlier. For Pentaho 8.1 and later, see Salesforce Input on the Pentaho Enterprise Edition documentation site.


The SalesForce Input step provides you with the ability to read data directly from SalesForce using the SalesForce Web Service. The following sections describe each of the available features for configuring the SalesForce Input step.

Settings Tab

The settings tab is where you configure the SalesForce WebService URL, login credentials, module to query from and query conditions:



SalesForce Webservice URL

This is the URL to the SalesForce Webservice.
Note: This URL is dependent on the API version you are using:

  • PDI 3.2.x uses API version 10.0. 
  • PDI 4.3 until 6.1 use API version 24.0.
  • PDI 7.0 and later versions use API version 37.0.


Username for authenticating to Salesforce (i.e.


Password for authenticating to Salesforce.  Enter your password followed by your security token.  If you password is 'PASSWORD' and your secuirty token is 'TOKEN', enter 'PASSWORDTOKEN' in this field.


Select the module you wish to retrieve data from.  Note: This list will be populated upon successfully authenticating to Salesforce using the Test Connection button.

Query Condition

Enter any query filters you wish to apply.  For example, 'fieldname=myvalue AND fieldname2=myvalue2...'
Note that you do NOT need to include WHERE in your condition statement and variables are supported for this field.


The content tab allows you to optionally include additional descriptive fields in the result set



Include URL in Output?

Enable to add a field to the output containing the URL used to retrieve the data.

Include Module in output?

Enable to add a field to the output containing the module from which the data was retrieved.

Include SQL in output?

Enable to add a field to the output containing the SQL used to generate the result set.

Include timestamp in output?

Enable to add a field to the output containing the timestamp for when the record was retrieved.

Include Rownum in output?

Enable to add a field to the output containing a row number for each record retrieved.

Time out

Configure the timeout interval in milliseconds before the step times out.


Configure the maximum number of records to retrieve.  Note: Setting this to '0' means there will be no limit placed on the number of records that can be retrieved.


The fields tab displays the fields that will be read from the SalesForce object (Module chosen on the Settings tab). You will need to go to the Fields tab and press the "Get Fields" button to populate the fields returned before being able to preview the rows returned.

  • No labels


  1. user-dac57

    Can we route the salesforce connection through a Proxy Server ?

    If yes, how can we do it ?

  2. user-837c9

    This could be done by adding HTTP proxy properties to your JVM launcher : -Dhttp.proxyHost=<proxy server> -Dhttp.proxyPort=<proxy port> -Dhttp.proxyUser=<user> -Dhttp.proxyPassword=<pwd>

    Ex: In Spoon.bat file, 

    set OPT=-Xmx256m -cp %CLASSPATH% -Djava.library.path=libswt\win32\ -DKETTLE_HOME="%KETTLE_HOME%" -DKETTLE_REPOSITORY="%KETTLE_REPOSITORY%" -DKETTLE_USER="%KETTLE_USER%" -DKETTLE_PASSWORD="%KETTLE_PASSWORD%" -DKETTLE_PLUGIN_PACKAGES="%KETTLE_PLUGIN_PACKAGES%" -DKETTLE_LOG_SIZE_LIMIT="%KETTLE_LOG_SIZE_LIMIT%" -Dhttp.proxyHost=<proxy server> -Dhttp.proxyPort=<proxy port>  -Dhttp.proxyUser=<user>  -Dhttp.proxyPassword=<pwd>

    I tested it : it works !

  3. user-d4dc7


    I'm currently trying to implement an update mechanism for our SalesForce-Pentaho-Connection. The problem is: SalesForce does not deliver results in any order... both IDs and timestamps seem to be extracted randomly. That's why I can't use a query condition like "CreatedDate > (select max of all those records which have already been extracted)".

    So: Is there any possibility to order the result set by a certain column when using the SalesForce Input? Something like "order by" in the standard SQL?

    If it helps, the idea of the job I'm trying to implement is: "get maximum date or id from extracted data" -> "set as environment variable" -> "SalesForce Input with this variable as a query condition" -> "store new results in database". The job itself works fine, but if I get a 2013 timestamp before all 2012 data is extracted, there seems to be no way to get this data into the database. I also tried just getting everything without any condition but well... this can't be a good solution (especially considering timeouts etc).

    Looking forward to responses - thanks in advance!



  4. user-f14cc

    Hi there,

    From SFDC input, i want to take only FIELD instead of NAME to next step(transformation).

    Please suggest...



  5. user-5e79c

    I am trying to create a generic transform so that I can pass parameters including SOQL for different tables and get the data and write to text file. But even when I have parametrized it the getfields option needs to be manually clicked for populating the fields when the query changes, which defeats the purpose. Is there a way to make the SFDC input generic for different tables in sfdc and use same transform for all tables by calling it with different parameters?

    Any help is appreciated. 

  6. user-f1c37

    Hi Everyone,

    Does anyone know if there is a plug-in for the Microsoft's NAVISION ERP in Pentaho/Kettle ? I am trying to achieve integration between NAVISION and

    I need a plug-in for NAV to extract data out of the NAV system and then move it into Salesforce CRM. Please advise.