Hitachi Vantara Pentaho Community Wiki
Child pages
  • Configuring for PostgreSQL
Skip to end of metadata
Go to start of metadata

Setting up the databases required by Pentaho 

Pentaho requires 3 databases which are more or less mandatory. These are:

Database

Description

Sql Script

Mandatory

Shark

Used for bursting and similar tasks

Shark.sql

YES

Quarz

Database which is the main pentaho scheduler (think about it as a cron-like tool that stores its cron jobs in the database)

Quarz.sql

YES

Hibernate

The persistence layer used by pentaho

hibernate.sql

YES

sampladata

If you are on production you may not need it, but you will need it for demonstration and seeing what is new in Pentaho.

sampledata.sql

No

Please note that the sql files above REQUIRES you to edit them since the database are created and are owned by the tomcat user. So, if you dont have another user in PostgreSQL who you want to use then you will have to replace every occurrence of tomcat with your own user before importing the files.

One point I would like to add is how I have created this SQL files: first I converted them from HSQLDB to MySQL and then from MySQL to PostgreSQL. It was a bit painfull process, but now Pentaho works and that's good. Please be warned that the sampledata is not 100% well converted so you may encounter some problem with it. This is a task that still needs to be done to finalize the sampledata.

loading the data into postgresql:

Create the above empty databases. Assuming your user is tomcat you would do that as following:

postgres@anfatech:~$ createdb shark -O tomcat
postgres@anfatech:~$ createdb quartz -O tomcat
postgres@anfatech:~$ createdb hibernate -O tomcat
postgres@anfatech:~$ createdb sampledata -O tomcat

 and import the scripts above into the appropriate database as following

tomcat@anfatech:~$ psql -U tomcat -d hibernate < hibernate.sql
tomcat@anfatech:~$ psql -U tomcat -d quartz < quarz.sql
tomcat@anfatech:~$ psql -U tomcat -d shark < shark.sql
tomcat@anfatech:~$ psql -U tomcat -d hibernate < sampledata.sql

Now your PostgreSQL should be ready. The next step is to configure the datasources required to access the data.

You can these are my samples for jboss

 sorry for the dead link (my homepage habe been token over) now i uploaded the templates to the wiki so they should be fine there!

Configuring pentaho's datasources for Tomcat 5.5.x and Higher, using context.xml

if you plan to deploy on tomcat then you better use what tomcat offers you:

instead of messing with config files in conf/server.xml  or even worst creating a file under $CATALINA_HOME/conf/Catalina/localhost/yourApplication.xml

you could use what tomcat expects in your Application archive

 pentaho
|-- META-INF
|   `-- context.xml  <---(***1***)
`-- WEB-INF
    `-- web.xml

(**1*): this file "context.xml" will be processed by tomcat *at deployment time and if you define your datasources there, tomcat
will do the necessary configurations for you.

An example of a valid context.xml file should be looking like the following one:

<Context     path="/pentaho"
               docBase="path/to/pentaho" <---(***2***)
               debug="5"
               reloadable="true"
               crossContext="true">   
    <Resource   name="jdbc/Quartz"
                auth="Container"
                type="javax.sql.DataSource"
                maxActive="100"
                maxIdle="30"
                maxWait="10000"
                username="username"
                password="password"  
                driverClassName="org.postgresql.Driver"
                url="jdbc:postgresql://localhost:5432/target-database"/>   
<!--
    You can add as many datasources as you need here

-->
</Context>

(**2**) : if you deploy your application to tomcat defualt webapps so your docbase shoud be like the following one:

      docBase="${catalina.home}/webapps/pentaho

when you finnished creating your context.xml you can then package your war and deploy it to tomcat. the context.xml of your application will get converted to applicationName.xml for the application name pentaho as shown above the resulting file will be this one:

 $CATALINA_HOME/conf/Catalina/localhost/pentaho.xml

if your database are correctly set and everything went fine pentaho should be able to access the databases using the datasource defined in the context.xml

Make sure your driver is in the right location and startup your application server where Pentaho is deployed.

Configuring the Quartz Scheduler:

go to the file solution/system/quartz/quartz.properties and change the line and search for the property "
org.quartz.jobStore.driverDelegateClass".  Change that line so that it reads:

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
  • No labels

18 Comments

  1. This Page was contributed by Yassine Elassad (elassad)

  2. Anonymous

    Hi! I was looking for how to configure Pentaho to do Reporting from another DB. It would be useful to have a link for this other "Configuring Pentaho for use with PostgreSQL".

  3. Anonymous

    HI! thanks for your contrib.

    I'm think that  some errors in yours scripts

    the jboss 's user must be replace by tomcatpsql -U tomcat -d quartz < quartz.sql instead of psql -U tomcat -d quarz < quarz.sql andpsql -U tomcat -d sampledata < sampledata.sql instead of psql -U tomcat -d hibernate < sampledata.sql

  4. Anonymous

    Minor typo: The scheduler should use Quartz, not Quarz.

  5. Anonymous

    Hi all,

    I tried another way to get the sampledata into PostgreSQL. I used Kettle... most of the work was done by the copy tables wizard. Then I had to tweak the .ktr files a little bit (e. g. HSQLDB has uppercase "PUBLIC" schema name, PostgreSQL lowercase and PostgreSQL has problems with datatypes like "NUMERIC(19,0)"). I'll try to attach the Kettle job files to this page.

  6. If you want to try to import the sampledata with Kettle then just start Hypersonic with the script delivered with Kettle 3.0, start Kettle and open the job files.

  7. Don't forget to change solution/system/quartz/quartz.properties and change the line:

    org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

  8. With Pentaho 1.6 and Postgresql 8.2.5

    It seems Shark database is not mandatory anymore as you have to install a shark server on your machine first and point pentaho to it with the configuration files in pentaho/bi-server/pentaho-solutions/system/shark/..

    Don't try the db scripts on this page with Pentaho 1.6, better convert the script pentaho/bi-server/data/SampleDataDump_MySql.sql to Postgresql

    I started my installation by using the easy install mode to install Pentaho with JBoss and MySQL and then change all the configuration
    to set it for Postgresql by looking for all the 'mysql' in the config files.

    Don't forget to tell Mondrian that you are using Postgresql or else all your olap queries won't run and it will tell that the tables doesn't exist.
    Modify /pentaho/bi-server/pentaho-solutions/system/olap/datasource.xml to point to Postgresql and all the *.mondrian.xml files making the table and column names lowercase

    It would have been great if you get to choose Postgresql with the standard installation.

    The good news is that with Postgresql, it's faster

  9. Anonymous

    hello jerome,

    Can u please list the config files to edited and also if possible to  upload the converted SampleDataDump_Postgres.sql

  10. Anonymous

    Jerome,

     Can you please post converted SampleDataDump_Postgres.sql

    Thanks,

    Dejan 

  11. Below is the summary of what I did to make Pentaho Bi to work on Tomcat and Postgres

    1. Create databases in Posgres
    shark
    quartz
    hibernate
    sampledata

    Not sure if all of them are needed ...

    2. Populate databases from http://wiki.pentaho.org/display/ServerDoc1x/Configuring+for+PostgreSQL
    psql -U userName -d hibernate < hibernate.sql
    psql -U userName -d quartz < quartz.sql
    psql -U userName -d shark < shark.sql
    psql -U userName -d sampledata < sampledata.sql

    3. Do ant build on pentaho_j2ee_deployments-1.6.0.GA.863-a and deploy
    pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/tomcat/hsqldb/pentaho.war
    pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/sw-style.war
    pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/pentaho-style.war
    pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/pentaho-portal-layout.war

    4. Install Pentaho for Jboss and Mysql
    pentaho-1.6.0GA-linux-opensource-installer.bin

    5. Replace Tomcat apache-tomcat-6.0.14/webapps/pentaho/pentaho-solutions
    with Jboss/Mysql
    pentaho/bi-server/pentaho-solutions

    6. In apache-tomcat-6.0.14/webapps/pentaho replace all references to mysql to postgresql
    org.postgresql.Driver
    jdbc:postgresql://serverName/sampledata

    7. Replace all references to local host to serverName in petaho dir and down

    8. In apache-tomcat-6.0.14/conf/server.xml inside Host tag add:

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
    maxWait="10000" username="username" password="password"
    driverClassName="org.postgresql.Driver" url="jdbc:postgresql://serverName/hibernate" />

    factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
    maxWait="10000" username="username" password="password"
    driverClassName="org.postgresql.Driver" url="jdbc:postgresql://serverName/quartz" />

    factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
    maxWait="10000" username="username" password="password" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/shark" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    maxIdle="5" maxWait="10000" username="username" password="password"
    factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
    url="jdbc:postgresql://serverName/sampledata" />

    9. Set up deployment password in
    pentaho-solutions/system/publisher_config.xml

    10. Find directories with *.properties files and configure final version of properties file

    P.S. Jerome thanks for your help

  12. Note that the link in the line:

        "You can these are my samples for jboss I will add tomcat context.xml in the near future (???)"

    is dead.

  13. Once you change the solutions repo to use an RDBMS, is the filesystem still used for something? How about the storage of all xaction and report files...where does that happen?

  14. I recomendo to add:

    <property name="hibernate.cache.provider_class">org.hibernate.cache.EhCacheProvider</property>

     to the hibernate.cfg.xml

     
    I am creating the  wars from de j2ee deployments distribution, and postgres is not fully configured for that generation, so you need create some files such as Datasources, etc...

    and do not forget en your solution_home/system in quartz properties override the next param:

    org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

    jcheers!

  15. I was wondering if there are similar scripts (hibernate.sql, quartz.sql) for creating these 2 schemas on Oracle 10g. Can someone share those?

  16. We have tried this method and found that it didn't work very well. We have developed this method, which works much better.

  17. Can you tell me is it possible to configure Pentaho for PostgreSQL+JBoss, instead for PostgreSQL+Tomcat?

    10x

  18. Anonymous

    I have some problem with the xaction related to the home page, for example in

    homeDashboard/Sales_by_Territory.xaction

    the query

    SELECT OFFICES.TERRITORY, SUM(ORDERDETAILS.QUANTITYORDERED*ORDERDETAILS.PRICEEACH)  SOLD_PRICE FROM ORDERS  ...

    should be

     SELECT OFFICES.TERRITORY, SUM(ORDERDETAILS.QUANTITYORDERED*ORDERDETAILS.PRICEEACH) as  SOLD_PRICE FROM ORDERS  ...

    and there are other similar errors in Sales_by_Productline.xaction, but I don't know how to solve it.