Added by Doug Moran, last edited by Yassine Elassad on Apr 26, 2008  (view change)

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

Setting up the databases required by Pentaho 

Pentaho requires 3 databases which are more or less mandatory. These are:

Database Description Sql Script Mandatory
Shark Used for bursting and similar tasks Shark.sql YES
Quarz Database which is the main pentaho scheduler (think about it as a cron-like tool that stores its cron jobs in the database) Quarz.sql YES
Hibernate The persistence layer used by pentaho hibernate.sql YES
sampladata
If you are on production you may not need it, but you will need it for demonstration and seeing what is new in Pentaho. sampledata.sql No

Please note that the sql files above REQUIRES you to edit them since the database are created and are owned by the tomcat user. So, if you dont have another user in PostgreSQL who you want to use then you will have to replace every occurrence of tomcat with your own user before importing the files.

One point I would like to add is how I have created this SQL files: first I converted them from HSQLDB to MySQL and then from MySQL to PostgreSQL. It was a bit painfull process, but now Pentaho works and that's good. Please be warned that the sampledata is not 100% well converted so you may encounter some problem with it. This is a task that still needs to be done to finalize the sampledata.

loading the data into postgresql:

Create the above empty databases. Assuming your user is tomcat you would do that as following:

postgres@anfatech:~$ createdb shark -O tomcat
postgres@anfatech:~$ createdb quartz -O tomcat
postgres@anfatech:~$ createdb hibernate -O tomcat
postgres@anfatech:~$ createdb sampledata -O tomcat

 and import the scripts above into the appropriate database as following

tomcat@anfatech:~$ psql -U tomcat -d hibernate < hibernate.sql
tomcat@anfatech:~$ psql -U tomcat -d quartz < quarz.sql
tomcat@anfatech:~$ psql -U tomcat -d shark < shark.sql
tomcat@anfatech:~$ psql -U tomcat -d hibernate < sampledata.sql

Now your PostgreSQL should be ready. The next step is to configure the datasources required to access the data.

You can these are my samples for jboss

 sorry for the dead link (my homepage habe been token over) now i uploaded the templates to the wiki so they should be fine there!

Configuring pentaho's datasources for Tomcat 5.5.x and Higher, using context.xml

if you plan to deploy on tomcat then you better use what tomcat offers you:

instead of messing with config files in conf/server.xml  or even worst creating a file under $CATALINA_HOME/conf/Catalina/localhost/yourApplication.xml

you could use what tomcat expects in your Application archive

 pentaho
|-- META-INF
|   `-- context.xml  <---(***1***)
`-- WEB-INF
    `-- web.xml

(**1*): this file "context.xml" will be processed by tomcat *at deployment time and if you define your datasources there, tomcat
will do the necessary configurations for you.

An example of a valid context.xml file should be looking like the following one:

<Context     path="/pentaho"
               docBase="path/to/pentaho" <---(***2***)
               debug="5"
               reloadable="true"
               crossContext="true">   
    <Resource   name="jdbc/Quartz"
                auth="Container"
                type="javax.sql.DataSource"
                maxActive="100"
                maxIdle="30"
                maxWait="10000"
                username="username"
                password="password"  
                driverClassName="org.postgresql.Driver"
                url="jdbc:postgresql://localhost:5432/target-database"/>   
<!--
    You can add as many datasources as you need here

-->
</Context>

(**2**) : if you deploy your application to tomcat defualt webapps so your docbase shoud be like the following one:

      docBase="${catalina.home}/webapps/pentaho

when you finnished creating your context.xml you can then package your war and deploy it to tomcat. the context.xml of your application will get converted to applicationName.xml for the application name pentaho as shown above the resulting file will be this one:

 $CATALINA_HOME/conf/Catalina/localhost/pentaho.xml

if your database are correctly set and everything went fine pentaho should be able to access the databases using the datasource defined in the context.xml

Make sure your driver is in the right location and startup your application server where Pentaho is deployed.

Configuring the Quartz Scheduler:

go to the file solution/system/quartz/quartz.properties and change the line and search for the property "
org.quartz.jobStore.driverDelegateClass".  Change that line so that it reads:

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

This Page was contributed by Yassine Elassad (elassad)

Comment: Posted by Doug Moran at Jul 12, 2007 20:09

Hi! I was looking for how to configure Pentaho to do Reporting from another DB. It would be useful to have a link for this other "Configuring Pentaho for use with PostgreSQL".

Comment: Posted by Anonymous at Oct 23, 2007 13:44

HI! thanks for your contrib.

I'm think that  some errors in yours scripts

the jboss 's user must be replace by tomcatpsql -U tomcat -d quartz < quartz.sql instead of psql -U tomcat -d quarz < quarz.sql andpsql -U tomcat -d sampledata < sampledata.sql instead of psql -U tomcat -d hibernate < sampledata.sql

Comment: Posted by Anonymous at Nov 08, 2007 09:27

Minor typo: The scheduler should use Quartz, not Quarz.

Comment: Posted by Anonymous at Nov 08, 2007 16:19

Hi all,

I tried another way to get the sampledata into PostgreSQL. I used Kettle... most of the work was done by the copy tables wizard. Then I had to tweak the .ktr files a little bit (e. g. HSQLDB has uppercase "PUBLIC" schema name, PostgreSQL lowercase and PostgreSQL has problems with datatypes like "NUMERIC(19,0)"). I'll try to attach the Kettle job files to this page.

Comment: Posted by Anonymous at Nov 17, 2007 15:48

If you want to try to import the sampledata with Kettle then just start Hypersonic with the script delivered with Kettle 3.0, start Kettle and open the job files.

Don't forget to change solution/system/quartz/quartz.properties and change the line:

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

Comment: Posted by Pedro Alves at Nov 20, 2007 07:36

With Pentaho 1.6 and Postgresql 8.2.5

It seems Shark database is not mandatory anymore as you have to install a shark server on your machine first and point pentaho to it with the configuration files in pentaho/bi-server/pentaho-solutions/system/shark/..

Don't try the db scripts on this page with Pentaho 1.6, better convert the script pentaho/bi-server/data/SampleDataDump_MySql.sql to Postgresql

I started my installation by using the easy install mode to install Pentaho with JBoss and MySQL and then change all the configuration
to set it for Postgresql by looking for all the 'mysql' in the config files.

Don't forget to tell Mondrian that you are using Postgresql or else all your olap queries won't run and it will tell that the tables doesn't exist.
Modify /pentaho/bi-server/pentaho-solutions/system/olap/datasource.xml to point to Postgresql and all the *.mondrian.xml files making the table and column names lowercase

It would have been great if you get to choose Postgresql with the standard installation.

The good news is that with Postgresql, it's faster

Comment: Posted by Jerome Cader at Nov 28, 2007 02:22

hello jerome,

Can u please list the config files to edited and also if possible to  upload the converted SampleDataDump_Postgres.sql

Comment: Posted by Anonymous at Nov 28, 2007 09:36

Jerome,

 Can you please post converted SampleDataDump_Postgres.sql

Thanks,

Dejan 

Comment: Posted by Anonymous at Jan 15, 2008 18:24

Below is the summary of what I did to make Pentaho Bi to work on Tomcat and Postgres

1. Create databases in Posgres
shark
quartz
hibernate
sampledata

Not sure if all of them are needed ...

2. Populate databases from http://wiki.pentaho.org/display/ServerDoc1x/Configuring+for+PostgreSQL
psql -U userName -d hibernate < hibernate.sql
psql -U userName -d quartz < quartz.sql
psql -U userName -d shark < shark.sql
psql -U userName -d sampledata < sampledata.sql

3. Do ant build on pentaho_j2ee_deployments-1.6.0.GA.863-a and deploy
pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/tomcat/hsqldb/pentaho.war
pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/sw-style.war
pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/pentaho-style.war
pentaho_j2ee_deployments-1.6.0.GA.863-a/build/pentaho-wars/pentaho-portal-layout.war

4. Install Pentaho for Jboss and Mysql
pentaho-1.6.0GA-linux-opensource-installer.bin

5. Replace Tomcat apache-tomcat-6.0.14/webapps/pentaho/pentaho-solutions
with Jboss/Mysql
pentaho/bi-server/pentaho-solutions

6. In apache-tomcat-6.0.14/webapps/pentaho replace all references to mysql to postgresql
org.postgresql.Driver
jdbc:postgresql://serverName/sampledata

7. Replace all references to local host to serverName in petaho dir and down

8. In apache-tomcat-6.0.14/conf/server.xml inside Host tag add:

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="username" password="password"
driverClassName="org.postgresql.Driver" url="jdbc:postgresql://serverName/hibernate" />

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="username" password="password"
driverClassName="org.postgresql.Driver" url="jdbc:postgresql://serverName/quartz" />

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"
maxWait="10000" username="username" password="password" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/shark" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

maxIdle="5" maxWait="10000" username="username" password="password"
factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="org.postgresql.Driver"
url="jdbc:postgresql://serverName/sampledata" />

9. Set up deployment password in
pentaho-solutions/system/publisher_config.xml

10. Find directories with *.properties files and configure final version of properties file

P.S. Jerome thanks for your help

Comment: Posted by dejan miljkovic at Jan 21, 2008 18:11

Note that the link in the line:

    "You can these are my samples for jboss I will add tomcat context.xml in the near future (???)"

is dead.

Comment: Posted by Jared Pshedesky at Feb 28, 2008 16:02

Once you change the solutions repo to use an RDBMS, is the filesystem still used for something? How about the storage of all xaction and report files...where does that happen?

Comment: Posted by Sarang Deshpande at Apr 15, 2008 15:14 Updated by Sarang Deshpande

I recomendo to add:

<property name="hibernate.cache.provider_class">org.hibernate.cache.EhCacheProvider</property>

 to the hibernate.cfg.xml

 
I am creating the  wars from de j2ee deployments distribution, and postgres is not fully configured for that generation, so you need create some files such as Datasources, etc...

and do not forget en your solution_home/system in quartz properties override the next param:

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

jcheers!

Comment: Posted by Juan Holder at Apr 21, 2008 09:48

I was wondering if there are similar scripts (hibernate.sql, quartz.sql) for creating these 2 schemas on Oracle 10g. Can someone share those?

Comment: Posted by Sarang Deshpande at Apr 30, 2008 12:18

We have tried this method and found that it didn't work very well. We have developed this method, which works much better.

Comment: Posted by Sridhar Dhanapalan at Jun 09, 2008 21:21

Can you tell me is it possible to configure Pentaho for PostgreSQL+JBoss, instead for PostgreSQL+Tomcat?

10x

Comment: Posted by new_p at Jul 16, 2008 09:43

I have some problem with the xaction related to the home page, for example in

homeDashboard/Sales_by_Territory.xaction

the query

SELECT OFFICES.TERRITORY, SUM(ORDERDETAILS.QUANTITYORDERED*ORDERDETAILS.PRICEEACH)  SOLD_PRICE FROM ORDERS  ...

should be

 SELECT OFFICES.TERRITORY, SUM(ORDERDETAILS.QUANTITYORDERED*ORDERDETAILS.PRICEEACH) as  SOLD_PRICE FROM ORDERS  ...

and there are other similar errors in Sales_by_Productline.xaction, but I don't know how to solve it.

Comment: Posted by Anonymous at Jul 17, 2008 09:22