Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Loading fact, dimension table

  1. #1

    Arrow Loading fact, dimension table


    i want to load a fact and dimension table within one KETTLE transformation. During the transformation process an error appears due to an foreign key constraint. The primary key of the dimension table is the foreign key in the fact table. The error occurs, when KETTLE wants to write a dataset with the foreign key into the fact table, but the key doesn't yet exist in the dimension table.
    At the beginning of the transformation everything works fine, but at a particular time the loading of the fact table overtakes the loading of the dimension table. --> Error.

    Has anybody an idea to solve this problem?

    THX 4 reply.


  2. #2


    Hi myma,
    Well i see two ways to solve your pb:

    Way 1
    1- use 1 trans for dimension
    2- another one for Fact

    Start --> Trans load Dimension --> Trans Load fact (All in one job)

    Way 2

    As PDI execute step in different thread, "select use unique connection" in Transformation properties (Miscalleneous Tab)

    Personnaly i think the first Way is more "Clean"


  3. #3


    Hi Samatar,

    thx for your fast reply.
    I already used your way 1 to separate the loading of the 2 tables into 2 transformations and to put the transformations into 1 job. That works, but that process takes too much time. I have to process more than 9 million datasets. Using a job means, that i have to run 2 times through the millions of datasets.

    I also tried to use Java Script step before writing datasets into the fact table. The script causes a time delay of 1 second before inserting rows into the fact table. So the loading of the dimension table has an advance. But that's only an workaround, too.

    I think there must be a KETTLE feature to load a dimension and a fact table in ONE transformation having a foreign key relation.
    Or do you always use two transformations in a job to load the tables?


  4. #4


    Hi Myma,

    Have you tried to use one unique connection for your Db????
    ("select use unique connection" in Transformation properties (Miscalleneous Tab)


  5. #5


    Hi Samatar,
    yes, i enabled the option "use unique connection" in the transformation properties. But the error occurs even faster.


  6. #6

    Default try twith commit size = 0

  7. #7


    Hi again,

    i tried it with a commit size 1 (fact table) and 1000 (dimension table). So i think it would be same to use a size 0. It worked till 7 million datasets. Pretty good compared with the other attempts.

    I also tried SQL inserts for each dataset. First dimension, then fact table. There's no problem but the time. I terminated the transformation after 3 days (not finished yet).

    I think loading dimension and fact tables are so essential that everybody has to manage this issue. Or does everybody use "workarounds"?


  8. #8
    Join Date
    Oct 2011

    Default Loading facts and dimension tables

    Hi Myma

    I am new to Pentaho and am currently working on loading dimensions and fact tables. Can you provide me with your xml transformation that is used for loading dimensions and fact tables. I am particularly interested in the steps that are used to perform these tasks.


  9. #9
    Join Date
    Nov 2008


    My advice is to remove foreign key checking during fact table generation. This is pretty common practice for data warehouse applications. If you think you really need it, you can reimpose the foreign key restriction after the tables have been generated.
    Java 1.7 (64 bit)
    MySQL 5.6 (64 bit)
    Windows 7 (64 bit)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.