Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: Need help understanding how to control flow, processing order in a transform

  1. #1
    Join Date
    Jan 2011
    Posts
    21

    Question Need help understanding how to control flow, processing order in a transform

    I keep running into a problem which I can't find a good solution.

    My database has a lot of primary keys which are just single numbers based on a sequence. In addition the database has many related tables which use foreign keys to these primary keys.

    The issue is when I load the tables I usually have one spreadsheet with data that needs to be processed into several tables in a specific order to meet the constraints. I thought I could use "Block this step until steps finish" but I can't get it to work and I really don't need the entire data set loaded in this manner.

    What I really want is to do the following
    Read row from sheet->Get Sequence X for Table A->Process data for table A->Insert into Table A using primary key from sequence X->Get Another Sequence Y for Table B->Process data for Table B->Insert into Table B using primary key from sequence Y and foreign key from sequence X

    The problem is after I have inserted the data into X, I have selected away the data for table Y so I don't have the correct data to insert.

    I tried spitting the stream into two but I can't control the order to assure that Table A gets loaded before Table B so the foreign key isn't found and the insert fails on the constraint. I tried setting the transform as a transaction and this doesn't work.

    In case it matters the database is Oracle 11g

    Can anyone help me understand how I can do this. I tried splitting this up into two transforms but the problem is the sequence values used for Table A are lost and I don't have a secondary key to look them up. So I need the two (and sometimes more) tables inserted in the same flow so I have the correct sequence values for the keys and foreign keys.

    Sorry for the long explanation but I think the issue needs detail to be understood.

    I am wondering if this is going to be in the upcoming cookbook. I think I need that book

    Aloha,
    Ron

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    It's a common question. The answer is that you have to make the transformation "Database transactional". You can do this in the Transformation Settings dialog on the Misc tab.

    HTH,
    Matt

  3. #3
    Join Date
    Jan 2011
    Posts
    21

    Default

    As I have mentioned "I tried setting the transform as a transaction and this doesn't work." I still get the failed foreign key constraint even though I have "Make the transaction database transactional" checked in Edit->Settings->Miscellaneous Tab. I am sure the foreign key is inserted by the end of the data stream. I changed the transform to dump the data into to files, inserted the main table first and the secondary table second with no issues so I am sure setting the transaction isn't working for me or that I am doing something wrong to maintain the transaction. Since several steps have their own database connection configuration and I am using parameters to set the database values how is a transaction maintained over the entire transform?

    Do you have to select the use one thread option to use the transaction option, that seems like it might be the case, I have not tried that yet.


    Is there a good sample or example using transaction for such an issue ?

  4. #4
    Join Date
    Jan 2011
    Posts
    21

    Default

    I tried a simple transform to test this case.

    I created to tables
    =======================
    create table PARENT ( id number,name varchar2(10),
    constraint pk_column PRIMARY KEY (id));

    create table CHILD ( id number, name varchar2(10), parent number,
    constraint pk_column2 PRIMARY KEY (id),
    constraint PARENT_FK FOREIGN KEY (parent) references PARENT(id) );
    =======================
    Then I created a simple excel sheet.
    ID Name Child ChildID
    1 BOB MARY 1
    2 JERRY BETTY 2

    The transform I used splits the excel into two streams and inserts into both tables.

    It fails when in transaction or not.

    Attached zip file contains Excel Data, image capture of transform and transform settings and the transform xml (edited to remove passwords etc)
    Attached Files Attached Files

  5. #5
    Join Date
    Jan 2011
    Posts
    21

    Default

    OK I added a "Block this step until steps finish" just before the insert into the second table waiting for the first table to finish and it seems to be working. I thought I tried this once before but I must have done it incorrectly. I don't know why transaction isn't working for me but I think the block steps might work. The next data file I have to process inserts into 8 tables which all depend on each other from one spreadsheet. Each table has it's own sequence so this should be a true test of this process flow with blocking.....

    Does "Block this step until steps finish" block one data line at a time or all data? In other words if I have 12 rows of data does the first table load all 12 and then the second table loads 12 or does the wait happen one row at a time? It's hard to tell in my tests since they load so quickly. Just so I understand how this step works.

  6. #6
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    If you are not populating tables in the correct order, no amount of trickery is going to help you.
    Good luck anyway!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.