Hitachi Vantara Pentaho Community Forums
Results 1 to 10 of 10

Thread: handling null output

  1. #1
    Join Date
    Jun 2008
    Posts
    14

    Default handling null output

    Hi All

    I have a transformation that includes a database lookup step that occasionally returns no rows (this is expected and acceptable behaviour). I'd like to know how I can test for this and use that test to influence another branch of the transformation. At the moment I have a stream lookup that breaks due to the null output.

    Any ideas gratefully accepted. I'm using 3.1.0-M1.

    Cheers
    Kyle

  2. #2
    Join Date
    Sep 2007
    Posts
    834

    Default

    try a "filter rows" step after the "database lookup" step,
    regards

  3. #3
    Join Date
    Jun 2008
    Posts
    14

    Default

    Hi Maria

    Thanks for your reply.

    I tried that but there are no rows to filter. In the step metrics report there are no rows produced from the lookup so nothing to filter on.

    At the moment I inject a dummy row with Generate Rows into the transformation to keep it from crashing but this is less than elegant and offers no control over the other branches.

    Cheers
    Kyle

  4. #4
    Join Date
    May 2006
    Posts
    4,882

    Default

    Can you attach your transformation, and the log file of what you get.

    Regards,
    Sven

  5. #5
    Join Date
    Jun 2008
    Posts
    14

    Default

    Hi Sven

    transformation and log attached. I warn you, it's not pretty

    Cheers
    Kyle
    Attached Files Attached Files

  6. #6
    Join Date
    Sep 2007
    Posts
    834

    Default

    ouch!
    before I die trying to understand....

    which one is the database lookup we're talking about?

  7. #7
    Join Date
    Jun 2008
    Posts
    14

    Default

    OK, brief(ish) explanation.

    We have a spreadsheet with barcodes and other product information at input_sheet which we want to check against our product catalogue to see whether we already stock the products or not.

    Check_for_matches does this but only brings back the first match. We have some duplicate issues with our data so Pull_Dupes queries the spreadsheet against a view of these duplicates on our database. If there are duplicates these are output at the end of the Duplicates output branch. The barcodes from these are then used in the Check_Against_Dupes step so that the same products do not appear in the Matches output step. That way we check our current data is correct for the 1:1 matches on one spreadsheet and we get a chance to sort out the duplicates on a second spreadsheet.

    If there are no duplicates, however, there are no rows produced by the Pull_Dupes step. This causes the whole process to collapse at Check_Against_Dupes. Ideally what I'd like to do is detect that there are no rows from Pull_Dupes and allow the other branch to skip the Check_Against_Dupes step and go straight from Check_For_Matches to Filter rows.

    An IF...Then statement should do the trick but I can't see anywhere to put one...

    I've attached my workaround (adding a Generate Rows step to produce a dummy record) but this is not exactly elegant and does not give me any control over the other branch.

    Cheers for any help.

    Kyle
    Attached Files Attached Files

  8. #8
    Join Date
    May 2006
    Posts
    4,882

    Default

    For the original .ktr file...

    - You didn't fill in the lookup step (drop down box) in the "check for dupes" step... that's also in the log file.
    - I would think stream lookup first reads all of the lookup data and then only starts processing real input. In your case this may work (if you fill in the drop down at least in check for dupes), but in other cases it may deadlock.

    Qua number of steps... I used to "threaten" people to physically throw something at them if they used more than 10 steps in a single transformation Things work better if you work in small "steps" than the"blobbish" approach.

    Regards,
    Sven

  9. #9
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Kyle, while you're at it, remove that "Append streams" step from the transformation.
    Odds are high that you don't need to append the streams in a particular order and it might block your transformation.

  10. #10
    Join Date
    Jun 2008
    Posts
    14

    Default

    Hi All

    Sorry it's a bit messy but I've only be using Spoon for a week...

    I guess I ought to split up the different streams into separate transformations and bolt them all together in a job?

    I've fixed the lookup step issue (ktr attached) but still get the following error:

    2008/06/25 13:29:03 - Check_Against_Dupes.0 - ERROR (version 3.1.0-M1, build 688 from 2007/11/14 00:11:43) : Unexpected error :
    2008/06/25 13:29:03 - Check_Against_Dupes.0 - ERROR (version 3.1.0-M1, build 688 from 2007/11/14 00:11:43) : java.lang.NullPointerException

    I guess the Check_Against_Dupes step won't run without any rows. If I split the different streams up could I then check for this and set alternative streams from there? Sounds like a plan...

    I'll drop the Append streams in the working version and see what happens.

    Cheers
    Kyle
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.