Hitachi Vantara Pentaho Community Forums
Results 1 to 9 of 9

Thread: Database join parameters and NULL values

  1. #1
    Join Date
    Aug 2016
    Posts
    142

    Default Database join parameters and NULL values

    Hi,

    I am performing a database join and I am having an issue with NULL values.

    In the DB join, the sql uses nested CASE WHEN statements with one of those being something like CASE WHEN field1 IS NOT NULL THEN 1 ELSE 2

    However when the step is run all NULL values fail this step and return 1 when it should be 2

    The SQL is using parameters from a previous step so I think what is happening is that the parameter is being passed through and is not passing as null but as "something" therefore when running the condition it is no longer NULL

    Has anyone come across this before? Is there any way around it?

    I have considered creating a step to convert NULL to something that would not naturally show up but it would require a lot of rewriting which I would like to avoid if possible

    Thanks

  2. #2
    Join Date
    Aug 2016
    Posts
    142

    Default

    I should probably note that the preview output shows the values as <null> but I think this might be some sort of post processing converting them back

  3. #3
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Generally, a job (or transformation) parameter is a string of a certain length.
    A zero length isn't the same as a null value.
    So long, and thanks for all the fish.

  4. #4
    Join Date
    Aug 2016
    Posts
    142

    Default

    Thanks marabu

    However I now think the issue is something else. I tried eliminating parts of the query to see if the behaviour changed and what I have found confuses me.

    If I only change the Database Join SQL to this:

    SELECT
    CASE WHEN ? IS NOT NULL THEN 1 ELSE 2
    FROM table1

    The result is incorrect

    However if I use this

    SELECT
    CASE WHEN ? IS NOT NULL THEN 1 ELSE 2

    It returns the correct value (note no FROM part)

    Since the CASE WHEN does not pull a value from the table I thought perhaps there could be something up with the table I had specified so I changed it to a random table which does not contain data relating to the parameter.

    In this case it returned the incorrect value again

    I'm very confused why the addition of a FROM command would cause it to not work.

    All the other nested CASE WHEN statements work. It is only data where this field is NULL seems to be affected as other fields I am checking are NULL and the logic works for them.


  5. #5
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Not sure if I really understand your problem, but don't try to parameterize "SELECT WHEN field1" like "SELECT WHEN ?" - this should produce a SQL error because you can't prepare identifiers, only values.
    So long, and thanks for all the fish.

  6. #6
    Join Date
    Aug 2016
    Posts
    142

    Default

    Ah that could be it

    Basically I have a query for a database and I want to replace references to a table in the query to those from my input

    In general what I am trying to do is join two different databases together. The first database has some data and the second database has the same data but also some more

    I want to be able to get the extra data from DB2 and join it to the data in DB1

    However there is a logic to this that I want to implement so a straight merge join is not possible. I thought database join would do it as it is the only option I could see where I could put a complex condition on the join

  7. #7
    Join Date
    Aug 2016
    Posts
    142

    Default

    To give more detail. My data looks something like this

    DB1 Table
    customer_id tag1 tag2 tag3
    1 NULL 50 100
    2 111 NULL 200

    DB2 Table
    tag1 tag2 tag3 channel subchannel
    111 channel1 subchannel1
    50 channel2 subchannel2
    100 channel1 subchannel2
    200 channel3 subchannel3

    So people in DB1 can have a combination of all 3 tags but not more than 1 of each. Each row is unique.

    However each tag can correspond to a different channel/subchannel combination.

    In DB2 there is a script which assigns a channel/subchannel combination based on an order of priority. So in the above, customer 1 would have channel2 & subchannel2 and customer 2 would have channel1 & subchannel1. The order of priority is tag1,tag2,tag3.

    I wrote a SQL query which mimcs this script. It works through various nested CASE WHEN statements.

    What I would like to do is join DB1 Table and DB2 Table and apply this logic so it gives the correct channel/subchannel combination

    At first I tried a merge join but could not see how to specify any clauses. Next I tried join row (cartesian product). I thought if I could join them together I could apply the CASE WHEN logic in another step. I added the condition (t1.tag1 = t2.tag1 OR t1.tag2 = t2.tag2 OR t1.tag3 = t2.tag3) but the output was not correct

    What I would like in my output for each customer is

    t1.customer_id t1.tag1 t1.tag2 t1.tag3 t2.channel t2.subchannel

    Where channel and subchannel are joined based on the logic I have written

    Given my last approach with DB join will not work, have you any suggestions on how I could make it work?

    Thanks
    Last edited by Deanicus; 01-16-2017 at 11:18 AM.

  8. #8
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Let's use the attached transformation to identify my misunderstandings
    Attached Files Attached Files
    So long, and thanks for all the fish.

  9. #9
    Join Date
    Aug 2016
    Posts
    142

    Default

    Thank you marabu!

    That is pretty much it. The priority logic is actually a little more complex than I initially described but what you have done is 99% of what I needed.

    Thanks again for your help. I only started using PDI in the last week (normally just use PRD) so I was struggling with it.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.