Hitachi Vantara Pentaho Community Forums
Results 1 to 6 of 6

Thread: 'Group by' when no rows in stream

  1. #1
    Join Date
    Jan 2007
    Posts
    19

    Default 'Group by' when no rows in stream

    Hello !

    I would like to make a testing request before the execution of my ETL. If the result of this request has no rows we can continue the job, otherwise we stop. So I created a transformation that returns the number of rows in a "copy rows to result", and my job tests this value in a javascript step.

    In my transformation, the request uses data from several databases, so I use a join. (I can't use the count(*) from sql)
    At the end, my stream goes into a "group by" step to calculate count of rows.

    My problem is that if there's no rows in my stream, the "group by step" is not executed and my "copy rows to result" at the end is not executed.

    Is it possible to force the "group by" step to return '0' ? Is there a trick or did I missed something ?

    Thanks
    Attached Files Attached Files

  2. #2
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    This is a weird one.
    Well, what you can always do is add a "Dummy" record using a Row Generator step and just do a -1 on the count.

    That being said... if you want to evaluate the number of rows processed by a certain step, wouldnt' it make more sense to evaluate that in a Job entry?

  3. #3
    Join Date
    Jan 2007
    Posts
    19

    Default

    You mean an SQL entry ?
    I think I can't do that because my query uses 2 different database connections.
    Or maybe you talk about something I don't know yet ...

    thanks, the dummy solution works fine.

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    If you wrap the transformation in a job and configure the logging settings (Transformation settings) you can evaluate the result of the transformation after it completed.
    SQL has nothing to do with that.

  5. #5
    Join Date
    Jan 2007
    Posts
    19

    Default

    I'm very sorry, I wasn't enough clear,
    I just would like to make a test like your tip :

    http://kettle.pentaho.org/tips/?tip=7

    (but instead of reading in a file, I made a query with 2 databases inputs and a merge)

    But I had the issue with the 'group by' when no rows were found.
    I could make that work with a row generator.

    Thanks

  6. #6
    Join Date
    Nov 1999
    Posts
    9,729

    Lightbulb

    That tip shows how you can pass information to the evaluation job entry.
    However, the more trivial way of doing what you want is to use the default auditing options.
    If you have a Table Input step you can modify the Transformation logging settings (CTRL-T, Logging) to capture the amount of row read on "input".
    The parameter lines_input then gives you the result immediately, without any additional code being required, straight out of the box.
    (Even if no rows are read)

    HTH,

    Matt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.