Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: does get files row count cache result?

  1. #1

    Default does get files row count cache result?

    Hi all,

    in a main stream I have to know the number of rows contained into a txt file. I used a "get files rows count" step and a "join rows (cartesian product)" step to get the rows count and then I use this count into the "getIndex" step.
    This figure shows in red square the two steps Name:  JoinRows.jpg
Views: 62
Size:  16.4 KB

    I would ask you for these two hints:
    1. there is another way to get this info (rows count of a file) only once before working with the main stream?
    2. If is not possible to read only once this info, does the "get files rows count" cache its result or does it reads repeately the txt file for each row of the main stream?

    Thank you.
    Gianpiero

  2. #2
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Not sure what you're looking for: If you don't want the rowcount in a field, you can store it in a job variable (from another transformation).
    So long, and thanks for all the fish.

  3. #3

    Default

    Quote Originally Posted by marabu View Post
    Not sure what you're looking for: If you don't want the rowcount in a field, you can store it in a job variable (from another transformation).
    If the main stream has to iterate 1 milion of rows, I want to avoid that the "get files rows count" runs 1 milion times: the entire transformation will take a long long time.

    If your hint (storing the rowcount into a job variable from another transf) will avoid that, I'll get your way, but how to do it? Thank you.
    Gianpiero

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    Get-Files-Rows-Count will be executed only once, given the transformation depicted in your opening post, so no worries.
    So long, and thanks for all the fish.

  5. #5

    Default

    Quote Originally Posted by marabu View Post
    Get-Files-Rows-Count will be executed only once, given the transformation depicted in your opening post, so no worries.
    Ok. That's all fine indeed. The Cartesian product made me scared (I thought as it would have read the file N x 1 times, where N is the number of the records of the main stream generated by the Mapplet_Algo_MapInSpec).
    Thank you very much.
    Gianpiero
    Last edited by gianpieropiccolo; 12-29-2016 at 10:05 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.