Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: PDI 4.0: Total length of all fields of the input stream of a step in a transformation

  1. #1
    Join Date
    Jul 2010
    Posts
    12

    Default PDI 4.0: Total length of all fields of the input stream of a step in a transformation

    Hallo,

    as an internal service provider we want to internaly "sell" our ETL activities to our departments using the total amount of data transfered to the target databases.

    The simple thing I want to do is to use all the fields we send to the "Table Output"-steps in the "data stream" of a transformation, calculate the total length of these fields and sum these over all rows.

    Of course I want to do this in a generic way, so that I have not to change the calculation every time a new field is added to the result.

    Sadly I have no idea how to loop over all fields of the input stream in a javascript-step or in a "User Defined Java Class" (or anywhere else)?

    Any hints?

    Kind Regards

  2. #2
    Join Date
    Sep 2009
    Posts
    810

    Default

    Hi Roman,

    recently I had a similar issue when I was trying to find a generic approach for logging bad rows.

    Check out this post for details:
    http://type-exit.org/adventures-with...entaho-kettle/

    The "flatten source fields" step loops over all fields in the row. Maybe something like that is going to help you.

    Edit: another thing that would work is copying your rows to a "Metadata structure of stream" from the Utility section, then following up by the "group by" step to find maximum of "position". I guess that would be the more elegant thing to do

    Cheers

    Slawo
    Last edited by slawomir.chodnicki; 09-01-2010 at 09:40 AM.

  3. #3
    Join Date
    Jul 2010
    Posts
    12

    Thumbs up

    Hi Slawo,

    thank you a lot! The method get InputRowMeta() is the secret. My solution now is as follows:


    var
    meta = getInputRowMeta();

    var


    ACCOUNTING = "";

    for


    (var i=0;i<meta.size();i++)
    {
    ACCOUNTING+=meta.getString(row, i);
    }

    var


    ACCOUNTING_LENGTH = ACCOUNTING.length;


    Kind Regards
    Roman


  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Our KFF project will feature a Rejects step (should arrive today) that does pretty much what is described here.

  5. #5
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    This step:


  6. #6
    Join Date
    Jul 2010
    Posts
    12

    Default

    Hallo Matt,

    This new step looks interesting, but I did not find out how it helps me to solve my problem of calculating the size of data transmitted to a target database. I think this step could be part of a solution to the problem of Slawo. Am I wright?

    Kind regards,

    Roman

  7. #7
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    This step not only concatenates the fields, it also reports on an optional key in the row & captures the error handling fields if needed.
    I wouldn't know why the calculation of the "size of the transmitted" matters in the slightest. Why would you need that?

  8. #8
    Join Date
    Jul 2010
    Posts
    12

    Default

    Hallo Matt,

    we are an internal service provider for the other departments. The other departments have to "pay" for our services internally. Other services in our company are payed e.g. by user, by MB of size of the database, or (like we want to do) by transfered data. Our goal is that our costs for production (like servers, licenses, administration, failure handling and so on) are payed by the departments and we need a kind of measurement to split our cost to our customers.

    Kind Regards,
    Roman

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.