Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Corrupt rows after javascript step

  1. #1
    Join Date
    Oct 2007
    Posts
    255

    Default Corrupt rows after javascript step

    I am using the following code in a javascript step to fill gaps in some data:

    Code:
     currow = createRowCopy(getOutputRowMeta().size());
    
    if (prevrow == null)
    {
      prevrow = currow;
      putRow(currow);
    }
    else
    {
      if (   !(currow[0] == prevrow[0] && (currow[2] - prevrow[2] <= 60)) // same ticker, but previous row is more than 1 minute lagged
          && !(currow[0] != prevrow[0] && (prevrow[2] - currow[17] == 0)) // different ticker, previous row's time not equal to max time for
                                                                          // the running instance of this transformation
         )
      {
        var timeval = floor(prevrow[2]) + 60;
    
        while (   (currow[0] == prevrow[0] && (currow[2] - timeval > 0)) // loops until all holes prior to the current row are filled in
               || (currow[0] != prevrow[0] && timeval <= currow[17])
              )
        {
          prevrow[2] = java.lang.Long(timeval);    // prevrow gets populated at the very bottom.
          putRow(prevrow);                         // The only value that needs to be changed is
          timeval    = floor(timeval) + 60;        // the unixtime value
        }
    
        if (currow[0] == prevrow[0]) // now that gaps before current row are fixed, if the current row has the same ticker
        {                            // as the previous row, then set the session open rate equal to the previous row's
                                     // session close, and related operations
    //      println(prevrow[3] + ", " + prevrow[4] + ", " + prevrow[5] + ", " + prevrow[6] + ", " + prevrow[7] + ", " + prevrow[8]);
    //      println(currow[3]  + ", " + currow[4]  + ", " + currow[5]  + ", " + currow[6]  + ", " + currow[7]  + ", " + currow[8]);
          currow[3] = prevrow[4];             // open = previous close
          currow[5] = currow[4] - currow[3];  // calculate the delta: close - open
          currow[6] = prevrow[7];             // log(open) = log(close)
          currow[8] = currow[7] - currow[6];  // log(close) - log(open)
    //      println(prevrow[3] + ", " + prevrow[4] + ", " + prevrow[5] + ", " + prevrow[6] + ", " + prevrow[7] + ", " + prevrow[8]);
    //      println(currow[3]  + ", " + currow[4]  + ", " + currow[5]  + ", " + currow[6]  + ", " + currow[7]  + ", " + currow[8] + "\n");
          putRow(currow);
        }
        else
        {
          putRow(currow); // While it's obviously redundant, if I put a single putRow at the bottom, the changes in the 'true'
                          // branch fall out of scope -- not sure why
        }
      }
    
      prevrow     = currow;         // All of these are presets meant to fill a gap while being very conservative in 'guessing'
      prevrow[03] = prevrow[4];     // what correct values should be during the missing times.  Many of the missing rows will
      prevrow[05] = 0;              // just be times over a weekend or whatever.  Others will be true gaps in the data that may
      prevrow[06] = prevrow[7];     // be filled in later.
      prevrow[08] = 0;
      prevrow[09] = prevrow[4];
      prevrow[10] = prevrow[4];
      prevrow[11] = 0;
      prevrow[12] = 0;
      prevrow[13] = java.lang.Long(0);
      prevrow[14] = 0;
      prevrow[15] = 0;
      prevrow[16] = 0;
    }
    
    trans_Status = SKIP_TRANSFORMATION;
    There are two other smaller scripts for filling in the last few rows if the last ticker in the set is missing data, and an init script for creating variables used in the other two scripts. If you feel they're relevant, they're in the attached example.

    Now, open the attached example transformation, modify the file input step to point to the text file in the archive, then preview the results from the javascript step and take note of the unixtime and ticker field on the first 12 or so lines -- not specific values, just generally what is in the unixtime field, and which ticker is represented.

    Now highlight the dummy step and preview it, taking note of the first 12 or so lines. The unixtime field is now messed up.

    This is the behavior I'm baffled about. Anyone have a comment? Is this a bug, or am I doing something wrong?

    For reference, I've tested this on 3.0.4, 3.1, 3.2, and the latest build that passed unit tests.

    -Brian
    Attached Files Attached Files
    Last edited by Phantal; 07-29-2009 at 11:16 PM.

  2. #2
    DEinspanjer Guest

    Default

    When you assign prevrow = currow, you are just copying the reference to the array. When you go changing the elements in that array, you are in fact changing an array of data that has already been passed on to the next step and is potentially being modified there as well. I believe this might be the source of your problem.

    There is a method in the JS step to specifically copy an array. find and use that instead and see if it helps.

  3. #3
    Join Date
    Oct 2007
    Posts
    255

    Default asdf

    Quote Originally Posted by DEinspanjer View Post
    When you assign prevrow = currow, you are just copying the reference to the array. When you go changing the elements in that array, you are in fact changing an array of data that has already been passed on to the next step and is potentially being modified there as well. I believe this might be the source of your problem.

    There is a method in the JS step to specifically copy an array. find and use that instead and see if it helps.

    Thank you. I figured that out about 10 minutes ago and fixed it. I just wish there was a relatively straightforward way to reference rows by column name in a copied row.

    I didn't find a method for making copies of a row I have a reference to (eg, row.clone() throws an exception that the Kettle devs seem to go out of their way to give a workaround for), so instead I'm just doing something of the form:

    Code:
      putRow([prevrow[0], prevrow[1], ..., ]);
    Plus, I cleaned it up quite a bit so it's more readable =)

    -Brian

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.