Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Get a File with FTP results not passing to transformation with execute for every row

  1. #1
    Join Date
    Feb 2011
    Posts
    840

    Default Get a File with FTP results not passing to transformation with execute for every row

    Even though my last question remains with no replies, which makes me wonder how dead this forum is lately, I've got a new problem =p

    Running a simple START > Get a file with FTP > Transformation, at first it ran ok, but I got a scenario where there's just too many files, so the transformation aborts with Java Heap error for lacking the memory. So I remembered the "Execute for every input row" option on the transformation options. And then it stopped working. I checked the job running at Rowlevel log and nothing shows up for the transformation - which starts with the "Get files from result" step and, besides using it, copies the info to a Write to Log step... and even so, nothing shows up:

    Code:
    2015/02/20 17:38:51 - Get a file with FTP - =======================================
    2015/02/20 17:38:51 - Get a file with FTP - Nr errors : 0
    2015/02/20 17:38:51 - Get a file with FTP - Nr files downloaded : 19
    2015/02/20 17:38:51 - Get a file with FTP - =======================================
    2015/02/20 17:38:51 - transformation - Starting entry [transformation]
    2015/02/20 17:38:51 - transformation - exec(2, 0, transformation.0)
    2015/02/20 17:38:51 - transformation - Starting job entry
    2015/02/20 17:38:51 - transformation - Opening transformation: [file:///G:/Pentaho/path/transformation.ktr]
    2015/02/20 17:38:51 - transformation - Loading transformation from XML file [file:///G:/Pentaho/path/transformation.ktr]
    2015/02/20 17:38:51 - transformation - Finished job entry [transformation] (result=[true])
    2015/02/20 17:38:51 - transformation - Finished job entry [Get a file with FTP] (result=[true])
    2015/02/20 17:38:51 - transformation - Job execution finished
    2015/02/20 17:38:51 - Spoon - Job has ended.
    2015/02/20 17:51:53 - Spoon - Spoon
    Join us on IRC! =)

    Twitter / Google+ / Timezone: BRT-BRST
    BI Server & PDI 5.4 / MS SQL 2012 / Learning CDE & CTools
    Windows 8 64-bit / Java 7 (jdk1.8.0_75)

    Quote Originally Posted by gutlez
    PLEASE NOTE: No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.

    I'm no expert.Take my comments at your own risk.

  2. #2

    Default

    Hi joao.ciocca,
    I've had very similar problem. What I found out is that when you check "Execute for every input row" option, Get files from result step will still get all files all at once.
    Try to pass path to local file trought parameter/variable - this is how I solved my issue.

  3. #3
    Join Date
    Feb 2011
    Posts
    840

    Default

    Hey Lukasz! I was checking a couple things, thought I'd try using a transformation in between... what I've figured out:

    1- put a transformation in the middle so I could have source filename, destination filename and wildcard with the filename, to copy the new files I've got from FTP to another folder (I can't do this straight from Get FTP because I need to compare files I'm getting with older files already gotten).
    2- used another transformation to list files only in this new folder. Contains only Get File names, write to log and set files in result. One example:
    Code:
    2015/02/20 19:11:51 - Write to log.0 - ------------> Linenr 1------------------------------
    2015/02/20 19:11:51 - Write to log.0 - filename = G:\Pentaho\RAROC\RAROC por Produto\carga SICRS\arquivo\novos\C070054_TIPO03_16512486.ZIP
    2015/02/20 19:11:51 - Write to log.0 - short_filename = C070054_TIPO03_16512486.ZIP
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ====================
    2015/02/20 19:11:51 - Get File Names.0 - Finished processing (I=0, O=0, R=0, W=2, U=0, E=0)
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ------------> Linenr 2------------------------------
    2015/02/20 19:11:51 - Write to log.0 - filename = G:\Pentaho\RAROC\RAROC por Produto\carga SICRS\arquivo\novos\C070054_TIPO03_16531891.ZIP
    2015/02/20 19:11:51 - Write to log.0 - short_filename = C070054_TIPO03_16531891.ZIP
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ====================
    3- then I used the transformation with the Execute for every input... and here's what I've got from write to log right after the Get files from result:
    Code:
    2015/02/20 19:11:51 - Write to log.0 - ------------> Linenr 1------------------------------
    2015/02/20 19:11:51 - Write to log.0 - filename = C070054_TIPO03_16512486.ZIP
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ====================
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ------------> Linenr 2------------------------------
    2015/02/20 19:11:51 - Write to log.0 - filename = C070054_TIPO03_16531891.ZIP
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ====================
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ------------> Linenr 3------------------------------
    2015/02/20 19:11:51 - Write to log.0 - filename = C070054_TIPO03_16531891.ZIP
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ====================
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ------------> Linenr 4------------------------------
    2015/02/20 19:11:51 - Write to log.0 - filename = C070054_TIPO03_16512486.ZIP
    2015/02/20 19:11:51 - Write to log.0 - 
    2015/02/20 19:11:51 - Write to log.0 - ====================
    so... yeah, files are getting messed up when being passed around with Set files in result/Get files from result and Execute for every input row. Guess I'll just change into a way to use as data row instead of files.

    JIRA created.
    Join us on IRC! =)

    Twitter / Google+ / Timezone: BRT-BRST
    BI Server & PDI 5.4 / MS SQL 2012 / Learning CDE & CTools
    Windows 8 64-bit / Java 7 (jdk1.8.0_75)

    Quote Originally Posted by gutlez
    PLEASE NOTE: No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.

    I'm no expert.Take my comments at your own risk.

  4. #4
    Join Date
    Jun 2012
    Posts
    5,534

    Default

    There's more than a single stream between transformations (and jobs).
    Advanced option Execute-for-every-input-row only refers to the general data stream, the one equivalent to the stream directed by hops.
    So long, and thanks for all the fish.

  5. #5
    Join Date
    Feb 2011
    Posts
    840

    Default

    I don't think I understood that, marabu... Every other time I've used the "Execute for every input row", it worked as intended - if I use "Copy rows to result" instead of "Copy files to result", I get the job to work as I wanted, the problem still seems to be with "Copy files to result".
    Join us on IRC! =)

    Twitter / Google+ / Timezone: BRT-BRST
    BI Server & PDI 5.4 / MS SQL 2012 / Learning CDE & CTools
    Windows 8 64-bit / Java 7 (jdk1.8.0_75)

    Quote Originally Posted by gutlez
    PLEASE NOTE: No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.

    I'm no expert.Take my comments at your own risk.

  6. #6
    Join Date
    Apr 2008
    Posts
    4,690

    Default

    Think of having two parallel streams... "Row Results" and "File Results"

    Copy Files to results adds rows to the "File Results" stream.
    Execute for every input row runs the job for each entry in the "Row Results" stream.
    The "Clear Results" option just clears the "Row Results" stream (from what I recall).

    So if you are doing action X for every row in the "Row Results" stream which then does something on every row in the "File Results" stream... it will do it for every file in *THAT* stream.

  7. #7
    Join Date
    Feb 2011
    Posts
    840

    Default

    so, if I want to work something on a transformation, each time for any file, I'll have to use the Copy Rows to Result, instead of Copy Files to Result - or what I thought was only a workaround.

    Seems too weird.
    Join us on IRC! =)

    Twitter / Google+ / Timezone: BRT-BRST
    BI Server & PDI 5.4 / MS SQL 2012 / Learning CDE & CTools
    Windows 8 64-bit / Java 7 (jdk1.8.0_75)

    Quote Originally Posted by gutlez
    PLEASE NOTE: No forum member is going to do your work for you. We will help you sort out how to do a specific part of the work, as best we can, in the timelines that our work will allow us.

    I'm no expert.Take my comments at your own risk.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.