Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: loop in a job

  1. #1

    Default loop in a job

    Hi

    I have a job with several transformations on a loop (it has to run continuously). Because of an unexpected error, the job has stopped and in the log file there were thousands of entries with the message "job entry had finished". Is it so, that a job entry does not finished, until the job has finished?

  2. #2
    Join Date
    May 2006
    Posts
    4,882

    Default

    It's somewhere in the spoon documentation: you can write loops but unless you're careful you will end up with aborting on "out of memory" errors.

    If you want to loop endlessly you may have to find another way, like an inittab entry or so in UNIX.

    Regards,
    Sven

  3. #3
    DEinspanjer Guest

    Default

    I had the exact same problem.
    A Kettle job keeps lots of meta-data about the sub-jobs and transformations it has invoked. When they started, when they ended, their log streams, etc.

    If you run a sub-job or transformation in a loop, and that loop gets executed thousands of times, you end up with an excessive log and you might even (as in my case) eventually run out of memory.

    Hopefully in the future, some work can be put into figuring out a better way of handling that, but for now, go outside of Kettle and design things such that some external process such as a cron job invokes kitchen. That way everything is shut down and cleaned up in between executions.

  4. #4

    Default

    Once again, thank you for your answers. I am already using a cron job.

  5. #5
    Join Date
    May 2008
    Posts
    9

    Default related problem

    I have a similar problem. Theres significant overhead involved in starting a job (via e.g. kitchen) and if i need to job to run every half-a-minute or so then the overhead is simply to big.

    I had the idear to build my job to iterate x-times and then close down; but it would seem (according to this thread) that kettle doesn't keep resources allocated but instead re-allocates every time. This would mean that i would have the overhead in any case. Furthermore i haven't been able to make the loop work.
    I've tried to set a variable and then evaluate and de-increment variable in a subsequent javascript job step, but it doesn't work.

    What is the best solution for running a ETL jobs at high frequency?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.