Hitachi Vantara Pentaho Community Forums
Results 1 to 11 of 11

Thread: Jobs stop for no reason

  1. #1

    Default Jobs stop for no reason

    My client has a Windows Server 2012 R2 box with SQL Server 2016. We use PDI 7. The loads are run from Windows scheduler. It happens quite often that PDI stops somewhere in a job for no apparent reason. The lot of the transformation shows nothing wrong, just that it ends with STOP instead of END. This causes the job to stop prematurely and not doing all the transformations. This is one of the main reasons why my client now abandons PDI and goes (back) to SSIS. Who else has this issue and knows what causes it? I think it's [$#&^$#^#&^$} that it's so unreliable.

  2. #2
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    I used PDI to scrape data from a web site and place it into a MS SQL database daily for over two years straight, with no issues.
    In that time, I didn't have an unscheduled "STOP" at all.

    Perhaps it has to do with your design, or conflict on the table?

    If your client has already switched to SSIS, what does it matter?

  3. #3

    Default

    How much data are we talking about with you?
    The cient just chose to go to SSIS after months of this issue. For me it is if I can decently promote PDI when it is this unstable.

  4. #4
    Join Date
    Apr 2008
    Posts
    4,696

    Default

    About 200 MB of data a day, running once a day. I couldn't schedule it to run in smaller batches.

    You keep saying that PDI is unstable, but you seem to ignore that a several large companies (pretty sure I saw Mozilla as a client in their list) use it in production.

  5. #5
    Join Date
    Aug 2016
    Posts
    290

    Default

    I use it for statistics, populating large amount of data every 5 minutes, 1 hour and aggregating to daily statistics once after midnight.

    Runs real stable, most of the time if something goes wrong it is some bug which is caused by the developer.

  6. #6

    Default

    gutlez, thank you for the useless comment. I can't do anything with the "(t)here it runs perfectly" smugness. Here it's UNstable and I am looking for people that make a constructive contribution to finding the cause of this unstableness and how to solve it. Also I doubt if Mozilla runs on SQL Server and Windows Server. I know I wouldn't if I had the choice.

    So who experienced this behaviour and found a solution? I will be grateful for it.

  7. #7
    Join Date
    May 2016
    Posts
    282

    Default

    Really? You might have a complex structure that it's impossible to analyze just from a three line description in a forum. @gutlez and @Sparkles have just told you that it can handle the amount the type of data you are trying, but they can't help you because there's no enough information.
    OS: Ubuntu 16.04 64 bits
    Java: Openjdk 1.8.0_131
    Pentaho 6.1 CE

  8. #8
    Join Date
    Aug 2016
    Posts
    290

    Default

    I managed to reply this in a wrong thread:

    If something isn't behaving as expected, I suggest putting some log steps between your jobs and inside transformations.


    Just log a sequence, for example "Debug 1", "Debug 2"... etc. This way you can see between which two log outputs it stops/fails. Then you can narrow it down by moving the debug log steps closer and closer to each other. I usually set the log level for "ERROR" on these types of debug logs, in case you forget to change the debug level when executing.

  9. #9

    Default

    Thanks Sparkles for the suggestion. You mean these as steps in the transformation?

    What I see it often stops at the end of the transfomation. I appears as if Pentaho fails to process the result properly. Any experience with that?

  10. #10
    Join Date
    Aug 2016
    Posts
    290

    Default

    Yes, I have experience with that.

    I have seen kettle stops and hangs forever. This was caused by a lot of rows queuing up in front of a wait step. I simply had a step waiting until another step finished, but it caused a "traffic jam" (if you compare rows between steps like cars on the roads), so everything just came to a stand-still. If that's the case, you need to replace the wait step in such a way that rows are constantly moving.

    With debug/log steps, I mean between jobs (I thought this was a problem about jobs stopping). But also put it inside transformations. Inside transformations, you want to limit the amount of rows to be logged, since you basically just want a "signal" (log print out) that the program has reached that step. Alternatively add some logic to capture the last row and log that.

    Just put log steps between all jobs and transformation steps where you think it might stop. Then you can see in log file something like this:

    "Debug 1"
    "Debug 2"
    "Debug 3"

    If you had a step printing "Debug 4" but this is not in the log, you know that somewhere between log 3 and log 4 it stopped. Then you can put debug log 3 and 4 closer to each other (narrowing the gap). In the end, you will have a pretty good idea where the logging (and therefore execution) stops.

    I find that debugging in transformation level (if problem can be re-created in spoon) is quite rich, but debugging on job level is really poor. I really miss breakpoints.
    Last edited by Sparkles; 04-26-2018 at 06:53 AM.

  11. #11
    Join Date
    May 2016
    Posts
    282

    Default

    Quote Originally Posted by Sparkles View Post
    I find that debugging in transformation level (if problem can be re-created in spoon) is quite rich, but debugging on job level is really poor. I really miss breakpoints.
    I haven't played with this, so it might not work with your purpose, it's in my permanent to-check list for when I have some time. Have you taken a look to Matt Caster's PDI plugin for Test Unit? I think it's available from the marketplace, here's the github page: https://github.com/mattcasters/pentaho-pdi-dataset
    It might provide some functionality for debugging jobs, for now I debug using the text output step in strategical points inside the transformations called by the job.
    Regards
    OS: Ubuntu 16.04 64 bits
    Java: Openjdk 1.8.0_131
    Pentaho 6.1 CE

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.