Hitachi Vantara Pentaho Community Forums
Results 1 to 7 of 7

Thread: Kettle Job(s) cause memory leak w/ BI Platform

  1. #1

    Default Kettle Job(s) cause memory leak w/ BI Platform

    There is a pretty substantional memory leak problem when trying to run jobs within the Pentaho BI Platform. The issue seems related, at least in this test, based on Pentaho BI 1.2.0.GA on Jboss 4.0.5.GA w/ updated libs for Kettle 2.5.0 and commons-vfs. The test used no commons-vfs code, and pratically no actual Kettle implementation (just using a START step).

    Using an empty job (just a START step) with no parameters, able to bleed memory out of the Pentaho BI Platform. The Job is called through the ViewAction REST interface.

    Since this related to the KettleComponent opposed to Kettle itself, not sure where to post, but starting here.

    Attached is a sample .xaction and .kjb to reproduce the problem. 50 invocations should show roughly 100mb memory leak.

    Note: there are similar memory issues for transformations, but I have not tested transformations as controlled as I have for jobs.
    Attached Files Attached Files

  2. #2
    Join Date
    Nov 1999


    The leak was not just in the platform, it was in Kettle jobs.
    It's been fixed in version 2.4.0

  3. #3


    I'm using Kettle 2.5.0.

    Single-thread invocations cause more problems the more. Concurrent (2-5 thread) calls cause a lot more memory problems.

    Looking at various memory profilers, I can't see any specific objects that are causing problems (I'm no expert though). I can say all the memory issues seem to be in Tenure (old) heap memory and are not getting gc'd effectively, or at all.

  4. #4
    Join Date
    Nov 1999


    Well, this might be the appropriate timing since my colleague Nick is going to debug the KettleComponent for another problem with jobs and variables.
    I'll forward him this thread.

    In the 2MB range, there is plenty of stuff that can fit, but perhaps it's simply the KettleVariables that are not being de-allocated when the KettleComponent thread is done. (something like that)

    All the best & thanks for the feedback,


  5. #5


    Just addendum information.

    I tested session-timeouts in the web.xml (since using the ViewAction) and those do not seem to have an impact - memory issues exist way beyond session timeouts. Confirmed watching various profilers as the sessions disappear but memory still leaking.

    ServiceAction instead of ViewAction tested, and also has similar memory problems.

    KettleComponent: commenting out the job.execute() line had a significant impact. Pratically no memory issues (didn't do anything, but still). Looks like the itself is fine - something in Kettle's job.execute() method flow is where the memory leak seems to be.
    Last edited by dhartford; 06-06-2007 at 10:17 AM.

  6. #6

    Default memory leak found - endprocessing()

    Looking at the Job class, I noticed it extends Thread.

    Looking at the run() method, it has:
    Result result = execute(); // Run the job
    endProcessing("end", result);

    In KettleComponent, it calls job.execute(), but not endProcessing("end",result). Adding that resolved the memory issues. Tested across the 50 invocation test, all good.

    Not sure if there was a reason this was not included, but identify potential memory leak fix. Do not have JIRA access, so go ahead and file away.

    Attached Files Attached Files
    Last edited by dhartford; 06-06-2007 at 11:49 AM.

  7. #7
    Join Date
    Nov 1999


    Note: "endProcessing()" closes and removes the StringAppender of our Log4J wrapper LogWriter.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.