Hitachi Vantara Pentaho Community Forums
Results 1 to 5 of 5

Thread: Clean Up Memory after a csv input loading into database

  1. #1
    Join Date
    Oct 2014
    Posts
    7

    Default Clean Up Memory after a csv input loading into database

    Hi Guys,

    I am loading a csv file using "csv input" into a database but what I observe is that the memory usage by spoon is continuosly growing over the time up to the end of the transformation. At the end of transformation the memory appear not cleaned up. My file is about 1.3GB for 900k rows 90 columns

    According to the this post of Matt: http://www.ibridge.be/?p=202 my kettle.properties contains this set up:

    KETTLE_MAX_LOG_SIZE_IN_LINES=1
    KETTLE_MAX_JOB_TRACKER_SIZE=1
    KETTLE_CARTE_OBJECT_TIMEOUT_MINUTES=1
    KETTLE_MAX_JOB_ENTRIES_LOGGED=1
    KETTLE_STEP_PERFORMANCE_SNAPSHOT_LIMIT=1

    I am using kettle version 5.1.0

    Can someone explain how force the cleanup of memory at the end of a each transformation ? thanks in advance

  2. #2
    Join Date
    Oct 2014
    Posts
    7

    Default test done on Kettle 6.0 currently available for downlaod

    I checked the same transformation on kettle version 6.0 and I still see the memory not cleaned up at the end of the transformation.

    it seems that javavw.exe still remains using the memory at the end of transformation.

    Where am I wrong ?

    Quote Originally Posted by dmas View Post
    Hi Guys,

    I am loading a csv file using "csv input" into a database but what I observe is that the memory usage by spoon is continuosly growing over the time up to the end of the transformation. At the end of transformation the memory appear not cleaned up. My file is about 1.3GB for 900k rows 90 columns

    According to the this post of Matt: http://www.ibridge.be/?p=202 my kettle.properties contains this set up:

    KETTLE_MAX_LOG_SIZE_IN_LINES=1
    KETTLE_MAX_JOB_TRACKER_SIZE=1
    KETTLE_CARTE_OBJECT_TIMEOUT_MINUTES=1
    KETTLE_MAX_JOB_ENTRIES_LOGGED=1
    KETTLE_STEP_PERFORMANCE_SNAPSHOT_LIMIT=1

    I am using kettle version 5.1.0

    Can someone explain how force the cleanup of memory at the end of a each transformation ? thanks in advance

  3. #3
    Join Date
    Oct 2014
    Posts
    7

    Default

    I add more information: if I run a transformation with only to steps:

    1) read csv
    2) dummy

    the file is read and the memory used remains constant. As conequence it seems that the problem starts when I start writing in the database:

    1) read csv
    2) insert into database

  4. #4
    Join Date
    Nov 1999
    Posts
    9,729

    Default

    Java VM doesn't clean up or garbage collect memory unless it absolutely has to. Also make sure your transformation is gone from Carte.
    Finally, we've had database JDBC drivers with memory leaks so make sure to try without loading into a database.

  5. #5
    Join Date
    Oct 2014
    Posts
    7

    Default

    Thanks Matt for your quick reply. As far as I know Carte is used to distribute and coordinate job execution on a Kettle Cluster but curently I would like to run my job I on a single machine 8GB.


    Even running my simple job using command line using kitchen instead of spoon the memory still appear in use at the end a transformation when writing by OJDBC. So no way to overcome this issue on a single node ?

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.