Hitachi Vantara Pentaho Community Forums
Results 1 to 8 of 8

Thread: Memory usage differences in PDI4...

  1. #1
    Join Date
    Apr 2007
    Posts
    2,009

    Default Memory usage differences in PDI4...

    Hi, we're upgrading from 3 to 4, and all has gone well apart from one specific job - it now runs out of memory (heap) which it didnt in 3.

    So, we reduced the rowset in all the transformations as we have to do in some of our others but that didnt help.

    Couple of questions then:

    1. Could I use javascript between the steps in the job, to gather some stats about memory usage, so I can try and work out which transformations are hogging the memory? Could I even force a GC between transformations?

    2. I'd expect memory to fall when a transformation finishes, and rise again when the next one starts, correct? Especially if you're not copying rows to result etc?

    Any other ideas for investigating this thorny issue?

    Thanks,
    Dan

  2. #2
    Join Date
    Sep 2011
    Posts
    190

    Default

    Is there any reason why you don't give it some more heapspace?

  3. #3
    Join Date
    Apr 2007
    Posts
    2,009

    Default

    It already has 2gb, which should be more than enough for these particular transforms - they are not that large.. I may try a little more, just incase we were on the bubble so to speak.

  4. #4
    Join Date
    Apr 2007
    Posts
    2,009

    Default

    The plot thickens. I think this may be related to logging. I'm trying a run with it turned off now.
    Weirdly it seems when i run a job with "Minimal" logging level it was coming out as basic logging on carte. But "Nothing" seems to be working ok.
    All very odd.

    I'd still like to know what the expected behaviour of memory usage is between transforms in a job though.. I'd surely expect memory to drop once the transform finishes and all it's caches are dumped etc. Or does that only happen when the job finishes now?

  5. #5
    Join Date
    Apr 2007
    Posts
    2,009

    Default

    Ok, we got to the bottom of the issue.
    It all boiled down to a wrapper we use - yajws or something, which is a wrapper for running java programs as services on windows (Carte)
    We upgraded the wrapper, and their memory settings are broken. I was able to use VisualVM to prove that carte was in fact only getting 2G not 1G and thats why it bombed!

    The other side of this, was that this particular transform was right on the 2G limit anyway. I wrote a test transform to do something similar, and managed to crash 3.2.5 as well. So obviously with it being on the limit anyway it was always likely to be worse with 4.2!

    So all ok for now

  6. #6
    Join Date
    Apr 2007
    Posts
    2,009

    Default

    There is further interest though.
    In PDI3.2 there were various calls to .gc() in the code - i.e. various steps or operations would call a garbage collection.
    All of those have been removed in PDI4. So the behaviour is very different. In 3 the server would sit at 20mb usage most of the time. Now in 4 it sits at 500mb, but will go down if you force a GC.
    It makes it tricky to work out if there's a problem or not (As always with gc issues!)

  7. #7
    Join Date
    Nov 2008
    Posts
    271

    Default

    This is a very interesting topic, codek.
    I am not so smart in java, but I would say that removing .gc() from code means that you trust the JVM more than the developer at spotting and removing unused objects in the heap. However JVM GC tuning should be possible.
    Andrea Torre
    twitter: @andtorg

    join the community on ##pentaho - a freenode irc channel

  8. #8
    Join Date
    Apr 2007
    Posts
    2,009

    Default

    No; It doesnt work like that at all. It's never the devs responsibility to control gc.
    As the comments in the code implied - the only reason they we're calling gc was to work around bugs in old vms.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.