Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: [Mondrian] assumed memory leak causes oome in Mondrian 3.1.6

  1. #1
    Pl Guest

    Default [Mondrian] assumed memory leak causes oome in Mondrian 3.1.6

    Dear all,

    Recently we reproducible got an OutOfMemoryException after executing many different queries one after another.
    Looking at the heap dump taken on OOME or after a certain amount of queries showed a suspicious schema object referencing
    multiple hundred megabytes growing on each query. The GC seems not to be able to remove these objects
    although - I thought that - these objects are softly referenced in Mondrian's cache.

    Further investigation showed that the whole object tree is hard referenced from a threadlocal variable. Since we
    run Mondrian within an application server that uses an thread pool there were many of these dangling threadlocals and
    the GC was never able to clean the cache.

    To further stress our hypothesis we removed the threadlocal after each request/query and run our test again. Now we
    could clearly see that the GC freed the heap, the OOME disappeared and a heap dump showed no hard
    referenced objects from Mondrian anymore.

    The problematic threadlocal variable is mondrian.rolap.RolapStar.localAggregations. The AggregationKey object holds a
    reference path to the RolapStar object which references the threadlocal. This reference chain prevents the GC from
    removing the threadlocal (threadlocal referenced by threadlocal's value).

    Is this threadlocal variable intended to be removed after each request?
    Does removing the threadlocal variable after each request yield another behavior?
    Are the query results guaranteed to be the same as before (our tests let us assume so)?

    Kind regards,


    Mondrian mailing list
    Mondrian (AT) pentaho (DOT) org

  2. #2
    Julian Hyde Guest

    Default RE: [Mondrian] assumed memory leak causes oome in Mondrian 3.1.6

    We put cached data into a thread local to ensure that we do not thrash while
    a query is being processed. If a query has a working set larger than main
    memory, thrashing would be a pattern where a query loads the first 20% into
    memory, loads the next 20% which pushes out the first 20%, loads the next
    20% which pushes out the next 20%, references the first 20% which causes the
    first 20% to be reloaded (an expensive operation) and pushes out the third
    20%. I call it thrashing because it is very similar to virtual memory
    thrashing [see <> ]. If thrashing
    occurs, the query might be hundreds or thousands of times slower than if
    adequate memory were available. We'd rather that it fails fast with an OOME
    than thrashing, and the thread-local achieves this.

    That said, you are correct that data should be removed from the thread-local
    after the query has stopped processing. Either
    RolapStar.pushAggregateModificationsToGlobalCache or
    RolapStar.clearCacheAggregations must be called at the end of query
    execution, whether the query succeeds or fails. If that is not happening, it
    is a bug.



    From: mondrian-bounces (AT) pentaho (DOT) org [mailto:mondrian-bounces (AT) pentaho (DOT) org] On
    Behalf Of Pl

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.