Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: how to stop RandomForest object from saving bagging information

  1. #1
    Join Date
    Aug 2016

    Default how to stop RandomForest object from saving bagging information

    Hi everyone. I'm working with an old java program which uses weka.jar (version 3.7.1). I'm training a random forest classifier using a dataset with approximately 2500 rows of data. I need to be able to save the classifier to an external file so that I can load it later, but with 750 trees the file is around 500Mb. Because of this size, I am running into Java memory problems.

    Looking at the object, I can see that there is a field called m_inBag which holds an array the same size as the number of trees, and each element holds an array the size of the dataset with a 'true' or 'false' (for whether or not that particular piece of training data is in the bag or out of the bag for that tree). I don't think I need this information when I classify new data, so I want to clear this parameter so that my classifier is smaller. Unfortunately I can't access the field directly. The RandomForest.listOptions method gives me the following: -I (num trees), -K (num features), -S (rand num seed), -depth (max depth of trees), and -D (debug mode).

    Could anyone tell me a way to stop the classifier from storing the bagging information? Or how to clear the field before I save the classifier? Thanks so much for the help.
    Last edited by m2oswald; 08-15-2016 at 06:04 PM.

  2. #2
    Join Date
    Aug 2006


    Ah yes, bummer :-) In later versions of Weka the buildClassifier() method in Bagging has

    m_inBag = null;

    at the end of the method, thus saving memory (and size on disk).

    Can you upgrade to a newer version of Weka? Or otherwise, can you add the line above to buildClassifier() and recompile?


  3. #3
    Join Date
    Aug 2016


    Thanks so much Mark, I appreciate your help. I'll try first to upgrade to a newer version.


Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.