Hitachi Vantara Pentaho Community Forums
Results 1 to 3 of 3

Thread: Weka Wrapper result interpretation

  1. #1
    Join Date
    Jan 2016

    Default Weka Wrapper result interpretation

    I am new to Weka. I ran a random forest algorithm with wrapper method for attribute selection with 5-fold cross validation. However, I get confused on interpreting the result. Basically, the result is as:

    number of folds (%) attribute
    4(80%) 1 length
    2(60%) 2 word_file
    5(100%) 3 contains_pdf
    1(20%) 4 name
    3(60%) 5 contains_url

    and so on.
    What are the numbers and what is the best way to select features?

    Any help is much appreciated. Thanks in advance.

  2. #2
    Join Date
    Aug 2006


    Cross-validation mode in the Select attributes panel is meant to give you an idea of how stable an attribute selection method is under small changes in the distribution of the input data. The numbers tell you, for each attribute, how many folds of the cross-validation it was selected in (this is also expressed as a percentage of the total number of folds). Attributes that are chosen in all (or most) of the folds are likely to be more important than those that aren't.

    If you are going to be using selected features with a classifier, then the best (correct) approach is to wrap an attribute selection technique along with the base classifier in an AttributeSelectedClassifier. This ensures that attribute selection is only ever trained on the training data (i.e. no test data gets used in this process, as would be the case if you performed attribute selection globally before running an evaluation for example).


  3. #3
    Join Date
    Jan 2016


    Thank You Mark. That makes it more clear. I appreciate your response.


Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.