Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: Trying to use less instances than offered by the trainingsdata

  1. #1
    Join Date
    Nov 2016

    Default Trying to use less instances than offered by the trainingsdata

    Hello everyone,

    it's quite a special question but maybe there is a nice little hack i can't find to fix my problem.

    I am trying to evaluate the influence of the number of traininginstances on the classification.

    Here is a short example:
    I've got four classes with 30 instances in my trainingsdata.
    The testdata includes 120 instances equally distributed to the classes.
    Now i want to reduce the number of trainingsinstances to evaluate how many instances i will need at the end to get a good trade off between training costs and result.
    So in my first trial i would use 30 instances of every class and get e.g. 95 % Detection Rate
    In my second 29 (randomly selected) with e.g. 94 % Detection Rate
    In my third 28 (randomly selected) with e.g. 90% Detection Rate
    and so on...

    Is there a parameter or option that i can use with weka ?

    Thanks for your help!

  2. #2
    Join Date
    Aug 2006


    There isn't anything specific in Weka to do this I'm afraid. You could probably write some scripting code in Weka (using groovy or jython) that uses Weka's classes to produce a stratified learning curve that would be similar.


Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.