Hitachi Vantara Pentaho Community Forums
Results 1 to 12 of 12

Thread: MultiThreaded Cross Validation

  1. #1
    Join Date
    Mar 2016
    Posts
    7

    Default MultiThreaded Cross Validation

    I'm working on performing Weka 10 fold-cross validation but using a multi-threaded approach. I created 10 threads in my Main class, and implemented a runnable interface in another, inside the
    Run()
    method i copied and pasted Weka's
    crossvalidateModel()
    method, removed the for loop since i'm assuming each thread works on each fold and passed the thread_ID as the fold number for Train and TestCV instead of the loop index i.
    My problem is i discovered that the
    EvaluateModel()
    does cumulative sum for the predictions of each fold. Since i'm new to Java Threading, i don't know how to make something like (Reductions in OMP) for that Predicted double array that is returned from the EvaluateModel. or maybe there exist some other workaround.
    Last edited by yasmen; 03-27-2016 at 06:04 PM.

  2. #2
    Join Date
    Aug 2006
    Posts
    1,741

    Default

    Have a look at weka.classifiers.evaluation.AggregateableEvaluation in Weka 3.7. Create your own cross-validation folds, use a separate Evaluation object in each thread, and then use AggregateableEvaluation to combine them into one final eval object.

    Cheers,
    Mark.

  3. #3
    Join Date
    Mar 2016
    Posts
    7

    Default

    Here's how i was trying to make each thread call the evaluate model,
    Instances test=inst.testCV(nf,t_ID);
    try {
    eval[t_ID].evaluateModel(copiedClassifier, test);

    }
    but since threads execution is interleaved, t_ID sometimes starts from 1, giving an exception of indexOutOfBounds. How can i handle that
    Last edited by yasmen; 03-30-2016 at 06:47 PM.

  4. #4
    Join Date
    Aug 2006
    Posts
    1,741

    Default

    Doesn't your array get constructed and initialised with as many evaluation objects as there are folds before your threads are launched? If so, I can't see how you could get an an array index out of bounds.

    Cheers,
    Mark.

  5. #5
    Join Date
    Mar 2016
    Posts
    7

    Default

    No, my array of objects is initialized inside the Run method, i did so to pass the test set for each thread to the evaluateModel() which is found in the weka Evaluation class. My main problem is where to call that evaluateModel() method and whether to rewrite it inside my runnable class or call it from my runnable to get it done from WEKA's Evaluation class. Plz check this:
    Eval_thread myThreads[] = new Eval_thread[2]; for (int i = 0; i < 2; i++) {
    myThreads[i] = new Eval_thread(nb,FinalData, 2, new Random(1));
    myThreads[i].start(i);
    }
    This is how i create the threads (i'm testing on just 2 threads), this is done in my main class, Eval_thread is a runnable class where i do the folds. but not doing the evaluation when i figured out it's cumulative.
    Instances test=inst.testCV(nf,t_ID); eval[t_ID].evaluateModel(copiedClassifier, test);
    This is my Run method, i was trying to call the WEKA's evaluation method passing it the Test set for each thread. but since threads execution is not sequential the array starts with thread 1 not zero.
    I really appreciate your help.

  6. #6
    Join Date
    Aug 2006
    Posts
    1,741

    Default

    This sounds needlessly complicated to me. What is wrong with:

    1. In the main program create an array of size 10 (10 folds) to hold the evaluation object processed by each thread
    2. Create a Runnable class that accepts the original dataset an a fold number. It's task is to create the appropriate train/test fold, build the classifier, evaluate it on the test fold and then store it in the appropriate location of the global array from step 1.
    3. Launch 10 threads to execute the 10 Runnables
    4. When they've all finished, create an AggregateableEvalation to aggregate all the individual Evaluation objects

    Cheers,
    Mark.

  7. #7
    Join Date
    Mar 2016
    Posts
    7

    Default

    The problem in Step 2. when i evaluate on the test fold, i'm actually calling the weka's evaluateModel which returns a double[] array, how can i store that as an Object in the array in step 1. Also when i do the following i got an error Saying: "The first argument must be the class name of a classifier"
    MyEvaluation eval=new MyEvaluation(inst);
    eval.evaluateModel(copiedClassifier, test);
    I know i'm doing something Very wrong here, but i can't figure out the right way

  8. #8
    Join Date
    Aug 2006
    Posts
    1,741

    Default

    Evaluation eval = new Evaluation(trainingData);
    myClassifier.buildClassifier(trainingData);
    for (int i = 0; i < testData.numInstances(); i++) {
    eval.evaluateModelOnce(myClassifier, testData.instance(i));
    }

    All statistics computed by Evaluation are additive and incremental. The main exception are the area under the various threshold curves, where predictions have to be retained in memory. If you need these metrics then call evaluateModelOnceAndRecordPrediction() instead of evaluateModelOnce().

    Cheers,
    Mark.

  9. #9
    Join Date
    Mar 2016
    Posts
    7

    Default

    This is fine, But EvaluatModelOnce() return double, Now when my threads finish executing and exit the Run(), i'm left with only one double valued variable returned from all 10 threads. What does this single Value Means !!

  10. #10
    Join Date
    Aug 2006
    Posts
    1,741

    Default

    It is the predicted value, for the test instance passed in. But you don't need this! The point is that every time you call evaluateModelOnce() the Evaluation object updates its internal statistics - i.e. it maintains a running count of correct/incorrect instances. All other statistics are derived from this. So after all your threads finish you can just aggregate all the individual Evaluation objects together and call the methods on the aggregated Evaluation to get the final measures you need (e.g. overall percent correct etc.).

    Cheers,
    Mark.

  11. #11
    Join Date
    Aug 2006
    Posts
    1,741

    Default

    You just need to iterate over your array of fold-based Evaluation objects, passing each to the aggregate() method in the AggregateableEvaluation object.

    Cheers,
    Mark.

  12. #12
    Join Date
    Mar 2016
    Posts
    7

    Default

    Really, I can't Thank You Enough, My code is working Great . God Bless You Prof.
    Last edited by yasmen; 04-03-2016 at 08:04 AM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.