Hitachi Vantara Pentaho Community Forums
Results 1 to 2 of 2

Thread: IBK Optimal K

  1. #1

    Default IBK Optimal K

    Hi Mark,

    I am trying to verify the K value found in IBK(crossvalidate = True and KNN set to max (number of instances) is correct.

    When using the crossvalidate IBK, I use that on "Use training set" under Classify. Say that IBK optimize KNN gives me a correlation of .5 and optimal K of 10 NN.

    Then I try to verify by switching crossvalidate to false and setting KNN to 10. Then I run a LOOCV under "Cross-validation" but get wildly different results from the correlation value found by optimizing on "Use training set".

    On a similar experiment using CVparmSelection with IBK. I set K 1 100 100. There are 100 instances. I do the CVparmSelection on "Use training set" with number of folds set to number of instances so it is a LOOCV,

    But when I get the K value found by CVParmSelection and then just run plain IBK with that value on "Cross-validation" with a LOOCV, the correlations between the finding the optimum by CVPS and using that optimum are again way different.

    What am I misunderstanding?


  2. #2



    Never mind on that, I figured it out. But, it would be nice to have a toggle on the IBK KNN optimize to have it output the CV'd value too.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Privacy Policy | Legal Notices | Safe Harbor Privacy Policy

Copyright © 2005 - 2019 Hitachi Vantara Corporation. All Rights Reserved.