-
Serial feature selection
Dear Weka Users/Developers,
I'm trying to do serial feature selection using attributeselectedclassifier in a nested manner where the first selector is a ranker and the second one is a wrapper (linearforwardselection).
I am assuming that numusedattributes (wrapper) corresponds to k. Therefore, in fixed-set configuration, I expect numtoselect (ranker) not to play any role as long as it is greater than or equal to numusedattributes. However, I am getting very dissimilar results when I change numtoselect. It seems like I am seriously confused about either linearforwardselection or attributeselectedclassifier. Hope it is not both :-).
Thank you very much in advance,
Huseyin
***
weka.classifiers.meta.AttributeSelectedClassifier -E "weka.attributeSelection.InfoGainAttributeEval " -S "weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N -1" -W weka.classifiers.meta.AttributeSelectedClassifier -- -E "weka.attributeSelection.WrapperSubsetEval -B weka.classifiers.functions.LibSVM -F 5 -T 0.01 -R 1 -E DEFAULT -- -S 0 -K 2 -D 1 -G 0.01 -R 0.0 -N 0.5 -M 40.0 -C 1.0 -E 0.001 -P 0.1 -Z -B -model C:\\ProgramData\\Weka-3-8 -seed 1" -S "weka.attributeSelection.LinearForwardSelection -D 0 -N 5 -I -K 5 -T 0" -W weka.classifiers.functions.LibSVM -- -S 0 -K 2 -D 1 -G 0.01 -R 0.0 -N 0.5 -M 40.0 -C 1.0 -E 0.001 -P 0.1 -Z -B -model C:\ProgramData\Weka-3-8 -seed 1
weka.classifiers.meta.AttributeSelectedClassifier -E "weka.attributeSelection.InfoGainAttributeEval " -S "weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 5" -W weka.classifiers.meta.AttributeSelectedClassifier -- -E "weka.attributeSelection.WrapperSubsetEval -B weka.classifiers.functions.LibSVM -F 5 -T 0.01 -R 1 -E DEFAULT -- -S 0 -K 2 -D 1 -G 0.01 -R 0.0 -N 0.5 -M 40.0 -C 1.0 -E 0.001 -P 0.1 -Z -B -model C:\\ProgramData\\Weka-3-8 -seed 1" -S "weka.attributeSelection.LinearForwardSelection -D 0 -N 5 -I -K 5 -T 0" -W weka.classifiers.functions.LibSVM -- -S 0 -K 2 -D 1 -G 0.01 -R 0.0 -N 0.5 -M 40.0 -C 1.0 -E 0.001 -P 0.1 -Z -B -model C:\ProgramData\Weka-3-8 -seed 1
***
-
Hmm. This seems fairly complicated :-) Have you looked at the RankSearch method in the attributeSelectionSearchMethods package?
http://weka.sourceforge.net/packageM...ods/index.html
http://weka.sourceforge.net/doc.pack...ankSearch.html
RankSearch first generates a ranking using an attribute evaluator and then evaluates increasing sized subsets of attributes from the ranked list using a subset evaluator. This effectively makes a linear time algorithm out of methods like the Wrapper, which have runtime that is typically quadratic or worse in the number of attributes.
Cheers,
Mark.
-
Dear Mark,
Thank you very much for the quick answer.
performranking was set to true therefore LFS was canceling the list of the ranker and was doing its own list. When performranking is set to false, the results come as expected.
I am working on this since I assume the initial list of a wrapper can play a significant role in the resulting performance, speed and the number of selected attributes. One of the rankers particulaly stands to create a difference :-)
I am particulary working on LFS because k can be determined explicitly and the comparison among the rankers can be done as function of k. Kindly, I would like to ask another question: which parameter in the original paper does searchtermination correspond to? What was it set to there?
Cheers,
Huseyin