PDA

View Full Version : problem with arff file loading



galib.cse
05-15-2008, 04:29 AM
Hello,
I am stuck with my project for having problems with loading arff file. Weka is showing this msg: " '...arff' not recognised as an 'arff data files' file.
Reason: unable to determine the structure as arff (Reason: Java.io.IOException: permanent end of line, read Token [EOL], line 5) "

here i am giving some part of my arff file:


@relation TestFeatures

@attribute Rate {Positive, Negative}
@attribute TestdataID
@attribute QuestionID
@attribute fft1to5
@attribute fft1to6
@attribute fft1to7
@attribute fft2to5
@attribute fft2to6
@attribute DRDT1to5
@attribute DRDT1to6
@attribute DRDT1to7
@attribute DRDT2to5
@attribute DRDT2to6
@attribute DRDTK31to5
@attribute DRDTK31to6
@attribute DRDTK31to7
@attribute DRDTK32to5
@attribute DRDTK32to6
@attribute DRDTK51to5
@attribute DRDTK51to6
@attribute DRDTK51to7
@attribute DRDTK52to5
@attribute DRDTK52to6
@attribute D2RDT21to5
@attribute D2RDT21to6
@attribute D2RDT21to7
@attribute D2RDT22to5
@attribute D2RDT22to6
@attribute D2RDT2K31to5
@attribute D2RDT2K31to6
@attribute D2RDT2K31to7
@attribute D2RDT2K32to5
@attribute D2RDT2K32to6
@attribute D2RDT2K51to5
@attribute D2RDT2K51to6
@attribute D2RDT2K51to7
@attribute D2RDT2K52to5
@attribute D2RDT2K52to6
@attribute SpiralRating
@data
Positive,1,13,75710,1.0678e+005,1.3697e+005,4003,9379,3,3,3,0,1,3,2,2,0,1,3,2,2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
Positive,2,13,58659,78289,1.0113e+005,6083,7662,2,5,6,0,2,2,4,5,0,2,2,4,5,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Positive,3,13,67092,92375,1.1986e+005,13182,18667,0,1,2,1,1,0,1,2,1,1,0,1,2,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Positive,4,13,67884,93210,1.1977e+005,5557,8970,11,18,24,0,2,9,16,22,0,3,9,16,22,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
Negative,5,13,71368,1.0117e+005,1.2467e+005,4359,9956,10,7,24,1,5,8,4,20,1,4,8,4,21,1,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-2
Negative,6,13,68481,94244,1.227e+005,6524,10357,6,8,6,1,1,5,7,4,1,1,5,7,4,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-2
Positive,7,13,68110,91959,1.1946e+005,3015,4628,7,15,17,3,8,6,13,15,4,8,6,13,16,3,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
Positive,8,13,66391,94315,1.2782e+005,538,5018,24,31,20,10,12,23,30,19,10,12,23,30,19,10,12,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
Positive,9,13,65643,86094,1.0923e+005,6966,7636,5,18,28,3,4,4,16,25,3,4,4,16,25,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
Negative,10,13,73053,1.0384e+005,1.2973e+005,3911,9714,8,5,14,1,3,6,4,12,1,2,6,4,12,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-2
Negative,11,13,58754,84842,1.1542e+005,5436,11673,15,18,8,4,4,9,10,1,4,4,10,10,1,4,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1
Positive,12,13,67095,91576,1.2086e+005,3070,5424,5,7,4,2,3,5,7,4,2,3,5,7,4,2,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1

I am sorry to give a liong list of data....but i am stuck with this...Please, help me... about this....

Thank you very much.

regards,
Galib

Mark
05-15-2008, 04:58 PM
Each of your numeric-valued attributes needs to have the keyword "numeric" declared after the name of the attribute in the header of the file. E.g.

@attribute Rate {Positive, Negative}
@attribute TestdataID numeric
@attribute QuestionID numeric
@attribute fft1to5 numeric
@attribute fft1to6 numeric

...

Cheers,
Mark.

galib.cse
05-16-2008, 04:07 AM
Hello Mark,
Thank you for your valuable reply. It really helped me. But still I have got another problem. I was trying to select some attributes using information gain. But this InfoGainAttributeEval just takes the attributes and then do nothing. So, can you please help me. If you need I can send you my arff file. Please, help me in this matter.

Thank you very much again for your help. And thanks in advance.

Regards
Galib

Mark
05-17-2008, 06:57 AM
I'm not too sure what you mean by "it does nothing". The InfoGainAttributeEval, when combined with the Ranker, produces a ranked list of attributes. The only time I can imagine this not producing any output is if the virtual machine runs out of memory - in which case you will see a pop-up from the GUI (if using the developer version of Weka 3.5.x) and an exception at the command line.

What version of Weka are you using, what OS and how big is your data set (# instances and # attributes)? Is there any exception generated in the GUI or on the command line? Also check the log if you are using the GUI.

Cheers,
Mark.

galib.cse
05-17-2008, 01:23 PM
dear Mark,
Thank you again for your answer. My number of attributes is 39 and I have 272 instances now (I'll have to use around 6000 to 6500 instances finally). Now, as you said to check the log I found this messages:

19:17:40: Started weka.attributeSelection.InfoGainAttributeEval
19:17:40: Class must be nominal!

Now what I understood is, I have all the numeric values. But I have to work with this. But do you think should I have to use nominal values for the resulting classes? coz, I have now -3 to +3 this 7 classes. and also i can add that weka is not showing any colors in the attributes which generally it shows for other files. I'm gonna try now with nominal classes. But please, check if you find something for me.
Thank you very much again. ang again, thanks in advance.

Regards,
Galib

galib.cse
05-17-2008, 02:13 PM
Hello Mark,
I have solved the problem by changing my target class into nominal values. but is there any possibility to handle those numeric values as target class. if you know so, please, inform.
Thank you very much for your help.

Best regards,
Galib

Mark
05-18-2008, 03:41 AM
Hi Galib,

Yes, there are some attribute selection methods in Weka that can work with a numeric class attribute. Try CfsSubsetEval, ReliefFAttributeEval and WrapperSubsetEval (using a base classifier that can predict a numeric target).

Cheers,
Mark.