1. Junior Member
Join Date
Jan 2009
Posts
21

## correlation coefficient

Hello All,

In WEKA, the correlation coefficient between 1 and 0, but according to wikipedia http://en.wikipedia.org/wiki/Correlation_coefficient, it's between -1 and 1, while the coefficient of determination, R2 http://en.wikipedia.org/wiki/R_squared is between 1 and 0.

Can anyone confirm for me please, which one is right which is wrong? I know I can compare the formula, but I am not very confident at maths Thank you very much

Wen 2. Pentaho WEKA Architect
Join Date
Aug 2006
Posts
1,741

## Hi Wen,

Weka does report the correlation coefficient. Normally you do get a value that lies from approximately 0 to 1, but this is because it takes a "confused" classifier to get a significantly negative correlation coefficient :-) I.e. the model would have to systematically predict opposite to the true target in order to achieve a negative correlation coefficient. In the case of a nominal binary target, better results could be achieved by flipping the prediction!

Note, you can just take the square of the correlation coefficient to get the coefficient of determination.

Cheers,
Mark. 3. Junior Member
Join Date
Jan 2009
Posts
21

## Hi Mark:

thank you for the reply, but I am still confused about what is a 'confused' classifier.

I made an example as follows. and I don't understand why the coefficient correlation is 1 while, Value2 = -2*Value1 strictly, which I made up in excel for the test.

Thank you very much

value1value21-254-10821-4214-280.21-0.423-64-841-820.04-0.0845-90327-6542-423-46

=== Run information ===
Scheme: weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8
Relation: negCoefficient
Instances: 13
Attributes: 2
value1
value2
Test mode: 10-fold cross-validation
=== Classifier model (full training set) ===

Linear Regression Model
value2 =
-2 * value1 +
0
Time taken to build model: 0.08 seconds
=== Cross-validation ===
=== Summary ===
Correlation coefficient 1
Mean absolute error 0
Root mean squared error 0
Relative absolute error 0 %
Root relative squared error 0 %
Total Number of Instances 13 4. Pentaho WEKA Architect
Join Date
Aug 2006
Posts
1,741

## Hi Wen,

Can you perhaps attach your entire arff file that produced this result? You've given me 7 of the 13 instances and linear regression produces a totally different model on just these 7.

Cheers,
Mark. 5. Junior Member
Join Date
Jan 2009
Posts
21

## Hi Mark:

Here you go, it's an excel file.

cheers,

Wen 6. Pentaho WEKA Architect
Join Date
Aug 2006
Posts
1,741

## Hi Wen,

Oh, since the Wiki removed all the spaces between the numbers in your first message, I got the first seven instances completely wrong anyway :-)

So the target really is -2 * value1. The correlation is computed between the *predicted* and actual target values. In the case of the linear model learned, these are identical (it is a perfect model for the data :-)), so the correlation coefficient is 1.0.

Cheers,
Mark. 7. Junior Member
Join Date
Jan 2009
Posts
21

## Hi Mark:

OIC! I thought it was the correlation coefficient between value1 and value2. So really, it is almost impossible to get a negative correlation coefficient.?

Thank you a lot

Wen 8. Pentaho WEKA Architect
Join Date
Aug 2006
Posts
1,741

##  Originally Posted by riverculture Hi Mark:

OIC! I thought it was the correlation coefficient between value1 and value2. So really, it is almost impossible to get a negative correlation coefficient.?

Wen
Yes, basically. A classifier that can't fit the concept at all will probably get essentially zero correlation (it might be slightly negative due to chance effects). What I meant earlier by the "confused" classifier would basically be a model that went out of it's way to predict in the opposite direction from the actual class. This would give you a significantly negative correlation coefficient.

Cheers,
Mark. 9. Junior Member
Join Date
Jan 2009
Posts
21

## Wen #### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•