The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
How to interpret Weka Logistic Regression output?
Posted by: Anton
Date: October 02, 2013 05:21AM

Please help interpret results of logistic regression produced by weka.classifiers.functions.Logistic from Weka library.

I use numeric data from Weka examples:

@relation weather

@attribute outlook {sunny, overcast, rainy}
@attribute temperature real
@attribute humidity real
@attribute windy {TRUE, FALSE}
@attribute play {yes, no}

@data
sunny,85,85,FALSE,no
sunny,80,90,TRUE,no
overcast,83,86,FALSE,yes
rainy,70,96,FALSE,yes
rainy,68,80,FALSE,yes
rainy,65,70,TRUE,no
overcast,64,65,TRUE,yes
sunny,72,95,FALSE,no
sunny,69,70,FALSE,yes
rainy,75,80,FALSE,yes
sunny,75,70,TRUE,yes
overcast,72,90,TRUE,yes
overcast,81,75,FALSE,yes
rainy,71,91,TRUE,no

To create logistic regression model I use command:
java -cp $WEKA_INS/weka.jar weka.classifiers.functions.Logistic -t $WEKA_INS/data/weather.numeric.arff -T $WEKA_INS/data/weather.numeric.arff -d ./weather.numeric.model.arff

Here the three arguments mean:

-t <name of training file> : Sets training file.
-T <name of test file> : Sets test file.
-d <name of output file> : Sets model output file.

Running the above command produce the following output:

Logistic Regression with ridge parameter of 1.0E-8
Coefficients...
Class
Variable yes
===============================
outlook=sunny -6.4257
outlook=overcast 13.5922
outlook=rainy -5.6562
temperature -0.0776
humidity -0.1556
windy 3.7317
Intercept 22.234

Odds Ratios...
Class
Variable yes
===============================
outlook=sunny 0.0016
outlook=overcast 799848.4264
outlook=rainy 0.0035
temperature 0.9254
humidity 0.8559
windy 41.7508


Time taken to build model: 0.05 seconds
Time taken to test model on training data: 0 seconds

=== Error on training data ===
Correctly Classified Instances 11 78.5714 %
Incorrectly Classified Instances 3 21.4286 %
Kappa statistic 0.5532
Mean absolute error 0.2066
Root mean squared error 0.3273
Relative absolute error 44.4963 %
Root relative squared error 68.2597 %
Total Number of Instances 14

=== Confusion Matrix ===
a b <-- classified as
7 2 | a = yes
1 4 | b = no


Questions:

1) First section of the report:

Coefficients...
Class
Variable yes
===============================
outlook=sunny -6.4257
outlook=overcast 13.5922
outlook=rainy -5.6562
temperature -0.0776
humidity -0.1556
windy 3.7317
Intercept 22.234

1.1) Do I understand right that "Coefficients" are in fact weights that are applied to each attribute
before adding them together to produce the value of class attribute "play" equal to " yes"?

2) Second section of the report:

Odds Ratios...
Class
Variable yes
===============================
outlook=sunny 0.0016
outlook=overcast 799848.4264
outlook=rainy 0.0035
temperature 0.9254
humidity 0.8559
windy 41.7508

2.1) What is the meaning of "Odds Ratios"?
2.2) Do they all also relate to class attribute "play" equal to " yes"?
2.3) Why value of "outlook=overcast" is so much bigger then value of "outlook=sunny"?

3)

=== Confusion Matrix ===
a b <-- classified as
7 2 | a = yes
1 4 | b = no

3.1) What is the menaing of Confusion Matrix?


Thanks a lot for your help!

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.