The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
How to generate probabilities using data mining?
Posted by: some_math_guy
Date: July 13, 2012 07:42AM


Options: ReplyQuote
Re: How to generate probabilities using data mining?
Date: July 13, 2012 07:00PM

Hi,

Welcome to the forum!

Here is my answer:

Decision trees produce probabilities. Each leaf of a decision tree correspond to a set of training instances that have been classified by the decision tree. If the decision tree can exactly separate the data, all the leaf will always contain instances belonging to the same class (e.g. "buy" or "not buy"winking smiley. This is equivalent to a probability of 0 or 1. But there are also some cases where a decision tree cannot perfectly separate the data given the attributes that you have. If this happens, the probability will be different from 0 and 1. For example, a leaf may contain 55 % of buy and 45 % of not buy. This is actually a probability and you can consider it as a probability.

Second, you could consider using the "Naive bayes classifier". These classifier are built on the Bayesian theorem from the field of statistics. Therefore the result is a probability. But you have to be careful about some underlying hypothesis for this classifier about independency between variables. You can check wikipedia for some information about this classifier: http://en.wikipedia.org/wiki/Naive_Bayes_classifier

Those are the two techniques that I'm thinking about now. There might be some other techniques too..

Best,

Philippe



Edited 1 time(s). Last edit at 07/13/2012 07:01PM by webmasterphilfv.

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.