The Data Mining Forum                             open-source data mining software data mining conferences Data Science for Social and Behavioral Analytics DSSBA 2022 data science journal
IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php
 
DATA MINING in FORUM
Posted by: Marcus P.
Date: April 26, 2011 11:50PM

Hello,

I would like to use data mining algorithms to do some data mining on messages from internet forum. But I don't know which algorithm I should use. I want to do this for a school project.

My question is : Which algorithm I should use?

I want to discover patterns in messages from forums. I would like for example to filter spam messages, to detect messages containing insults or other inappropriate behaviors. I think that data mining would be very useful for doing this automatically.

Should I use bayesian networks? Or what else?

THanksS

Options: ReplyQuote
Re: DATA MINING in FORUM
Posted by: Symboldrive
Date: April 29, 2011 04:34AM

HI,

There is a lot of data mining techniques that you could use. Some researchers have tried in the past some approach like latent semantic analysis for automatically evaluating comments in product feedback to determine if they are positive or negative. For Spam filtering, bayesian networks are used in some e-mail systems and had very good success. Of course you could also use other kind of approach like an expert system with many hardcoded rules about what is a good or bad messages in a forum, but that would not be data mining. I guess that for your school project you need to use a data mining technique, so I think that you should look about the work on latent semantic analysis. Good luck in your project.

SymbolDrive

Options: ReplyQuote
Re: DATA MINING in FORUM
Posted by: Max
Date: April 29, 2011 06:04PM

Hey,

IMO, this is too much work for a school project for a single course. You should try to do something smaller. Or another solution would be to do a simulation. For example, ou could apply some data mining algorithms on some data that you have generated by yourself to show the results. Or if you have the opportunities maybe you should choose another data mining topic like clustering that would be easier to make.Good luck!

Options: ReplyQuote


This forum is powered by Phorum and provided by P. Fournier-Viger (© 2012).
Terms of use.