Hi everyone, I just wanted to give a pointer to Dataconda (
available here), a new software (free for teaching and research purposes) that analyzes relational data. It is very useful in the phase of data preparation, when you have to build a "mining table" from a database. Most of the research on data mining is on how to analyze a flat table, but very little has been done on how to build the table from the raw data, which in 99% of the cases is in a relational DB.
Say that you have a table "Customers" and a table "Purchases" such that each customer has 0:n purchases. Say that customers are divided into "good" and "bad" customers and that you want to classify them accordingly. Dataconda allows you to find, for example, that a customer's goodness is determined by the number of times that she bought products that were popular among young customers
without you having to build this attribute or to even suspect that this could be a good predictor.
I have a conference paper on this topic and I would love to engage more researchers in this area. I'm also using Dataconda in my business analytics course and I can share material and slides.