Re: Class association rule induction
Posted by:
Hoa Vu
Date: February 22, 2014 11:07PM
Hi Philippe,
Thank you very much for your reply. These 2 steps you suggest would definitly fit my needs. However when I working on that, I found another issues. That is, the itemsets I am interested in are very rare in the dataset (~200/55k transactions)!
If I lower the minsup to get these rules, it still costs me alot, even FP-Growth. When I set minsup = 1% (mincount = 550 > 200), the output contains 6855736 frequent itemsets. So I think would try another way:
step 1: Divide the dataset in two part, one all the transactions contains interested itemsets (D1), the other does not (D2). Then find all the association rule (X -> Y) in D1 that fit my need.
Step 2: Calculate the frequency of X in D2 to calculate the confidence of rule (X -> Y). There would be not so many X.
==> Now my problem becomes finding frequency of a particular itemset in a very large dataset. If I am not wrong your function called "algorithms for building, updating and querying an Itemset-Tree" would solve this problem?
Thank you very much!
Hoa Vu