FP-Growth is a little bit complicated to explain. The best way to understand it is to take the time to read the paper. You can even draw some fp-trees on paper to make sure that you understand how it works. But I can give you a brief description of the main process.
FPGrowth takes as input a transaction database. It outputs all the frequent itemsets (itemsets that appear in at least minsup
FPGROWTH performs two steps.
STEP1: BUILD FP-TREE: First, FP-GROWTH build the FP-TREE. You should consider the FP-TREE as an intermediary product produced by the algorithm. The FP-TREE is a compact representation of the database that is convenient for discovering the frequent itemsets.
STEP2: MINE FP-TREE: Second, FP-GROWTH use the FP-TREE to discover the frequent itemsets. This is performed by recursively calling the FP-GROWTH(tree, alpha) method described in the paper. Initially the method considers itemsets containing a single item. Then, the method is called recursively with each of these items to see if each item can be extended into a frequent itemset containing 2 items. Then, for each frequent itemset containing 2 items, the method will be called recursively to see if each one can be extended for 3 items... then 4 items... and so on. Discovering frequent itemsets is a recursive process.
That is the main idea. But for the details, you need to read the paper ;-)
By the way, I have some Java source code for FP-Growth and many other algorithms. Check my website SPMF
to get it.
Edited 3 time(s). Last edit at 05/03/2011 09:53PM by webmasterphilfv.