Re: Difference between equivalence class and sequence prefix tree
Date: April 19, 2021 09:52PM
Dear Martin,
To design an algorithm for pattern mining, a key issue is to represent the search space in such way that it can be explored without missing any patterns and that the same pattern will not be visited twice (that would waste time).
Both SPAM and SPADE are similar but they do have some differences.
First, a similarity is that both SPAM and SPADE use a vertical database representation (the ID-lists). In SPAM, the difference is that the ID-lists are represented as bit vectors, while in SPADE, they are represented as list of integers. However, there also exist a version of SPADE called BITSPADE, which represents ID-lists as bit vectors just like SPAM.
So what is the main difference between BitSPADE/SPADE and SPAM?
The main difference is how they generate candidate patterns.
SPAM will recursively try to append single items to each frequent pattern to make larger patterns. For this, SPAM performs two types of extensions : i-extensions and s-extensions.
SPADE on the other hand will try to merge two frequent patterns to make larger patterns.
The approach of SPADE is better because there will be less combinations to try.....
For example, if you have two frequent patters like ABC and ABD, SPADE will combine them to make ABCD or ABDC for example.
But SPAM, maybe will try to combine ABC with many single items to generate many possibilities like:
ABCA,
ABCB
ABCC,
ABCD,
ABCE
ABCF
Maybe some of these possibilities are useless....
The idea in SPADE is to only combine two patterns if they are in the same equivalence class.
In SPAM, there is no concept of equivalence class. SPAM will just try to combine many single items with each pattern and this may make a lot more candidates...
That is the main idea. Hope this is clear.
You can also check our paper about CM-SPADE and CM-SPAM where we improved them with some techniques called CMAP to further reduced the search space and make both algorithms more efficient.
Best regards
Edited 1 time(s). Last edit at 04/19/2021 09:52PM by webmasterphilfv.