Hello Vivek,
I tried to follow the paper for implementing SPAM. In my implementation there is a few differences due to optimization and some design decisions. But the main idea of the S step is the same.
I will explain to you the
createNewBitmapSStep method in the
Bitmap class, which performs the main part of the S-Step.
The S-STEP is shown in figure 4 of the paper. It consists of doing a kind of logical AND (not exactly a logical AND) with two bitmaps : the bitmap corresponding to the current pattern (e.g. {a}) and an itemset that we want to add to the current pattern (e.g. {b}). The goal is to calculate the bitmap corresponding to the resulting pattern (e.g. {a}, {b}).
So how to do that? Let's look at the
createNewBitmapSStep method.
First I create a new bitmap to store the result.
BitSet newBitset = new BitSet(lastBitIndex);
Bitmap newBitmap = new Bitmap(newBitset);
Then, to perform the AND, the
createNewBitmapSStep method first do a loop on the bits of the bitmap corresponding to the itemset that we want to add (e.g. {b}). The index bitK represents the index of a bit that is set to 1 in this bitmap. The loop will loop on all the bit set to 1 on the bitmap.
for (int bitK = bitmap.nextSetBit(0); bitK >= 0; bitK = bitmap.nextSetBit(bitK+1)) {
For each bit set to 1, bitK is the index in the bitmap. We need to determine to what sequence this bit correspond in the original sequence database and what is the last bit that corresponds to this sequence. This is done with this:
int sid = bitToSID(bitK, sequencesSize);
int lastBitOfSID = lastBitOfSID(sid, sequencesSize, lastBitIndex);
Then, we check the other bitmap (e.g. {a}) to see if there is a bit set to 1 after the position bitK until the last bit of the sequence.
for (int bit = bitmapItem.bitmap.nextSetBit(bitK+1); bit >= 0 && bit <= lastBitOfSID; bit = bitmapItem.bitmap.nextSetBit(bit+1)) {
newBitmap.bitmap.set(bit);
match = true;
}
If there is, that means then we set the bit to 1 in the new bitmap
newBitmap.bitmap.set(bit);
match = true;
and then I do some optimizations like increasing the support count, and storing the last bit set to 1 for the new bitmap (lastSID). These are for optimizations purposes:
if(match){
// update the support
if(sid != newBitmap.lastSID){
newBitmap.support++;
}
newBitmap.lastSID = sid;
}
bitK = lastBitOfSID; // to skip the bit from the same sequence
The idea of these optimisations it to keep track of the support to avoid always counting the bits and to keep the position of the last bit set to 1 to avoid scanning the bitmap completely.
I think that I have explained the main idea of the S-Step.
Actually it is not very complicated. You take two bitmap (e.g. {a} and {b}( and do a kind of AND to generate a new bitmap (e.g. {a},{b})
Hope this helps,
Philippe
Edited 4 time(s). Last edit at 04/22/2012 06:16AM by webmasterphilfv.