Re: what is the difference between frequent subgraph mining from single large graph and multiple transactional graphs?
Date: October 28, 2018 01:33AM
Hi Maya,
From what I read about frequent subgraph mining, most frequent subgraph mining like GSpan can applied to a single graph or to a database of graphs. In other words, it is not difficult to adapt a frequent subbgraph mining algorithm so that it can be applied to these two scenarios. The main difference is how the support is calculated.
Lets say that you want to calculate the support of a subgraph:
If you have a single graph, then the support of the subgraph is usually calculated as the number of occurrences of that subgraph in that single graph. Also, you may want to decide whether you allows some overlapping occurrences or not.
If you have a database of graphs, then the support of the subgraph is the number of graphs that contains this subgraph.
Thus, the main difference between these two scenarios is how the support of a subgraph is calculated. In both cases, calculating the support of a subgraph requires to find all occurrences of the subgraph. This gives you the support in the case of a single graph. In the case of multiple graphs, you also find all occurrences, but need to also check if theses occurrences appear in a same graph. Thus you need to keep the graph ID of each occurrence to calculate the support in the case of multiple graphs. It is just some simple modification to how the support is calculated.
By the way, it is also possible to modify subgraph mining algorithms to work on graphs with directed edges or undirected edges. This is also I think a simple modification.
Best regards,
Edited 1 time(s). Last edit at 10/28/2018 01:34AM by webmasterphilfv.