Good afternoon Dr. Xiaowei,
Thanks for your interest in the paper. And happy to receive a messsage from someone else in China! I have been to Zhejiang and it is nice place ;-)
This is a good question and I understand that terms can be confusing.
I will explain these terms and how I use them to make this clearer.
-
Memory usage or
memory consumption mean the amount of
RAM memory that is used by an algorithm at a specific time. For example, if I run an algorithm, I could check now (at 5:50 PM) and see that the algorithm is currently using 200 MB.
-
Maximum memory usage or
peak memory usage: Since the memory usage of an algorithm can vary quite a lot over time, we are often interested by the maximum memory usage. That means that I will run an algorithm and perform many measurements of the memory usage over time, and then I will just keep the maximum value. For example, I can run an algorithm from 5:50PM to 5:52 PM. Lets say that at 5:50 PM it uses 200 MB, then at 5:51 PM it uses 350 MB and then at 5:52 PM, it uses 100 MB. Then the maximum memory usage is 350 MB.
In general, for the papers where the algorithms are implemented in Java, we use the maximum memory usage to measure memory because in Java, memory measurement are quite inaccurate. The reason is that Java uses a garbage collector (GC) to free memory and we never know when the GC will clean the memory. By calculating the maximum memory usage, we can get some results that are quite stable.
-
Memory scalability : This refers to how much the (maximum) memory usage will increase when we increase the amount of data that is given as input to analgorithm. For example, when we increase the amount of data, the memory could increase a little, increase linearly, increase exponentially, etc. So to evaluate the memory scalability, we will vary the size of the dataset that is given to an algorithm and check how much memory is used. Then we can draw a chart or make a table with these results. If we find the memory usage increases exponentially, this is very bad.. :-) Ideally, we want to find that the memory usage increases linearly with the size of the data.
So to go back to that paper, when it is said that algorithm X has a better memory scalability than another algorithm Y, it means that if we give bigger and bigger datasets to X the memory usage will increase more slowly than if we give bigger and bigger datasets to Y.
To make this more clear, you can think about the
memory scalability as the
space complexity of an algorithm. But when we talk about complexity we generaly try to analyze the complexity of an algorithm formally.... When we talk about scalability we often talk from the perspective of what we observed in some experiments!
Related to that, if you are using Java, you may be interested to read my blog post where I explain how to measure accurately the memory usage in Java:
How to measure the memory usage of data mining algorithms in Java?Also, I should also say that when we evaluate scalability we can also evaluate scalability with respect to different aspect of the data. For example, lets say that I make an algorithm to analyze graph data. I could check the scalability when we increase the number of edges, or I could check the scalability when we increase the number of labels per nodes etc. So basically, we can vary different aspects of the data to evaluate the scalability.
Also, we can evaluate the scalability not just in terms of memory but also in terms of runtime ;-)
Hope this is clear!
Best regards,
Philippe
Edited 1 time(s). Last edit at 01/18/2022 01:14AM by webmasterphilfv.