Measuring memory consumption

There are various external tools and methods that can be used to measure memory consumption. In some environments though, your only means to measure memory consumption is to query it yourself in the code. BlueGene/P systems fall into this category since your program runs on dedicated compute nodes that only run your program and do not allow other programs to run.

Code instrumentation

Here are 2 methods of instrumenting your code to determine the amount of memory in use.

Using GetMemorySize

VisIt provides a function in Utility.h that provides the memory used and the resident set size, which is the portion of a process's memory that is held in RAM.

#include <Utility.h>
#include <DebugStream.h>

int size = -1, rss = -1;
GetMemorySize(size, rss);

debug5 << "Amount of memory in use: " << size << ", RSS=" << rss << endl;

A challenge in using this method is that most malloc implementations do not return free'd memory to the system. For performance reasons, they tend to keep it around for later calls again to malloc by the same process. So, long story short, the usage numbers obtained this way tend to always increase. They cannot, for example, be used to bracket a block of code that allocates and/or frees data with the expectation that the returned values will quantify memory usage changes that happened in the intervening block. On the other hand, when very, very, very large allocations are free'd, malloc implementations often do return these to the system and the resulting usage numbers reported here would indeed reflect a reduction in memory usage.

Using mallinfo

The mallinfo() function is a GNU function (it may only exist on Linux) that lets you query statistics for memory allocated via malloc(). Don't worry if your program is C++ because all C++ new operators eventually call malloc(). Here is a function that you can include in your code to log memory usage:

#include <string>
#include <malloc.h>
#include <DebugStream.h>

log_memory_usage(const std::string &label = std::string())
    struct mallinfo m = mallinfo();
    debug5 << "mallinfo:" << label << endl;
    debug5 << "  non-mmapped space allocated from system ="<<m.arena<<endl;
    debug5 << "  number of free chunks ="<<m.ordblks<<endl;
    debug5 << "  number of fastbin blocks ="<<m.smblks<<endl;
    debug5 << "  number of mmapped regions ="<<m.hblks<<endl;
    debug5 << "  space in mmapped regions ="<<m.hblkhd<<endl;
    debug5 << "  maximum total allocated space ="<<m.usmblks<<endl;
    debug5 << "  space available in freed fastbin blocks ="<<m.fsmblks<<endl;
    debug5 << "  total allocated space ="<<m.uordblks<<endl;
    debug5 << "  total free space ="<<m.fordblks<<endl;

The key numbers to look at are: non-mmapped space allocated from system, which seems to indicate the total amount that malloc() has allocated from the system so far. The next important lines are: total allocated space and total free space. These lines, give the amount of the malloc'd memory that is in use by the program and the amount of that space that is currently free, respectively.

External sampling

On systems where you can run additional programs to sample a program's memory footprint, you have various options. Note that for parallel programs, you should use the code instrumentation approaches since you will likely want to profile memory consumption for several hundreds or thousands of processors. That is not typically feasible using sampling tools.

Using top

Most UNIX systems provide a program called top that lists the programs that are using the most CPU cycles. The table that gets printed usually contains metrics of the memory used by a process.

Using Totalview

Totalview has a memory profiler that can keep track of the allocations for each processor in your parallel job. Note when you set up the process to be debugged in Totalview, there is an "Enable memory checking" check box that you must enable running your program.

(more here later)

In my experience, using Totalview for memory debugging is hard to get working with VisIt in a parallel setting.

Querying VisIt

VisIt's CLI provides a function called GetProcessAttributes() that you can insert into your scripts to measure memory usage for the compute engine or viewer. It is helpful to insert code into your scripts that measures memory usage after operations that cause the compute engine to execute, such as changing time steps. You can write the samples to a curve file that you can plot in VisIt so you can see the memory behavior of the compute engine over time. The sampling is done using the GetMemorySize function mentioned above.

def startCurve(filename, var):
    f = open(filename, "wt")
    f.write("# %s\n" % var)

def appendCurve(filename, x, y):
    f = open(filename, "at")
    f.write("%g %g\n" % (x,y))

def iterate(filename):
    # Sample the engine to create a starting point.
    pa = GetProcessAttributes("engine")
    np = len(pa.memory)
    for i in range(np):
        startCurve(filename + "_memory_rank%d.curve" % i, "size")

    # Iterate over those time steps.
    for i in xrange(0, TimeSliderGetNStates()):

        # Sample the engine again, write out samples.
        pa = GetProcessAttributes("engine")
        for j in range(np):
            appendCurve(filename + "_memory_rank%d.curve" % j, i, pa.memory[j])