ParallelPorting

From VisItusers.org

Jump to: navigation, search

This is a page that describes how make VisIt run in parallel. For information about building parallel on Mac look at Building in parallel on a Mac.

Note that you will almost certainly not be able to use a download from the VisIt website, since your cluster likely has a different version of MPI, networking library, etc. So you will need to be VisIt from scratch in all likelihood.


The game plan is five steps:

  1. Figuring out how to run a parallel program on your cluster
  2. Compiling VisIt to be parallel enabled
  3. Manually making the engine be launched in parallel
  4. Automating the launch of a parallel engine
  5. Confirming that you actually have a parallel engine running

Contents

[edit] Figuring out how to run a parallel program on your cluster

This step has nothing to do with VisIt. Most people write a hello world program, like:

#include <iostream>
#include <mpi.h>
 
int main(int argc, char *argv[])
{
      MPI_Init(&argc, &argv);
      int rank = -1;
      MPI_Comm_rank(MPI_COMM_WORLD, &rank);
      std::cerr << "My rank is " << rank << std::endl;
      MPI_Finalize();
}

Then compile it with something like:

  g++ -I/path/to/mpi/include -L/path/to/mpi/lib -lmpi \
      -o helloworld helloworld.C

And invoke it with something like:

 mpirun -np 4 -o helloworld

or

 qsub -n 4 myscript

with a script that has an mpirun, ibrun, or something else inside it.

And get output like:

  My rank is 3
  My rank is 1
  My rank is 0
  My rank is 2

Of course, the details of the compile line and invocation command will likely be very different on your machine. Again, the purpose of this section is to ensure that you understand what the compile line is and what the invocation command is.

[edit] Compiling VisIt to run in parallel

To compile VisIt to run in parallel, you need to tell the build system where the include files and MPI libraries are located or at least tell the build system where to find your mpic++ compiler if you have one. This information is typically provided in a "config-site" file. If you have used "build_visit", a config-site was created for you. This file is located in <VisIt>/src/config-site/<hostname>.cmake and it contains the variables needed to tell VisIt's build system where it can find libraries needed to build VisIt.

[edit] Find MPI with mpic++

You can tell VisIt's CMake build system to determine your MPI compilation parameters using the mpic++ compiler, if you wish. This is a fairly automatic way for VisIt to discover how to compile MPI programs on your system but it does not have some of the fine controls that setting the parameters explicitly has.

You'll want to add this to your cmake config-site file:

VISIT_OPTION_DEFAULT(VISIT_PARALLEL ON)
## (configured w/ mpi compiler wrapper)
VISIT_OPTION_DEFAULT(VISIT_MPI_COMPILER /path/to/bin/mpic++)

[edit] Setting MPI parameters explicitly

In order to provide the build system with information about your MPI, you can set 3 special variables in your <VisIt>/src/config-site/<hostname>.cmake file:

  • VISIT_MPI_CXX_FLAGS
  • VISIT_MPI_LD_FLAGS
  • VISIT_MPI_LIBS
  • Tell the build system where to find the include paths for your MPI header files:
  SET(VISIT_MPI_CXX_FLAGS -I/usr/lib/mpi/include)
  • Tell the build system where to find your MPI libraries:
  SET(VISIT_MPI_LD_FLAGS -L/usr/lib/mpi/mpi_gnu/lib)
  • Or, if your system gets MPI libraries from multiple locations:
  SET(VISIT_MPI_LD_FLAGS "-L/usr/lib/mpi/mpi_gnu/lib -L/some/other/path/mpi")
  • Finally, tell the build system which libraries use:
  SET(VISIT_MPI_LIBS mpi)
  • Or if you require multiple libraries:
  SET(VISIT_MPI_LIBS mpi_cxx mpi open-rte open-pal)

After you are done, you need to tell VisIt when it compiles to also build in parallel:

  cmake -DVISIT_PARALLEL:BOOL=ON .

Now you should have both the serial and parallel builds of VisIt.

[edit] OpenMPI

OpenMPI likes to build an MPI library that contains C++ language bindings. This can be a problem if you are using OpenMPI built using one compiler and VisIt built using another compiler because of C++ name mangling mismatches. You can force VisIt's build to only use the C language bindings for OpenMPI (all VisIt needs anyway) by adding the following definitions to your src/config-site/hosts.cmake configuration file.

VISIT_OPTION_DEFAULT(VISIT_MPI_C_FLAGS "-DOMPI_SKIP_MPICXX")
VISIT_OPTION_DEFAULT(VISIT_MPI_CXX_FLAGS "-DOMPI_SKIP_MPICXX")

[edit] Manually making the engine be launched in parallel

If you are running VisIt on your Linux or Mac desktop machine and you used an MPI that provides an mpirun command then you can easily run in parallel by adding -np # to your VisIt command line.

 visit -np 4

If you instead are trying to run on a machine that requires more setup to run a parallel job (such as when you have a batch submission system) then there are two parts to launching the parallel engine:

  1. Figuring out what the right command line is to launch the engine.
  2. Creating a host profile that will create this command line.

[edit] Figuring out the right command

This section is about figuring out the right command line to launch the parallel engine.

Invoke VisIt as:

 visit -np 8 -norun engine_par

VisIt will then print out something like:

 RUN USING: /Users/childs3/dev/trunk/src/exe/engine_par -host eng-4-49.hotspot.utah.edu -timeout 480 -norun engine_par -port 5600 -dv -key 20c141c52b7b88ab3d44
 
 Be sure to set your environment variables:
   setenv LD_LIBRARY_PATH /Users/childs3/dev/trunk/src/lib::/usr/local/lib
   setenv LD_LIBRARY32_PATH /Users/childs3/dev/trunk/src/lib:
   setenv LD_LIBRARYN32_PATH /Users/childs3/dev/trunk/src/lib:
   setenv LD_LIBRARY64_PATH /Users/childs3/dev/trunk/src/lib:
   setenv LIBPATH /Users/childs3/dev/trunk/src/lib:
   setenv VISITHOME /Users/childs3/dev/trunk/src
   setenv VISITHELPHOME /Users/childs3/dev/trunk/src/help
   setenv VISITPLUGINDIR :/Users/childs3/.visit/darwin-i386/plugins:/Users/childs3/dev/trunk/src/plugins
   setenv PYTHONHOME /Users/childs3/dev/trunk/src/lib/python
   setenv TRAP_FPE 
   setenv MESA_GLX_FX disable

You would then try to get the engine to connect back to the viewer by:

  • Setting up the environment variables as directed
  • Invoking the command it says, except for modifying it so that it conforms to the invocation to the parallel batch system.
    • For example, changing
 /Users/childs3/dev/trunk/src/exe/engine_par -host eng-4-49.hotspot.utah.edu -timeout 480 -norun engine_par -port 5600 -dv -key 20c141c52b7b88ab3d44
    • to
mpirun -np 8 /Users/childs3/dev/trunk/src/exe/engine_par -host eng-4-49.hotspot.utah.edu -timeout 480 -port 5600 -dv -key 20c141c52b7b88ab3d44
    • Note that in this example we have also dropped the "-norun engine_par" from this invocation. It causes no problems if you are running the executable directly as described here, but on some machines it may be necessary to run the engine using "visit -par -engine" to let our script set your environment for you. While this is rare, leaving the "-norun engine_par" on in that case will prevent it from working.

[edit] Automating the launch of a parallel engine

Once you have successfully launched a parallel engine (manually), the last step is to set up a host profile so that VisIt can launch the engine without manual intervention.

Host profiles are documented on page 304 of the current users manual (V1.5) on the Host Profiles section of the "Remote Visualization" chapter.

Also, Brock Palen put together this excellent video tutorial on how to set up a host profile on YouTube [1].

[edit] Making sure you actually have a parallel engine

Now that you can start a parallel engine, it is time to make sure that VisIt is really running a parallel job (as opposed to many serial jobs due to an MPI mishap). VisIt has an option to color by the processor ID of which one of VisIt's processors processed the data. So the blocks that MPI task 0 processed will be colored blue, the blocks from MPI task 1 will be cyan, and so on.

Here is a script that sets up this plot:

# you will need to change this command for your machine
OpenComputeEngine("localhost", ("-l", "qsub/mpirun", "-np", "8", "-nn", "1", "-machinefile", "$PBS_NODEFILE", "-mit-loki")) 
# you will need to change the data location for your machine
OpenDatabase("../data/multi_ucd3d.silo")
DefineScalarExpression("procid", "procid(mesh1)")
AddPlot("Pseudocolor", "procid")
DrawPlots()
SaveWindow()

This script will produce an image like this:

This image should be produced when you run the script above. The colors may vary depending on how many processors you use. If you don't run with multi_ucd3d.silo, make sure you do use a multi-block file.

[edit] Making your work accessible to all

If you:

  1. Set up host profiles for a parallel machine that you maintain VisIt on
  2. Have other users of this VisIt installation, especially users that connect client/server

Then: You should submit those host profiles back to the VisIt team to be incorporated in the VisIt install routine. Meaning: right now when you run visit-install, it asks which site you want to install for (LLNL, LBNL, ORNL, ANL, etc). So, if you send the VisIt team your host profiles, your site will show up on this list.

Q: Why is this good? A: When users of your supercomputer install the VisIt client, they can choose the configuration for your site, which includes the host profiles. When they fire up VisIt, it will then know about your supercomputer and they can immediate connect to it, without having to manually set up a host profile.

Similarly, you should contribute back the <machine-name>.conf file. That way building VisIt will be "cmake . ; make", meaning you don't have to run build_visit.

Personal tools