Cholla-PM Tests on Summit

Brant Robertson bio photo By Brant Robertson

Cholla-PM Tests on Summit

This website documents the procedure for performing the Cholla-PM tests on Summit.

Changes to Cholla-PM for Summit Tests

Currently, we are lacking an initial conditions generator for Cholla for tests at scale. The issue is that we have been using MUSIC, which does not use MPI and therefore is limited to a single system memory.

This issue is compounded by the need to run on arbitrary even numbers of processes without keeping a cubic domain. We can subdivide a single cubic initial conditions, but that would require a large number of particles and a reduced timestep (for the hydro).

The plan instead is to simply replicate either a 128^3 or 256^3 box onto every process and use a computational domain that maps onto the domain of mpi processes determined via the usual functions in Cholla. This would make rectangular domains.

I have gone through the code and added “TILING” preprocessor definitions that alter the behavior of the code. Basically I have each process read in a single snapshot + particle file and replicate it locally, shifting appropriately for the processor location within the domain. This also requires simlinks to be created for each process to the single snapshot files, in the “0.h5.?” and “0_part.h5.?” formats.

Installation and Tests on Sparkle

First, I installed and tested Cholla-PM on sparkle. Here is the procedure

Get Cholla-PM

git clone https://github.com/bvillasen/cholla.git
cd cholla
git checkout -b particles
git pull origin particles

Note we will build from the cholla-pm.tiling.tar.gz tarball on Summit.

Install FFTW-3.3

wget http://www.fftw.org/fftw-3.3.7.tar.gz
tar -zxvf fftw-3.3.7.tar.gz
cd fftw-3.3.7/
./bootstrap.sh
./configure --enable-mpi --enable-openmp --enable-threads --disable-shared (Add a path)
make -j 20
make install

Install PFFT

unzip pfft-master.zip
cd pfft-master
autoreconf --install
./configure --disable-fortran (Add a path)
make -j 20
make install
./configure --enable-openmp --disable-fortran --disable-shared (Add a path, remake with openmp)
make -j 20
make install

Install HDF5

wget https://support.hdfgroup.org/ftp/HDF5/current18/src/hdf5-1.8.20.tar.gz
tar -zxvf hdf5-1.8.20.tar.gz
cd hdf5-1.8.20/
./configure --enable-cxx --disable-shared (Add a path)
make -j 20
make install

Compile Cholla-PM

On sparkle, I’m using /home/brant/github/bruno/makefile_tiling. Note we require libz when compiling hdf5 with –disable-shared. And some of the compiling sequences needed to be changed.

Run simple tests

I have run simple tests with the set of 256^3 ICs that Bruno provided, using 2 and 4 processes on sparkle. The weak scaling is fine, with each step taking about 5 seconds.

Installation and Tests on Summit

Below I detail the process for installing cholla-pm and running tests on Summit.

Connecting to Summit

There is connection information on the Summit website.

Currently, you have to connect to an internal OLCF system first, via, e.g., home.ccs.ornl.gov. Connection passcode is pin + RSA key. Note that summit has a different address, summit.olcf.ornl.gov.

Source code location

/ccs/proj/csc275/brantr/cholla-pm/cholla

Also copied at:

/ccs/home/brantr/code/cholla-pm.summit_scaling_tests

Modules for compilation

module load cuda
module load hdf5
module load spectrum-mpi

Compiling FFTW

Well, the FFTW module on Summit does not have mpi enabled! So we have a harder route.

wget http://www.fftw.org/fftw-3.3.7.tar.gz
tar -zxvf fftw-3.3.7.tar.gz
cd fftw-3.3.7/
./bootstrap.sh
./configure --enable-mpi --enable-openmp --enable-threads --disable-shared --prefix=/ccs/home/brantr/code/fftw
make -j 20
make install

Compiling PFFT

First, load modules. Then:

cd ~/code/pfft
unzip pfft-master.zip
cd pfft-master
./bootstrap.sh
./configure --disable-fortran --disable-shared --with-fftw3=/ccs/home/brantr/code/fftw --prefix=/ccs/home/brantr/code/pfft
make -j 20
make install

Compiling Cholla on Summit

I used the makefile at:

/ccs/home/brantr/code/cholla-pm.summit_scaling_tests/makefile_tiling_summit

which is also at:

/ccs/proj/csc275/brantr/cholla-pm/cholla/makefile_tiling_summit

I had to change N_OMP_THREADS to 6 in global.h.

Reminders about LSF

To submit a job:

bsub scaling.lsf

To check on your jobs

bjobs

To kill a job

bkill [jobid]

Running Cholla-PM on Summit

The tests were run out of the directory:

/ccs/proj/csc275/brantr/projwork

This directory contained the cholla-mp executable, the cosmo_tiling.txt file:

##########################################
# number of grid cells in the x dimension
nx=256
# number of grid cells in the y dimension
ny=256
# number of grid cells in the z dimension
nz=256
# output time
tout=10000000000
# how often to output
outstep=10000
times_output=output_list.txt
# value of gamma
gamma=1.66666667
# name of initial conditions
init=Read_Grid
nfile=0
n_parts_initFiles=1
time_max=1000000000
# domain properties
xmin=0.0
ymin=0.0
zmin=0.0
xlen=115000.0
ylen=115000.0
zlen=115000.0
# type of boundary conditions
xl_bcnd=1
xu_bcnd=1
yl_bcnd=1
yu_bcnd=1
zl_bcnd=1
zu_bcnd=1
outdir=./dat/
indir=./ics/ics_256/

The output_list.txt file:

1.000000000000000000e+00

There was a subdirectory ics/ics_256/, which contained the 0.h5 and 0_parts.h5 files. I created symlinks for the process ICs using the following script

#!/bin/bash
I=$1
while [ $I -lt $(($2+1)) ]; do
    $BASE `printf "ln -s 0.h5 0.h5.%d" $I`
    $BASE `printf "ln -s 0_parts.h5 0_parts.h5.%d" $I`
    I=$(($I+1))
done

An ICs symlink needs to be made for each process.

To run the test, I used the following LSF script:

#!/bin/bash
#BSUB -P CSC275robertson
#BSUB -W 0:30
#BSUB -nnodes 1
#BSUB -alloc_flags gpumps
#BSUB -J scaling_6
#BSUB -o scaling_6.%J
#BSUB -e scaling_6.%J

module load cuda
module load hdf5
module load spectrum-mpi

date

cd /ccs/proj/csc275/brantr/projwork
export OMP_NUM_THREADS=6
jsrun -n6 -r6 -a1 -g1 -c6 -b packed:6 -d packed -l GPU-CPU ./cholla-pm cosmo_tiling.txt
mv tiling_timing.txt tiling_timing.6.txt