Singulairtyv3.6

From CMASWIKI
Jump to: navigation, search

HOST MPI Method

Singularity Container was prepared by Carlie Coats on his machine as root, then packaged to a tar.gz file. /proj/ie/apps/dogwood/singularity

I copied this to /21dayscratch/scr/l/i/lizadams/singularity

I was able to get CMAQ to run using Carlie's singularity container by modifying the script. Found the instructions here: https://sylabs.io/guides/3.3/user-guide/mpi.html

mpirun -np 16 singularity -d exec \
--bind ${HOSTDATA}:/opt/CMAQ_REPO/data \
${CONTAINER} /opt/CMAQ_REPO/CCTM/scripts/run_cctm_Bench_2016_12SE1.csh

The run_cctm_Bench_2016_12SE1.csh script calls the following script:

/21dayscratch/scr/l/i/lizadams/singularity/cmaq.sif/opt/CMAQ_REPO/CCTM/scripts/run_cctm_Bench_2016_12SE1.csh

Used the following input data

set HOSTDATA = /21dayscratch/scr/l/i/lizadams/CMAQv5.3.1_Benchmark_2Day_Input
set HOSTBUILD = /21dayscratch/scr/l/i/lizadams/CMAQv5.3.1
set CONTAINER = /21dayscratch/scr/l/i/lizadams/singularity/cmaq.sif

LOG files are available

/21dayscratch/scr/l/i/lizadams/singularity/LOGS/


Table of run time

PE Configuration 1x2pe 2x4pe 4x2pe 4x4pe 4x8pe 8x8pe
Wall Time Openmpi CANCELLED after 2:10:00 Friday, July 1, 2016 (exceeded 4 hour queue timelimit) 2718.05 2718.22 1430.86 768.16 UCX ERROR ibv_open_device(mlx5_0) failed: Bad file descriptor
Wall Time Mvapich CANCELLED after 16:30:00 Friday, July 1, 2016 (exceeded 4 hour queue timelimit) [cli_3]: write_line error; fd=18 buf=:cmd=init pmi_version=1 pmi_subversion=1 [cli_3]: write_line error; fd=18 [cli_3]: write_line error; fd=18 [cli_3]: write_line error; fd=18 [cli_3]: write_line error; fd=18

Container MPI Method


To get mpirun working within the container, the key was to provide the path to the mpirun in the run script within the container.

#> Executable call for multi PE, configure for your system
       set MPI = /usr/lib64/openmpi3/bin
       set MPIRUN = $MPI/mpirun
    ( /usr/bin/time -p $MPIRUN -np 8 $BLD/$EXEC ) |& tee buff_${EXECUTION_ID}.txt

To get day 2 of the run to work, I also had to convert the YESTERDAY date which was a julian date format to a gregorian date format, as that is what is available.

   #> Calculate Yesterday's Date
   set YESTERDAY = `julshift $YYYYJJJ -1`
   set YESTERDAYG = `jul2greg ${YESTERDAY}`

Note, that I am using the following script:

  /proj/ie/proj/CMAS/Singularity/Scripts-BATCH/cmaq_cctm.csh

I also added the restart environment variable, after I needed to start the 2nd day.

  setenv SINGULARITYENV_NEW_START     FALSE