Distributed Coupling-Mode Decomposed-AQM Modeling System

Contents


Summary

The Distributed Coupling-Mode Decomposed-AQM Modeling System uses a cooperating system of "normal" air quality models on rectangular subdomains, and programs aqmmaster and metserver. This approach may be thought of as a special case of the nesting approach used by the real-time MAQSIP air quality model: each nest grid has a single air quality model working on just that grid; these AQM's cooperate as a coupled parallel system, with the coarser grids providing time dependent boundary conditions to the finer grids nested within them, and potentially aggregation data on the coverage within the coarser grids of concnetrations on the finer grid nested within them.

The modeling process begins by decomposing the domain into rectangular subdomains that overlap properly, and then putting the description of these subdomains into an I/O API-standard GRIDDESC file. The metserver program reads the full-domain grid-geometry, emissions, and meteorology files, and from them constructs windowed grid-geometry, emissions, and meteorology files for each subdomain in the decomposition. At every advection step, the aqmmaster program assembles full-domain concentrations from the outputs of the subdomain air quality models, and provides time stepped boundary coonditions back to them, as well as producing full-domain concentration outputs. The whole system is tied together with the coupling-mode extensions for the Models-3 I/O API to perform distributed parallel domain-decomposed air quality modeling across a set of machines.

Notice that all of the scheduling, coordination, assembly, and extraction activities are managed by the aqmmaster program, so that the subdomain air quality models are unmodified (except for linking with the coupling-mode version of the I/O API library and the PVM library, in addition to the usual netCDF library). The source code of the AQM is unaffected by this cooperating-process parallelism. No more work writing schedulers, boundary-extractors, etc., needs to be done by the modeler!

Both programs aqmmaster and metserver are Fortran-90 dynamically-sized programs that adapt at run time to the sets of met and chemistry variables being modeled, and to the grids being run. They are basically independent of AQM being run, as well, as long as the AQM uses the Models-3 I/O API for input and output, uses the basic Models-3 scheme for meteorology file types, and avoids deadlocks, and as long as the gridded met files have all the variables necessary for windowing to produce subdomain boundary met files.

Back to Contents


Domain Decomposition

Example of a Domain Decomposition

In this example, we begin with a grid having 18 rows and 24 columns. We decompose this domain into three subdomains, as illustrated below; for subdomain modeling purposes, each subdomain will be extended with a "halo" by one cell along each internal boundary.
  1. Subdomain 1: rows 1-18, columns 1-8
  2. Subdomain 2: rows 1-9, columns 9-24
  3. Subdomain 3: rows 10-18, columns 9-24

The reason for requiring this halo is the fact that in all existing CMAQ and MAQSIP implementations, there are errors in the implementation of thickened-boundary advection. If these errors were corrected, then the halos (and the computational overhead that goes with them) would no longer be necessary.

The portion of Subdomain 1 that is actually used to generate concentration field output is as described above; however, in order to preserve the full order accuracy of the horizontal advection numerics, the air quality model for Subdomain 1 actually models a region with a "halo" one column wider (and extending into Subdomains 2 and 3). The boundary for this subdomain then "lives" partly on column 10 of the original full domain. It is the responsibility of the aqmmaster program to gather rows 1-18, columns 1-8 from Subdomain 1 for the full-domain concentration output, and to provide column 10 as part of the (time-dependent) boundary values for Subdomain 1.

Similarly, the air quality model for Subdomain 2 actually models a 10 row by 17 column region with an L-shaped halo; the boundary for Subdomain 2 includes portions of column 6 and row 11, as illustrated below. Subdomain 3 is left as an exercise for the reader :-)
Subdomain 2

Back to Contents


Setting up Coupled-Mode Cooperative Modeling

The for using the new Coupling Mode of the Models-3 I/O API was developed as part of the MCNCPractical Parallel Computing Strategies Project, which was partially funded by the National Center for Environmental Research and Quality Assurance, Office of Research and Development, U.S. Environmental Protection Agency, under the Science To Achieve Results Grants Program in high performance computing and communications. The basic idea was that by changing the low-level data storage layer in the I/O API so that it had an alternative communications implementation in addition to the existing (netCDF-based) file-storage implementation, one could use existing single-topic models to build more complex cooperating process multi-topic coupled modeling systems, the choice of storage mode being made at program launch on the basis of environment variables. Moreover, the individual single-topic models would not "know" (nor would they need to know!) at the source code level whether they were running stand-alone or as part of a coupled modeling system. The only requirement for such coupled modeling was that input operations (I/O API OPEN3() for input files, READ3(), INTERP3(), XTRACT3(), and DDTVAR3()) should block when they request data that is not yet available (i.e., they should put the requester to sleep until the data's producer writes it out, and then wake up the requester and allow it to continue). This is made possible by the selective direct access nature of I/O API calls, and in fact was one of the original design goals of the Models-3 system.

For the particular case of domain decomposed air quality modeling, the way this works is as follows:

Back to Contents


Program aqmaster

Description

The sequence of operation for aqmaster is as follows:
  1. read in all the control parameters (starting date, starting time, etc., as given in the Environment Variables section below.
  2. Open the full-domain chemical initial and boundary condition files for input. Note that this determines both the full-domain grid structure and the set of chemical species that will be modeled.
  3. Create/opens all the subdomain chemical initial and boundary condition file for output.
  4. Opens the subdomain chemical concentration files for input.
  5. For every subdomain and for every chemical species:
    1. Window the full domain concentrations to the subdomain grid.
    2. Write the subdomain concentrations grid to the subdomain chemical initial condition file
  6. For every time step:
    1. For every subdomain and for every chemical species:
      1. Read in the subdomain concentrations grid from the subdomain concentration file.
      2. Aggregate the subdomain concentrations into the full domain concentrations grid.
      3. Construct the subdomain boundary concentrations.
      4. Write the subdomain boundary concentrations to the subdomain chemical boundary condition file.
Notice that the order of operations is carefully laid out so as to avoid deadlocks, and so as to allow aqmaster to act as a component in a cooperating-process implementation of a nested AQM. It is also laid out so as to allow coupled aqmaster and metserver to operate within a cooperating process real time environmental modeling system with additional meteorological and emissions model components (avoiding race conditions in such a system, particularly, is the role of the optional SYNCH_FILE).

Required Environment Variables

The execution of program aqmaster is controlled purely by environment variables, for easy scriptability. Some of these variables are control parameter variables, others are logical name environment variables for the input and output files, which contain the path-names for the respective files, according to Models-3 conventions. These environment variables may be set by the csh setenv command, or by the sh or ksh set and export or env commands. The list of environment variables for aqmaster is the following.

Back to Contents


Program metserver

Description

The sequence of operation for metserver is as follows:
  1. read in all the control parameters (starting date, starting time, etc.), as given in the Environment Variables section below. Options include turning on or off each of the families of files below; note that the sets of variables in each input file determine at runtime the sets of variables in the corresponding output subdomain files:
    1. CHEM_EMIS_3D
    2. GRID_BDY_2D
    3. GRID_BDY_3D
    4. GRID_CRO_2D
    5. GRID_CRO_3D
    6. GRID_DOT_2D
    7. MET_BDY_2D
    8. MET_BDY_3D
    9. MET_CRO_2D
    10. MET_CRO_3D
    11. MET_DOT_2D
    12. MET_KF_2D
    13. MET_KF_3D
  2. Perform consistency checks:
    1. Subdomain grids fit together correctly to form the full domain grid.
    2. If both are being produced, the time step for the CHEM_EMIS_3D file must be an exact multiple of, ro exactly the same as, the met time step. If the met files are not being produced set the met time step artificially to be the emissions time step, to allow the deadlock-free interleaved processing algorithm, below.
    3. If GRID_BDY_2D is turned on, then GRID_CRO_2D must be available and must contain the needed variables;
    4. If GRID_BDY_3D is turned on, then GRID_CRO_3D must be available and must contain the needed variables;
    5. If MET_BDY_2D is turned on, then MET_CRO_2D must be available and must contain the needed variables;
    6. If MET_BDY_3D is turned on, then MET_CRO_3D must be available and must contain the needed variables;
  3. If the GRID_*_2D files are being produced, then for each variable within them:
    1. Read the grid and boundary values of the variable from the input files, as appropriate (from both, if it is a boundary variable; from the gridded file only, if it is a gridded-only variable);
    2. If the variable is a boundary variable, construct an "expanded domain" grid of that variable (including both the boundary and the cross-point-grid cells).
    3. For each subdomain:
      1. If the variable is a boundary variable, extract/construct the subdomain boundary values from the "expanded domain" grid and write then to the subdomain boundary file.
      2. Extract the subdomain cross point gridded values from either the full domain or the "expanded domain" grid (as appropriate), and write then to the subdomain cross-point gridded file.
  4. Similarly for the GRID_*_3D files.
  5. For each output met-file time step:
    1. If the MET_*_2D files are being produced, then for each variable within them:
      1. Read the grid and boundary values of the variable from the input files, as appropriate (from both, if it is a boundary variable; from the gridded file only, if it is a gridded-only variable);
      2. If the variable is a boundary variable, construct an "expanded domain" grid of that variable (including both the boundary and the cross-point-grid cells).
      3. For each subdomain:
        1. If the variable is a boundary variable, extract/construct the subdomain boundary values from the "expanded domain" grid and write then to the subdomain boundary file.
        2. Extract the subdomain cross point gridded values from either the full domain or the "expanded domain" grid (as appropriate), and write then to the subdomain cross-point gridded file.

      Note about KF Files: the MET_KF_* are always physical files (not virtual) and are written in MM5 *before* the first write to any MET_CRO* file; in the AQM, they are read after several reads from the MET_CRO* files. Sandwiching MET_KF_* processing between MET_CRO_2D and MET_CRO_3D processing guarantees synchronization in coupling mode operation. Note also that for the first time step iteration, we must be careful to "capture" all events currently in progress.

    2. If the MET_KF_2D file is being produced, then window it and write the result to the subdomain files.
    3. Similarly for the MET_KF_3D files.
    4. Process the MET_CRO*_3D files in the same fashion as the MET_CRO_2D files.
    5. Similarly for the MET_DOT_3D files.
    6. If the CHEM_EMIS_3D file is being produced, for each emissions variable:
      1. Read the grid and boundary values of the variable from the input file.
      2. Extract the subdomain cross point gridded values from the full domain grid, and write then to the subdomain cross-point gridded file.
Notice that the order of operations is carefully interleaved so as to avoid deadlocks in a cooperating process environmental modeling system, and so as to allow metserver to act as a component in a cooperating-process implementation that includes concurrent meteorological and emissions models that generate the full-domain inputs to the distributed air quality model.

Required Environment Variables

Execution of metserver is completely controlled by environment variables, for easy scriptability. These may be set by the csh setenv command, or by the sh or ksh set and export or env commands. The list of environment variables for metserver is the following.

Back to Contents


I/O API Coupling Mode

As part of the Practical Parallel Project, MCNC has developed an extended Model Coupling Mode for the I/O API. This mode, implemented using PVM 3.4 mailboxes, allows the user to specify in the run-script whether "file" means a physical file on disk or a PVM mailbox-based communications channel (a virtual file), on the basis of the value of the file's logical name:

    setenv FOO                "virtual BAR"
    setenv IOAPI_KEEP_NSTEPS  3
    
declares that FOO is the logical name of a virtual file whose physical name (in terms of PVM mailbox names) is BAR. The additional environment variable IOAPI_KEEP_NSTEPS determines the number of time steps to keep in PVM mailbox buffers -- if it is 3 (as here), and there are already 3 timesteps of variable QUX in the mailboxes for virtual file FOO, then writing a fourth time step of QUX to FOO causes the earliest time step of QUX to be erased, leaving only timesteps 2, 3, and 4. This is necessary, so that the coupled modeling system does not require an infinite amount of memory for its sustained operation. If not set, IOAPI_KEEP_NSTEPS defaults to 2 (the minimum needed to support INTERP3()'s double-buffering).

The (UNIX) environments in which the modeler launches multiple models each of which reads or writes from a virtual file must all agree on its physical name (usually achieved by sourcing some csh script that contains the relevant setenv commands).

For models exchanging data via virtual files of the I/O API's coupling mode, the I/O API schedules the various processes on the basis of data availability:

There are three requirements on the modeler: Using coupling mode to construct complex modeling systems has several advantages from the model-engineering point of view:

Back to Contents


Previous: MCPL I/O API output module for MM5

Next: AIRS2M3 Program

Up: I/O API Related Programs

To: Models-3/EDSS I/O API: The Help Pages


Send comments to
Carlie J. Coats, Jr.
carlie@jyarborough.com