PPP: I/O API Extensions for Coupled Models

Under Construction!!

I/O API: Paradigm

The EDSS/Models-3 I/O API is a library of routines which provide selective direct access to modeling data in modeler-oriented terms. For example, the read-and-interpolate operation is typified by the call
INTERP3( 'foo', 'bar', 1, 1988200, 120000, NCOLS*NROWS, BAR )
which says, in ordinary English,
Read the layer-1, year 1988, day 200, time 12:00:00 slice of the variable named "bar" from the file with logical name1 "foo" and put it into array BAR having extent NCOLS*NROWS (extent being used for error checking purposes). NOTE: INTERP3() optimizes disk-accesses and double buffering behind the scenes at run-time.
This paradigm of operation greatly reduces modeling complexity (as compared to models built around sequential files), since a model does not need to know the detailed structure of a file it reads (e.g., the exact list of variables in the file, their order, nor even the precise time step structure). What it does need to know is the names, types, and dimensionality of the variables it wants to read, and must deal with error-flags coming from requests that cannot be fulfilled. EPA'a Models-3 and NCSC's MAQSIP models take advantage of this flexibility in order to construct "plug-compatible interchangeable-part" modules implementing the various relevant atmospheric processes.2 Notice that this kind of selective direct-access I/O makes for models that are more robust, and less "brittle"as well: if a new variable is added to a file, only those modules which write or read the new variable need to change; all others are unaffected.

Model Coupling: Concept

Just as a model's access to disk files should not demand intimately detailed internal knowledge of the structure of the file or of its writer, coupled models composed of cooperating concurrent processes should not require that receivers possess intimately detailed internal knowledge of senders (and vice versa). Ideally, a model should use the same facilities for disk I/O and for virtual files used for model coupling, and should not "need" to know whether its inputs are coming from disk files or from other models coupled to it. Synchronization and scheduling should be done on the basis of data availability. In order for this to work, the following properties are required for model-coupling mode of the I/O API, and of the models that use it:

Model Coupling: Examples

the models that use it:

Model Coupling: Implementation

Our implementation is built on top of "mailboxes" in PVM 3.4. There is a new library, libclp.a, that fits in front of libioapi.a in the compile-and-link command line for producing programs, and which then over-rides the standard file-oriented OPEN3(), READ3(), etc., with the new polymorphic versions which can deal with both physical files and virtual files.

At run-time, the new OPEN3() looks at the value of its logical-name1 argument to see whether the request is for a virtual file, or for a physical file (and in that case, what physical path-name the file has). For virtual files opened for writing, it then creates a PVM mailbox containing both the virtual file's description (number, names and types of variables, dimensionality, grid description, etc.)

For a virtual file, WRITE3() allocates a buffer for the requested time step of the requested variable in the file's mailbox, puts the data into the buffer and labels it by variable-name, date, and time, and then cleans up any outstanding buffers for that variable that are more than (a default) 2 time steps old.

For a virtual file, READ3(), INTERP3(), and XTRACT3() make up a label with the requested variable-name, date, and time, and requests a copy of the corresponding buffer from the file's mailbox. If a buffer with the corresponding label exists, PVM copies it into the requester's output argument; otherwise, PVM puts the requesting process to sleep until such a labelled buffer is written by some other process.

If any virtual files are open, all calls to SHUT3() wait on a barrier that ensures virtual files are not destroyed prematurely. SHUT3() then cleans up the mailboxes for all of the virtual output files its process had opened (corresponding to SHUT3()'s using ncclose() to flush and close any disk files that were opened by that process).


Footnotes

  1. I/O API files are referred to by environment-variable "logical names" that are properties of the calling program. The user sets the values of these to the path-name for the files by doing a csh command,
    setenv <logical-name> <path>
    before executing the program, for each file the program accesses (probably in the run-script for the model being run). To say that a logical name refers to a "virtual" file used to couple models together, the corresponding command is
    setenv <logical-name> "VIRTUAL <file-name>"
  2. Consider, for example, that modules implementing Kain-Fritsch convective cloud pollutant chemistry and transport at a grid-scale of 12-36 km. need rather different input meteorology variables than do Kuo cloud modules at 50-120 km.
  3. placeholder

Back to PPP Contents