Changes from Previous I/O API Versions


OpenMP Task-Parallel Extensions

For Version 2, the I/O API was substantially re-written to support thread-safe task-parallel use in modeling. In general, data access operations may be called at the same time, provided that they do not access the same variable from the same file. In particular, because INTERP3() and DDTVAR3() always do substantial linear-algebra computations (linear interpolation and re-scaled differencing, respectively), while they only occasionally do actual I/O (because the manage behind-the-scenes double-buffering for the interpolation buffers), substantial gains in parallel efficiency may be obtained by using the OpenMP PARALLEL SECTIONS facility to perform multiple INTERP3()'s in task-parallel. Additionally, the (relatively fine-grained) KF-file access functions KFOPEN(), KFINDX() KFREAD() KFWRITE() may be called in parallel even acting on the same variable(s) of the same file(s).

Extensions for Coupling Concurrent Models

As part of the MCNC Practical Parallel Project, MCNC has developed an extended Model-Coupling Mode for the I/O API. This mode, implemented using PVM 3.4 mailboxes, allows the user to specify in the run-script whether "file" means a physical file on disk or a PVM mailbox-based communications channel (a virtual file), on the basis of the value of the file's logical name. For models exchanging data via virtual files of the I/O API's coupling mode, the I/O API schedules the various processes on the basis of data availability: There are two requirements on the modeler:

This has several advantages from the model-engineering point of view:

Time Series READ/WRITE

In order to support the needs of surface-water, lake and estuary, and bay modeling, at the request of EPA-AREAL we have added two routines to the I/O API which will access an entire time step sequence of data in a single operation ("read or write N time steps of data starting at date and time D:T with step DT for variable V to/from file F"). The new routines are READ4D() and WRITE4D(), and very much resemble READ3() and WRITE3(), except for the specification of an entire time step sequence as arguments, the restriction to single-variable operations (ALLVAR3 is not supported for these) and file types CUSTOM3, GRDDED3, BNDARY3, and TSRIES3.

Portability Enhancements

For I/O API version 2, conditional compilation directives were modified to make it easier to port the I/O API to other platforms. Four issues are prominent: Preprocessor definitions and conditional compilation directives recognizing many of these have been provided in the code, as follows: This provides support for at least the following additional platforms and compilers:


File Extensions for KF Event Data

The Kain-Fritsch prameterization for convective clouds (as found in the MM5 and Eta meteorological models, and as adapted for air quality models by McHenry) generates data which does not fit the basic I/O API model of data which always occurs on a regular time step sequence [<date>:<time>:<time-step>:<record-count>] for all the cells in an entire array. Instead, convective cloud events happen on a cell-by-cell basis, each event having its own starting location [<column>,<row>] on a grid, as well as its own [<starting-date>:<starting-time>:<duration>] which define its lifetime. We have constructed an additional file type KFEVNT3 with data structures appropriate for this data, together with additional operations KFOPEN(), KFINDX(), KFREAD(), and KFWRITE() in the I/O API to store, index, and retrieve this kind of data.

Modeling Conventions, September 1996

A number of changes are being made in modeling conventions for the September 1996 freeze/release of the I/O API and related models and analysis/visualization software. Most of these have to do with strict adherence to usage of MKS (SI) units.

Vertical Coordinates

Diagrams showing the relationship of the grid and its layers to the header attributes VGLVS3D, etc., are available in Postscript, X bitmap, JPEG, and GIF image formats.

Horizontal Coordinates

Temporal Coordinates

(A reiteration, not a change:) Dates and times in I/O API files are assumed to be in Greenwich mean time.

Standard Environment Variables

Environment variable "IOAPI_LOG_WRITE" controls whether each successful call to WRITE3() generates a log message or not. The default value of this environment variable is "Y", indicating that log messages will be written (compatible with the previous behavior).

Sample programs

A set of sample programs are now available, demonstrating some useful ways to use the I/O API, how the modeling conventions work, and how the two fit together. The programs were designed not only to be demonstrative, but also to do some useful work:

New Concepts

You must use the new INCLUDE files rather than keeping the old versions. Source code can presently be found in the directories:
/pub/storage/env/proj/ioapi on the EPA workstation cluster; and
/home/xcc/m3io on sequoia.

C bindings: There are now C include-files and C wrappers around the public I/O API routines and almost all of the utility routines .

The public routines have been changed to permit name arguments to be CHARACTER*(*) with actual length at most 16, for files and variables (internally, the I/O API copies the actual name arguments to its own CHARACTER*16 buffers). This makes the API more robust (you no longer need to pad to exactly length-16), as well as friendlier (you can use immediate-mode strings -- e.g. use 'SO4' as a variable-name).

For metadata tracking (suggested by Becky Bagdasarian): the I/O API will look for environment variable "EXECUTION_ID" , to be stored in file headers to identify the exact program execution which produced the file. For files opened for writing, it will record the execution-ID (as a CHARACTER*80 string in the file header, and will report it appropriately. It can be retrieved by getting the file description using routine DESC3() and then examining EXECN3D.

We introduce a new timestep-structure "circular-buffer" (or "restart") for files. The circular-buffer time step structure allows you to minimize the disk space consumed while at the same time ensuring that enough data is stored to disk to allow you to restart a computation. It is defined as follows:

We introduce "BUFFERED" virtual files to provide a mechanism that is safer and more robust than COMMON blocks for sharing data among modules within the same program. These "files" are actually an in-memory mechanism for sharing data between modules in the same program; they are created and read from and written to just as ordinary files are. Only two active time steps are kept in memory (as two active disk records are kept for circular-buffer files, above); memory allocation, etc., is handled behind the scenes by the I/O API when these virtual files are created.

You cause a file with say logical name FOO to be BUFFERED by the way you assign the logical name: setenv FOO BUFFERED instead of setenv FOO <file path name> . Since READ3() and INTERP3() check the date and time associated with the data they retrieve, the I/O API will catch and report instances when you attempt to use data in one module before it has been generated in another (unlike COMMON blocks, which will blithely let you attempt to use variables that haven't been set yet). Since the decision as to whether a file is BUFFERED or is a real disk-file is made at program-launch, on the basis of setenvs in the script, the calling program doesn't know (nor need to know) whether a file is BUFFERED or not. This provides the opportunity to save -- at will -- a program's intermediate values to disk for further analysis.

New basic data type options and descriptions are now available: individual variables may now be arrays of integers, reals, or double-precision, instead of real only. The basic data type of each variable is indicated by the VTYPE3D array in file description data structures in the FDESC3.EXT INCLUDE-file; "magic-number" values M3INT, M3REAL, and M3DBLE, respectively (defined in PARMS.EXT), indicate variables of types INTEGER, REAL, and DOUBLE PRECISION, respectively.

New data structure-type SMATRX3 for sparse matrices used in new emissions modeling. The sparse matrices are stored in the so-called "skyline-transpose" representation. For these matrices, the interpretation of dimensioning attributes, memory layout, and multiplication with vectors V is as follows:

    NROWS3 = number of matrix rows
    NCOLS3 = max number of nonzero columns in a row
    NLAYS3 = 1
    NVARS3 = 1 (or do we want to allow for the possibility
               of multiple matrices using the same indexing
               scheme?  -- i.e., one INDX but multiple COEF's
               in the memory layout below.)
    SINDX3 maps into variable-index for NMAX below
    LINDX3 maps into variable-index for INDX below.
    
    INTEGER  NMAX( NROWS3D )
    INTEGER  INDX( NCOLS3D, NROWS3D )
    REAL     COEF( NCOLS3D, NROWS3D )
    COMMON / ASPARSE /  NMAX, INDX, COEF        !  memory layout
    
    P( j ) = \sum_{i=1}^{NMAX(j)} COEF(i,j) V( INDX( i,j ) )
(Internal change at the request of Kathy Pearson): internal implementation-flag array "TFLAG" |~~> "TIMESTAMP" has values which are 2-vectors containing components for the date and time for the corresponding record (using Models-3 date and conventions -- TIMESTAMP( var, rec) = (YYYYDDD,HHMMSS).

PARMS3.EXT: Dimensioning and Constants

New or changed dimensioning parameters

New missing-value parameters in PARMS3.EXT:

the intent is to use BADVAL3, IMISS, and CMISS as the standard REAL, INTEGER, and CHARACTER-string "missing" values and always to test for BADVAL as X < AMISS3. Note that BADVAL3 and AMISS3 are generally-unused values, safely in range of floating-point arithmetic for all M3/EDSS machines, and BADVAL3 < AMISS3 on all such machines (i.e., the test is roundoff-safe on any reasonable hardware).

New "magic-number" parameters

IODECL3.EXT: Declaration of routines

IODECL3.EXT now declares routines INTERP3() and DDTVAR3(); it no longer declares the obsolete routine CREATE3().

FDESC3.EXT: Grid and File Descriptions

Grid description definitions were changed extensively. For horizontal grids and coordinates, two new description parameters, (XCENT3D,YCENT3D) were added to FDESC3.EXT. These describe the (Lat-Lon) or standard-UTM (for offset-UTM) coordinates for the center of the Cartesian coordinate system (i.e., Cartesian (0,0) has these as its Lat-Lon or UTM coordinates. The complete vertical grid description (previously not specified in file descriptions) was also added. Vertical grid descriptions provide the following information:

A new maximum time step number attribute MXREC3D for files was added to FDESC3.EXT . It allows, for example, an analysis program to determine easily not only the beginning (as it could do earlier, in terms of SDATE3D:STIME3D) but also the end of the time period for which a file contains data.

Individual variables may now be arrays of basic data type INTEGER, REAL, or DOUBLE-PRECISION, instead of real only. Which such type each variable has is indicated by the VTYPE3D array in file description data structures in the FDESC3.EXT INCLUDE-file; it takes the "magic-number" values (defined in PARMS.EXT):

OPEN3()/CREATE3()

The OPEN3()/CREATE3() changes semantics are as follows:

CREATE3() goes away.

OPEN3( FNAME, FMODE, PGNAME ) takes a new argument, FMODE, (replacing the READ-ONLY/READ-WRITE flag) which takes the following magic numbers defined in PARMS3.EXT as its values:

FSREAD3 = 1 for "old read-only"
FSRDWR3 = 2 for "old, read-write (update)"
FSNEW3 = 3 for "new, read-write"
FSUNKN3 = 4 for "unknown read-write" (create if necessary; otherwise perform consistency-check with the supplied definition).
FSCREA3 = 5 for "create/truncate read-write" (remove any existing file and create new file with the supplied definition).

For files opened "old", the file must already exist, or else OPEN3() will return FALSE (which matches the previous behavior of OPEN3()).

For files opened "new", the behavior matches the previous CREATE3(): the file must NOT exist; the caller must have supplied a file description in the FDESC3.EXT commons. for use by OPEN3(), which then constructs the new file according to the caller-supplied description.

For files opened "unknown", the file may or may not exist; the caller must have supplied a file description in the FDESC3.EXT commons; and the behavior depends upon whether the file exists or not: if it does, the file is opened and the description from the file's header is checked for consistency with the description supplied by the caller. If these are consistent, OPEN3() returns TRUE; if not, it closes the file again and returns FALSE. If the file does not exist, OPEN3() will create a new file according to the caller-supplied description (just as it would if the mode had been "new".

For files opened "create/truncate", the caller must have supplied a file description in the FDESC3.EXT commons. OPEN3() first checks validity of this description (returning FALSE if IOAPI_CHECK_HEADERS is set and the file description is not consistent), then closes the file if it is already open. If the file exists, it deletes it, and then creates a new file according to the supplied file description. NOTE: Joan Novak (EPA) and Ed Bilicki (MCNC) have declared as a software standard that modeling programs may not use FSCREA3 as the mode for opening files. FSCREA3 is reserved for analysis/data extraction programs only.

OPEN3() now writes significant portions of a file's description to the program log upon success at opening a file.

WRITE3()

The granularity of WRITE3() has been changed to permit write-granularity at the level of time steps of individual variables for GRIDDED , BOUNDARY , and CUSTOM files. The argument list now looks like:
    WRITE3( <filename>, <variable-name>, <date>, <time>, <buffer > ) 
    
If the file type is GRIDDED, BOUNDARY, or CUSTOM, then the variable-name argument may be either a valid variable name (in which case it will write exactly that variable from the buffer to the file), or ALLVAR3 (defined to be 'ALL' in PARMS3.EXT (in which case the behavior of WRITE3() is as defined in the previous version, i.e., to write an entire time step ofdata from the buffer, interpreted as an array of all the variables, to the file). If the file is of any other type, the variable-name argument must be 'ALL' (and the behavior is as defined earlier).

CHECK3()

The change to WRITE3() changes the semantics of CHECK3() , so that it must have argument-list
    CHECK3( <filename>, <variable-name>, <date>, <time> )
    
and the semantics is that .TRUE. is returned iff the indicated time step is available for the indicated variable. Note that 'ALL' is accepted as a variable-name; in that case, CHECK3() returns TRUE iff all variables are present for the indicated date and time. This means it still returns FALSE even if some variables are available for that date and time, but others are not.

New I/O API Function CLOSE3()

A new I/O API function CLOSE3() , has been requested, so that open/close operations are more symmetric. It has argument-list
    CLOSE3( <filename> )
    
and the semantics is that .TRUE. is returned iff the file was successfully flushed to disk and closed.

New I/O API Function DDTVAR3()

For GRIDDED, BOUNDARY, or CUSTOM files, DDTVAR3() returns the mean time derivative (per second) for the indicated variable for the time step containing the indicated date and time. Note that for time independent files this derivative is of course zero.

New Utility Routines and Support Structures

In order to keep track of horizontal grids and coordinate systems , and to make their definitions easily available to programs without the necessity to recompile them every time a new grid is defined, we introduce a "grid-and-coordinate-description" file GRIDDESC , and a family of utility routines as follows:
GRIDDESC is the logical name for a text file with two segments. Each segment has a 1-line header (which by convention provides titles for the columns in the data records), a sequence of data records, and a terminal record with name field blank ( i.e. ' '). The first segment is the coordinate system description segment and consists of text records giving coordinate-system name and descriptive parameters P_ALP, P_BET, P_GAM, XCENT, and YCENT. The second segment is the grid-description segment, and consists of text records giving grid name, related coordinate-system name and descriptive parameters XORIG, YORIG, XCELL, YCELL, NCOLS, NROWS, and NTHIK. Each data record is list-formatted (i.e., items are separated by either blanks or commas, where names are quoted strings, and consists of three lines, as appropriate:
    
    COORD-NAME
    P_ALP, P_BET, P_GAM
    XCENT, YCENT
    
        or
    
    GRID-NAME
    COORD-NAME, XORIG, YORIG, XCELL, YCELL
    NCOLS, NROWS, NTHIK
There are at most 32 coordinate systems and 256 grids listed in one of these files. These files are small enough to be archived easily with a study, and have a sufficiently simple format that new ones can easily be constructed "by hand."

Logical function DSCGRID() manages access to GRIDDESC (in fact, serves as an operational definition of the GRIDDESC file format), and gets grid and coordinate system descriptive parameters COORDNAME, COORDTYPE, P_ALP, P_BET, P_GAM, XCENT, YCENT, XORIG, YORIG, XCELL, YCELL, NCOLS, NROWS, and NTHIK for the specified grid name. Returns TRUE iff the requested grid is found in the GRIDDESC file. LOGICAL ENTRY DSCOORD() of DSCGRID() gets coordinate-system descriptive parameters P_ALP, P_BET, P_GAM, XCENT, and YCENT for the specified coordinate system name (also returning TRUE iff the coordinate system is found in the GRIDDESC file).

New date-and-time functions

DAYMON: find month and day-of-month for <jdate>
DT2STR: Construct string "HH:MM:SS Month DD, YYYY" for <jdate-&-time>
GETDTTIME: get current wall-clock date and time
HHMMSS: construct string "HH:MM:SS" for <time>
JULIAN: find Julian day number for <month> <day> <year>
MMDDYY: construct string "Month DD, YYYY" for <jdate>
WKDAY: get day-of-week (1...7) for <jdate>
New utility functions
DSCOORD: get description of named coordinate system
DSCGRID: get description of named grid
ENVINT: get INTEGER value of logical name from the environment
ENVREAL: get REAL value of logical name from the environment
ENVSTR: get CHARACTER-STRING value of logical name from the environment
ENVYN: get LOGICAL value of logical name from the environment
FIND1, FIND2, FIND3, FIND4: find integer key-tuple in sorted keytuple-table
GCD: greatest common divisor function
GETDFILE: open and return unit number for direct access Fortran file with specified logical name
GETEFILE: open and return unit number sequential Fortran file with specified logical name
GETDBLE: prompt user for DOUBLE and get response, with default and range checking
GETMENU: prompt user for menu choice, etc.
GETNUM: prompt user for INTEGER, etc.
GETREAL: prompt user for REAL, etc.
GETYN: prompt user for "Yes-No" answer, etc.
GRIDOPS: select and compute various comparison operations
INDEX1: unsorted-name-table lookup for character-string key
JUNIT: return a "safe" Fortran unit number
LEN2: number of leading blanks in string
M3ERR: warning message; or error message with SHUT3() and CALL EXIT( 2 )
M3EXIT: exit message with SHUT3() and CALL EXIT( <status> )
NAMEVAL: get value of environment variable (for Fortran)
POLY: degree-d polynomial interpolation function
TRIMLEN: string length, not counting trailing blanks
UPCASE: make string into ALL CAPS


See also "What's New" and file conversion utility M4FILTER

Previous: Tutorial

Next: Conventions: Logical Names

Up: Related Programs

To: Models-3/EDSS I/O API: The Help Pages