I/O API Binary-Mode Design and Implementation

Contents

Back to Contents


Introduction

This page describes the structure of the files for a new underlying ("BINFIL3") binary mode for the EDSS/Models-3 I/O API, to supplement the existing (and default) netCDF-file mode, the in-memory BUFFERED mode, and the PVM-based virtual mode.

Since this mode uses native machine binary representation for its data as its underlying data representation layer, it should offer somewhat greater performance than the machine independent lower layers (netCDF, PVM) do, for applications where I/O performance is critical. On the other hand, it is very desirable to keep the header metadata in a portable format, so that user-level programs can still read the data on binary-incompatible platforms and perform the appropriate data conversion themselves. For this reason, header metadata is stored in the portable formats, as described below.

The sequence of data structures in these files is modeled somewhat after the structure of netCDF files, although the implementation mechanisms to store some of the metadata in a machine independent fashion are to some extent borrowed from ideas found in other formats, e.g., GRIB.

Back to Contents


Implementation Considerations

Restrictions and Limitations

Remarks on Implementation Strategy

Back to Contents


Metadata Formats

The following representations of primitive data types of significance to the I/O API are used to store metadata in a portable fashion (so that the metadata can be interpreted on platforms other than the originating platform) in I/O API BINFIL3 files. In principle, this lets the application programmer use the BINFIL3 layer of the I/O API to read the data on any platform, determine the transformations necessary to interpret it on his platform, and then perform the transformations on the data and use it.

INT4
represented by a 4-byte string, in little-Endian order:
BYTE_0(X) contains (unsigned char)(X&&255), i.e., the least significant byte of X
BYTE_1(X) contains (unsigned char)((X/256)&&255)
BYTE_2(X) contains (unsigned char)((X/65536)&&255)
BYTE_3(X) contains (unsigned char)((X/16777216)&&255)
REAL
represented by a character string formatted with format equivalent to the Fortran FORMAT 1PE15.9, followed by a trailing ASCII NULL

DOUBLE
represented by a character string formatted as 1PD27.19, followed by a trailing ASCII NULL

NAME
Equivalent to a Fortran CHARACTER*16 type (fixed-length 16-byte string, padded on the right by blanks; not nul-terminated as a C string would be.)

LINE
Equivalent to a Fortran CHARACTER*80 type (fixed-length 80-byte string, padded on the right by blanks)

STRING
Equivalent to the Mac Fortran internal representation of a Fortran CHARACTER*(*) variable (with blank-padding on the right), i.e., as a C "struct hack"
struct{
      INT4 length;
      char contents[ length ];
      } ;

Back to Contents


File Data Structure Design

The structure of a BINFIL3 file is as follows:

Header Section
INT4 IOAPI_VRSN: I/O API Version

Machine/Compiler Architecture Metadata

INT4 BYTE_ORDER: Byte order, i.e., the C subscripts at which BYTE_0, BYTE_1, BYTE_2, BYTE_3 would occur if we think of an integer as a C union:
union{ int idata; char cdata[4] } ;

INT4 INTSIZE: size of Fortran "INTEGER"

INT4 REALSIZE: size of Fortran "REAL"

INT4 DBLESIZE: size of Fortran "DOUBLE PRECISION"

Per-File Metadata

NAME GRIDNAME: grid name

NAME UPDATE_NAME: name of the last program writing to file

LINE EXECUTION: value of environment variable EXECUTION_ID

LINE FILE_DESC[ MXDESC3=60 ]: array containing file description (set by programmer during OPEN3())

LINE UPDATE_DESC[ MXDESC3=60 ]: array containing run description, from file with logical name SCENFILE

Dimension/Type Metadata

INT4 FTYPE: File data type
CUSTOM3, GRDDED3, BNDARY3, IDDATA3, PROFIL3, or SMATRX3

INT4 GDTYP: map projection type

LATGRD3=1 (Lat-Lon),
LAMGRD3=2 (Lambert conformal conic),
MERGRD3=3 (general tangent Mercator),
STEGRD3=4 (general tangent stereographic),
UTMGRD3=5 (UTM, a special case of Mercator),
POLGRD3=6 (polar secant stereographic),
EQMGRD3=7 (equatorial secant Mercator), or
TRMGRD3=8 (transverse secant Mercator)

INT4 VGTYP: vertical coordinate type

VGSGPH3=1 (hydrostatic sigma-P),
VGSGPN3=2 (nonhydrostatic sigma-P),
VGSIGZ3=3 (sigma-Z),
VGPRES3=4 (pressure (mb)),
VGZVAL3=5 (Z (m above sea lvl), or
VGHVAL3=6 (H (m above ground))
INT4 NCOLS: number of grid columns

INT4 NROWS: number of grid rows

INT4 NLAYS: number of layers

INT4 NTHIK:

for BNDARY3 files, perimeter thickness (cells), or for SMATRX3 files, number of matrix-columns (unused for other file types)

Temporal Metadata

INT4 SDATE: starting date, coded YYYYDDD according to Models-3 conventions

INT4 STIME: starting time, coded HHMMSS according to Models-3 conventions

INT4 TSTEP: time step, coded HHMMSS according to Models-3 conventions

INT4 NRECS: current number of time step records in the file (1-based Fortran-style counting)

Spatial Metadata

DOUBLE P_ALPHA: first map projection descriptive parameter

DOUBLE P_BETA: second map projection descriptive parameter

DOUBLE P_GAMMA: third map projection descriptive parameter

DOUBLE X_CENTER: Longitude of the Cartesian map projection coordinate-origin (location where X=Y=0)

DOUBLE Y_CENTER: Latitude of the Cartesian map projection coordinate origin (map units)

DOUBLE X_ORIGIN: Cartesian X-coordinate of the lower left corner of the (1,1) grid cell (map units)

DOUBLE Y_ORIGIN: Cartesian Y-coordinate of the lower left corner of the (1,1) grid cell (map units)

DOUBLE X_CELLSIZE: X-coordinate cell dimension (map units)

DOUBLE Y_CELLSIZE: Y-coordinate cell dimension (map units)

REAL VGTOP: model-top, for sigma vertical-coordinate types

REAL VGLEVELS[0:NLAYS+1]: array of vertical coordinate level values; level 1 of the grid goes from vertical coordinate VGLEVELS[0] to VGLEVELS[1], etc.

Per-Variable Metadata

NAME VNAME[ NVARS ]: array of variable names

NAME UNITS[ NVARS ]: array of units or 'none'

LINE VDESC[ NVARS ]: array of array of variable descriptions

INT4 VTYPE[ NVARS ]: array of variable types:

M3BYTE = 1
M3INT = 4
M3REAL = 5
M3DBLE = 6

Additional attributes

Not implemented at this time.

Eventually: TBD, as necessary for the WRF extensions placed in I/O API Version 2.2. At this point, we anticipate that the implementation will be in terms of a sequence of <name-type-value> triplets

Data Section

sequence of time step records

Time Step Header

INT4 FLAGS[2,NVARS]: array of data-availability flags (with Fortran-style left-major, 1-based subscripting):
FLAGS[1,V] are the dates for the data record, encoded YYYYDDD

FLAGS[2,V] are the times for the data record, encoded HHMMSS

FLAGS[1,V] and FLAGS[2,V] are in consecutive memory/disk locations.

(NOTE: This amount of data is not functionally necessary; however, it is included for the historical reasonsa involving the convenience of visualization-system programmers.)

Time step Contents:

array of data records, subscripted by variable 1, ..., NVARS:

<type> array of data for this variable and time step. Data is in native machine binary format.

Back to Contents


New Pseudo-Attributes in I/O API Version 3

INTEGER IOAPI_VERSION

INTEGER IMPL_LAYER

INTEGER BYTE_ORDER

INTEGER INTEGER_SIZE

INTEGER REAL_SIZE

INTEGER DBLE_SIZE

Back to Contents


Send comments to
Carlie J. Coats, Jr.
carlie@jyarborough.com