3.4. Environment Variables Used by the Software
The main method for controlling the Spatial Allocator programs is through environment variables. This page is a reference page that describes all of the environment variables used by the programs srgcreate.exe and allocator.exe. (Note: not all variables are used in all circumstances) The active variables depend on the processing mode selected and the settings of the other options (e.g., grid name and a GRIDDESC file are not required unless you are using a grid as an input or output variable).
See Section 5.1:Creating Inputs to SMOKE Biogenic Processing for information on the environment variables used by beld3smk.exe and diffioapi.exe
3.4.1 Helper Variables Found in Scripts
The following environment variables are used to help locate files and directories in the example scripts, but they are not recognized by any of the Spatial Allocator programs:
- SA_HOME - Installation directory
- OUTPUT - Location of output files
- DATADIR - Location of shapefiles
- OUTPUT - Name of the output directory
- EXE - Path and Name of the program executable
- TIME - The name of the command that can compute the time of a program on the executing computer
- SRG_FILE - Path and file name of final merged surrogate file
3.4.2 Program Control Variables
The following variables control how the allocator and srgcreate programs behave:
- MIMS_PROCESSING - Controls the mode of operation for the allocator program. Valid modes are:
- ALLOCATE - Convert data from one geospatial unit (e.g., counties) to another (e.g., grid cells) by summing or averaging attributes
- OVERLAY - Use to determine whether a grid, bounding box, or set of polygons overlaps a region and to print the attributes of the overlaid shapes (points, lines, or polygons)
- FILTER_SHAPE - Filter shapefile attributes and save them as a new shapefile or comma-separated ASCII file
- CONVERT_SHAPE - Create a copy of a shapefile with a new projection
- DEBUG_OUTPUT - Specifies whether to write the debug output to standard output. If debug output is turned off, programs will output only critical information (such as errors and I/O API log information) (Y for yes/on or N for no/off)
- MAX_LINE_SEG - Specifies the maximum length of a line segment to use when reading in a line or polygon Shapefile or creating the polygons for a grid. Any line segments longer than the specified length will be split to be no longer than the length specified by this variable. This could be useful when converting data on one grid to another, as the spatial mapping can be done more precisely when the grid is described by more points than just the four corners. Note that applying this feature will make the program run more slowly.
3.4.3 Surrogate Input Specification Variables
The following variables are used by srgcreate.exe to specify information relating to the inputs used to create surrogates.
- DATA_FILE_NAME - Directory and base file name (without .shp extension for shapefiles) for file
containing data polygons (used only by srgcreate.exe)
- DATA_FILE_TYPE - Type of file containing data polygons (used only by srgcreate.exe)
- DATA_ID_ATTR - Name of attribute
from data polygons file that specifies a unique data polygon for surrogates (e.g., FIPS_CODE if creating
surrogates on a county scale)
- WEIGHT_FILE_NAME - Directory and base file name (without .shp extension for shapefiles) for file
containing weight shapes; if no weights are desired, set to NONE
- WEIGHT_FILE_TYPE - Type of file containing weight shapes (currently the only supported value is ShapeFile)
- WEIGHT_ATTR_LIST - Attribute used as the weight in the surrogate calculation.
- If you wish to perform the surrogate computatins using only the area, length, or count, enter NONE.
- If multiple surrogates are desired from different attributes of the
same shapefile (e.g., housing, population), specify a comma-separated list of attribute names.
- When using a weight function, enter USE_FUNCTION and be sure to then set the WEIGHT_FUNCTION variable to the function you wish to use.
- WEIGHT_FUNCTION - A mathematical function used to compute a surrogate based on a function
of multiple attributes. For example, (A+B+C) or 0.4*N1+0.6*N2. Any arithmetic function using the
operators +, -, *, /, (, ), constants, and variable names may be used. Exponential notation and power functions are not currently supported, nor are unary negative numbers used as constants (e.g., X1 + -5 should be X1 - 5)
- FILTER_FILE - Name of file containing attributes on which to filter a shapefile (or NONE if no filter is to be applied).
- FILTERED_WEIGHT_SHAPES - The name of a temporary shapefile that will be created that contains only the shapes to be included in the surrogate.
3.4.4 Surrogate Output Specification Variables
The following variables are used by srgcreate.exe to specify information relating to the output of surrogates.
- OUTPUT_FORMAT - Current only SMOKE format is supported.
- SURROGATE_ID - The integer used to designate a particular surrogate (e.g., 7 represents households).
If multiple surrogates are being created from the same shapefile, specify a comma-separated list of integers
that correspond to the list specified for WEIGHT_ATTR_LIST.
- SURROGATE_FILE - Directory and file name of output surrogate file (including .txt extension)
- WRITE_HEADER - (Optional) Specifies whether to write a header line to give the names of the output attributes.
- YES (default) - Displays traditional SMOKE-ready header
- NO - Suppresses header (used when running multiple surrogates for the same grid, to prevent the repetition of the same header information)
- WRITE_SRG_NUMERATOR - (Optional) Specifies whether to write the numerator used to compute the surrogate fraction as part of a comment that follows the fraction.
- YES - Adds the surrogate numerator in the surrogate file as a new column
- NO (default) - Does not add column
- WRITE_SRG_DENOMINATOR - (Optional) Specifies whether to write the denominator used to compute the surrogate fraction as part of a comment that follows the fraction.
- YES - Adds the surrogate denominator in the surrogate file as a new column
- NO (default) - Does not add column
- WRITE_QASUM - (Optional) Specifies whether to write a running sum of surrogate fractions for the county as part of a comment in the surrogates file. This helps to quality assure the surrogate fractions to make sure they do not sum up to greater than 1 for any county within the grid domain.
- YES - Sums the surrogates by the specified attribute (often by count/FIPS) and shows the
sum in the surrogate file as a new column
- NO (default) - Does not add column
- DENOMINATOR_THRESHOLD -The value of a denominator threshold under which the surrogate values will not be used. Instead,
the surrogate value is output as comment line with # sign if denominator is less than the threshold.
The default value is 0.00001.
- OUTPUT_FILE_NAME – Directory and name of output file (without extension). This will cause srgcreate.exe to create an output RegularGrid shapefile that contains the surrogate numerators for each grid cell.
- SAVE_DW_FILE - (Optional) The directory and file name of an intermediate file to save that contains the overlay of weight shapes on data polygons. Note: This file is of interest because it is independent of the grid. Set to NONE or leave unset to not creat this file.
- USE_DW_FILE - (Optional) The directory and file name of intermediate file to use to initialize the intersection of data and weight shapes. Set to NONE or leave unset for no file.
3.4.5 Map Projection and GRID Specification Variables
The following variables are used by srgcreate.exe and allocator.exe to specify information relating to the map projections of surrogate input and output files. Note: when the name of a grid is specified for a map projection variable, the map projection information is obtained from that of the corresponding grid in the GRIDDESC file. For more information on the specification of map projections and ellipsoids, see Section 3.4:Specifying Grids, Ellipsoids, and Map Projections.
The following variables are used by srgcreate.exe only:
- DATA_FILE_ELLIPSOID - PROJ.4 ellipsoid for the data (e.g. county) polygons. Users must set this variable. It can be set as "+a=6370997.0,+b=6370997.0" for a sphere with R=6370997.0m, "+datum=NAD83" for GRS80 ellipse, or other.
- WEIGHT_FILE_ELLIPSOID - PROJ.4 ellipsoid for the weight shapes.
- DATA_FILE_MAP_PRJN - The name of a grid or a list of PROJ.4 map projection parameters for the data (e.g. county) polygons. It can be geographic or projected coordinate systems defined by +proj=latlong, +proj=lcc, or other.
- WEIGHT_FILE_MAP_PRJN - The name of a grid or a list of PROJ.4 map projection parameters for the weight shapes.
The following variables are used by both allocator.exe and srgcreate.exe:
- GRIDDESC - (Optional) Directory and file name of file containing the grid descriptions (i.e. the GRIDDESC file).
- OUTPUT_GRID_NAME - The name of the output grid for surrogate processing or spatial allocation (this must exist as a grid name in the GRIDDESC file). This variable is required by srgcreate.exe. It is also required by allocator.exe when the output file type is IoapiFile or RegularGrid.
- OUTPUT_POLY_FILE - For Polygon OUTPUT_FILE_TYPE a shapefile is specified. For EGrid, an arcGIS polygon
text file is specified.
- OUTPUT_POLY_ATTR - For Polygon OUTPUT_FILE_TYPE an attribute of the shapefile defined by
OUTPUT_POLY_FILE is specified.
- OUTPUT_FILE_MAP_PRJN - The output map projection when OUTPUT_FILE_TYPE is Polygon for srgcreate. Or, it is the name of a grid or a list of PROJ.4 map projection
parameters for the output shapes.
This is not used when the output file is RegularGrid, IoapiFile, and EGrid, as the map
projection is read looked up in the GRIDDESC file for the grid specified by OUTPUT_GRID_NAME.
- OUTPUT_FILE_ELLIPSOID - PROJ.4 ellipsoid for the output shapes. It can be set as "+a=6370997.0,+b=6370997.0" for a sphere with R=6370997.0m, "+datum=NAD83" for GRS80 ellipse, or other.
- USE_CURVED_LINES - (Optional) Set to YES to compute length of lines as a curve over the Earth's surface, as MapInfo does (default is NO – i.e., length = sqrt(a^2 + b^2))
The following variables are used by allocator.exe:
- INPUT_GRID_NAME – (Optional) Name of input grid (when INPUT_FILE_TYPE is RegularGrid)
- INPUT_FILE_ELLIPSOID - PROJ.4 ellipsoid for the input shapes. It can be set as "+a=6370997.0,+b=6370997.0" for a sphere with R=6370997.0m, "+datum=NAD83" for GRS80 ellipse, or other.
- INPUT_FILE_MAP_PRJN - The name of a grid or a list of PROJ.4 map projection parameters for the input shapes. Note that this is not used when the input file is an I/O API file.
3.4.6 Variables Specifying Input and Output File Characteristics
The following variables are used by allocator.exe when run in various modes to specify information about input and output files.
- INPUT_FILE_TYPE – Shapefile, PointFile, IoapiFile, or RegularGrid (i.e., a special type of shapefile that contains gridded data, such as that output when creating surrogates)
- INPUT_FILE_NAME – Name of the file containing input data for spatial allocation. If the INPUT_FILE_TYPE is Shapefile or RegularGrid, this does not include the extension, otherwise it should include the extension.
- INPUT_FILE_DELIM – The delimiter that is used for the input PointFile (when INPUT_FILE_TYPE is PointFile); valid arguments are COMMA, PIPE, SPACE, and SEMICOLON
- INPUT_FILE_XCOL – The name of the column containing x coordinates in a PointFile (when INPUT_FILE_TYPE=PointFile)
- INPUT_FILE_YCOL – The name of the column containing y coordinates in a PointFile (when INPUT_FILE_TYPE=PointFile)
- INPUT_GRID_NAME – Name of the input grid (when INPUT_FILE_TYPE=RegularGrid)
- OUTPUT_FILE_TYPE - Three types are currently supported: RegularGrid, EGrid, and Polygon.
- OUTPUT_FILE_NAME - Directory and name of output file (without extension when OUTPUT_FILE_TYPE is ShapeFile or RegularGrid; with extension when OUTPUT_FILE_TYPE is IoapiFile).
- WRITE_HEADER – Specifies whether to write a header line to give the names of the output attributes (set to Y or N).
3.4.7 Overlay Mode Specific Variables
The following variables are used by allocator.exe when run in OVERLAY mode to specify information relating to the overlay shape.
- OVERLAY_TYPE - Specifies the type of shape that will be used as the overlay mask. Valid values are RegularGrid, ShapeFile, PolygonFile, or BoundingBox.
- OVERLAY_SHAPE - Specifies the shape that will be used as the overlay mask, based on the value of OVERLAY_TYPE. This variable can contain either a grid name, file name or, in the case of a BoundingBox, a set of coordinates.
- If OVERLAY_TYPE is RegularGrid, specify the name of a grid (when used, this requires GRIDDESC to be set)
- If OVERLAY_TYPE is ShapeFile, specify the name of a shapefile (note that the OVERLAY region is the union of all of the polygons in the shapefile).
- If OVERLAY_TYPE is BoundingBox, specify as coordinates: x1,y1,x2,y2
- If OVERLAY_TYPE is PolygonFile, specify the name of an ASCII polygon file. Each line of the file should contain two values with the following format:
xcoord ycoord
Note that the polygon will automatically be closed, and the points should be specified in a clockwise manner so that they are not interpreted as a hole.
- OVERLAY_MAP_PRJN – The map projection for the OVERLAY_SHAPE.
- OVERLAY_ELLIPSOID – The ellipsoid for the OVERLAY_SHAPE.
- OVERLAY_OUT_TYPE – Set to Stdout or DelimitedFile (eventually possibly Shapefile).
- OVERLAY_OUT_NAME – Set to Stdout or the name of the output file to create.
- OVERLAY_OUT_CELLID - If it is set to YES, the OVERALY mode process will output grid, egrid, or polygon ID from output file.
- OVERLAY_ATTRS – The list of attributes to read from the INPUT_FILE_NAME and print to standard output or to the OUTPUT_FILE_NAME. Set to ALL or a use comma-separated list of attribute names for which values will be printed.
- OVERLAY_OUT_DELIM – A constant that specifies the type of delimiter to use for the DelimitedFile output type - valid values are COMMA, PIPE, SPACE, and SEMICOLON. (Note that a PointFile is a special case of DelimitedFile – but DelimitedFile is used here because the output file does not need to be a PointFile since the shapes may not be points)
- MAX_INPUT_FILE_SHAPES - Currently supported only when using OVERLAY mode, this variable specifies the maximum number of output polygons to keep in memory for processing at one time (used when the OUTPUT_FILE_TYPE is Shapefile)
3.4.8 Allocate Mode-Specific Variables
The following variables are used by allocator.exe when run in ALLOCATE mode to specify information relating to the allocation process.
- ALLOCATE_ATTRS – The attributes in the input file to be allocated. Set to ALL to allocate all attributes, or set to a comma separated list of attribute names.
- ALLOC_MODE_FILE – The name of the allocation mode file, which specifies how the attributes in the input file should be allocated (e.g., aggregate,
average, discrete overlap, or discrete centroid).
- ALLOC_ATTR_TYPE - Used to specify SURF_ZONE if creating a CMAQ OCEANfile.
- OUTPUT_POLY_FILE – The name of a shapefile that specifies the geometry
of the output polygons when OUTPUT_FILE_TYPE is Shapefile
(note that the shapes must be polygons; points or lines are not allowed)
- OUTPUT_POLY_TYPE – The type of the file specifying the geometry of the
output polygons. Currently, only Shapefile is supported.
- OUTPUT_POLY_ATTRS – A list of attributes to be carried over from the O
UTPUT_POLY_FILE to the output file.
- OUTPUT_POLY_ELLIPSOID – The ellipsoid of the shapes in OUTPUT_POLY_FILE
- OUTPUT_POLY_MAP_PRJN (formerly POLY_DATA_MAP_PRJN) – The map projectio
n of the shapes in OUTPUT_POLY_FILE
To Section 3.5: Specifying Grids, Ellipsoids, and Map Projections