5.4. Reports Created by Smkreport

5.4.1. File format(s)
5.4.2. Example reports

As described in Section 5.2, “Smkreport Program”, the Smkreport program creates user-defined reports and summaries for QA purposes based on the SMOKE intermediate files (e.g., SMOKE inventory files, hourly emissions, speciation matrix, gridding matrix). The reports that are created by Smkreport can easily be imported into a spreadsheet program or into a database because the ASCII output format is delimited. Note that report files larger than about 60,000 lines will not import into Excel because of limitations of that software.

Because Smkreport is very flexible, there are a large number of reports that you can generate. The “File Format(s)” section below describes the guidelines that Smkreport uses in building the output reports, based on the instructions the user provides to Smkreport. This section is followed by a large number of example reports, to help you understand the possible format options.

5.4.1. File format(s)

5.4.1.1. Multiple reports in one file

Each file output by Smkreport can contain multiple reports. The user has control over how many reports are included in each file (please see the REPCONFIG file documentation above for more information on using the /CREATE REPORT/ and /NEWFILE/ packets). The reports in a file are separated by a row of pound signs (#). If having multiple reports in a single file causes problems for importing the data into other software, then you should create a REPCONFIG file that specifies a new file for each report.

5.4.1.2. Basic report structure

The reports in an output file use columns and rows to structure the information. The reports are structured to permit the user to import a report into spreadsheet or database software, or some other data-processing tool for further analysis. The columns in the reports are semicolon delimited (though the delimiter can be set by the user), have labels, and have units. Where possible, we have also tried to make sure that the text versions of the reports can be viewed easily. The width of the widest entry in a column is used to set the width for the column, thereby making fixed-file format transfer to other software possible.

5.4.1.3. Advanced report structures

Using the ARRANGE instruction in the Smkreport REPCONFIG input file, you can also create reports with more advanced structures: a single report that is composed of multiple physical files; a report that is single physical file but has multiple sections; or a report that uses a “database” format of one data value (e.g., emissions value) per line. Use of these formats is described in the REPCONFIG file documentation in Section 5.3, “REPCONFIG Input File”.

5.4.1.4. Structure template

The following is a general template for the Smkreport output files. It consists of a REPORTS section and a METADATA section. These two sections are discussed in detail below the template.

<REPORT1>
     <User Title(s), one per line>
     <Automatic Titles, one per line>
     <Column headers>
     <Units, for columns that need them>
     <Line of minus signs>
     <Data values for all rows>
##########################################################################
<REPORT2, if any>
     <User Title(s), one per line>
     <Automatic Titles, one per line>
     <Column headers>
     <Units, for columns that need them>
     <Line of minus signs>
     <Data values for all rows>
##########################################################################
METADATA
     <Input files>
     <Echoed group definitions from REPCONFIG file>
     <Echoed report instructions from REPCONFIG file>
<End of file>

5.4.1.5. Elements of the REPORTS section

5.4.1.5.1. User titles

The first section of each report is a title section. The first title entries in the report are any titles specified in the /CREATE REPORT/ packets in the REPCONFIG file, created by the user. They are listed in the same order as those provided in the input packets for the report. It is recommended that user titles be no more than 80 characters wide (to more easily fit on printed documents), although they can have up to 300 characters.

5.4.1.5.2. Automatic titles

Automatic titles are inserted by Smkreport to indicate the operations that were done by Smkreport to generate the requested report. This helps the report be self-describing. The automatic titles are formatted to be no larger than 80 characters wide, in order to permit easier viewing as text files. The following automatic titles are applied, where applicable:

  • The source category processed

  • The year of the base inventory

  • The year of the grown inventory (only if a grown inventory file is input to Smkreport)

  • Whether growth factors were applied, and if so, for what year

  • Whether multiplicative control factors were applied, and if so, for what year

  • Whether a gridding matrix was applied, and if so, the grid name

  • Whether a speciation matrix was applied, and if so, whether it was mole-based or mass-based

  • Whether annual, average-day, or hourly data from SMOKE intermediate files are the basis of the emissions in the report

  • If hourly data were input, the time period processed with dates, hours, and time zones

  • Whether annual or average-day data were the basis for the report

  • Whether the emissions are divided by grid cell area (NORMALIZE CELLAREA instruction)

  • Whether the emissions are divided by population (NORMALIZE POPULATION instruction)

  • The name of the group used to select the data, if any (the group definition will be listed in the METADATA section)

  • The name of the subgrid used to select the data, if any (the subgrid definition will be listed in the METADATA section)

5.4.1.5.3. Data columns written

The particular columns of data that are written to a report are determined by the BY instructions in the /CREATE REPORT/ packet. Typically, one column is created for each BY instruction, which corresponds to the name of the BY instruction. The exceptions to this rule are the following:

  • The BY COUNTRY, BY STATE, and BY COUNTY instructions, which output the country, state, and county codes, and optionally the country, state, and county names, respectively (if NAME is included at the end of the instruction).

  • The BY SOURCE instruction, which outputs columns for all source characteristics (point sources: country/state/county, plant, char1, char2, char3, char4, char5, SIC; area sources: country/state/county, SCC; mobile sources: country/state/county, road class, vehicle type, SCC). For point sources, the BY SOURCE instruction can also output four additional columns for stack parameters, and another additional column for facility name.

In addition to the columns specified by the BY instructions, the emissions and/or activity data are written out to the report. If no speciation option is used (i.e., the SPECIATION instruction is not given), then the other data columns are the emissions and activities from the SMOKE inventory file (e.g., NOX, VOC, PM10, VMT). If the speciation option is used, then the other data columns are the speciated emissions data (e.g., NO, NO2, ALD, PAR, ISOP). The SELECT DATA instruction determines which inventory data and speciated emissions data are written to the reports. When speciated data are output, the columns for the corresponding inventory pollutants are written as well (as explained more fully in the input file description of the SELECT DATA instruction, earlier).

The order of the columns of data values is taken from the order of the data in the SMOKE intermediate files. The orders of pollutants and activities are controlled by the INVTABLE file when Smkinven is run to create the SMOKE inventory files. Species are arranged by Spcmat based on the order of the pollutants, and then alphabetically for species coming for the same pollutant, and this order is used to create the speciation matrix.

5.4.1.5.4. Column delimiters

The default column delimiter in most cases is a semicolon (;). However, if you select the “BY SCC10 NAME” or “BY SIC NAME” instruction, the SCC or SIC description will be included in the report and the default delimiter will be a pipe symbol (|). This is because the SCC and SIC descriptions can include semicolons. In either case, you can use the /DELIMITER/ packet to manually set your choice of delimiter, if desired.

5.4.1.5.5. Column headers

Each column of data is labeled with a header name. Where applicable, the columns also are labeled with units (discussed below), as set by the UNITS instruction or by the default units for the data used by the program. For the emissions and activity data, the header names are simply the names of the pollutants or activity variables input to SMOKE. For columns generated by the BY instructions, the following possible column headers are used by SMOKE. These are listed in the order in which they would appear.

  • Date (values of MM/DD/YYYY)
  • Hour (integer values from 1 through 24)
  • Layer number (integer values >0)
  • X cell (integer values >0)
  • Y cell (integer values >0)
  • Source ID (SMOKE record number, integer values >0)
  • Co/St/Cy (Country, state, and county code; values of NSSCCC, with SS and/or CCC replaced by zeros as necessary, depending on the BY option used)
  • Country (value is name; optional when country, state, and county code is output)
  • State (value is name; optional when country, state, and county code is output)
  • County (value is name; optional when country, state, and county code is output)
  • SCC (10-character values, with 2 leading zeros for 8-character SCCs)
  • SIC (4-character values)
  • NAICS (6-character values)
  • MACT (6-character values)
  • ORIS ID (6-character values)
  • Primary Srg (primary surrogate ID code; for area or mobile sources only)
  • Fallbk Srg (fallback surrogate ID code; for area or mobile sources only)
  • Monthly Prf (monthly temporal profile code)
  • Weekly Prf (weekly temporal profile code)
  • Diurnal Prf (diurnal temporal profile code)
  • Spec Prf (speciation profile code)
  • Plant ID (for ORL) or Facility ID (for FF10)
  • Char 1 point-source stack ID (for ORL) or ?? (for FF10)
  • Char 2 Stack ID (for ORL) or Release Point ID (for FF10)
  • Char 3 Segment ID (for ORL) or Process ID (for FF10)
  • Char 4 (for point sources only; may or may not be defined)
  • Char 5 (for point sources only; may or may not be defined)
  • Stk Ht (stack height; for point sources only)
  • Stk Dm (stack diameter; for point sources only)
  • Stk Tmp (stack exit temperature; for point sources only)
  • Stk Vel (stack exit velocity; for point sources only)
  • Elevstat (E= elevated, P= Plume-in-grid, L= low-level; for point sources only)
  • Plt Name (facility name with quotes around it; for point sources only)
  • SCC Description (in quotes)
  • SIC Description (in quotes)
  • NAICS Description (in quotes)
  • MACT Description (in quotes)
  • ORIS Description (in quotes)
5.4.1.5.6. Header units

The units used in the header for each column depend on (1) the units in the input files, (2) the units set as output units in the report instructions, and (3) any normalization used. Smkreport starts with the units obtained from the input files and application of any matrices of interest. Then all unit conversions that can be made, are made. The conversions are done separately on the numerator and the denominator, but one or both of the conversions will not be done if Smkreport has not been configured for the conversions. If this is the case, a note will appear to this effect in the log file. Last, the units are converted to 1/m2 if the NORMALIZE CELLAREA instruction has been given.

Below the header units, a line of minus signs will appear in the reports.

5.4.1.5.7. Data rows

The rows of data (i.e., emissions totals or activity totals) are listed sequentially, without spaces between the rows, in sorted order according to the BY instructions, and based on the order listed in the file (just explained above). Data-value-based sorting is not done by Smkreport, but rather left for a postprocessing step of other data manipulation software. Unlike the title rows, the length of the data rows constrained only by the maximum number of characters up to 3500 per ASCII line, which varies based on the platform you are using. The ARRANGE instruction can be used to improve the arrangement of data rows when the lines are too large or the program is not working properly because of the large number of data values requested in a report.

The data values in the rows are formatted based on the NUMBER instruction for the report.

The resolution of the data values (e.g., summed by SCC) depends on the BY instructions for the report. The inventory records included in the totals (i.e., which sources are included in the reported emissions) depend on the SELECT instructions for the report. More description of these instructions and how they affect the output is provided with the REPCONFIG file format information given earlier.

5.4.1.6. Elements of the METADATA section

Note: this section is not yet implemented in SMOKE. Once implemented, this section of the reports will begin by a line with the word METADATA on it.

5.4.1.6.1. Input files

In this part of the METADATA section, all of the input files used in running the program are provided, one on each line. The full path and file name will be provided.

5.4.1.6.2. Echoed group definitions

The group definitions from the REPCONFIG file used to generate the file will be listed in this section. The definitions will be merely copied directly from the input file to this section of the output report.

5.4.1.6.3. Echoed report instructions

The report instructions from the REPCONFIG file used for creating each report in the file will be listed in this section. The instructions will be merely copied directly from the input file.