As described in Section 5.2, “Smkreport Program”, the Smkreport program creates user-defined reports and summaries for QA purposes based on the SMOKE intermediate files (e.g., SMOKE inventory files, hourly emissions, speciation matrix, gridding matrix). The reports that are created by Smkreport can easily be imported into a spreadsheet program or into a database because the ASCII output format is delimited. Note that report files larger than about 60,000 lines will not import into Excel because of limitations of that software.
Because Smkreport is very flexible, there are a large number of reports that you can generate. The “File Format(s)” section below describes the guidelines that Smkreport uses in building the output reports, based on the instructions the user provides to Smkreport. This section is followed by a large number of example reports, to help you understand the possible format options.
Each file output by Smkreport can contain multiple reports. The user has control over how many reports are included in each file (please see the REPCONFIG
file documentation above for more information on using the /CREATE REPORT/ and /NEWFILE/ packets). The reports in a file
are separated by a row of pound signs (#). If having multiple reports in a single file causes problems for importing the data
into other software, then you should create a REPCONFIG
file that specifies a new file for each report.
The reports in an output file use columns and rows to structure the information. The reports are structured to permit the user to import a report into spreadsheet or database software, or some other data-processing tool for further analysis. The columns in the reports are semicolon delimited (though the delimiter can be set by the user), have labels, and have units. Where possible, we have also tried to make sure that the text versions of the reports can be viewed easily. The width of the widest entry in a column is used to set the width for the column, thereby making fixed-file format transfer to other software possible.
Using the ARRANGE instruction in the Smkreport REPCONFIG
input file, you can also create reports with more advanced structures: a single report that is composed of multiple physical
files; a report that is single physical file but has multiple sections; or a report that uses a “database” format of one data value (e.g., emissions value) per line. Use of these formats is described in the REPCONFIG
file documentation in Section 5.3, “REPCONFIG
Input File”.
The following is a general template for the Smkreport output files. It consists of a REPORTS section and a METADATA section. These two sections are discussed in detail below the template.
<REPORT1> <User Title(s), one per line> <Automatic Titles, one per line> <Column headers> <Units, for columns that need them> <Line of minus signs> <Data values for all rows> ########################################################################## <REPORT2, if any> <User Title(s), one per line> <Automatic Titles, one per line> <Column headers> <Units, for columns that need them> <Line of minus signs> <Data values for all rows> ########################################################################## METADATA <Input files> <Echoed group definitions from REPCONFIG file> <Echoed report instructions from REPCONFIG file> <End of file>
The first section of each report is a title section. The first title entries in the report are any titles specified in the
/CREATE REPORT/ packets in the REPCONFIG
file, created by the user. They are listed in the same order as those provided in the input packets for the report. It is
recommended that user titles be no more than 80 characters wide (to more easily fit on printed documents), although they can
have up to 300 characters.
Automatic titles are inserted by Smkreport to indicate the operations that were done by Smkreport to generate the requested report. This helps the report be self-describing. The automatic titles are formatted to be no larger than 80 characters wide, in order to permit easier viewing as text files. The following automatic titles are applied, where applicable:
The source category processed
The year of the base inventory
The year of the grown inventory (only if a grown inventory file is input to Smkreport)
Whether growth factors were applied, and if so, for what year
Whether multiplicative control factors were applied, and if so, for what year
Whether a gridding matrix was applied, and if so, the grid name
Whether a speciation matrix was applied, and if so, whether it was mole-based or mass-based
Whether annual, average-day, or hourly data from SMOKE intermediate files are the basis of the emissions in the report
If hourly data were input, the time period processed with dates, hours, and time zones
Whether annual or average-day data were the basis for the report
Whether the emissions are divided by grid cell area (NORMALIZE CELLAREA instruction)
Whether the emissions are divided by population (NORMALIZE POPULATION instruction)
The name of the group used to select the data, if any (the group definition will be listed in the METADATA section)
The name of the subgrid used to select the data, if any (the subgrid definition will be listed in the METADATA section)
The particular columns of data that are written to a report are determined by the BY instructions in the /CREATE REPORT/ packet. Typically, one column is created for each BY instruction, which corresponds to the name of the BY instruction. The exceptions to this rule are the following:
The BY COUNTRY, BY STATE, and BY COUNTY instructions, which output the country, state, and county codes, and optionally the country, state, and county names, respectively (if NAME is included at the end of the instruction).
The BY SOURCE instruction, which outputs columns for all source characteristics (point sources: country/state/county, plant, char1, char2, char3, char4, char5, SIC; area sources: country/state/county, SCC; mobile sources: country/state/county, road class, vehicle type, SCC). For point sources, the BY SOURCE instruction can also output four additional columns for stack parameters, and another additional column for facility name.
In addition to the columns specified by the BY instructions, the emissions and/or activity data are written out to the report. If no speciation option is used (i.e., the SPECIATION instruction is not given), then the other data columns are the emissions and activities from the SMOKE inventory file (e.g., NOX, VOC, PM10, VMT). If the speciation option is used, then the other data columns are the speciated emissions data (e.g., NO, NO2, ALD, PAR, ISOP). The SELECT DATA instruction determines which inventory data and speciated emissions data are written to the reports. When speciated data are output, the columns for the corresponding inventory pollutants are written as well (as explained more fully in the input file description of the SELECT DATA instruction, earlier).
The order of the columns of data values is taken from the order of the data in the SMOKE intermediate files. The orders of
pollutants and activities are controlled by the INVTABLE
file when Smkinven is run to create the SMOKE inventory files. Species are arranged by Spcmat based on the order of the pollutants, and then alphabetically for species coming for the same pollutant, and this order is
used to create the speciation matrix.
The default column delimiter in most cases is a semicolon (;). However, if you select the “BY SCC10 NAME” or “BY SIC NAME” instruction, the SCC or SIC description will be included in the report and the default delimiter will be a pipe symbol (|). This is because the SCC and SIC descriptions can include semicolons. In either case, you can use the /DELIMITER/ packet to manually set your choice of delimiter, if desired.
Each column of data is labeled with a header name. Where applicable, the columns also are labeled with units (discussed below), as set by the UNITS instruction or by the default units for the data used by the program. For the emissions and activity data, the header names are simply the names of the pollutants or activity variables input to SMOKE. For columns generated by the BY instructions, the following possible column headers are used by SMOKE. These are listed in the order in which they would appear.
The units used in the header for each column depend on (1) the units in the input files, (2) the units set as output units in the report instructions, and (3) any normalization used. Smkreport starts with the units obtained from the input files and application of any matrices of interest. Then all unit conversions that can be made, are made. The conversions are done separately on the numerator and the denominator, but one or both of the conversions will not be done if Smkreport has not been configured for the conversions. If this is the case, a note will appear to this effect in the log file. Last, the units are converted to 1/m2 if the NORMALIZE CELLAREA instruction has been given.
Below the header units, a line of minus signs will appear in the reports.
The rows of data (i.e., emissions totals or activity totals) are listed sequentially, without spaces between the rows, in sorted order according to the BY instructions, and based on the order listed in the file (just explained above). Data-value-based sorting is not done by Smkreport, but rather left for a postprocessing step of other data manipulation software. Unlike the title rows, the length of the data rows constrained only by the maximum number of characters up to 3500 per ASCII line, which varies based on the platform you are using. The ARRANGE instruction can be used to improve the arrangement of data rows when the lines are too large or the program is not working properly because of the large number of data values requested in a report.
The data values in the rows are formatted based on the NUMBER instruction for the report.
The resolution of the data values (e.g., summed by SCC) depends on the BY instructions for the report. The inventory records
included in the totals (i.e., which sources are included in the reported emissions) depend on the SELECT instructions for
the report. More description of these instructions and how they affect the output is provided with the REPCONFIG
file format information given earlier.
Note: this section is not yet implemented in SMOKE. Once implemented, this section of the reports will begin by a line with the word METADATA on it.
In this part of the METADATA section, all of the input files used in running the program are provided, one on each line. The full path and file name will be provided.
The group definitions from the REPCONFIG
file used to generate the file will be listed in this section. The definitions will be merely copied directly from the input
file to this section of the output report.