2.4. Cross-referencing and profiles

The emission inventories described in Section 2.3, “Emission inventories” can contain hundreds of thousands or even millions of sources. Collecting specific information for each source about its temporal allocation, chemical speciation, and spatial allocation is not practical. Therefore, a part of emissions processing involves assuming that many sources share the same factors for these major processing steps. For example, we apply monthly, day-of-week, and hourly temporal factors (called profiles) to convert from an annual emissions value to an hour-specific emissions value. A limited set of monthly, day-of-week, and hourly diurnal profiles are available from various studies, and these profiles each have their own unique profile number (also called profile code or profile ID). This limited set of profiles is assigned to the much more numerous inventory sources using an approach called cross-referencing, which is implemented using cross-reference files.

The cross-reference files assign the profiles based on source characteristics such as country, state, and county codes and/or SCCs, using the profile numbers to associate source characteristics with the profiles. While the profile numbers are unique in the profile files, they will appear many times in the cross-reference; this is how SMOKE is able to group the sources to treat them in the same manner. This approach is used for temporal allocation profiles, chemical speciation profiles and the spatial “profiles”, which are called spatial (or gridding) surrogates.

The cross-reference tables are applied to the sources in a stepwise manner, such that the most specific entry available is always applied. For example, if a cross-reference entry were available that matched a source by state, county, and SCC, SMOKE would apply that entry instead of a different cross-reference entry that matched that source only by SCC. The hierarchy that describes how each cross-reference file is applied to the inventory is described for each program in Chapter 6, SMOKE Core Programs.

Figure 2.1, “Generic example of how cross-reference files and profiles work together” provides a generic example of how cross-reference files and profile files work together. In the example, the profile to be used for most of North Carolina is profile ID 16. Durham and Orange counties, however, are assigned profile 15, which would be preferentially applied to all sources in Durham and Orange counties, instead of using the general North Carolina profile. South Carolina sources would be assigned profile 17.

Figure 2.1. Generic example of how cross-reference files and profiles work together

Generic example of how cross-reference files and profiles work together

This example does not correspond to a particular processing step (i.e., temporal allocation, chemical speciation, or spatial allocation), but rather assigns generic “factors” from profiles 15, 16, and 17 based on the state and county information in the cross-reference file. (Note that we have used the state and county names in this example, whereas real cross-reference files would use the country, state, and county codes according to the file format of the actual cross-reference files.)

SMOKE handles cross-references and profile application in a very efficient manner. In reading a cross-reference file, SMOKE first sorts the cross-reference entries using the same sort criteria as are used for the inventory sources (e.g. by country/state/county code, then by SCC, then by remaining source characteristics if any). Next, the cross-reference entries are grouped according to the “level” of matching of each of the entries. For example, all entries that could match to the inventory using only state and county codes would be in one group, while entries that could match to the inventory using only SCCs would be in another group. Once the cross-reference entries are grouped, SMOKE processes each sources in the inventory, and attempts to find a matching entry in one of the cross-reference groups. The most specific groups are searched first, and when a match is found for a particular source, the other groups are not searched. This helps increase efficiency. In addition, because the cross-reference entries are sorted within each group, an efficient searching algorithm can be used for each individual search. When a match to one of the cross-reference groups has been found, SMOKE continues to the next source in the inventory until all sources have been processed.

Cross-references and profiles are used in the following SMOKE processing steps. These steps and their associated programs (listed in parentheses) will be described in the sections to come.

The hierarchies that each SMOKE program uses to assign cross-reference entries to sources are provided in Chapter 6, SMOKE Core Programs, where the programs are described at length. The file contents and formats are described in more detail in Chapter 8, SMOKE Input Files.

Note: The use of the Environment variable AGPRO (Area spatial surrogate file)and MGPRO (Mobile spatial surrogate file) have been discontinued. Two new Environment variables have been introduced to SMOKE; SRGPRO_PATH (spatial surrogate profile file location) and SRGDESC (description file with the specific list of available surrogates located in SRGPRO_PATH) See Figure 6.4, “Grdmat input and output files”. The surrogate files located in SRGPRO_PATH are refinements of the old [A|M]GPRO files. They are of the same format as the old files, however, there now may be one or more surrogate files. Grdmat now process each surrogate separately. On domains with large cell counts, this approach limits the memory usage at the expense of slightly longer run times.