WRF-CMAQ v5.3.2 Compiler Test

Following the directions available on
https://github.com/kmfoley/CMAQ/blob/v532_20200702/DOCS/Users_Guide/Tutorials/CMAQ_UG_tutorial_WRF-CMAQ_build_gcc.md

Made some modifications to the tutorial. They are located under.

https://github.com/lizadams/CMAQ/blob/patch-3/DOCS/Users_Guide/Tutorials/CMAQ_UG_tutorial_WRF-CMAQ_build_gcc.md
or
https://github.com/lizadams/CMAQ/blob/master/DOCS/Users_Guide/Tutorials/CMAQ_UG_tutorial_WRF-CMAQ_build_gcc.md

I will combine and submit a pull request to Kristen when the edits are complete.

I was able to get the debug version to run without a floating point error by changing Debug Flag in configure.wrf using the following

FCOPTIM = -O0
FCDEBUG = -g $(FCNOOPT)

Modifying FCOPTIM in this way, means that I don't need to edit the Makefile.twoway in the cmaq code.

When I used the following debug flags the floating point error stopped the run as before (almost before it got started)

-ggdb -fbacktrace -fcheck=bounds,do,mem,pointer -ffpe-trap=invalid,zero,overflow

The Optimized version of WRF-CMAQv5.3.2 does not match the debug version. I ran WRFv4.1.1-CMAQv5.3.2 with the following option

setenv RUN_CMAQ_DRIVER            F   # [F]

This confirmed that the WRF output does not change using the optimized -O2 option of WRF.

FCOPTIM         =       -O2 -ftree-vectorize -funroll-loops

Optimized Flag in configure.wrf
FCOPTIM = -O2 -ftree-vectorize -funroll-loops

The plots are available here:
The debug version matches Fahim's results for the no-feedback case.
Click on the following link and then click submit. You should be able to scroll thru the variables.
https://dataviewer-dept-cempd.cloudapps.unc.edu/index.cfm?back_address=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_nf/plots/compiler_sens/base_nf/layer1_only

The debug 16 pe version matches Fahim's results for the short-wave feedback case. (#pe dependent)
https://dataviewer-dept-cempd.cloudapps.unc.edu/index.cfm?back_address=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_sf/plots/compiler_sens/base_sf/layer1_only

Side by side comparison of the no feedback and shortwave feedback is available here: https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address2=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_sf/plots/compiler_sens/base_sf/layer1_only&back_address1=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_nf/plots/compiler_sens/base_nf/layer1_only

The shortwave feedback percent difference plot shows that there is a difference in output for the debug version on 16 and 32 pe. We see this with the no feed back runs, but to a lesser magnitude.

David Wong suggested that I use a -O2 or less aggressive compiler optimization to see if that helps reduce the difference between the debug and optimized versions.

I tried the following compile options

Caption: Compiler Options and Run Times on 16 pe
	-O2	-O1	-Og	-O0 -g (debug)
real	2009.05	2230.82	2753.16	8090.41
user	20,597.59	35,559.21	43,522.31	64,661.29
m3diff at 2016183:230000	A:B 1.12602E+02@( 90,30, 1) -1.05625E+01@( 90,32, 1) 3.87858E+01 1.53473E+01	A:B 4.97734E+01@( 99,65, 1) -4.92266E+01@( 99,58, 1) 4.80420E-01 5.49204E+00	A:B 0.000	B

Fahim and David will take a look at the domain decomposition issues. Looking at the CO variable

Note: the debug version of WRF-CMAQ with Shortwave Feedback will not run on 8 pe in time to complete successfully in the debug queue, and can't run in the larger queues as it is too small a PE configuration.

Table of Plots Available

Caption:
	debug base vs debug	optimized base vs opt	debug base vs debug and opt	debug CMAQv5.3.2 vs debug nf
nf (no feedback)	https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address1=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_nf/compiler_sens/base_nf&back_address2=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_nf/prc_diff/base_nf	https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address1=/WRFv4.1.1-CMAQv5.3.2/fahim_opt_16pe_nf/compiler_sens/base_nf/layer1_only&back_address2=/WRFv4.1.1-CMAQv5.3.2/fahim_opt_16pe_nf/prc_diff/base_nf/layer1_only	https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address1=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_and_opt_16pe_nf/plots/compiler_sens/base_nf/layer1_only&back_address2=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_and_opt_16pe_nf/plots/prc_diff/base_nf/layer1_only
sf (shortwave feedback)	https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address1=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_sf/compiler_sens/base_sf/layer1_only&back_address2=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_16pe_sf/prc_diff/base_sf/layer1_only	https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address1=/WRFv4.1.1-CMAQv5.3.2/fahim_opt_16pe_sf/compiler_sens/base_sf&back_address2=/WRFv4.1.1-CMAQv5.3.2/fahim_opt_16pe_sf/prc_diff/base_sf	https://dataviewer-dept-cempd.cloudapps.unc.edu/compare.cfm?back_address1=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_and_opt_16pe_sf/plots/compiler_sens/base_sf/layer1_only&back_address2=/WRFv4.1.1-CMAQv5.3.2/fahim_debug_and_opt_16pe_sf/plots/prc_diff/base_sf/layer1_only

Caption: Max %Diff for ACLI
	debug	opt	Opt&Debug vs Debug
WRF-CMAQ NF	120	8	200
WRF-CMAQ SF	200	200	200
CMAQ	0	25 (2x4)	150

WRF-CMAQ v5.3.2 Compiler Test

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

CMAS Software

Release Notes

Other Pages

Tools

How to edit Wiki pages