I/O API Troubleshooting

Contents

Back to the I/O API User Manual

NOTE

If you run into troubles with I/O API related programs, it is useful to know the versions of all the software components. The CVS-related program ident can report to you versioning keywords in the various components of (binary) object, library, or executable files. For example, I can run the following sequence of commands on my desktop machine to find out the versioning information of various binary components:
% cd $HOME/apps/$BIN
% ident init3.o
    —reports the INIT3 version:  init3.F 87 2015-01-07 17:37:58Z coats
% ident libioapi.a
    —reports the INIT3 and M3UTILIO versions
% ident m3stat
    —reports the INIT3, M3UTILIO, and netCDF versions
    
Each I/O API source file will have its version embedded in the file's header-comment, e.g.
    !! Version "$Id: ERRORS.html 187 2020-09-18 13:53:19Z coats $"
    


gfortran version 10 and lots of spurious warning messages

This version of gfortran takes a particularly idiosyncratic interpretation of the (latest) Fortran-2018 Standard.

AS of July 12, 2020, the relevant ioapi/Makeinclude.${BIN} files have been modified to add Fortran compile-flag

-std=legacy
so that this interpretation does not cause a compile-error.

However, using this compiler version will cause the generation of a huge number of spurious warning-messages, as the compiler is still trying to enforce its version of the Fortran-2018 (not Fortran-90, not Fortran-95, not Fortran-2008) Standard.

Thanks to Mrs. Indumathi S Iyer, (SO/D), BARC, for pointing out this compiler-problem and help with testing the fix.—CJC

Back to "Troubleshooting" Contents


Errors after retrofitting MODULE M3UTILIO

Since MODULE M3UTILIO itself INCLUDEs the standard I/O  include-files and also has INTERFACE-blocks for (almost all of) the public I/O API functions, when you retrofit USE M3UTILIO into an old code, you must remove these INCLUDE-statements and declarations and EXTERNAL statements for the public I/O API functions. If you missed some of these, you may see compile errors like the following
...
/home/coats/ioapi-3.2/ioapi/PARMS3.EXT(66): error #6401: The attributes of this name conflict with those made accessible by a USE statement.   [NAMLEN3]
...
/home/coats/ioapi-3.2/m3tools/m3tproc.f90(102): error #6401: The attributes of this name conflict with those made accessible by a USE statement.   [GETNUM]
...
    
or
...
Error: Symbol 'getnum' at (1) conflicts with symbol from module 'm3utilio', use-associated at (2)
...
    
or... Back to "Troubleshooting" Contents

To fix these errors, remove the corresponding INCLUDE-statements, function-declarations, and EXTERNAL statements.

Back to "Troubleshooting" Contents


"Instruction... not supported" and "Program Exception—illegal instruction" issues

Thanks to Christopher G. Nolte, Ph.D., US EPA Office of Research and Development for his M3USER mailing-list comments this one.

Problem: at run-time, messages like

Please verify that both the operating system and the processor support Intel® X87, CMOV, MMX, FXSAVE, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2, MOVBE, POPCNT, F16C, AVX, FMA, BMI, LZCNT and AVX2 instructions.

or

Program Exception - illegal instruction

This is probably the result of compiling either the library or the model (or both) for a different processor-model than you are running it on.

Starting with the Pentium II processor (1997), successive generations of Intel processors have introduced more and more powerful vector-style instructions (MMX, SSE, SSE2, SSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, ...) that can substantially speed up array-style calculations (including, particularly, the I/O API INTERP3()). Note that each processor generation does support all the previous generations of instructions (but not, of course, vice versa).

Well designed modeling codes will get approximately a 20-25% performance boost for using SSE4.2 instructions, a further 70-80% boost for AVX, and a further 25-30% for AVX2. Because of its sloppy coding, WRF will get less than half that much speedup, and CMAQ even less than that (due to the fact that these codes are so bottlenecked by main-memory operations that improving the arithmetic doesn't help much) . Note that Intel and AMD have also improved the memory systems of the various processor generations, giving a further 5-10% performance boost per processor generation for that reason (independent of which instruction set you're using). In fact, the degree of speedup is a good measure of how well array based calculations are coded: good CFD applications will typically get an AVX speedup factor of about 1.8, whereas the (more poorly-coded) WRF gets only about 1.3 (which can be improved substantially by re-coding the advection and diffusion routines to be less memory-system-hostile).

See https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions and https://en.wikipedia.org/wiki/Advanced_Vector_Extensions for more information about the SSE and AVX families of new instructions.

On a Linux system, you can see what instructions are supported by running the following at the command line

cat /proc/cpuinfo
and then looking at the flags sections for the instructions listed below.

Use of these instructions is typically governed by command-line directives given to the compiler; different compilers use different flags to govern this, and have different defaults. GNU and Intel compilers typically default to SSE3; PGI compilers typically default to the instruction set for the processor on which the compiler itself is being run. See your compiler's documentation on how to control this. Some examples are:

Intel ifort/icc:
-x... directives:
-xHost: Use all the instructions for this machine
-xSSE4.2: Nehalem or later
-xAVX: SandyBridge or later
-xAVX2: Haswell or later

-xCORE-AVX512: Skylake-X or later

GNU gfortran/gcc
-march=... -mtune=... directives: the first of these governs instruction set use; the second controls how the optimizer uses it
-march=native -mtune=native: this machine's architecture
-march=corei7 -mtune=corei7: Nehalem or later (SSE4.2)
-march=corei7-avx -mtune=corei7-avx: SanyBridge or later (AVX)
-march=corei7-avx2 -mtune=corei7-avx2: Haswell or later (AVX2)

Portland Group ifort/icc
Default is this machine's architecture (dangerous if you have multiple different-generation machines!)
-tp=nehalem: Nehalem or later (SSE4.2)
-tp=sandybridge: SanyBridge or later (AVX)
-tp=haswell: Haswell or later (AVX2)

Recent Intel processors and their instruction sets

Nehalem (2008)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2
Sandy Bridge (2011)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX
Ivy Bridge (2012)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX
Haswell (2013)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, FMA3
Broadwell (2015)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, FMA3
(XEON server-processor) Skylake (2015)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, FMA3
(XEON server-processor) Skylake-X (2017)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, AVX-512, FMA3
Kaby Lake (2017)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, FMA3
Coffee Lake (2018)
MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, FMA3

Back to "Troubleshooting" Contents


"Segmentation-Fault" Issues with m3tools programs

Generally, you may need to run these programs with
    limit stacksize unlimited
    
since they allocate scratch-variables "off the stack" (as is the usual/recommended practice in Fortran-90). You may possibly also need
    limit memoryuse unlimited
    

Back to "Troubleshooting" Contents


"Missing Symbol" issues

The system-utility nm is very useful for this category of problems; it can be used to list all of the symbols in a library .a, object .o or executable. When they contain machine-code for a routine, it will show up with its linker-name (as opposed to source-code name), with a U for each use and a U for the routine's definition. For example, to find out about OPEN3 in libioapi.a:
    nm $io/../$BIN/libioapi.a | grep -i open3
                 U open3_
                 U open3_
                 U open3_
                 U open3_
                 U open3_
open3.o:
0000000000000000 d open3.firstime_
0000000000000000 T open3_
open3c.o:
                 U open3_
0000000000000000 T open3c
    
says that OPEN3 is defined in open3.o and used 6 other times (including in open3c.o)

Generally, missing symbols with omp or openmp as parts of their name indicate that you may have compiled the I/O API with OpenMP parallelism enabled, but are not linking your program accordinglly. Look in the relevant ioapi/Makeinclude.$BIN for the make-variables OMPFLAGS and OMPLIBS to see what you need to add to your program's Makefile.

Many other missing symbols (especially with nf_ or nc in them) are related to netCDF-library issues (and to the libraries which netCDF assumes); see the section on netCDF Version 4 issues.

Back to "Troubleshooting" Contents


Compiler/system-library compatibility issues

In general, you are best off if you can build the whole modeling system (libnetcdf.a, libpvm3.a, libioapi.a, and your model(s) CMAQ, SMOKE, etc. with a common compiler set and common set of compile-flags. When this is not done, there are a number of compatibility issues with mixed compiler sets, and with the GNU 3.x-4.x compiler set these get worse. Some of these problems show up at link time; others at run-time. In particular, the following are known to have problems:
Back to "Troubleshooting" Contents


Warning messages with recent Intel compilers

Starting with their Version 16 compilers, Intel has introduced a new compiler directive -qopenmp to enable OpenMP, and has deprecated the previous -openmp. This previous-version flag now results in a "deprecated flag" warning from the compiler. Changing the Makeinclude.*ifort* to match this compiler-change can eliminate this compile-warning for the latest set of Intel compilers at the cost of making, for example, makes, Makeincludes, etc. incompatible with Intel-15 or earlier ones.

Back to "Troubleshooting" Contents


"Internal compiler error" Problems

At least some versions of the Intel compilers icc and ifort cannot handle the internal complexity of some routines (usually iobin3.c) when compiling with full optimization: one will see error messages like the following when running make for the I/O API (where I've used backslashes to fold the compile-line to make it readable):
cd /nas01/depts/ie/cempd/apps/CMAQ/v5.1/Linux2_x86_64ifortopenmpi;     \
  icc -c -DIOAPI_PNCF=1 -DAUTO_ARRAYS=1 -DF90=1 -DFLDMN=1 -DFSTR_L=int \
  -DIOAPI_NO_STDOUT=1 -DAVOID_FLUSH=1 -DBIT32=1  -O3 -traceback -xHost \
  -DVERSION='3.2-nocpl' /nas01/depts/ie/cempd/apps/CMAQ/v5.1/ioapi/iobin3.c
/nas01/depts/ie/cempd/apps/CMAQ/v5.1/ioapi/iobin3.c(1111) (col. 29): internal error: 0_1529

compilation aborted for /nas01/depts/ie/cempd/apps/CMAQ/v5.1/ioapi/iobin3.c (code 4)
make: *** [iobin3.o] Error 4
    

A measure that generally works is to re-do the last compile-command manually, but with a lower optimization, and then re-do the make. It is useful to cut-and-paste the last command into a sub-shell (enclosing the command by parentheses), with the "-O3" eliminated or reduced to "-O", as in the following example:

( cd /nas01/depts/ie/cempd/apps/CMAQ/v5.1/Linux2_x86_64ifortopenmpi;   \
  icc -c -DIOAPI_PNCF=1 -DAUTO_ARRAYS=1 -DF90=1 -DFLDMN=1 -DFSTR_L=int \
  -DIOAPI_NO_STDOUT=1 -DAVOID_FLUSH=1 -DBIT32=1 -traceback -xHost      \
  -DVERSION='3.2-nocpl' /nas01/depts/ie/cempd/apps/CMAQ/v5.1/ioapi/iobin3.c )
! make
    

This trick is also useful when trying to do highly-optimized builds of other models that contain large, complex routines (WRF, CMAQ, ...)

Back to "Troubleshooting" Contents


Link errors with pgf90 and ifort on Linux

General Principle: various Fortran compilers "mangle" subroutine names (etc.) for the linker in various ways.

Note: 64-bit mode under Linux adds further issues.

Note added 2/24/2009:: Aparna Vemuri of EPRI reports troubles with recent gcc compiler systems and Portland Group pgf90: the gcc Fortran name mangling system has changed, requiring a change in compile flags. For mixed pgf90/gcc builds, one can either remove the -Msecond_underscore flag from FOPTFLAGS in the Makeinclude.Linux2_x86_64pg_gcc* or else change the line CC = pgcc to CC = gcc in the Makeinclude.Linux2_x86_64pg_pgcc* files. These modifications have been made to the 2/24/2009 release of the Makeinclude.Linux2_x86_64pg_gcc*, with the older flags commented out, for use by those who need them.

In particular, Gnu Fortrans (g77 and g95) have different name mangling behavior than is the default with Portland Group pgf90. Vendor supplied NetCDF librararies libnetcdf.a always use the Gnu Fortran conventions, and as such are incompatible with the default compilation flags for SMOKE or CMAQ. For the Linux/Portland Group/SMOKE or CMAQ combination, you have two choices:

This Portland Group inconsistency is exactly why the I/O API is supplied with multiple /Makeinclude.Linux2_x86pg* files in the first place... Note that the I/O API supplies a script nm_test.csh and a make target

make nametest
to help you identify such problems.

Back to "Troubleshooting" Contents


gcc/g77 on x86_64 problems

Added 4/4/2005

Internal compiler errors have shown with gcc/g77 on the some Linux distributions for x86_64, particularly with Fedora Core 3 and Red Hat Enterprise Linux Version 3 for x86_64: the symptom is a sequence of messages such as the following:

error: unable to find a register to spill in
class `AREG'
/work/IOAPI/ioapi/currec.f:93: error: this is the insn:
(insn:HI 145 171 170 8 (parallel [
            (set (reg:SI 3 bx [95])
                (div:SI (reg/v:SI 43 r14 [orig:67 secs ] [67])
                    (reg/v:SI 2 cx [orig:68 step ] [68])))
            (set (reg:SI 1 dx [96])
                (mod:SI (reg/v:SI 43 r14 [orig:67 secs ] [67])
                    (reg/v:SI 2 cx [orig:68 step ] [68])))
            (clobber (reg:CC 17 flags))
        ]) 264 {*divmodsi4_cltd} (insn_list:REG_DEP_ANTI 92
(insn_list:REG_DEP_OUTPUT 91 (insn_list 140 (insn_list 84
(insn_list:REG_DEP_ANTI 139 (nil))))))
    (expr_list:REG_DEAD (reg/v:SI 43 r14 [orig:67 secs ] [67])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (expr_list:REG_UNUSED (reg:SI 1 dx [96])
                (nil)))))
...confused by earlier errors, bailing out
    
A workaround is to weaken architecture/optimization flags for binary type Linux2_x86_64 as described above to get around this compiler bug -- eliminating the -fschedule-insns and -march=opteron optimization flags from "Makeinclude.Linux2_x86_64" will tend to get rid of the problem. Note that this same compiler bug will bite you when trying to build lots of other stuff (TCL/TK, plplot, NCAR graphics), on FC3/gcc/g77 systems, and the same fix seems to work for many other problems as well.

Back to "Troubleshooting" Contents


relocation error link issues for x86_64 Linux

If sizes of individual arrays or of COMMON blocks exceed 2GB on the x86_64 platforms, Intel ifort and icc will give you failures, with messages about relocation errors at link-time. The problem is that the default "memory model" doesn't support huge arrays and huge code-sets properly. The "medium" memory model supports huge arrays, and the "medium" memory model supports both huge arrays and huge code-sets. To get around this, you will need to add
-mcmodel=medium -shared-intel
to your compile and link flags (for the medium model), and then recompile everything including libioapi.a and libnetcdf.a using these flags. Note that this generates a new binary type that should not be mixed with the default-model binaries. There is a new binary type BIN=Linux2_x86_64ifort_medium for this binary type, and a is a sample Makeinclude file for it, to demonstrate these flags:
Makeinclude.Linux2_x86_64ifort_medium

Other compilers and other non-Linux x86_64 platforms will have similar problems, but the solutions are compiler specific.

Back to "Troubleshooting" Contents


IRIX 7.4 Problems

Added 12/18/2003

SGI F90 compiler-flag problems: It seems that SGI version 7.4 and later Fortran compilers demand a different set of TARG flags than do 7.3.x and before. For example, for an Origin 3800 (where hinv reports

24 400 MHZ IP35 Processors CPU: MIPS R12000 Processor Chip Revision: 3.5 ...
one would use the following sets of ARCHFLAGS compiler flags in Makeinclude.${BIN} with the different Fortran-90 compiler versions:

There are a number of problems with both the I/O API and netCDF with the newer (version 7.4) SGI compilers:

Added 12/18/2003
SGI claims to have fixed this in the latest patch for F90 version 7.4.1 (bug # 895393); I haven't had time to test it yet, though. -- CJC
Back to "Troubleshooting" Contents


NetCDF Error Troubleshooting

Back to "Troubleshooting" Contents


Other Problems

Back to "Troubleshooting" Contents


To: Models-3/EDSS I/O API: The Help Pages

Send comments to

Carlie J. Coats, Jr.
carlie@jyarborough.com