Hi,
We (@sbahamondez and others) are having problems running CROCO in a new cluster with servers Lenovo ThinkSystem SR645 V3 (the CPU is AMD EPYC)
Compilation is done with flang
and when we try the Benguela default case it crashes with a segmentation fault. @sbahamondez noticed that before that it shows the following
hmin hmax grdmin grdmax Cu_min Cu_max
************************* 0.197007298E+01 0.200078392E+01244.41329956319.53134155
volume= NaN open_cross= 0.000000000000000000000E+00
lonmin = -0.00 lonmax = ****** latmin = ****** latmax = ******
so it migth not be reading the croco_grd.nc file correctly.
Anyone with experience using flang? Here is the current jobcomp we use
jobcomp_main.txt (11.5 KB)
Hi,
@AndresSepulveda @sbahamondez
can you please try this option in jobcomp. Please send me the >compilation-log.text
thinking this is floating-point calculations issues, may be small value imprecision that could lead to NaNs.
FFLAGS1=“-O2 -g -mcmodel=medium -march=znver3 -fno-omit-frame-pointer -ffp-contract=off -ftrapping-math -fimplicit-none -fcheck=bounds -frounding-mode=precise”
see what happens.
Best,
Subhadeep
Hi Subhadeep,
With the following options it complies
FFLAGS1="-O2 -g -mcmodel=medium -march=znver3 -fno-omit-frame-pointer -ffp-contract=off -ftrapping-math "
The other ones gave me an error at compilation time
lang-16: error: unknown argument: ‘-fimplicit-none’
clang-16: error: unknown argument: ‘-frounding-mode=precise’
But even when it compiles, the problem persists:
NUMBER OF THREADS: 1 BLOCKING: 1 x 1.
Spherical grid detected.
hmin hmax grdmin grdmax Cu_min Cu_max
************************* 0.197007298E+01 0.200078392E+01244.41329956319.53134155
volume= NaN open_cross= 0.000000000000000000000E+00
lonmin = -0.00 lonmax = ****** latmin = ****** latmax = ******
Vertical S-coordinate System:
level S-coord Cs-curve at_hmin over_slope at_hmax
32 0.0000000 0.0000000 0.000 0.000 0.000
31 -0.0312500 -0.0001015 Inf************ -Inf
30 -0.0625000 -0.0004108 Inf************ -Inf
@AndresSepulveda
ok.
In the beginning of ncjoin.F add this (it may not be needed for your case)
integer :: iargc
I think your flang skipped this.
I used flang-new. installed using conda env.
use the following ::
CPP1=“cpp -traditional -DLinux”
FFLAGS1=“-O2 -mcmodel=medium -fdefault-real-8 -fdefault-double-8 -std=f2018”
change in :
-std=legacy to -std=f2018
I hope you will run successfully.
Note: compile netcdf4_par using flang/flang-new, it may be speedup with openmp+NC4PAR
Best,
Subhadeep
2 Likes
Hi @smaishal
Following your instructions it compiled and run successfully.
Thanks.
Best regards.
Sergio.
1 Like
Thanks for the help. It compiles and runs. The compiler gives this two warnings:
clang-16: warning: argument unused during compilation: ‘-fdefault-real-8’ [-Wunused-command-line-argument]
clang-16: warning: argument unused during compilation: ‘-fdefault-double-8’ [-Wunused-command-line-argument]
When I define AGRIF, it fails with the following message
ld.lld: error: unable to find library -lagrif
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
but all the AGRIF code is under OCEAN, as usual…
netcdf-par I/O needs to build with using flang/flang-new:
export HDF5_USE_FILE_LOCKING=FALSE
export MPICH_MPIIO_HINTS=“*:romio_cb_write=enable”
need to link with ===openmp==== ::
if $($CPP1 testkeys.F | grep -i -q openmp) ; then
COMPILEOMP=TRUE
if [[ $OS == Linux || $OS == Darwin ]] ; then
if [[ $FC == gfortran ]] ; then
FFLAGS1="$FFLAGS1 -fopenmp"
elif [[ $FC == ifort || $FC == ifc ]] ; then
INTEL_VERSION=$(ifort --version 2>&1 | grep -oP "(\d+)" | head -n1)
# Compare the version with 18
if [[ "$INTEL_VERSION" -gt 18 ]]; then
FFLAGS1="$FFLAGS1 -qopenmp"
else
FFLAGS1="$FFLAGS1 -openmp"
fi
elif [[ $FC == flang || $FC == flang-new ]]; then
FFLAGS1="$FFLAGS1 -fopenmp"
LDFLAGS="$LDFLAGS -lomp"
else
FFLAGS1="$FFLAGS1 -openmp"
fi
elif [[ $OS == CYGWIN_NT-10.0 ]] ; then
FFLAGS1="$FFLAGS1 -fopenmp"
elif [[ $OS == AIX ]] ; then
FFLAGS1="$FFLAGS1 -qsmp=omp"
CFT1="xlf95_r"
fi
fi
I will check AGRIF.
Best,