ABNORMAL JOB END BLOW UP during running the model, and got errors : Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG, Program received signal SIGSEGV: Segmentation fault - invalid memory reference

Hi dear croco users,

I am facing erros when i tried to run the model. So I will post the detail about it. I would like to know if anyone can give me any advices on this. Thanks very much in advance.

  1. Messages after I run the model:

BENGUELA TEST MODEL
8640 ntimes Total number of timesteps for 3D equations.
30.00 dt Timestep [sec] for 3D equations
60 ndtfast Number of 2D timesteps within each 3D step.
1 ninfo Number of timesteps between runtime diagnostics.

WARNING: Unrecognized keyword: time_stepping_nbq → DISREGARDED.

5.000E+00 theta_s S-coordinate surface control parameter.
8.800E-01 theta_b S-coordinate bottom control parameter.
7.000E+01 Tcline S-coordinate surface/bottom layer width used in
vertical coordinate stretching, meters.
Grid File: CROCO_FILES/croco_grd.nc
Forcing Data File: CROCO_FILES/croco_frc.nc

WARNING: Unrecognized keyword: bulk_forcing → DISREGARDED.

Climatology File: CROCO_FILES/croco_clm.nc

WARNING: Unrecognized keyword: boundary → DISREGARDED.

Initial State File: CROCO_FILES/croco_ini.nc Record: 1
Restart File: CROCO_FILES/croco_rst.nc nrst = 8640 rec/file: -1
History File: CROCO_FILES/croco_his.nc Create new: T nwrt =1728 rec/file = 0
1 ntsavg Starting timestep for the accumulation of output
time-averaged data.
864 navg Number of timesteps between writing of time-averaged
data into averages file.
Averages File: CROCO_FILES/croco_avg.nc rec/file = 0

Fields to be saved in history file: (T/F)
T write zeta free-surface.
T write UBAR 2D U-momentum component.
T write VBAR 2D V-momentum component.
T write U 3D U-momentum component.
T write V 3D V-momentum component.
T write T( 1) Tracer of index 1.
T write T( 2) Tracer of index 2.

  F  write RHO   Density anomaly.
  F  write Omega Omega vertical velocity.
  T  write W     True vertical velocity.
  F  write Akv   Vertical viscosity.
  T  write Akt   Vertical diffusivity for temperature.
  F  write Aks   Vertical diffusivity for salinity.
  F  write bvf   Brunt Vaisala Frequency.
  F  write Visc3d Horizontal diffusivity.

  T  write Hbl   Depth of model boundary layer.
  T  write Hbbl   Depth of bottom planetary boundary layer.
  T  write Bostr Bottom Stress.
  F  write Bustr U-Bottom Stress.
  F  write Bvstr V-Bottom Stress.
  T  write Wstress Wind Stress.
  T  write U-Wstress comp. U-Wind Stress.
  T  write V-Wstress comp. V-Wind Stress.

  T  write Shflx [W/m2] Surface net heat flux
  T  write Swflx [cm/day] Surface freshwater flux (E-P)
  T  write Shflx_rsw [W/m2] Short-wave surface radiation

WARNING: Unrecognized keyword: gls_history_fields → DISREGARDED.

Fields to be saved in averages file: (T/F)
T write zeta free-surface.
T write UBAR 2D U-momentum component.
T write VBAR 2D V-momentum component.
T write U 3D U-momentum component.
T write V 3D V-momentum component.
T write T( 1) Tracer of index 1.
T write T( 2) Tracer of index 2.

  F  write RHO   Density anomaly
  T  write Omega Omega vertical velocity.
  T  write W     True vertical velocity.
  F  write Akv   Vertical viscosity
  T  write Akt   Vertical diffusivity for temperature.
  F  write Aks   Vertical diffusivity for salinity.
  F  write bvf   Brunt Vaisala Frequency.
  F  write diff3d Horizontal diffusivity

  T  write Hbl   Depth of model boundary layer
  T  write Hbbl   Depth of the bottom planetary boundary layer
  T  write Bostr Bottom Stress.
  F  write Bustr U-Bottom Stress.
  F  write Bvstr V-Bottom Stress.
  T  write Wstr Wind Stress.
  T  write U-Wstress comp. U-Wind Stress.
  T  write V-Wstress comp. V-Wind Stress.

  T  write Shflx [W/m2] Surface net heat flux.
  T  write Swflx [cm/day] Surface freshwater flux (E-P)
  T  write Shflx_rsw [W/m2] Short-wave surface radiation.

WARNING: Unrecognized keyword: gls_averages → DISREGARDED.

1025.0000 rho0 Boussinesq approximation mean density, kg/m3.
0.000E+00 visc2 Horizontal Laplacian mixing coefficient [m2/s]
for momentum.
0.000E+00 tnu2( 1) Horizontal Laplacian mixing coefficient (m2/s)
for tracer 1.
0.000E+00 tnu2( 2) Horizontal Laplacian mixing coefficient (m2/s)
for tracer 2.
0.000E+00 tnu4( 1) Horizontal biharmonic mixing coefficient [m4/s]
for tracer 1.
0.000E+00 tnu4( 2) Horizontal biharmonic mixing coefficient [m4/s]
for tracer 2.

WARNING: Unrecognized keyword: vertical_mixing → DISREGARDED.

0.000E+00 rdrg Linear bottom drag coefficient (m/si).
0.000E+00 rdrg2 Quadratic bottom drag coefficient.
1.000E-02 Zobt Bottom roughness for logarithmic law (m).
1.000E-04 Cdb_min Minimum bottom drag coefficient.
1.000E-01 Cdb_max Maximum bottom drag coefficient.

  1.00  gamma2   Slipperiness parameter: free-slip +1, or no-slip -1.

SPONGE_GRID is defined: x_sponge parameter in sponge/nudging
layer is set generically in set_nudgcof.F routine

1.157E-05 tauT_in Nudging coefficients [sec^-1]
3.215E-08 tauT_out Nudging coefficients [sec^-1]
3.858E-06 tauM_in Nudging coefficients [sec^-1]
3.215E-08 tauM_out Nudging coefficients [sec^-1]

WARNING: Unrecognized keyword: diagnostics → DISREGARDED.

WARNING: Unrecognized keyword: diag_avg → DISREGARDED.

WARNING: Unrecognized keyword: diag3D_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diag2D_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diag3D_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diag2D_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagnosticsM → DISREGARDED.

WARNING: Unrecognized keyword: diagM_avg → DISREGARDED.

WARNING: Unrecognized keyword: diagM_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagM_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_vrt → DISREGARDED.

WARNING: Unrecognized keyword: diags_vrt_avg → DISREGARDED.

WARNING: Unrecognized keyword: diags_vrt_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_vrt_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_ek → DISREGARDED.

WARNING: Unrecognized keyword: diags_ek_avg → DISREGARDED.

WARNING: Unrecognized keyword: diags_ek_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_ek_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: surf → DISREGARDED.

WARNING: Unrecognized keyword: surf_avg → DISREGARDED.

WARNING: Unrecognized keyword: surf_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: surf_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_pv → DISREGARDED.

WARNING: Unrecognized keyword: diags_pv_avg → DISREGARDED.

WARNING: Unrecognized keyword: diags_pv_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_pv_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_eddy → DISREGARDED.

WARNING: Unrecognized keyword: diags_eddy_avg → DISREGARDED.

WARNING: Unrecognized keyword: diags_eddy_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diags_eddy_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagnostics_bio → DISREGARDED.

WARNING: Unrecognized keyword: diagbio_avg → DISREGARDED.

WARNING: Unrecognized keyword: diagbioFlux_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagbioVSink_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagbioGasExc_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagbioFlux_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagbioVSink_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: diagbioGasExc_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: biology → DISREGARDED.

WARNING: Unrecognized keyword: wkb_boundary → DISREGARDED.

WARNING: Unrecognized keyword: wkb_wwave → DISREGARDED.

WARNING: Unrecognized keyword: wkb_roller → DISREGARDED.

WARNING: Unrecognized keyword: wave_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: wave_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: wci_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: wci_average_fields → DISREGARDED.

WARNING: Unrecognized keyword: sediments → DISREGARDED.

WARNING: Unrecognized keyword: sediment_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: bbl_history_fields → DISREGARDED.

WARNING: Unrecognized keyword: floats → DISREGARDED.

WARNING: Unrecognized keyword: float_fields → DISREGARDED.

WARNING: Unrecognized keyword: stations → DISREGARDED.

WARNING: Unrecognized keyword: station_fields → DISREGARDED.

WARNING: Unrecognized keyword: psource → DISREGARDED.

WARNING: Unrecognized keyword: psource_ncfile → DISREGARDED.

WARNING: Unrecognized keyword: online → DISREGARDED.

Activated C-preprocessing Options:

      REGIONAL
      KUROSHIO
      OPENMP
      OBC_EAST
      OBC_WEST
      OBC_NORTH
      OBC_SOUTH
      CURVGRID
      SPHERICAL
      MASKING
      SOLVE3D
      UV_COR
      UV_ADV
      SALINITY
      NONLIN_EOS
      ANA_DIURNAL_SW
      QCORRECTION
      SFLX_CORR
      CLIMATOLOGY
      ZCLIMATOLOGY
      M2CLIMATOLOGY
      M3CLIMATOLOGY
      TCLIMATOLOGY
      ZNUDGING
      M2NUDGING
      M3NUDGING
      TNUDGING
      UV_HADV_UP3
      UV_VADV_SPLINES
      TS_HADV_RSUP3
      TS_DIF4
      TS_VADV_SPLINES
      SPONGE
      LIMIT_BSTRESS
      LMD_MIXING
      LMD_SKPP
      LMD_BKPP
      LMD_RIMIX
      LMD_CONVEC
      LMD_NONLOCAL
      ANA_BSFLUX
      ANA_BTFLUX
      OBC_M2CHARACT
      OBC_M3ORLANSKI
      OBC_TORLANSKI
      AVERAGES
      AVERAGES_K
      TS_HADV_C4
      MPI_COMM_WORLD
      M2FILTER_POWER
      TRACERS
      TEMPERATURE
      HZR
      VAR_RHO_2D
      SPLIT_EOS
      UV_MIX_S
      DIF_COEF_3D
      TS_MIX_ISO
      TS_MIX_IMP
      TS_MIX_ISO_FILT
      NTRA_T3DMIX
      SPONGE_GRID
      SPONGE_DIF2
      SPONGE_VIS2
      LMD_SKPP2005
      NF_CLOBBER

NUMBER OF THREADS: 12 BLOCKING: 1 x 12.

Spherical grid detected.

hmin hmax grdmin grdmax Cu_min Cu_max
75.000000 5000.000000 0.106406820E+05 0.117773385E+05 0.00162856 0.01471755
volume= 1.543422154656911800000E+16 open_cross= 4.116860069940998840332E+10

   lonmin = 120.00  lonmax = 180.00  latmin =  32.00  latmax =  40.08

Vertical S-coordinate System:

level S-coord Cs-curve at_hmin over_slope at_hmax

32   0.0000000  -0.0000000          -0.000      -0.000      -0.000
31  -0.0312500  -0.0024236          -2.200      -8.168     -14.136
30  -0.0625000  -0.0056323          -4.403     -18.273     -32.142
29  -0.0937500  -0.0099036          -6.612     -31.000     -55.387
28  -0.1250000  -0.0156026          -8.828     -47.250     -85.671
27  -0.1562500  -0.0232047         -11.054     -68.195    -125.337
26  -0.1875000  -0.0333182         -13.292     -95.338    -177.384
25  -0.2187500  -0.0467041         -15.546    -130.555    -245.564
24  -0.2500000  -0.0642819         -17.821    -176.115    -334.410
23  -0.2812500  -0.0871094         -20.123    -234.630    -449.137
22  -0.3125000  -0.1163145         -22.457    -308.881    -595.305
21  -0.3437500  -0.1529579         -24.827    -401.486    -778.145
20  -0.3750000  -0.1978140         -27.239    -514.356   -1001.473
19  -0.4062500  -0.2510854         -29.693    -647.991   -1266.289
18  -0.4375000  -0.3121168         -32.186    -800.773   -1569.361
17  -0.4687500  -0.3792270         -34.709    -968.555   -1902.401
16  -0.5000000  -0.4497843         -37.249   -1144.843   -2252.436
15  -0.5312500  -0.5205810         -39.790   -1321.721   -2603.652
14  -0.5625000  -0.5884150         -42.317   -1491.289   -2940.261
13  -0.5937500  -0.6506727         -44.816   -1647.097   -3249.379
12  -0.6250000  -0.7057025         -47.279   -1785.071   -3522.863
11  -0.6562500  -0.7528925         -49.702   -1903.700   -3757.697
10  -0.6875000  -0.7925021         -52.088   -2003.624   -3955.160
 9  -0.7187500  -0.8253784         -54.439   -2086.934   -4119.428
 8  -0.7500000  -0.8526718         -56.763   -2156.468   -4256.172
 7  -0.7812500  -0.8756196         -59.066   -2215.279   -4371.492
 6  -0.8125000  -0.8954108         -61.352   -2266.301   -4471.250
 5  -0.8437500  -0.9131218         -63.628   -2312.191   -4560.753
 4  -0.8750000  -0.9296993         -65.898   -2355.283   -4644.668
 3  -0.9062500  -0.9459713         -68.167   -2397.622   -4727.076
 2  -0.9375000  -0.9626718         -70.438   -2441.018   -4811.597
 1  -0.9687500  -0.9804698         -72.715   -2487.122   -4901.529
 0  -1.0000000  -1.0000000         -75.000   -2537.500   -5000.000

Time splitting: ndtfast = 60 nfast = 82

Maximum grid stiffness ratios: rx0 = 0.28116377967784584 rx1 = 29.661961444291226

GET_INITIAL – Processing data for time = 0.000 record = 1

GET_INITIAL - unable to find variable: hbl
in input NetCDF file: CROCO_FILES/croco_ini.nc ==> Initialized to zero state.
>> CAUTION in case of #define EXACT_RESTART <<
If it is the case
- OK if it is a ‘cold start’ i.e coming from a 3rd-party initial file
- otherwise if it is a ‘hot start’ i.e from a restart file produced by this code:
=> problem: run is not restartable
=> check your initial file
GET_TCLIMA – Read climatology of tracer 1 for time = 345.0
GET_TCLIMA – Read climatology of tracer 1 for time = 15.00
GET_TCLIMA – Read climatology of tracer 2 for time = 345.0
GET_TCLIMA – Read climatology of tracer 2 for time = 15.00
GET_UCLIMA – Read momentum climatology for time = 345.0
GET_UCLIMA – Read momentum climatology for time = 15.00
GET_SSH – Read SSH climatology for time = 345.0
GET_SSH – Read SSH climatology for time = 15.00
GET_SMFLUX – Read surface momentum stresses for time = 345.0
GET_SMFLUX – Read surface momentum stresses for time = 15.00
GET_STFLUX – Read surface flux of tracer 1 for time = 345.0
GET_STFLUX – Read surface flux of tracer 1 for time = 15.00
GET_SST – Read SST and dQdSST fields for time = 345.0
GET_SST – Read SST and dQdSST fields for time = 15.00
GET_STFLUX – Read surface flux of tracer 2 for time = 345.0
GET_STFLUX – Read surface flux of tracer 2 for time = 15.00
GET_SSS – Read SSS fields for time = 345.0
GET_SSS – Read SSS fields for time = 15.00
GET_SRFLUX – Read solar shortwave radiation for time = 345.0
GET_SRFLUX – Read solar shortwave radiation for time = 15.00
DEF_HIS/AVG - Created new netCDF file ‘CROCO_FILES/croco_his.nc’.
WRT_GRID – wrote grid data into file ‘CROCO_FILES/croco_his.nc’.
WRT_HIS – wrote history fields into time record = 1 / 1

MAIN: started time-stepping

STEP time[DAYS] KINETIC_ENRG POTEN_ENRG TOTAL_ENRG NET_VOLUME trd
0 0.00000 0.000000000E+00 5.0060820E+01 5.0060820E+01 1.5434222E+16 4

=======================================
= =
= STEP2D: ABNORMAL JOB END =
= BLOW UP =
= =

VMAX (M/S) =**********
IMAX JMAX = 1 79
IINT IEXT = 1 1

Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG

=======================================
= =
= STEP2D: ABNORMAL JOB END =
= BLOW UP =
= =

VMAX (M/S) =**********
IMAX JMAX = 7 11
IINT IEXT = 1 1

Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG

=======================================
= =
= STEP2D: ABNORMAL JOB END =
= BLOW UP =
= =

VMAX (M/S) =**********
IMAX JMAX = 5 18
IINT IEXT = 1 1

Note: The following floating-point exceptions are signalling: IEEE_INVALID_FLAG

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:

  1. Related scripts:

cppdefs.h (42.9 KB)
param.h (31.6 KB)

Hi, does it runs ok in serial mode, no OPENMP?

Hi Andres, I am getting the same error when running with MPI. It runs fine in serial mode:

ERROR:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
NUMBER OF THREADS: 1 BLOCKING: 1 x 1.

Spherical grid detected.

I am running a nested grid simulation.

Thanks
Konstantinos

After experimentation I found that I can run the simulation with MPI up to a certain number of cores but not with the full number of cores.

Is there a best practice between the grid i, j dimensions and the number of cores? For example does it help if the grid i or j dimension is an exact multiplication of the available cores?

Thanks
Konstantinos

Hi Andres,

Sorry for being late, thanks for the help.

I have found the reasons(grid config and Nan in initial conditions) and solved the problems.

Hi Konstantinos,

I have solved this problem after I changed to openmp. So maybe the following information will not be helpful, but I can share some thoughts about it.

I think maybe you can try these following things:

  1. check the cpu cores settings
  2. check if you set some environmental var related to the setting of mpi
  3. check the cppkeys related to grid/nesting config.

Try also with the option

ulimit -s unlimited

in your script. This helps if the domain is very big