Feedback on make_bry croco_pytools

Hello,
After the release of the new croco_pytools version, here are some personal feedback regarding the make_bry function.

I’m using croco_pytools to set up a new configuration over the northwestern Mediterranean Sea.
For the make_bry function, I use data from Mercator (Global Ocean Physics Reanalysis).

Here are the changes I made to make it easier to use for my case. I hope I didn’t introduce any mistakes during the process.

  1. Added units in get_uv

→ Added unit attributes to lon and lat (at the end of the function get_uv) to ensure correct vertical interpolation in get_3dfield

def get_uv(config, input_files, dico, z_rho, z_w, crocogrd, extrapolators=None,
velbar_ogcm = (myvel_masked * dz_masked).sum(dim=“depth”) / dz_masked.sum(
dim=“depth”
)

   velbar_ogcm["lon"].attrs["units"] = "degrees_east"
   velbar_ogcm["lat"].attrs["units"] = "degrees_north"
  1. Updated define_time_segments

→ Added one extra day at the end of each month to allow the creation of a restart file at midnight of the first day of the following month.

def define_time_segments(config, times):

if date_fmt == "MONTHLY":
    current = date_start.replace(day=1, hour=config["Times"]["Hstart"])
    while current <= date_end:
         year, month = current.year, current.month
         if (year, month) not in available_pairs:
             raise ValueError(f"  Missing data for {year}-{month:02d}")

        next_month = current + relativedelta(months=1)# + relativedelta(hours=1)
         segment_start, segment_end = search_realtimes(times, current, next_month)
        segments.append((segment_start, segment_end + relativedelta(days=2)))
         current = next_month

 elif date_fmt == "YEARLY":

def define_time_segments(config, times):
# Add one step before if possible, else duplicate first timestep
if start_idx == 0:
print(f" No earlier time step for time {start}: duplicating {start}")
full_indices = full_indices
else:
full_indices = np.concatenate(([start_idx - 1], full_indices))

     # Add one step after if possible, else duplicate last timestep
     if end_idx >= len(times) - 1:
         print(f"  No later time step for time {end}: duplicating {end}")
        full_indices = np.concatenate((full_indices-1, [full_indices[-2]]))
     else:
         full_indices = np.concatenate((full_indices, [end_idx + 1]))
  1. Converted dt to timedelta64

→ Allows proper calculations between seg_times and dt.

def define_time_segments(config, times):
seg_times.append(times[ind].astype(“datetime64[ms]”))
# Adjust first time if it’s a duplicate of the next one
if full_indices[0] == full_indices[1]:
dt_np = np.timedelta64(dt)
seg_times[0] = seg_times[1] - dt_np
# Adjust last time if it’s a duplicate of the previous one
if full_indices[-1] == full_indices[-2]:
dt_np = np.timedelta64(dt)
seg_times[-1] = seg_times[-2] + dt_np
# convert back to numpy datetimes for later
seg_times = np.array(seg_times, dtype=“datetime64[ms]”)

     segments_times.append(seg_times)

 return segments, segments_indices, segments_times
  1. Added two separate paths in ibc.ini for OBC and INI

→ Replaced config[“IBC_Input_Files”][“ibc_dir”] with
config[“IBC_Input_Files”][“ibc_dir_OBC”] or config[“IBC_Input_Files”][“ibc_dir_INI”]
in the functions create_ini_croco2croco, create_bry_croco2croco, and read_input_file_list.

def create_ini_croco2croco(config):

 tracers = config["IBC_Options"]["tracers"]
 prefix = config["IBC_Input_Files"]["ibc_prefix"]
 input_dir = config["IBC_Input_Files"]["ibc_dir_INI"]
 input_extension = config["IBC_Input_Files"]["ibc_extension"]
 croco_dir = config["Croco_Files"]["croco_files_dir"]
 croco_grdname = config["Croco_Files"]["croco_grd_prefix"] + "_zoom_offline.nc"

def create_bry_croco2croco(config):

 tracers = config["IBC_Options"]["tracers"]
 prefix = config["IBC_Input_Files"]["ibc_prefix"]
 input_dir = config["IBC_Input_Files"]["ibc_dir_OBC"]
 input_extension = config["IBC_Input_Files"]["ibc_extension"]
 croco_dir = config["Croco_Files"]["croco_files_dir"]
 croco_grdname = config["Croco_Files"]["croco_grd_prefix"] + "_zoom_offline.nc"

def read_input_file_list(config, ibc=None):

 def select_file(config, ibc=ibc, var=None):

     if ibc == "ini":
         input_prefix = f"{config['IBC_Input_Files']['ibc_dir_INI']}/{config['IBC_Input_Files']['ibc_prefix']}"
         if config["IBC_Input_Files"]["ibc_multi_files"]:
             attr = "ibc_file_" + var
             input_prefix = input_prefix + config["IBC_Input_Files"][attr] 

     elif ibc == "bry":
         input_prefix = f"{config['IBC_Input_Files']['ibc_dir_OBC']}/{config['IBC_Input_Files']['ibc_prefix']}"
         if config["IBC_Input_Files"]["ibc_multi_files"]:
             attr = "ibc_file_" + var
             input_prefix = input_prefix + config["IBC_Input_Files"][attr] 

These changes helped fix several issues I encountered during the preprocessing for my regional setup.
Hopefully this feedback will be helpful for future updates of the tools.
Alice

Thanks a lot for this feedback, it is not easy to see changes you make inside the text, can you attached the files you have modified please ?

And I did not get what was your issues with the original files ?
Just some comments from what you list, but perhaps I did not get all your issues :

  • Added units in get_uv : I do not get why units attributes ensure correct vertical interpolation ?
  • Updated define_time_segments : can you detail what was the problem ?
  • Converted dt to timedelta64 : here again, can you detail what was the problem ?
  • Added two separate paths in ibc.ini for OBC and INI : you should use the same datasets for OBC and INI, why would you want to use different datasets ?

And thanks a lot for this precious feedback

Here is my personal file: ibc_tools.py

  • For the added unit, i have to admit that i dont realy understand why there was a problem but it woudn’t interpolate. The errore was:
    The dataset doesn't define a longitude axis
    And it was the case only for u and v. I manage to correct the error by setting the unit in the get_uv function.
    Maybe the problem is specific to my dataset

  • Updated define_time_segments : I wanted my monthly files to have one extra day compared to the original ones, because I needed the restart at midnight on the first day of the next month, which wasn’t possible if the croco_bry files didn’t end on the 2nd of the following month.

  • Converted dt to timedelta64 : the error was:
    ufunc 'add' cannot use operands with types dtype('<M8[ms]') and dtype('O')
    Which i think is due to this operation: seg_times[-1] = seg_times[-2] + dt
    So i think dt needs to be converted as: dt = np.timedelta64(dt)

  • Added two separate paths in ibc.ini for OBC and INI: I my case i’m using the same dataset but differentes files. I have to admit i’m not sure if it’s the best way.

I hope it’s clearer now

Thanks, yes that’s helping :slight_smile: I’ll have a look to the file for the time related issues

The dataset you use is Mercator (Global Ocean Physics Reanalysis), did you use the download script (download_mercator.py) to retrieve it ?
If not, perhaps there is indeed an issue with longitude axis in your dataset.

Thanks you for your advice ! For downloading the dataset i did used download_copernicus.py (the mercator product are in copernicus marine systhem). But I’ll check if the script correct then.

Hi, thank you for your feedback. I also have a few additional questions and remarks:

  • for time segments, normally the script already adds a time step before and after the extent for the dataset if necessary. Also note that the dt of ibc inputs is not automatically computed, but is given in ibc.ini, to deal with cases when only one file with one time step are provided. I am thus not sure to understand what was going wrong in your case. Could you provide your ibc.ini and your input files (mercator) and your grid, so that we can test?
  • for your ini and obc using different files, do you mean that the files are in different directories or with name convention that is different? For make_ini the script uses the following pattern: {ibc_dir}/{ibc_prefix}_{ini_filedate}{ibc_extension} and then it picks the time using {ini_idx}. This is the trick to be able to choose any time and consider it as your initial condition at the date you want. For make_bry the script uses all files available with the following pattern: {ibc_dir}/{ibc_prefix}_*{ibc_extension}, and then define the time segments from the time axis, and the dates you defined as start and end.

Here is my ibc.ini file: ibc.ini
Here are the file form copernicus i’m using, i’m doing the processing on the 2019 year, but here i’m sharing only the first month: files for 2019/01

For the path, I mainly changed it because it was more convenient for me and it made it easer to how I originally organized my files.

  • For time segments: i had the croco_bry.nc until midnight of the first day of the next month but i wanted the files to end on the second day so i could have a restart file at midnight the on the first day. I don’t now if it makes sense or maybe i missing something.