GEM at UQAM for Newbies!

GEM is Environment Canada and Climate Canada's (ECCC) weather forecast model. GEM stands for Global Environmental Multiscale (GEM) model.

GEM offers the possibilities of running regional/limited area (LAM) grids, global uniform (up until including version 4), and global YinYang (starting version 4). A YinYang grid is made of two overlapping LAM grids, covering each a little more than half the globe, like a tennis or baseball ball.

At UQAM we run this model for a few hour long weather forecasts up to several decades long climate simulations. In climate mode the model can be run for many years without any interference of the user. In general the model will run one month at a time. At the end of each month two "jobs" get started right away: A script will be launched to treat the model output of this month (post processing, diagnostics, and archiving) and the next month will be launched automatically.

GEM can be run in parallel on multiple CPU's using MPI and OpenMP.
In general OpenMP does not scale as well as MPI, except for OpenMP=2! So you would rather want to use more CPU's for MPI than for OpenMP.

When using 2 cpu's in x-direction, 3 cpu's in y-direction and OpenMP=4, you will run with 2*3*4=24 cpu's.

The GEM model consists of several parts: The model executable, the scripts for pre and post processing and the configurations files (aka the model's config files).

  

Model

The model consist of the dynamics and the physics and can be run parallel in MPI and/or OpenMP on multiple cpu's.

MPI

MPI stands for "Message Passing Interface".
It is a library specification for message-passing, proposed as a standard by a broadly based committee of vendors, implementors, and users. It can be used on distributed and/or shared memory machines.
To run the model in MPI you partition the domain into tiles, xx-yy. Each cpu will then work on one tile.

     
In LAM mode insure that the tiles are as square as possible so that the borders over which the tiles are communicating are as small as possible.
When running a global grid, uniform or stretched, insure that you have as little as possible divisions in x-direction. That means keeping 'Ptopo_npex' small and rather increase 'Ptopo_npey'. Some calculations in the model work very well in x-direction when not cut (near the poles).

Also have a look at the general introduction to GEMDM.



Of course the tiles need to exchange information between each other. They do this in the halo zone.
For global grids, the size of this halo zone is variable. For LAM grids, it needs to be specified with the  'Adw_halox' and 'Adw_haloy' parameters in your 'gem_settings.nml'. (The default value for these two variables in LAM mode is '7'.)



(Special thanks to Vivian Lee for the figure)


OpenMP

OpenMP can be used with shared memory only.
Running OpenMP means running parallel inside a tile.

In the physics the l_nj 2D slabs, with dimension l_ni*nk (where l_ni, l_nk are defined as in the previous figures and nk is the number of vertical levels), get distributed over the different cpu's used for OpenMP. Each OpenMP cpu then does the full physics set of calculations for all of the points of its slab.

In the dynamics the nk horizontal levels, with dimension l_ni*l_nj, get distributed over the different cpu's used for OpenMP. Therefore it is preferable to have a number of levels which can be divided by the number of cpu's.

The number of cpu's which can be used for OpenMP depends on the machine configuration.
OpenMP=2 scales fairly well. The larger the value used for OpenMP the less well it scales.
More than 2 CPU's in OpenMP should only be used if the tiles sizes will get too small otherwise.




Author: Katja Winger
Last update: January 2021