GEMCLIM for Newbies!

GEMCLIM is the climate version of Environment Canada's
Global Environmental Multiscale (GEM) model.

GEM offers the possibilities of running global uniform, global stretched (also called variable) and limited area (LAM) grids.

In climate mode the model can be run for many years without any interference of the user. In general the model will run one month at a time. At the end of each month two "jobs" get started right away: A script will be launched to treat the model output of this month (post processing, diagnostics and archiving) and the next month will be launched automatically.

GEM can be run on multiple CPU's using MPI and OpenMP.
In general OpenMP does not scale as well as MPI, except for OpenMP=2! So you would rather want to use more CPU's for MPI than for OpenMP.

When using 2 cpu's in x-direction, 3 cpu's in y-direction and OpenMP=4, you will run with 2*3*4=24 cpu's.

The GEM model consists of several parts: The entry and the model (the executables), the scripts for pre and post processing and the three configurations files (i.e. configexp.dot.cfg, gemclim_settings.nml, and outcfg.out,  aka the model's config files).

Entry

The entry  always runs on only 1 cpu and is usually fairly short job in comparison to the model jobs. It prepares the geophysical fields, the initial conditions and, in LAM mode, the pilot files for the model.
For a global grid (uniform or stretched) the entry is only run at the very beginning of the run.
In LAM mode the entry is run at the beginning of each month (to create that month's pilot files).
  

Model

The model consist of the dynamics and the physics and can be run parallel in MPI and/or OpenMP (OpenMP not yet on marvin) on multiple cpu's.

MPI

MPI stands for "Message Passing Interface".
It is a library specification for message-passing, proposed as a standard by a broadly based committee of vendors, implementors, and users. It can be used on distributed and/or shared memory machines.
To run the model in MPI you partition the domain into tiles, xx-yy. Each cpu will then work on one tile.

     
In LAM mode insure that the tiles are as square as possible so that the borders over which the tiles are communicating are as small as possible.
When running a global grid, uniform or stretched, insure that you have as little as possible divisions in x-direction. That means keeping 'Ptopo_npex' small and rather increase 'Ptopo_npey'. Some calculations in the model work very well in x-direction when not cut (near the poles).

Also have a look at the general introduction to GEMDM.



Of course the tiles need to exchange information between each other. They do this in the halo zone.
For global grids, the size of this halo zone is variable. For LAM grids, it needs to be specified with the  'Adw_halox' and 'Adw_haloy' parameters in your 'gemclim_settings.nml'. (The default value for these two variables in LAM mode is '7'.)



(Special thanks to Vivian Lee for the figure)


OpenMP

OpenMP can be used with shared memory only.
Running OpenMP means running parallel inside a tile.

In the physics the l_nj 2D slabs, with dimension l_ni*nk (where l_ni, l_nk are defined as in the previous figures and nk is the number of vertical levels), get distributed over the different cpu's used for OpenMP. Each OpenMP cpu then does the full physics set of calculations for all of the points of its slab.

In the dynamics the nk horizontal levels, with dimension l_ni*l_nj, get distributed over the different cpu's used for OpenMP. Therefore it is preferable to have a number of levels which can be divided by the number of cpu's.

The number of cpu's which can be used for OpenMP depends on the machine configuration.
OpenMP=2 skales very well and is therefore always recommended!
More CPU's in OpenMP should only be used if the tiles sizes will get too small otherwise.

For more information follow the little example on how to run GEMCLIM.




Author: Katja Winger
Last update: January 2010