An introduction to RPN standard files

Yves Chartier
(yves.chartier@ec.gc.ca)
Section informatique, Recherche en Prévision Numérique
Atmospheric Environment Service, Environment Canada



Last Revision: March 24, 2004

Introduction

This document gives an overview of the RPN standard file format, which has been in use at the CID/CMC/RPN complex since 1980. This file format is used to store gridded data from numerical weather prediction models, objective analyses and geophysical fields.

This document is intended as an introduction for people having to work with this data format. It is assumed that the reader has a working knowledge of FORTRAN, and has coded algorithms manipulating gridded data (1, 2, or 3 dimensional arrays).

After having read this document, the reader should be able to create and manipulate RPN standard files using FORTRAN.

Historical perspective

The RPN standard file format was created by Michel Valin, from the "Section Informatique" of RPN, in the late 1970's. At that time, the CID/CMC/RPN super-computer was a CYBER 7600 (2 Mbytes RAM, 36 Mips), to which users where submitting batch jobs from a CYBER-171 front-end (512K memory, 1 Mips). Punched cards were the most common way to communicate with the computer, and 300 Baud was the standard communication speed for the lucky few who had access to a terminal.

The production system of numerical forecasts was already quite complex. It was composed of dozens of applications, most of which had their own, non-standard and undocumented data format. The majority of data files were without internal structure, a record number being often the only clue that one had to locate a given data item (e.g. to compute the height thickness between 1000 and 500 mb, one had to know that records #78 and #89 from file "xyz" had to be accessed). It is left to the reader to imagine the amount of work involved in modifying the horizontal, vertical or temporal resolutions of the models, which could alter the distribution of the data fields in the files (e.g. use rec. #117 instead of #78 and rec. #125 instead of #89).

The first release of the RPN standard file software appeared in 1980. The advantages of the new data format over the previous ones were numerous:

* each meteorological record stored in a standard file had several non-ambiguous identifiers (name, a pressure level, a forecast hour, a date of origin, precise geographical location, etc...).

* files were sequential or direct access, allowing fast positioning and retrieval of data.

* any application dealing with gridded data only had one single format to deal with, thereby eliminating the cost of using dozens of other non-descriptive formats.

* record manipulation was done through high-level subroutines; the underlying structure of the files could then change without requiring users to modify their applications.

A second release of the software came in 1982, offering new enhancements. The most significant was the total transportability of files, bit for bit, to almost any type of computer.

Some features of the original standard file software needed 60 or 64-bit machines (e.g. packing 8 Hollerith characters or storing a 10-digit integer in an integer variable). But in 1989, the CID/CMC/RPN complex acquired numerous UNIX workstations and servers, all of which were 32-bit machines. A third release of the software was therefore needed to accomodate the new platforms. At the same time several other enhancements were added to this release, such as a streamlined interface and an improved internal management.

General structure

From an end-user point of view, the RPN standard file package can be compared to commercial flat-file database software. The basic unit of storage is a file, containing numerous records of gridded meteorological fields. A record contains a 1, 2 or 3-D grid representing the values of a field, with various attributes that give information about the field.

The RPN standard file flavors

The RPN standard files come in three distinct flavors: random access (RND - which is not equivalent to direct access in FORTRAN), indexable sequential (SEQ) and FORTRAN sequential (SEQ/FTN). These flavors define the way records are accessed within the file. In SEQ and SEQ/FTN files, an application that wants to access the 50th record of a file needs to read or skip the first 49 records before accessing it, whereas in a RND file, that record can be accessed directly.

An important concept that comes with these different file types is the pointer position; this has an impact on the behavior of some of the standard file functions. RND type files have a catalog of the contents of the file, which is always resident in computer memory. For these, the file pointer is an index that is attached to the catalog only. In SEQ and SEQ/FTN file type, there is no catalog of the contents of a file, and the file pointer is physically tied to the file. It points either to the beginning of file, to the current record or to the end of file.

Here are some of the differences found between RND and SEQ files.

* In RND files, all the attributes of a given record can be retrieved directly if the file pointer contains a valid position. In SEQ or SEQ/FTN type files, changing the value of the file pointer has no effect; the file pointer is restricted to the record currently selected (or to the end of the file).

* In RND files, the value of the file pointer becomes undefined after having written a record; in SEQ or SEQ/FTN files, the file pointer points to the end of the file after that operation.

* In RND files, most record searching functions start from the beginning of the catalog, and multiple searches do not require an explicit rewinding of the file. In SEQ or SEQ/FTN files, the same functions start their search from the current record, and multiple searches may require an explicit rewinding of the file.

The choice of the file type really depends on the application. A numerical model that reads a sequential stream of records at start-up and writes another stream of records at completion will get the most efficient input/output with SEQ/FTN or SEQ files. However, the vast majority of RPN standard files are of random access (RND) type, and this is the type recommended for beginners. SEQ type files are recommended over SEQ/FTN because they are transportable between heterogeneous systems.

The RPN standard file attributes

The RPN standard file attributes can be roughly divided in 5 categories, which are:

The usage of these attributes varies with the operation (reading, writing, querying) that is done on the file. Generally, only a subset of the attributes described above is needed for a given operation.

I - Classification by category

The following is a list of the attributes found in RPN standard files, as they are used in FORTRAN programs, classified by category. The Range indicated here are for the file format FSTD89. See below in III - Limitations of sizes and values for the new file format FSTD2000.

Field identification attributes

Data element                Suggested name    Data type         Range             
Variable name               NOMVAR            CHARACTER*2       UPPER CASE        
Type of field               TYPVAR            CHARACTER*1       UPPER CASE        
Label                       ETIKET            CHARACTER*8       UPPER CASE        
User defined index          IP3               INTEGER           0-4095            

Time attributes

Data element                Data type         Suggested name    Range             
Date of orig. analysis      INTEGER           DATEO             MMDDYYHHR         
Date of validity            INTEGER           DATEV             MMDDYYHHR         
Forecast hour               INTEGER           IP2               0-32767           
Length of time step         INTEGER           DEET              0-32767           
Time step number            INTEGER           NPAS              0-32767           

Spatial attributes

Data element                 Data type        Suggested name     Range            
Vertical level              INTEGER           IP1               0-32767           
# of points along X         INTEGER           I                 1-32767           
# of points along Y         INTEGER           NJ                1-32767           
# of points along Z         INTEGER           NK                1-4095            
Type of geographical        CHARACTER*1       GRTYP             UPPER CASE        
projection                                                                        
1st grid parameter          INTEGER           IG1               0-2047            
2nd grid parameter          INTEGER           IG2               0-2047            
3rd grid parameter          INTEGER           IG3               0-65535           
4th grid parameter          INTEGER           IG4               0-65535           

Internal representation attributes

Data element                Data type         Suggested name    Range             
Data type                   INTEGER           DATYP             0-5               
Packing ratio/ # of bits    INTEGER           NPAK              0-32 / -1 to -48  

Internal storage attributes

Data element                            Data type            Suggested name       
Erased field flag                       INTEGER              DLTF                 
Key                                     INTEGER              KEY                  
Length of record in host machine words  INTEGER              LNG                  
Starting address of rec. in host        INTEGER              SWA                  
machine words                                                                     
Unused number of bits in the last       INTEGER              UBC                  
word                                                                              
reserved for future use                 INTEGER              EXTRA1               
reserved for future use                 INTEGER              EXTRA2               
reserved for future use                 INTEGER              EXTRA3               

II - Classification by usage

The record attributes discussed above can also be divided in 3 categories:

* search attributes

* descriptive attributes

* internal attributes

The search attributes are the ones that must be used at all times when using the RPN standard file subroutines. They can be used as selection criteria for querying sets of records, and their value must be defined when writing a record into a file. The descriptive attributes need only to be defined when writing a record into a file. Their value can be retrieved, but cannot be used to make queries. Finally, the values of the internal attributes are set by the standard file package; they can only be retrieved.

Here is the list of attributes, grouped according to this classification:

Search attributes

Data element                             Suggested name                           
Variable name                            NOMVAR                                   
Type of field                            TYPVAR                                   
Label                                    ETIKET                                   
Vertical level                           IP1                                      
Forecast hour                            IP2                                      
User defined index                       IP3                                      
Date of validity                         DATEV=DATEO+ DEET  * NPAS                

Descriptive attributes

Data element                             Suggested name                           
Length of time step                      DEET                                     
Time step number                         NPAS                                     
Date of original analysis                DATEO                                    
Dimension of grid along the X-axis       NI                                       
Dimension of grid along the Y-axis       NJ                                       
Dimension of grid along the Z-axis       NK                                       
Type of geographical projection          GRTYP                                    
1st grid parameter                       IG1                                      
2nd grid parameter                       IG2                                      
3rd grid parameter                       IG3                                      
4th grid parameter                       IG4                                      
Numerical values data type               DATYP                                    
Packing ratio                            NPAK                                     

Internal attributes

Data element                             Suggested name                           
Erased field flag                        DLTF                                     
Key                                      KEY                                      
Length of record in machine words        LNG                                      
Starting address of record in machine    SWA                                      
words                                                                             
Unused number of bits in the last word   UBC                                      
reserved for future use                  EXTRA1                                   
reserved for future use                  EXTRA2                                   
reserved for future use                  EXTRA3                                   

III - Limitations of Size and Values

The record attributes discussed above have different limitations on size and values depending on the file format: old (FSTD89) or new (FSTD2000).

FSTD89				FSTD2000
Applies to CONVIP
IP1	15 bits (0-32767)	28 bits (0-268435455)
IP2	15 bits (0-32767)	28 bits (0-268435455)
IP3	12 bits (0-4095) 	28 bits (0-268435455)

Applies to CXGAIG
IG1	11 bits (0-2047) 	24 bits (0-16777215)
IG2	11 bits (0-2047) 	24 bits (0-16777215)
IG3	16 bits (0-65535)	24 bits (0-16777215)
IG4	16 bits (0-65535)	24 bits (0-16777215)

Applies to All FSTD Functions
NI	16 bits (0-65535)	24 bits (0-16777215)
NJ	16 bits (0-65535)	24 bits (0-16777215)
NK	12 bits (0-4095) 	20 bits (0-1048575)
NOMVAR	2 chars         	4 chars (NOT case sensitive)
TYPVAR	1 chars         	2 chars (NOT case sensitive)
ETIKET	8 chars         	12 chars (NOT case sensitive)

Maximum Record Size
	2 Megabytes     	128 Megabytes

Maximum File Size
	250 Megabytes   	32  Gigabytes

Detailed description of RPN standard file attributes

Variable name (NOMVAR):

This is a 2-letter code (upper case only) representing a meteorological parameter. A partial list of existing codes will be found in appendix B, which users should follow. Here are some examples: PN (sea level pressure), GZ (height), TT (air temperature), HR (relative humidity).

Type of field (TYPVAR):

This is a 1-letter code (upper case only) representing the origin of the data. As for the 2-letter name described above, a partial list of existing codes will be found at appendix B. Some examples: A (analysis), C (climatology), P (forecast).

The official list of existing codes for NOMVAR and TYPVAR can be invoked on-line on the CID/CMC/RPN front-end computers using the "r.dict" command.

To get the dictionary definition of the code "GZ":

> r.dict -n gz

GZ Geopotential Height dam

To get the dictionary definition of the variable type "A":

> r.dict -t a

--A, ANALYSIS

To get the dictionary definition of all variable codes starting with "A":

> r.dict -n a.

AA Ammonium Aerosols (NH4) ppb

AL Albedo 0 to 1

AM Ammonia Gas (NH3) ppb

AP Planetary albedo 0 to 1

To get the dictionary definition of all existing variable codes:

> r.dict -n

(OUTPUT TOO LONG TO BE PRINTED HERE)

Label (ETIKET):

This is an 8-letter code (upper case only) allowing rapid identification of a field. The contents of this label is normally left to the discretion of the user. It can be used to identify a numerical model, or the code of an experiment. Some examples: the label for the operational regional finite element model is `FE OPRUN' (Finite Element OPerational RUN), the one for the spectral model is `SEF79A21' (Spectral Elements Finis 79 waves 21 levels).

When coding the value of ETIKET in a FORTRAN program, always initialize the 8 characters of the label, such as

ETIKET = 'AA '

instead of

ETIKET = 'AA'

because some implementations of FORTRAN may not pad the remaining characters with spaces but pad with null or random characters.

Vertical level (IP1):

This attribute represents a vertical level in the following vertical coordinates

In the FSTD89 format the value of IP1 could be interpreted directly. However in the FSTD2000 format this attribute is coded as a 28 bits integer that needs to be translated with the CONVIP utility.

Forecast hour (IP2):

This attribute normally represents the forecast hour (e.g. 12 hour forecast). Its value should normally be rounded to the nearest hour as given by the FORTRAN formula:

IP2 = ((NPAS * DEET+1800)/3600).

User defined identifier (IP3):

The contents of this attribute is left to the user. It should be set to 0 when not used.

Length of time step (DEET):

This is the length of a time step used during a model integration, in seconds.

Time step number (NPAS):

This is the time step number at which the field was written during an integration. The number of the initial time step is 0.

Date of original analysis (DATEO) and date of validity (DATEV):

Before the 1989 release of the RPN standard file library, the date of origin (DATEO) was originally encoded in FORTRAN programs in the following format: WMMDDYYHHR, where

* W day of the week (1=Sunday, 7=Saturday)

* MM month(01 to 12)

* DD day of the month (01 to 31)

* YY year (00-49 = from 2000 to 2049, 50-99 = from 1950 to 1999)

* HH GMT hour (00-23)

* R Operational Run (0 to 7)

That format is still used for printout in RPN utilities, such as "voir". However, in the current release of the RPN standard file library, the `W' part of the date-time stamp has been dropped, so that the actual format that should be used in FORTRAN programs is MMDDYYHHR (Example: 081192000 -> Augusth 11th, 1992, at 00Z, run 0).

The date of validity (DATEV) of a field is closely associated with the date of origin. It is normally computed using the subroutine "incdat", which computes a date of validity from a date of original analysis and a time lapse defined by deet*npas (in hours). The following sample of code shows how to compute datev from dateo, deet and npas.

integer deltat, deet, npas, dateo, datev

deltat = (deet*npas+1800)/3600
call incdat(DATEV, dateo, deltat)

It is important to be aware of the difference between DATEO and DATEV. DATEO is used by the routines writing records into a file while DATEV is used by all routines querying records except one (FSTPRM).

Dimension of grid along the X, Y and Z axes (NI, NJ, NK):

This is the physical dimension of the grid along each spatial axis. On a geographical map, NI lies along the horizontal or X axis, NJ along the vertical or Y axis, and NK would point out of the map or along the Z axis.

Type of geographical projection (GRTYP) and grid parameters (IG1-IG2-IG3-IG4):

The usage of these parameters will be discussed extensively in appendix C, "Conventions regarding the usage of grid descriptors in RPN standard files".

Type of data (DATYP):

This is a numerical code indicating the data type of the numerical values stored in a record. This is the list of existing codes:

0: raw binary (unexportable among platforms)

1: floating point

2: integer

3: character

4: signed integer

5: IEEE style representation

Packing ratio (NPAK and NBITS):

In order to save disk space, the numerical values stored in RPN standard file records are not kept to their full precision (typically 32 bits on UNIX computers); they are usually compressed to occupy between 12 to 16 bits per floating point value. The compression factor can go as high as 1 bit per value.

At the time of creation of RPN standard files, users at CID/CMC/RPN counted memory and disk space in terms of words rather than bytes. At that time, the packing ratio was also expressed in items per word. A packing ratio of 4 (NPAK=4) meant that 4 floating point values could be stored with a precision of 16 bits into a 64-bit CRAY word. That precision becomes 15 bits on a CDC CYBER-720 60-bit word and 8 bits on a standard 32-bit word UNIX system. These differences in interpretation can be confusing, especially when the same code runs on different platforms. In order to get an absolute value for the arithmetic precision while keeping backward compatibility, the following standard has been adopted:

Let NBITS be the number of bits kept per floating point value.

NPAK = 0 -> No compaction, NBITS = (number of bits/word)

NPAK = 1 -> No compaction, NBITS = (number of bits/word)

NPAK > 0 -> NBITS = (number of bits/word) /NPAK

NPAK < 0 -> NBITS = -NPAK

On a 32-bit UNIX system, NPAK = 4 means that 8 bits are kept for each floating point value; NPAK=- 16 means that 16 bits are kept for each value.

We recommend to use a value of NPAK < 0, so that the number of bits kept per item will always be absolute. As mentioned above, a value of -12 or -16 provides a good compromise between disk space and arithmetic precision. Positive values of NPAK are likely to be rejected by future versions of the software.

The RPN standard file library

The creation and manipulation of standard files can be done either by calling a set of FORTRAN integer functions (the RPN standard file library) or by using RPN utilities such as PGSM and EDITFST. The RPN standard file library contains a set of FORTRAN functions. 13 of these functions will be presented in this document, the others being reserved for a more specialized usage. All these functions return an INTEGER code. They can be divided in 5 categories.

5 auxiliary routines are often used in conjunction with the standard file library: the routines FNOM and FCLOS to associate and dissociate a FORTRAN unit number to a filename, the routines CXGAIG and CIGAXG to encode/decode the grid attributes IG1, IG2, IG3 and IG4, and the routine INCDAT to compute the date of validity from the date of origin and vice-versa.

Effect of standard file function calls on file pointer position

The following table lists the differences between standard file types regarding current file pointer position after the following function calls. (BOC=Beginning of catalog, BOF=Beginning of file, EOF=End of file, N/A=Not applicable)

Function name              RND type file              SEQ or SEQ/FTN type file   
FSTOUV                     BOC                        BOF                        
FSTFRM                     Undefined                  Undefined                  
FSTRWD                     N/A                        BOF                        
FSTECR                     Undefined                  EOF                        
FSTEFF                     Undefined                  N/A                        
FSTLIR                     Current record             Next record                
FSTLIS                     Current record             Next record                
FSTLUK                     Current record             Next record                
FSTINF                     Current record             Current record             
FSTINL                     Current record             EOF                        
FSTNBR                     Current record             N/A                        
FSTPRM                     Current record             Current record             
FSTSUI                     Current record             Current record             
FSTVOI                     BOC                        EOF                        

The following table lists the differences between standard file types regarding current file pointer position before calling query functions. (BOC=Beginning of catalog, BOF=Beginning of file, EOF=End of file, N/A=Not applicable)

Function name              RND type file              SEQ or SEQ/FTN type file   
FSTLIR                     BOC                        Current record             
FSTINF                     BOC                        Current record             
FSTINL                     BOC                        Current record             
FSTLIS                     Current record             Current record             
FSTSUI                     Current record             Current record             

A basic example: creating a standard file from ASCII data

The examples discussed in this section should be accessible on-line, in the directory $ARMNLIB/demo. To start working on your own, you can make a copy of this directory in your own HOME directory. Here is a suggested list of commands:

	% cd 
	% mkdir fstd
	% cp $ARMNLIB/demo/* fstd

The first example covers the conversion of data stored in ASCII format into the RPN standard file format. The file contains climatological monthly surface temperatures on a 120x60 grid, covering the globe and defined on a lat-lon projection. The data is stored in file "ts.asc" and will be converted into file "ts.fst".

The grid has the following structure: point (1,1) represents the southwest corner of the globe
(-88.5 lat, 0 lon), while point (ni,nj) represents the northeast corner (88.5 lat, 357 lon). There is one grid point every 3 degrees of latitude and longitude.

                                              (NI, NJ)
   *** |---------------------------------------|
    *  |                                       |
    *  |                                       |
    *  |                                       |
    NJ |                                       |
    *  |                                       |
    *  |                                       |
    *  |                                       |
   *** |---------------------------------------|
 (1, 1) *                                     *
       |**************   NI  ******************
        *                                     *

The input file has the following format: the first line contains the month of the year, followed by field values encoded in the following order: i, j, fld(i,j), fld(i+1,j), fld(i+2,j), fld(i+3,j), fld(i+4,j). Here are the first ten lines of the file.

            1
   1   1 -28.6152 -28.6816 -28.6015 -28.6230 -28.8769
   6   1 -28.8496 -28.7461 -28.6621 -28.5722 -28.7051
  11   1 -28.5605 -28.5742 -28.4902 -28.2500 -28.3613
  16   1 -28.1113 -28.2285 -28.2441 -28.1621 -28.2715
  21   1 -28.2070 -28.1133 -28.0410 -27.9941 -28.0879
  26   1 -28.1465 -28.2676 -28.3437 -28.3730 -28.4004
  31   1 -28.4394 -28.5703 -28.7109 -28.8008 -28.6797
  36   1 -28.5703 -28.5156 -28.4394 -28.3730 -28.4785
  41   1 -28.5664 -28.5410 -28.6367 -28.7871 -28.5468

The FORTRAN code is stored in the file "ex1.f". A copy of the whole program appears in Appendix A. The "ex1" program uses 5 routines from the standard file library. Here is an overview of what is done by the program.

* Declare variables

* Associate a file name with a FORTRAN unit number (FNOM)

* Open the file (FSTOUV)

* Initialize proper standard file attributes

* Read the data contained in the ASCII file

* Write records into the standard file (FSTECR)

* Close the file (FSTFRM)

* Dissociate the file name from the FORTRAN unit number (FCLOS).

Let's look more closely at the different parts of the program.

Declare variables

In the first part we use the attributes names that were suggested in the "Detailed structure" section.

      ---------------------------------------------------
      character*2 nomvar
      character*1 typvar, grtyp
      character*8 etiket
      
      integer key, dateo, deet, npas, ni, nj, nk, npak, datyp 
      integer ip1, ip2, ip3
      integer ig1, ig2, ig3, ig4
      logical rewrit

We then declare the functions used by the program.

      ---------------------------------------------------
      external fstecr
      external fnom, fstouv, fclos, fstfrm

      integer fstecr
      integer fnom, fstouv, fclos, fstfrm
      ---------------------------------------------------

We continue with the other variables. "ier" contains the return status of the functions used in the examples. "iun" contains the logical FORTRAN unit. "month" contains the month index that will be used to initialize the date of origin. "fld" will contain the values of surface temperature for each month, and "work" contains a working storage area for the "fstecr" function. Although there is a precise formula to compute and reduce the size of the "work" array, we will keep things simple and initialize it with the same dimensions as "fld".

      ---------------------------------------------------
      integer ier
      integer i,j,ii,jj,iun
      integer month
      real fld(120, 60), work(120, 60)
      ---------------------------------------------------

Associate a file name with a FORTRAN unit number (FNOM)

      ---------------------------------------------------
      iun = 1
      ier = fnom(iun, 'ts.fst', 'STD+RND', 0)
      if (ier.lt.0) then
         print *, 'Fatal error while opening the file'
      endif
      ---------------------------------------------------

Open the file in random access mode (FSTOUV)

      ---------------------------------------------------
      iun = 1
      ier = fstouv(iun, 'STD+RND')
      ---------------------------------------------------

Initialize proper standard file attributes

      ---------------------------------------------------
      typvar = 'C'
      nomvar = 'TS'
      etiket = 'SFC TEMP '

      ip1 = 0
      ip2 = 0
      ip3 = 0

      ni = 120
      nj = 60
      nk = 1

      deet = 0
      npas = 0

      grtyp  = 'A'
      ig1 = 0
      ig2 = 0
      ig3 = 0
      ig4 = 0

      datyp = 1
      npak = -16
      ---------------------------------------------------

Read the data contained in the ASCII file.

We assume here that we are reading data from the console (through standard UNIX redirection) and that we know exactly the format of the input data.

      ---------------------------------------------------
        read(5, *) month
         do 200 j=1,120*60/5
            read(5,*) ii,jj,fld(ii,jj),fld(ii+1,jj),fld(ii+2,jj),
     *                fld(ii+3,jj),fld(ii+4,jj)
 200        continue
      ---------------------------------------------------

Write records in the standard file (FSTECR).

We set the date so that each climatological average has the 1st of each month as date of origin. We then call integer function FSTECR with the attributes that were set at the beginning of the program. The last argument of the routine is a flag indicating that we don't want to replace any existing record that would have the same search attributes (ip1, ip2, ip3, typvar, nomvar, etiket). Note that the date of origin (DATEO) is not included in the search attributes.

      ---------------------------------------------------
       dateo = month * 10000000 + 0199000
       ier = fstecr(fld, WORK, npak, iun, dateo, deet, npas, ni, nj,
     *                nk, ip1, ip2, ip3, typvar, nomvar, etiket, grtyp,
     *                ig1, ig2, ig3, ig4, datyp, .false.)
      ---------------------------------------------------
      ---------------------------------------------------
      ier = fstfrm(1)
      ---------------------------------------------------
Dissociate the file name from the FORTRAN unit number (FCLOS).
      ---------------------------------------------------
      ier = fclos(1)
      ---------------------------------------------------

To compile the program, type

% f77 ex1.f -o ex1 $ARMNLIB/lib/rmnxlib.a

To execute the program, type

% ex1 < ts.asc 

The program should then execute, producing the following output:

 UNIT =  1 RANDOM EST CREE
 UNIT =  1 EST OUVERT  RANDOM    
 ECRIT( 1)      0-TS C     0     0    0   120    60    1 SFC TEMP 4010199000      0     0  A    0    0     0     0 R16    1531  3604
 ECRIT( 1)      1-TS C     0     0    0   120    60    1 SFC TEMP 7020199000      0     0  A    0    0     0     0 R16    5161  3604
 ECRIT( 1)      2-TS C     0     0    0   120    60    1 SFC TEMP 1030199000      0     0  A    0    0     0     0 R16    8791  3604
 ECRIT( 1)      3-TS C     0     0    0   120    60    1 SFC TEMP 4040199000      0     0  A    0    0     0     0 R16   12421  3604
 ECRIT( 1)      4-TS C     0     0    0   120    60    1 SFC TEMP 6050199000      0     0  A    0    0     0     0 R16   16051  3604
 ECRIT( 1)      5-TS C     0     0    0   120    60    1 SFC TEMP 2060199000      0     0  A    0    0     0     0 R16   19681  3604
 ECRIT( 1)      6-TS C     0     0    0   120    60    1 SFC TEMP 4070199000      0     0  A    0    0     0     0 R16   23311  3604
 ECRIT( 1)      7-TS C     0     0    0   120    60    1 SFC TEMP 7080199000      0     0  A    0    0     0     0 R16   26941  3604
 ECRIT( 1)      8-TS C     0     0    0   120    60    1 SFC TEMP 3090199000      0     0  A    0    0     0     0 R16   30571  3604
 ECRIT( 1)      9-TS C     0     0    0   120    60    1 SFC TEMP 5100199000      0     0  A    0    0     0     0 R16   34201  3604
 ECRIT( 1)     10-TS C     0     0    0   120    60    1 SFC TEMP 1110199000      0     0  A    0    0     0     0 R16   37831  3604
 ECRIT( 1)     11-TS C     0     0    0   120    60    1 SFC TEMP 3120199000      0     0  A    0    0     0     0 R16   41461  3604
 UNITE FORTRAN IUN=   1 EST FERME

Let's look at the first two lines of the output.

 UNIT =  1 RANDOM EST CREE
 UNIT =  1 EST OUVERT  RANDOM    

These lines show that the file associated with logical unit 1 is created (UNIT = 1 RANDOM EST CREE), and then opened in random access mode (UNIT = 1 EST OUVERT RANDOM ).

We then have a message for each record that has been created.

 
ECRIT( 1)      0-TS C     0     0    0   120    60    1 SFC TEMP 4010199000      0     0  A    0    0     0     0 R16    1531  3604

The contents of the line show that a record has been written on unit 1, followed by a list of attributes. The order of the attributes is key(0), var. name(TS), field type(C), ip1(0), ip2(0), ip3(0), ni(120), nj(60), nk(1), label(SFC TEMP), date of origin(4010199000), deet(0), npas(0), grtyp(A), ig1(0), ig2(0), ig3(0), ig4(0), data type and packing ratio(R16) and finally two other attributes reserved for internal use: swa(1531 ) and lng (3604).

The name of the standard file produced by the program is ts.fst. We can inspect the contents of this file by invoking the RPN utility voir.

% voir -iment ts.fst

1

   ********************************************************************************************
   *                                                                                          *
   *             VOIR                                                            3.2          *
   *                                                                                          *
   *                                                                                          *
   *          Thu Aug 20 11:11:43 1999                                                        *
   *                                                                                          *
   *          BEGIN  EXECUTION                                                                *
   *                                                                                          *
   ********************************************************************************************
 UNIT = 10 EST OUVERT  RANDOM    
1 FILE=ts.fst                                                           TYPE=RANDOM        Thu Aug 20 1999  11:11:44     PAGE   1

      KEY#  ID    IP1   IP2  IP3   NI    NJ   NK  ETIQ.   DATE ORIG.   DEET  NPAS GR  IG1  IG2   IG3   IG4 DTY     SWA   LNG

        0-TS C     0     0    0   120    60    1 SFC TEMP 4010199000      0     0  A    0    0     0     0 R16    1531  3604
        1-TS C     0     0    0   120    60    1 SFC TEMP 7020199000      0     0  A    0    0     0     0 R16    5161  3604
        2-TS C     0     0    0   120    60    1 SFC TEMP 1030199000      0     0  A    0    0     0     0 R16    8791  3604
        3-TS C     0     0    0   120    60    1 SFC TEMP 4040199000      0     0  A    0    0     0     0 R16   12421  3604

        4-TS C     0     0    0   120    60    1 SFC TEMP 6050199000      0     0  A    0    0     0     0 R16   16051  3604
        5-TS C     0     0    0   120    60    1 SFC TEMP 2060199000      0     0  A    0    0     0     0 R16   19681  3604
        6-TS C     0     0    0   120    60    1 SFC TEMP 4070199000      0     0  A    0    0     0     0 R16   23311  3604
        7-TS C     0     0    0   120    60    1 SFC TEMP 7080199000      0     0  A    0    0     0     0 R16   26941  3604

        8-TS C     0     0    0   120    60    1 SFC TEMP 3090199000      0     0  A    0    0     0     0 R16   30571  3604
        9-TS C     0     0    0   120    60    1 SFC TEMP 5100199000      0     0  A    0    0     0     0 R16   34201  3604
       10-TS C     0     0    0   120    60    1 SFC TEMP 1110199000      0     0  A    0    0     0     0 R16   37831  3604
       11-TS C     0     0    0   120    60    1 SFC TEMP 3120199000      0     0  A    0    0     0     0 R16   41461  3604


 STATISTIQUES

 DIMENSION DU DIRECTEUR DISQUE       100
 NOMBRE D ENTREES UTILISEES           12
 LONGUEUR DU FICHIER               45090 MOTS
 NOMBRE D ECRITURES                   12
 NOMBRE DE RE-ECRITURES                0
 NOMBRE D EFFACAGES                    0
 NOMBRE D EXTENSIONS                   0
 NOMBRE DE CORRECTIONS                 0

 ***************************************** 
 UNITE FORTRAN IUN=  10 EST FERME

   ********************************************************************************************
   *                                                                                          *
   *             VOIR                                                           O.K.          *
   *                                                                                          *
   *          Thu Aug 20 11:11:44 1999                                                        *
   *                                                                                          *
   *          END EXECUTION                                                                   *
   *                                                                                          *
   *          CP SECS =      0.100                                                            *
   *                                                                                          *
   ********************************************************************************************


A second example: querying the contents of a standard file

Most of the RPN standard file functions (9 out 13) are meant for querying and reading records. As mentioned in the "Classification by category" section, queries can be targeted to records using the following attributes:

Data element                             Suggested name                           
Variable name                            NOMVAR                                   
Type of field                            TYPVAR                                   
Label                                    ETIKET                                   
Vertical level                           IP1                                      
Forecast hour                            IP2                                      
User defined index                       IP3                                      
Date of validity                         DATEV=DATEO+ DEET  * NPAS                

Queries can be made by using precise values for the attributes, or by wildcarding some of them. Wildcarding an attribute means to ignore it when making a search. A standard file query is usually of the type "Give me the key of the first record where NOMVAR='GZ' and TYPVAR='P' and ETIKET='FE OPRUN' and IP1=1000 and IP2=3 and IP3=0 and DATEV=039217120". A query where IP1, IP2 and IP3 would be wildcarded (and thus ignored during the search) would look like "Give me the key of the first record where NOMVAR='GZ' and TYPVAR='P' and ETIKET='FE OPRUN' .

Wildcarding the integer attributes IP1, IP2, IP3 and DATEV is done by assigning them a value of -1; wildcarding the character attributes NOMVAR, TYPVAR and ETIKET is done by using a single blank, ` `.

The program "ex2.f", printed in appendix A, computes simple statistics from the "ts.fst" file created in example 1. The program reads records meeting certain search criteria (functions FSTLIR and FSTLIS), finds their average, minimum and maximum values (subroutine STATFLD), and then prints the value of all the standard file attributes. Here is an overview of what is done by the program.

* Declare variables

* Associate a file name with a FORTRAN unit number (FNOM)

* Open the file (FSTOUV)

* Initialize proper standard file attributes needed by the FSTLIR function

* Read the first record matching search criteria (FSTLIR)

* Find minimum, maximum and average values (STATFLD - included in the main program)

* Get and print the values of all standard file attributes (FSTPRM)

* Repeat until no more records are found (FSTLIS)

* Close the standard file (FSTFRM)

* Dissociate the file name from the FORTRAN unit number (FCLOS).

The part of the code of "ex2.f" that declares variables and opens the standard file is very similar to "ex1.f". We will discuss only the parts of the program that are different.

The first part is a call to the "FSTNBR" function, which return the number of records existing in a random standard file.

      ---------------------------------------------------
****
*     Get the number of records in the standard file
*     This function can only be used for random standard files
****
      nrecs = fstnbr(iun)
      print *, 'There are ', nrecs, ' records in that file'
      ---------------------------------------------------

The second part is a call to the "FSTVOI" function, which produces a listing (on standard output) of the attributes of all existing records in the standard file. This listing is identical to the one produced by the invocation of the "VOIR" utility.

      ---------------------------------------------------
****
*     Print the contents of the standard file directory
****
      ier = fstvoi(iun, 'STD+RND')
      if (ier.lt.0) then
         print *, '(FSTVOI) Cannot print the directory'
      endif
      ---------------------------------------------------

The next part initializes the standard file attributes to initiate a query for the first occurence of the variable 'TS', where TYPVAR='C', ETIKET='SFC TEMP ', and where IP1, IP2 and IP3 are all zero. Note that DATEV (the date of validity) is the only wild card attribute in this call.

      ---------------------------------------------------
****
*     Initialize standard file variables for doing a query
****

      typvar = 'C'
      nomvar = 'TS'
      etiket = 'SFC TEMP '
      datev  = -1
      ip1 = 0
      ip2 = 0
      ip3 = 0

The FSTLIR function locates the first record meeting the search criteria, and then loads the field values into the FLD array.[2]

      
      ---------------------------------------------------
****
*     Reads the first field matching selection criteria
****
      key = fstlir(FLD, iun, NI, NJ, NK, datev, etiket, 
     *             ip1, ip2, ip3, typvar, nomvar)
      ---------------------------------------------------

The FSTLIR function returns a negative value if it cannot locate a record matching the search criteria.

       ---------------------------------------------------
50   if (key.lt.0) then
         print *, '(FSTLIR) Invalid key number:', key
      ---------------------------------------------------

The next part of the code contains a loop that reads all the other records in the standard file meeting the search criteria (using "FSTLIS"), until the end of file is reached (i.e. until a key less than zero is returned). For each record found, we compute its minimum, maximum and average values (subroutine STATFLD), and get the values of its attributes using the "FSTPRM" function.

Here is a schematic part of the loop

	--->	if (key.lt.0) then
	|	    print error message
	|	else
	|	    compute min, max, avg
	|	    get and print the value of all attributes
	|      endif
	|	read next field matching search criteria
	|-----------

and the real code
       ---------------------------------------------------
50   if (key.lt.0) then
         print *, '(FSTLIR) Invalid key number:', key
      else
****
*        Computes minimum, maximum and average value of the field
****
         call statfld(minval, maxval, avgval, fld, ni, nj)
****
*        Get all standard file parameters and print them
****
         ier  = fstprm(key, dateo, deet, npas, ni, nj, nk, 
     *                 nbits, datyp, ip1, ip2, ip3, 
     *                 typvar, nomvar, etiket, grtyp,
     *                 ig1, ig2, ig3, ig4, swa, lng, dltf, ubc, 
     *                 extra1, extra2, extra3)

         print *, '*****************************************'
         print *, ' minval = ', minval, 'maxval =', maxval, 
     *            'avgval = ', avgval

         print 10, nomvar, typvar, etiket, dateo, deet, npas,
     *             ni, nj, nk,nbits, datyp, ip1, ip2, ip3,
     *             grtyp, ig1, ig2, ig3, ig4,
     *             swa, lng, dltf, ubc, extra1, extra2, extra3

****
*        Try to read the next field matching selection criteria set
*        by the first call to FSTLIR.
****
         key  = fstlis(fld, iun, NI, NJ, NK)
         goto 50
      endif
      ---------------------------------------------------

Here is the output produced by the program:

 ***************************************** 
 LU( 1)      0-TS C     0     0    0   120    60    1 SFC TEMP 6010199000      0     0  A    0    0     0     0 R16    1531  3604
 *****************************************
  minval =   -48.25580    maxval =   31.51569    avgval =    3.220830    
  nomvar=        TS typvar=         C etiket=  SFC TEMP
  dateo=   10199000 deet=           0 npas=           0
  ni=           120 nj=            60 nk=             1
  nbits=         16 datyp=          1
  ip1=            0 ip2=            0 ip3=            0
  grtyp=          A ig1=            0 ig2=            0 ig3=            0 ig4=            0
  swa=         1531 lng=         3604 dltf=           0 ubc=            0
  extra1=         0 extra2=         0 extra3=         0
 LU( 1)      1-TS C     0     0    0   120    60    1 SFC TEMP 2020199000      0     0  A    0    0     0     0 R16    5161  3604
 *****************************************
  minval =   -49.15850    maxval =   30.90791    avgval =    2.926342    
  nomvar=        TS typvar=         C etiket=  SFC TEMP
  dateo=   20199000 deet=           0 npas=           0
  ni=           120 nj=            60 nk=             1
  nbits=         16 datyp=          1
  ip1=            0 ip2=            0 ip3=            0
  grtyp=          A ig1=            0 ig2=            0 ig3=            0 ig4=            0
  swa=         5161 lng=         3604 dltf=           0 ubc=            0
  extra1=         0 extra2=         0 extra3=         0
 LU( 1)      2-TS C     0     0    0   120    60    1 SFC TEMP 2030199000      0     0  A    0    0     0     0 R16    8791  3604
 *****************************************
  minval =   -63.80290    maxval =   30.91976    avgval =    2.481087    
  nomvar=        TS typvar=         C etiket=  SFC TEMP
  dateo=   30199000 deet=           0 npas=           0
  ni=           120 nj=            60 nk=             1
  nbits=         16 datyp=          1
  ip1=            0 ip2=            0 ip3=            0
  grtyp=          A ig1=            0 ig2=            0 ig3=            0 ig4=            0
  swa=         8791 lng=         3604 dltf=           0 ubc=            0
  extra1=         0 extra2=         0 extra3=         0

(...OUTPUT TOO LONG TO BE PRINTED HERE)

Other query methods

The remaining part of this document will show alternative methods of locating the records the program "ex2.f" searches for.

It may sometimes be useful to retrieve record information before reading data values. For example it may be valuable to verify the size of a record (in terms of NI, NJ, NK) before loading it into memory. This can be done by using the "FSTINF" function. The next occurence of a record meeting the search criteria defined by "FSTINF" can be found be using the "FSTSUI" function.

Once a record has been located by "FSTINF", it can be loaded in memory by a call to "FSTLUK", which uses the key returned by "FSTINF". In fact, the code

      		key1 = fstlir(FLD, iun, NI, NJ, NK, datev, etiket, 
			       ip1, ip2, ip3, typvar, nomvar)
      		key2 = fstlis(FLD, iun, NI, NJ, NK)

produces the same result as

      		key1 = fstinf(iun, NI, NJ, NK, datev, etiket, 
                    	      ip1, ip2, ip3, typvar, nomvar)
		ier  = fstluk(FLD, key1, NI, NJ, NK)
      		key2 = fstsui(iun, NI, NJ, NK)
		ier  = fstluk(FLD, key2, NI, NJ, NK)

Sample code using "FSTINF", "FSTLUK" and "FSTSUI" can be found in the program "ex3.f".

The last function to be presented in this document is "FSTINL". The "FSTINL" calling sequence is similar to "FSTINF", except that it returns a list of record keys satisfying search criteria. So, instead of executing the following loop NKEYS times,

      		key = fstinf(iun, NI, NJ, NK, datev, etiket, 
                    	      ip1, ip2, ip3, typvar, nomvar)
	-->	if (key > 0) then
	|	   do something
	|	   key = fstsui(iun, NI, NJ, NK)
	|	endif
	------------

it is possible to use

       	integer maxkeys
      		parameter (maxkeys = 100)
      		integer keys(maxkeys), nkeys

		(...)

    		ier = fstinl(iun, NI, NJ, NK, datev, etiket, 
                    	      ip1, ip2, ip3, typvar, nomvar,
			      KEYS, NKEYS, nmax)

	       do 100 i=1, nkeys
		   ier = fstluk(FLD, keys(i), NI, NJ, NK)
		   do something
	100    continue

Sample code using "FSTINL" can be found in the file "ex4.f".


Source code examples

FORTRAN Source code :