An introduction to RPN standard files
Yves Chartier
This document gives an overview of the RPN standard file format, which has been in use at the CID/CMC/RPN complex since 1980. This file format is used to store gridded data from numerical weather prediction models, objective analyses and geophysical fields.
This document is intended as an introduction for people having to work with this data format. It is assumed that the reader has a working knowledge of FORTRAN, and has coded algorithms manipulating gridded data (1, 2, or 3 dimensional arrays).
After having read this document, the reader should be able to create and manipulate RPN standard files using FORTRAN.
The RPN standard file format was created by Michel Valin, from the "Section Informatique" of RPN, in the late 1970's. At that time, the CID/CMC/RPN super-computer was a CYBER 7600 (2 Mbytes RAM, 36 Mips), to which users where submitting batch jobs from a CYBER-171 front-end (512K memory, 1 Mips). Punched cards were the most common way to communicate with the computer, and 300 Baud was the standard communication speed for the lucky few who had access to a terminal.
The production system of numerical forecasts was already quite complex. It was composed of dozens of applications, most of which had their own, non-standard and undocumented data format. The majority of data files were without internal structure, a record number being often the only clue that one had to locate a given data item (e.g. to compute the height thickness between 1000 and 500 mb, one had to know that records #78 and #89 from file "xyz" had to be accessed). It is left to the reader to imagine the amount of work involved in modifying the horizontal, vertical or temporal resolutions of the models, which could alter the distribution of the data fields in the files (e.g. use rec. #117 instead of #78 and rec. #125 instead of #89).
The first release of the RPN standard file software appeared in 1980. The advantages of the new data format over the previous ones were numerous:
* each meteorological record stored in a standard file had several non-ambiguous identifiers (name, a pressure level, a forecast hour, a date of origin, precise geographical location, etc...).
* files were sequential or direct access, allowing fast positioning and retrieval of data.
* any application dealing with gridded data only had one single format to deal with, thereby eliminating the cost of using dozens of other non-descriptive formats.
* record manipulation was done through high-level subroutines; the underlying structure of the files could then change without requiring users to modify their applications.
A second release of the software came in 1982, offering new enhancements. The most significant was the total transportability of files, bit for bit, to almost any type of computer.
Some features of the original standard file software needed 60 or 64-bit machines (e.g. packing 8 Hollerith characters or storing a 10-digit integer in an integer variable). But in 1989, the CID/CMC/RPN complex acquired numerous UNIX workstations and servers, all of which were 32-bit machines. A third release of the software was therefore needed to accomodate the new platforms. At the same time several other enhancements were added to this release, such as a streamlined interface and an improved internal management.
General structure
From an end-user point of view, the RPN standard file package can be compared to commercial flat-file database software. The basic unit of storage is a file, containing numerous records of gridded meteorological fields. A record contains a 1, 2 or 3-D grid representing the values of a field, with various attributes that give information about the field.
The RPN standard files come in three distinct flavors: random access (RND - which is not equivalent to direct access in FORTRAN), indexable sequential (SEQ) and FORTRAN sequential (SEQ/FTN). These flavors define the way records are accessed within the file. In SEQ and SEQ/FTN files, an application that wants to access the 50th record of a file needs to read or skip the first 49 records before accessing it, whereas in a RND file, that record can be accessed directly.
An important concept that comes with these different file types is the pointer position; this has an impact on the behavior of some of the standard file functions. RND type files have a catalog of the contents of the file, which is always resident in computer memory. For these, the file pointer is an index that is attached to the catalog only. In SEQ and SEQ/FTN file type, there is no catalog of the contents of a file, and the file pointer is physically tied to the file. It points either to the beginning of file, to the current record or to the end of file.
Here are some of the differences found between RND and SEQ files.
* In RND files, all the attributes of a given record can be retrieved directly if the file pointer contains a valid position. In SEQ or SEQ/FTN type files, changing the value of the file pointer has no effect; the file pointer is restricted to the record currently selected (or to the end of the file).
* In RND files, the value of the file pointer becomes undefined after having written a record; in SEQ or SEQ/FTN files, the file pointer points to the end of the file after that operation.
* In RND files, most record searching functions start from the beginning of the catalog, and multiple searches do not require an explicit rewinding of the file. In SEQ or SEQ/FTN files, the same functions start their search from the current record, and multiple searches may require an explicit rewinding of the file.
The choice of the file type really depends on the application. A numerical model that reads a sequential stream of records at start-up and writes another stream of records at completion will get the most efficient input/output with SEQ/FTN or SEQ files. However, the vast majority of RPN standard files are of random access (RND) type, and this is the type recommended for beginners. SEQ type files are recommended over SEQ/FTN because they are transportable between heterogeneous systems.
The RPN standard file attributes can be roughly divided in 5 categories, which are:
The usage of these attributes varies with the operation (reading, writing, querying) that is done on the file. Generally, only a subset of the attributes described above is needed for a given operation.
The following is a list of the attributes found in RPN standard files, as they are used in FORTRAN programs, classified by category. The Range indicated here are for the file format FSTD89. See below in III - Limitations of sizes and values for the new file format FSTD2000.
Field identification attributes
Data element Suggested name Data type Range Variable name NOMVAR CHARACTER*2 UPPER CASE Type of field TYPVAR CHARACTER*1 UPPER CASE Label ETIKET CHARACTER*8 UPPER CASE User defined index IP3 INTEGER 0-4095
Time attributes
Data element Data type Suggested name Range Date of orig. analysis INTEGER DATEO MMDDYYHHR Date of validity INTEGER DATEV MMDDYYHHR Forecast hour INTEGER IP2 0-32767 Length of time step INTEGER DEET 0-32767 Time step number INTEGER NPAS 0-32767
Spatial attributes
Data element Data type Suggested name Range Vertical level INTEGER IP1 0-32767 # of points along X INTEGER I 1-32767 # of points along Y INTEGER NJ 1-32767 # of points along Z INTEGER NK 1-4095 Type of geographical CHARACTER*1 GRTYP UPPER CASE projection 1st grid parameter INTEGER IG1 0-2047 2nd grid parameter INTEGER IG2 0-2047 3rd grid parameter INTEGER IG3 0-65535 4th grid parameter INTEGER IG4 0-65535
Internal representation attributes
Data element Data type Suggested name Range Data type INTEGER DATYP 0-5 Packing ratio/ # of bits INTEGER NPAK 0-32 / -1 to -48
Internal storage attributes
Data element Data type Suggested name Erased field flag INTEGER DLTF Key INTEGER KEY Length of record in host machine words INTEGER LNG Starting address of rec. in host INTEGER SWA machine words Unused number of bits in the last INTEGER UBC word reserved for future use INTEGER EXTRA1 reserved for future use INTEGER EXTRA2 reserved for future use INTEGER EXTRA3
The record attributes discussed above can also be divided in 3 categories:
* search attributes
* descriptive attributes
* internal attributes
The search attributes are the ones that must be used at all times when using the RPN standard file subroutines. They can be used as selection criteria for querying sets of records, and their value must be defined when writing a record into a file. The descriptive attributes need only to be defined when writing a record into a file. Their value can be retrieved, but cannot be used to make queries. Finally, the values of the internal attributes are set by the standard file package; they can only be retrieved.
Here is the list of attributes, grouped according to this classification:
Search attributes
Data element Suggested name Variable name NOMVAR Type of field TYPVAR Label ETIKET Vertical level IP1 Forecast hour IP2 User defined index IP3 Date of validity DATEV=DATEO+ DEET * NPAS
Descriptive attributes
Data element Suggested name Length of time step DEET Time step number NPAS Date of original analysis DATEO Dimension of grid along the X-axis NI Dimension of grid along the Y-axis NJ Dimension of grid along the Z-axis NK Type of geographical projection GRTYP 1st grid parameter IG1 2nd grid parameter IG2 3rd grid parameter IG3 4th grid parameter IG4 Numerical values data type DATYP Packing ratio NPAK
Internal attributes
Data element Suggested name Erased field flag DLTF Key KEY Length of record in machine words LNG Starting address of record in machine SWA words Unused number of bits in the last word UBC reserved for future use EXTRA1 reserved for future use EXTRA2 reserved for future use EXTRA3
The record attributes discussed above have different limitations on size and values depending on the file format: old (FSTD89) or new (FSTD2000).
FSTD89 FSTD2000 Applies to CONVIP IP1 15 bits (0-32767) 28 bits (0-268435455) IP2 15 bits (0-32767) 28 bits (0-268435455) IP3 12 bits (0-4095) 28 bits (0-268435455) Applies to CXGAIG IG1 11 bits (0-2047) 24 bits (0-16777215) IG2 11 bits (0-2047) 24 bits (0-16777215) IG3 16 bits (0-65535) 24 bits (0-16777215) IG4 16 bits (0-65535) 24 bits (0-16777215) Applies to All FSTD Functions NI 16 bits (0-65535) 24 bits (0-16777215) NJ 16 bits (0-65535) 24 bits (0-16777215) NK 12 bits (0-4095) 20 bits (0-1048575) NOMVAR 2 chars 4 chars (NOT case sensitive) TYPVAR 1 chars 2 chars (NOT case sensitive) ETIKET 8 chars 12 chars (NOT case sensitive) Maximum Record Size 2 Megabytes 128 Megabytes Maximum File Size 250 Megabytes 32 Gigabytes
The official list of existing codes for NOMVAR and TYPVAR can be invoked on-line on the CID/CMC/RPN front-end computers using the "r.dict" command.
To get the dictionary definition of the code "GZ":
> r.dict -n gz
GZ Geopotential Height dam
To get the dictionary definition of the variable type "A":
> r.dict -t a
--A, ANALYSIS
To get the dictionary definition of all variable codes starting with "A":
> r.dict -n a.
AA Ammonium Aerosols (NH4) ppb
AL Albedo 0 to 1
AM Ammonia Gas (NH3) ppb
AP Planetary albedo 0 to 1
To get the dictionary definition of all existing variable codes:
> r.dict -n
(OUTPUT TOO LONG TO BE PRINTED HERE)
When coding the value of ETIKET in a FORTRAN program, always initialize the 8 characters of the label, such as
ETIKET = 'AA '
instead of
ETIKET = 'AA'
because some implementations of FORTRAN may not pad the remaining characters with spaces but pad with null or random characters.
In the FSTD89 format the value of IP1 could be interpreted directly. However in the FSTD2000 format this attribute is coded as a 28 bits integer that needs to be translated with the CONVIP utility.
IP2 = ((NPAS * DEET+1800)/3600).
* W day of the week (1=Sunday, 7=Saturday)
* MM month(01 to 12)
* DD day of the month (01 to 31)
* YY year (00-49 = from 2000 to 2049, 50-99 = from 1950 to 1999)
* HH GMT hour (00-23)
* R Operational Run (0 to 7)
The date of validity (DATEV) of a field is closely associated with the date of origin. It is normally computed using the subroutine "incdat", which computes a date of validity from a date of original analysis and a time lapse defined by deet*npas (in hours). The following sample of code shows how to compute datev from dateo, deet and npas.
integer deltat, deet, npas, dateo, datev deltat = (deet*npas+1800)/3600 call incdat(DATEV, dateo, deltat)
It is important to be aware of the difference between DATEO and DATEV. DATEO is used by the routines writing records into a file while DATEV is used by all routines querying records except one (FSTPRM).
0: raw binary (unexportable among platforms)
1: floating point
2: integer
3: character
4: signed integer
5: IEEE style representation
At the time of creation of RPN standard files, users at CID/CMC/RPN counted memory and disk space in terms of words rather than bytes. At that time, the packing ratio was also expressed in items per word. A packing ratio of 4 (NPAK=4) meant that 4 floating point values could be stored with a precision of 16 bits into a 64-bit CRAY word. That precision becomes 15 bits on a CDC CYBER-720 60-bit word and 8 bits on a standard 32-bit word UNIX system. These differences in interpretation can be confusing, especially when the same code runs on different platforms. In order to get an absolute value for the arithmetic precision while keeping backward compatibility, the following standard has been adopted:
Let NBITS be the number of bits kept per floating point value.
NPAK = 0 -> No compaction, NBITS = (number of bits/word)
NPAK = 1 -> No compaction, NBITS = (number of bits/word)
NPAK > 0 -> NBITS = (number of bits/word) /NPAK
NPAK < 0 -> NBITS = -NPAK
On a 32-bit UNIX system, NPAK = 4 means that 8 bits are kept for each floating point value; NPAK=- 16 means that 16 bits are kept for each value.
We recommend to use a value of NPAK < 0, so that the number of bits kept per item will always be absolute. As mentioned above, a value of -12 or -16 provides a good compromise between disk space and arithmetic precision. Positive values of NPAK are likely to be rejected by future versions of the software.
The creation and manipulation of standard files can be done either by calling a set of FORTRAN integer functions (the RPN standard file library) or by using RPN utilities such as PGSM and EDITFST. The RPN standard file library contains a set of FORTRAN functions. 13 of these functions will be presented in this document, the others being reserved for a more specialized usage. All these functions return an INTEGER code. They can be divided in 5 categories.
The following table lists the differences between standard file types regarding current file pointer position after the following function calls. (BOC=Beginning of catalog, BOF=Beginning of file, EOF=End of file, N/A=Not applicable)
Function name RND type file SEQ or SEQ/FTN type file FSTOUV BOC BOF FSTFRM Undefined Undefined FSTRWD N/A BOF FSTECR Undefined EOF FSTEFF Undefined N/A FSTLIR Current record Next record FSTLIS Current record Next record FSTLUK Current record Next record FSTINF Current record Current record FSTINL Current record EOF FSTNBR Current record N/A FSTPRM Current record Current record FSTSUI Current record Current record FSTVOI BOC EOF
The following table lists the differences between standard file types regarding current file pointer position before calling query functions. (BOC=Beginning of catalog, BOF=Beginning of file, EOF=End of file, N/A=Not applicable)
Function name RND type file SEQ or SEQ/FTN type file FSTLIR BOC Current record FSTINF BOC Current record FSTINL BOC Current record FSTLIS Current record Current record FSTSUI Current record Current record
The examples discussed in this section should be accessible on-line, in the directory $ARMNLIB/demo. To start working on your own, you can make a copy of this directory in your own HOME directory. Here is a suggested list of commands:
% cd % mkdir fstd % cp $ARMNLIB/demo/* fstd
The first example covers the conversion of data stored in ASCII format into the RPN standard file format. The file contains climatological monthly surface temperatures on a 120x60 grid, covering the globe and defined on a lat-lon projection. The data is stored in file "ts.asc" and will be converted into file "ts.fst".
The grid has the following structure: point (1,1) represents the
southwest corner of the globe
(-88.5 lat, 0 lon), while point (ni,nj) represents the northeast corner
(88.5 lat, 357 lon). There is one grid point every 3 degrees of latitude
and longitude.
(NI, NJ) *** |---------------------------------------| * | | * | | * | | NJ | | * | | * | | * | | *** |---------------------------------------| (1, 1) * * |************** NI ****************** * *The input file has the following format: the first line contains the month of the year, followed by field values encoded in the following order: i, j, fld(i,j), fld(i+1,j), fld(i+2,j), fld(i+3,j), fld(i+4,j). Here are the first ten lines of the file.
1 1 1 -28.6152 -28.6816 -28.6015 -28.6230 -28.8769 6 1 -28.8496 -28.7461 -28.6621 -28.5722 -28.7051 11 1 -28.5605 -28.5742 -28.4902 -28.2500 -28.3613 16 1 -28.1113 -28.2285 -28.2441 -28.1621 -28.2715 21 1 -28.2070 -28.1133 -28.0410 -27.9941 -28.0879 26 1 -28.1465 -28.2676 -28.3437 -28.3730 -28.4004 31 1 -28.4394 -28.5703 -28.7109 -28.8008 -28.6797 36 1 -28.5703 -28.5156 -28.4394 -28.3730 -28.4785 41 1 -28.5664 -28.5410 -28.6367 -28.7871 -28.5468
The FORTRAN code is stored in the file "ex1.f". A copy of the whole program appears in Appendix A. The "ex1" program uses 5 routines from the standard file library. Here is an overview of what is done by the program.
* Declare variables
* Associate a file name with a FORTRAN unit number (FNOM)
* Open the file (FSTOUV)
* Initialize proper standard file attributes
* Read the data contained in the ASCII file
* Write records into the standard file (FSTECR)
* Close the file (FSTFRM)
* Dissociate the file name from the FORTRAN unit number (FCLOS).
Let's look more closely at the different parts of the program.
Declare variables
In the first part we use the attributes names that were suggested in the "Detailed structure" section.
--------------------------------------------------- character*2 nomvar character*1 typvar, grtyp character*8 etiket integer key, dateo, deet, npas, ni, nj, nk, npak, datyp integer ip1, ip2, ip3 integer ig1, ig2, ig3, ig4 logical rewrit
We then declare the functions used by the program.
--------------------------------------------------- external fstecr external fnom, fstouv, fclos, fstfrm integer fstecr integer fnom, fstouv, fclos, fstfrm ---------------------------------------------------
We continue with the other variables. "ier" contains the return status of the functions used in the examples. "iun" contains the logical FORTRAN unit. "month" contains the month index that will be used to initialize the date of origin. "fld" will contain the values of surface temperature for each month, and "work" contains a working storage area for the "fstecr" function. Although there is a precise formula to compute and reduce the size of the "work" array, we will keep things simple and initialize it with the same dimensions as "fld".
--------------------------------------------------- integer ier integer i,j,ii,jj,iun integer month real fld(120, 60), work(120, 60) ---------------------------------------------------
Associate a file name with a FORTRAN unit number (FNOM)
--------------------------------------------------- iun = 1 ier = fnom(iun, 'ts.fst', 'STD+RND', 0) if (ier.lt.0) then print *, 'Fatal error while opening the file' endif ---------------------------------------------------
Open the file in random access mode (FSTOUV)
--------------------------------------------------- iun = 1 ier = fstouv(iun, 'STD+RND') ---------------------------------------------------
Initialize proper standard file attributes
--------------------------------------------------- typvar = 'C' nomvar = 'TS' etiket = 'SFC TEMP ' ip1 = 0 ip2 = 0 ip3 = 0 ni = 120 nj = 60 nk = 1 deet = 0 npas = 0 grtyp = 'A' ig1 = 0 ig2 = 0 ig3 = 0 ig4 = 0 datyp = 1 npak = -16 ---------------------------------------------------
Read the data contained in the ASCII file.
We assume here that we are reading data from the console (through standard UNIX redirection) and that we know exactly the format of the input data.
--------------------------------------------------- read(5, *) month do 200 j=1,120*60/5 read(5,*) ii,jj,fld(ii,jj),fld(ii+1,jj),fld(ii+2,jj), * fld(ii+3,jj),fld(ii+4,jj) 200 continue ---------------------------------------------------
Write records in the standard file (FSTECR).
We set the date so that each climatological average has the 1st of each month as date of origin. We then call integer function FSTECR with the attributes that were set at the beginning of the program. The last argument of the routine is a flag indicating that we don't want to replace any existing record that would have the same search attributes (ip1, ip2, ip3, typvar, nomvar, etiket). Note that the date of origin (DATEO) is not included in the search attributes.
--------------------------------------------------- dateo = month * 10000000 + 0199000 ier = fstecr(fld, WORK, npak, iun, dateo, deet, npas, ni, nj, * nk, ip1, ip2, ip3, typvar, nomvar, etiket, grtyp, * ig1, ig2, ig3, ig4, datyp, .false.) ---------------------------------------------------
--------------------------------------------------- ier = fstfrm(1) ---------------------------------------------------
--------------------------------------------------- ier = fclos(1) ---------------------------------------------------
To compile the program, type
% f77 ex1.f -o ex1 $ARMNLIB/lib/rmnxlib.a
To execute the program, type
% ex1 < ts.asc
The program should then execute, producing the following output:
UNIT = 1 RANDOM EST CREE UNIT = 1 EST OUVERT RANDOM ECRIT( 1) 0-TS C 0 0 0 120 60 1 SFC TEMP 4010199000 0 0 A 0 0 0 0 R16 1531 3604 ECRIT( 1) 1-TS C 0 0 0 120 60 1 SFC TEMP 7020199000 0 0 A 0 0 0 0 R16 5161 3604 ECRIT( 1) 2-TS C 0 0 0 120 60 1 SFC TEMP 1030199000 0 0 A 0 0 0 0 R16 8791 3604 ECRIT( 1) 3-TS C 0 0 0 120 60 1 SFC TEMP 4040199000 0 0 A 0 0 0 0 R16 12421 3604 ECRIT( 1) 4-TS C 0 0 0 120 60 1 SFC TEMP 6050199000 0 0 A 0 0 0 0 R16 16051 3604 ECRIT( 1) 5-TS C 0 0 0 120 60 1 SFC TEMP 2060199000 0 0 A 0 0 0 0 R16 19681 3604 ECRIT( 1) 6-TS C 0 0 0 120 60 1 SFC TEMP 4070199000 0 0 A 0 0 0 0 R16 23311 3604 ECRIT( 1) 7-TS C 0 0 0 120 60 1 SFC TEMP 7080199000 0 0 A 0 0 0 0 R16 26941 3604 ECRIT( 1) 8-TS C 0 0 0 120 60 1 SFC TEMP 3090199000 0 0 A 0 0 0 0 R16 30571 3604 ECRIT( 1) 9-TS C 0 0 0 120 60 1 SFC TEMP 5100199000 0 0 A 0 0 0 0 R16 34201 3604 ECRIT( 1) 10-TS C 0 0 0 120 60 1 SFC TEMP 1110199000 0 0 A 0 0 0 0 R16 37831 3604 ECRIT( 1) 11-TS C 0 0 0 120 60 1 SFC TEMP 3120199000 0 0 A 0 0 0 0 R16 41461 3604 UNITE FORTRAN IUN= 1 EST FERME
Let's look at the first two lines of the output.
UNIT = 1 RANDOM EST CREE UNIT = 1 EST OUVERT RANDOM
These lines show that the file associated with logical unit 1 is created (UNIT = 1 RANDOM EST CREE), and then opened in random access mode (UNIT = 1 EST OUVERT RANDOM ).
We then have a message for each record that has been created.
ECRIT( 1) 0-TS C 0 0 0 120 60 1 SFC TEMP 4010199000 0 0 A 0 0 0 0 R16 1531 3604
The contents of the line show that a record has been written on unit 1, followed by a list of attributes. The order of the attributes is key(0), var. name(TS), field type(C), ip1(0), ip2(0), ip3(0), ni(120), nj(60), nk(1), label(SFC TEMP), date of origin(4010199000), deet(0), npas(0), grtyp(A), ig1(0), ig2(0), ig3(0), ig4(0), data type and packing ratio(R16) and finally two other attributes reserved for internal use: swa(1531 ) and lng (3604).
The name of the standard file produced by the program is ts.fst. We can inspect the contents of this file by invoking the RPN utility voir.
% voir -iment ts.fst
1
******************************************************************************************** * * * VOIR 3.2 * * * * * * Thu Aug 20 11:11:43 1999 * * * * BEGIN EXECUTION * * * ******************************************************************************************** UNIT = 10 EST OUVERT RANDOM 1 FILE=ts.fst TYPE=RANDOM Thu Aug 20 1999 11:11:44 PAGE 1 KEY# ID IP1 IP2 IP3 NI NJ NK ETIQ. DATE ORIG. DEET NPAS GR IG1 IG2 IG3 IG4 DTY SWA LNG 0-TS C 0 0 0 120 60 1 SFC TEMP 4010199000 0 0 A 0 0 0 0 R16 1531 3604 1-TS C 0 0 0 120 60 1 SFC TEMP 7020199000 0 0 A 0 0 0 0 R16 5161 3604 2-TS C 0 0 0 120 60 1 SFC TEMP 1030199000 0 0 A 0 0 0 0 R16 8791 3604 3-TS C 0 0 0 120 60 1 SFC TEMP 4040199000 0 0 A 0 0 0 0 R16 12421 3604 4-TS C 0 0 0 120 60 1 SFC TEMP 6050199000 0 0 A 0 0 0 0 R16 16051 3604 5-TS C 0 0 0 120 60 1 SFC TEMP 2060199000 0 0 A 0 0 0 0 R16 19681 3604 6-TS C 0 0 0 120 60 1 SFC TEMP 4070199000 0 0 A 0 0 0 0 R16 23311 3604 7-TS C 0 0 0 120 60 1 SFC TEMP 7080199000 0 0 A 0 0 0 0 R16 26941 3604 8-TS C 0 0 0 120 60 1 SFC TEMP 3090199000 0 0 A 0 0 0 0 R16 30571 3604 9-TS C 0 0 0 120 60 1 SFC TEMP 5100199000 0 0 A 0 0 0 0 R16 34201 3604 10-TS C 0 0 0 120 60 1 SFC TEMP 1110199000 0 0 A 0 0 0 0 R16 37831 3604 11-TS C 0 0 0 120 60 1 SFC TEMP 3120199000 0 0 A 0 0 0 0 R16 41461 3604 STATISTIQUES DIMENSION DU DIRECTEUR DISQUE 100 NOMBRE D ENTREES UTILISEES 12 LONGUEUR DU FICHIER 45090 MOTS NOMBRE D ECRITURES 12 NOMBRE DE RE-ECRITURES 0 NOMBRE D EFFACAGES 0 NOMBRE D EXTENSIONS 0 NOMBRE DE CORRECTIONS 0 ***************************************** UNITE FORTRAN IUN= 10 EST FERME ******************************************************************************************** * * * VOIR O.K. * * * * Thu Aug 20 11:11:44 1999 * * * * END EXECUTION * * * * CP SECS = 0.100 * * * ********************************************************************************************
Most of the RPN standard file functions (9 out 13) are meant for querying and reading records. As mentioned in the "Classification by category" section, queries can be targeted to records using the following attributes:
Data element Suggested name Variable name NOMVAR Type of field TYPVAR Label ETIKET Vertical level IP1 Forecast hour IP2 User defined index IP3 Date of validity DATEV=DATEO+ DEET * NPAS
Queries can be made by using precise values for the attributes, or by wildcarding some of them. Wildcarding an attribute means to ignore it when making a search. A standard file query is usually of the type "Give me the key of the first record where NOMVAR='GZ' and TYPVAR='P' and ETIKET='FE OPRUN' and IP1=1000 and IP2=3 and IP3=0 and DATEV=039217120". A query where IP1, IP2 and IP3 would be wildcarded (and thus ignored during the search) would look like "Give me the key of the first record where NOMVAR='GZ' and TYPVAR='P' and ETIKET='FE OPRUN' .
Wildcarding the integer attributes IP1, IP2, IP3 and DATEV is done by assigning them a value of -1; wildcarding the character attributes NOMVAR, TYPVAR and ETIKET is done by using a single blank, ` `.
The program "ex2.f", printed in appendix A, computes simple statistics from the "ts.fst" file created in example 1. The program reads records meeting certain search criteria (functions FSTLIR and FSTLIS), finds their average, minimum and maximum values (subroutine STATFLD), and then prints the value of all the standard file attributes. Here is an overview of what is done by the program.
* Declare variables
* Associate a file name with a FORTRAN unit number (FNOM)
* Open the file (FSTOUV)
* Initialize proper standard file attributes needed by the FSTLIR function
* Read the first record matching search criteria (FSTLIR)
* Find minimum, maximum and average values (STATFLD - included in the main program)
* Get and print the values of all standard file attributes (FSTPRM)
* Repeat until no more records are found (FSTLIS)
* Close the standard file (FSTFRM)
* Dissociate the file name from the FORTRAN unit number (FCLOS).
The first part is a call to the "FSTNBR" function, which return the number of records existing in a random standard file.
--------------------------------------------------- **** * Get the number of records in the standard file * This function can only be used for random standard files **** nrecs = fstnbr(iun) print *, 'There are ', nrecs, ' records in that file' ---------------------------------------------------The second part is a call to the "FSTVOI" function, which produces a listing (on standard output) of the attributes of all existing records in the standard file. This listing is identical to the one produced by the invocation of the "VOIR" utility.
--------------------------------------------------- **** * Print the contents of the standard file directory **** ier = fstvoi(iun, 'STD+RND') if (ier.lt.0) then print *, '(FSTVOI) Cannot print the directory' endif ---------------------------------------------------
The next part initializes the standard file attributes to initiate a query for the first occurence of the variable 'TS', where TYPVAR='C', ETIKET='SFC TEMP ', and where IP1, IP2 and IP3 are all zero. Note that DATEV (the date of validity) is the only wild card attribute in this call.
--------------------------------------------------- **** * Initialize standard file variables for doing a query **** typvar = 'C' nomvar = 'TS' etiket = 'SFC TEMP ' datev = -1 ip1 = 0 ip2 = 0 ip3 = 0
The FSTLIR function locates the first record meeting the search criteria, and then loads the field values into the FLD array.[2]
--------------------------------------------------- **** * Reads the first field matching selection criteria **** key = fstlir(FLD, iun, NI, NJ, NK, datev, etiket, * ip1, ip2, ip3, typvar, nomvar) ---------------------------------------------------The FSTLIR function returns a negative value if it cannot locate a record matching the search criteria.
--------------------------------------------------- 50 if (key.lt.0) then print *, '(FSTLIR) Invalid key number:', key ---------------------------------------------------The next part of the code contains a loop that reads all the other records in the standard file meeting the search criteria (using "FSTLIS"), until the end of file is reached (i.e. until a key less than zero is returned). For each record found, we compute its minimum, maximum and average values (subroutine STATFLD), and get the values of its attributes using the "FSTPRM" function.
Here is a schematic part of the loop
---> if (key.lt.0) then | print error message | else | compute min, max, avg | get and print the value of all attributes | endif | read next field matching search criteria |-----------and the real code
--------------------------------------------------- 50 if (key.lt.0) then print *, '(FSTLIR) Invalid key number:', key else **** * Computes minimum, maximum and average value of the field **** call statfld(minval, maxval, avgval, fld, ni, nj) **** * Get all standard file parameters and print them **** ier = fstprm(key, dateo, deet, npas, ni, nj, nk, * nbits, datyp, ip1, ip2, ip3, * typvar, nomvar, etiket, grtyp, * ig1, ig2, ig3, ig4, swa, lng, dltf, ubc, * extra1, extra2, extra3) print *, '*****************************************' print *, ' minval = ', minval, 'maxval =', maxval, * 'avgval = ', avgval print 10, nomvar, typvar, etiket, dateo, deet, npas, * ni, nj, nk,nbits, datyp, ip1, ip2, ip3, * grtyp, ig1, ig2, ig3, ig4, * swa, lng, dltf, ubc, extra1, extra2, extra3 **** * Try to read the next field matching selection criteria set * by the first call to FSTLIR. **** key = fstlis(fld, iun, NI, NJ, NK) goto 50 endif ---------------------------------------------------Here is the output produced by the program:
***************************************** LU( 1) 0-TS C 0 0 0 120 60 1 SFC TEMP 6010199000 0 0 A 0 0 0 0 R16 1531 3604 ***************************************** minval = -48.25580 maxval = 31.51569 avgval = 3.220830 nomvar= TS typvar= C etiket= SFC TEMP dateo= 10199000 deet= 0 npas= 0 ni= 120 nj= 60 nk= 1 nbits= 16 datyp= 1 ip1= 0 ip2= 0 ip3= 0 grtyp= A ig1= 0 ig2= 0 ig3= 0 ig4= 0 swa= 1531 lng= 3604 dltf= 0 ubc= 0 extra1= 0 extra2= 0 extra3= 0 LU( 1) 1-TS C 0 0 0 120 60 1 SFC TEMP 2020199000 0 0 A 0 0 0 0 R16 5161 3604 ***************************************** minval = -49.15850 maxval = 30.90791 avgval = 2.926342 nomvar= TS typvar= C etiket= SFC TEMP dateo= 20199000 deet= 0 npas= 0 ni= 120 nj= 60 nk= 1 nbits= 16 datyp= 1 ip1= 0 ip2= 0 ip3= 0 grtyp= A ig1= 0 ig2= 0 ig3= 0 ig4= 0 swa= 5161 lng= 3604 dltf= 0 ubc= 0 extra1= 0 extra2= 0 extra3= 0 LU( 1) 2-TS C 0 0 0 120 60 1 SFC TEMP 2030199000 0 0 A 0 0 0 0 R16 8791 3604 ***************************************** minval = -63.80290 maxval = 30.91976 avgval = 2.481087 nomvar= TS typvar= C etiket= SFC TEMP dateo= 30199000 deet= 0 npas= 0 ni= 120 nj= 60 nk= 1 nbits= 16 datyp= 1 ip1= 0 ip2= 0 ip3= 0 grtyp= A ig1= 0 ig2= 0 ig3= 0 ig4= 0 swa= 8791 lng= 3604 dltf= 0 ubc= 0 extra1= 0 extra2= 0 extra3= 0 (...OUTPUT TOO LONG TO BE PRINTED HERE)
The remaining part of this document will show alternative methods of locating the records the program "ex2.f" searches for.
It may sometimes be useful to retrieve record information before reading data values. For example it may be valuable to verify the size of a record (in terms of NI, NJ, NK) before loading it into memory. This can be done by using the "FSTINF" function. The next occurence of a record meeting the search criteria defined by "FSTINF" can be found be using the "FSTSUI" function.
Once a record has been located by "FSTINF", it can be loaded in memory by a call to "FSTLUK", which uses the key returned by "FSTINF". In fact, the code
key1 = fstlir(FLD, iun, NI, NJ, NK, datev, etiket, ip1, ip2, ip3, typvar, nomvar) key2 = fstlis(FLD, iun, NI, NJ, NK)
produces the same result as
key1 = fstinf(iun, NI, NJ, NK, datev, etiket, ip1, ip2, ip3, typvar, nomvar) ier = fstluk(FLD, key1, NI, NJ, NK) key2 = fstsui(iun, NI, NJ, NK) ier = fstluk(FLD, key2, NI, NJ, NK)Sample code using "FSTINF", "FSTLUK" and "FSTSUI" can be found in the program "ex3.f".
The last function to be presented in this document is "FSTINL". The "FSTINL" calling sequence is similar to "FSTINF", except that it returns a list of record keys satisfying search criteria. So, instead of executing the following loop NKEYS times,
key = fstinf(iun, NI, NJ, NK, datev, etiket, ip1, ip2, ip3, typvar, nomvar) --> if (key > 0) then | do something | key = fstsui(iun, NI, NJ, NK) | endif ------------
it is possible to use
integer maxkeys parameter (maxkeys = 100) integer keys(maxkeys), nkeys (...) ier = fstinl(iun, NI, NJ, NK, datev, etiket, ip1, ip2, ip3, typvar, nomvar, KEYS, NKEYS, nmax) do 100 i=1, nkeys ier = fstluk(FLD, keys(i), NI, NJ, NK) do something 100 continue
Sample code using "FSTINL" can be found in the file "ex4.f".
FORTRAN Source code :