The purpose of this is to summarize coding standards to be used in the MIDAS fortran code. A shorter list of the "Top 10" things to keep in mind is provided here.
A long list of coding standard rules (based on the jules land surface model coding standards with modifications)
- Use the Fortran 90 free format syntax.
- Indent blocks by 2 characters.
- Use spaces and blank lines where appropriate to format your code to improve readability (use genuine spaces rather than tabs, as the tab character is not in the Fortran character set).
- Try to confine your line width to 80 characters. This means that your code can be viewed easily in any editor on any screen, and can be printed easily on standard paper.
- Line up your statements, where appropriate, to improve readability.
- Short and simple Fortran statements are easier to read and understand than long and complex ones. Where possible, avoid using continuation lines in a statement.
- Avoid putting multiple statements on the same line. It is not good for readability.
- Each program unit (module, subroutine, function etc.) should follow a structure similar to the examples supplied above. The intended behaviour of the unit should be clearly described in the header, and the prefix of the module should be defined.
- Each subroutine, function and module should be in a separate file. Modules may be used to group related variables, subroutines and functions. At this stage, all code except "main" programs should be inside modules.
- When naming your variables and program units, always keep in mind
that Fortran is a case-insensitive language (e.g.
EditOrExit
is the same asEditorExit
.) - Use only characters in the Fortran character set. In particular, accent characters and tabs are not allowed in code, although they are usually OK in comments. If your editor inserts tabs automatically, you should configure it to switch off the functionality when you are editing Fortran source files.
- Although Fortran has no reserved keywords, you should avoid naming
your program units and variables with names that match an intrinsic
function
orsubroutine
. Similarly, you should avoid naming your program units and variables with names that match a keyword in a Fortran statement. - To improve readability, write your code using the lower case for all
Fortran keywords and intrinsic functions/subroutines. The rest of
the code should be written using "camelCase" with no underscores, except
between a module prefix and a variable name
(e.g.
variableNameList
orgsv_getField
). When naming any public entity (variable, subroutine, or function) in a module it must begin with the module prefix: e.g.gsv_getField
, where "gsv
" is the prefix associated with the modulegridStateVector_mod
. - Function or subroutine arguments should be declared separately from, and before, local variables, separated by a blank line.
- As for variables, "camelCase" (https://en.wikipedia.org/wiki/Camel_case)
should be used for all filenames. For modules, an underscore should
appear only between the module name and the suffix "
mod
. Therefore, a module filename should end with "_mod.f90
".
- Comments should start with a single
!
and be indented with the code. A blank line should be left before (but not after) the comment line. - An important exception to the above rule on comments is for any comments that are intended to appear in the automatically generated on-line documentation. Refer to the documentation page for more details.
- Be generous with comments. State the reason for doing something, instead of repeating the Fortran logic in words. However, comments are not a replacement for clear code. Better that the code itself is easy to understand by the way it is structured and the choice of variable and subroutine/function names - it is ok (and encouraged) to use longer names that really describe what the variable or subroutine/function represents or does (also easier later to grep/search for these names as compared with a variable named i, for example).
-
Use
msg()
instead of nakedwrite(*,*)
to output information: provide the origin of the message (such as the caller subroutine, function or program).call msg('int_tInterp_gsv', 'START', verb_opt=2) ! prints a short message on all MPI tiles when verbosity threshold >= 2 ... call msg('int_tInterp_gsv', 'numStepIn='//str(numStepIn)& //',numStepOut='//str(numStepOut), mpiAll_opt=.false.) ! prints a short message with some numerical values on MPI tile 0 only
Optionally, a verbosity level that specifies how important is the message can be provided. The verbosity thresholds are defined as follow:
msg_ALWAYS
: always printed, irrespectively of threshold- 0 : critical, should always be printed
- 1 : default priority, printed in operational context
- 2 : detailed output, provides extra information
- 3 : intended for developers, printed for debugging or specific diagnostcs
msg_NEVER
: never printed, irrespectively of threshold
-
Use the new and clearer syntax for
logical
comparisons, i.e.:== instead of .eq. /= instead of .ne. > instead of .gt. < instead of .lt. >= instead of .ge. <= instead of .le.
-
Positive logic is usually easier to understand. When using an
if-else-end if
construct you should use positive logic in the IF test, provided that the positive and the negative blocks are about the same length. It may be more appropriate to use negative logic if the negative block is significantly longer than the positive block. -
To improve readability, you should always use the optional space to separate the following Fortran keywords:
else if end do end forall end function end if end interface end module end program end select end subroutine end type end where select case
-
If you have a large or complex code block embedding other code blocks, you may consider naming some or all of them to improve readability.
-
Improve readability by always using the full version of the
end
statement (i.e.end subroutine <name>
orend function <name>
instead of justend
) at the end of each sub-program unit. -
Where possible, consider using
cycle
,exit
or awhere
-construct to simplify complicatedDO
-loops. -
When writing a
real
literal with an integer value, put a0
after the decimal point (i.e.1.0
as opposed to1
) to improve readability. For double precision real literals (real(8)
) always include "d0
" as in1.0d0
to ensure the correct precision. -
Where reasonable and sensible to do so, you should try to match the names of dummy and actual arguments to a
subroutine
/function
. -
In an array assignment, it is recommended that you use array notations to improve readability, e.g.
Avoid this:array1 = 1 array2 = array1 * scalar
Use this instead:
array1(:,:) = 1 array2(:,:) = array1(:,:) * scalar
-
Use
implicit none
in all program units. This forces you to declare all your variables explicitly. This helps to reduce bugs in your program that will otherwise be difficult to track. -
Design any derived data types carefully and use them to group related variables. Appropriate use of derived data types will allow you to design modules and procedures with simpler and cleaner interfaces.
-
Always use a
private
statement at the beginning of a module so that all module variables and procedures are by default declared private. Any public subroutines, functions or variables are then explicity declare public using apublic
statement near the beginning of the module to make it clear to a user what is accessible from the outside world. -
Where possible, an
allocate
statement for anallocatable
array (or apointer
used as a dynamic array) should be coupled with adeallocate
within the same scope. if anallocatable
array is a publicmodule
variable, it is highly desirable if its memory allocation and deallocation are only performed in procedures within themodule
in which it is declared. You may consider writing specificsubroutines
within the module to handle these memory managements. -
To avoid memory fragmentation, it is desirable to
deallocate
in reverse order ofallocate
, as in:allocate(a(n)) allocate(b(n)) allocate(c(n)) ! ... do something ... deallocate(c) deallocate(b) deallocate(a)
-
Inside a function or subroutine, always define a local
pointer
before using it. do not define apointer
in its declaration by pointing it to the intrinsic functionnull()
since this will invoke an implicit "save
" attribute which is very dangerous! Instead, make sure that yourpointer
is defined or nullified early on in the program unit. similarly,nullify
apointer
when it is no longer in use, either by using thenullify
statement or by pointing yourpointer
tonull()
. This recommandation does not apply topointer
global to a module or program. (The reason for this, is that the declaration statement is actually executed only on the first call, the only one for module or program pointer, but for functions or subroutines, subsequent calls will not re-declare the pointer tonull()
such that it might already points to some value obtained in the previous call.) -
Avoid the
dimension
attribute or statement. declare thedimension
with the declared variables. E.g.: Avoid this:integer, dimension(10) :: array1 integer :: array2 dimension :: array2(20)
Instead, do this:
integer :: array1(10), array2(20)
-
Never initialize a local variable on the declaration unless the
save
attribute is explicitely present.
Avoid this:logical :: trueByDefault = .true.
Instead, do this:
logical :: trueByDefault trueByDefault = .true.
If you actually want the local variable to keep it's value after passing out of scope, be explicit:
logical, save :: thisIsTheFirstCall = .true.
-
Avoid
common
blocks andblock data
program units. instead, use amodule
withpublic
variables. -
Avoid the
equivalence
statement. Use apointer
or a derived data type, and thetransfer
intrinsic function to convert between types. -
Avoid the
pause
statement, as your program will hang in a batch environment. If you need to halt your program for interactive use, consider using aread*
statement instead. -
Avoid the
entry
statement. Use amodule
or internalsubroutine
. -
Avoid the
goto
statement. -
Avoid numbered statement labels.
do ... label continue
constructs should be replaced bydo ... end do
constructs. everydo
loop must be terminated with a correspondingend do
. -
Never use a
format
statement - they require the use of labels, and obscure the meaning of the I/O statement. The formatting information can be placed explicitly within theread
,write
orprint
statement, or be assigned to acharacter
variable in aparameter
statement in the header of the routine for later use in I/O statements. Never place output text within the format specifier, i.e. only format information may be placed within thefmt=
part of an I/O statement. All variables and literals, including any character literals, must be 'arguments' of the I/O routine itself. This improves readability by clearly separating what is to be read/written from how to read/write it. -
Avoid the
forall
statement/construct. Despite what it is supposed to do,forall
is often difficult for compilers to optimise (see, for example, Implementing the Standards including Fortran 2003 by NAG). Stick to the equivalentdo
construct,where
statement/construct or array assignments unless there are actual performance benefits from usingforall
. -
A
function
should bepure
, i.e. it should have no side effects (e.g. altering an argument or module variable, or performing I/O). If you need to perform a task with side effects, you should use asubroutine
instead. -
Declare the
intent
of all arguments to a subroutine or function. This allows checks against unintended access of variables to be done at compile time. The above point requiring functions to be pure means that all arguments of afunction
should be declared asintent(in)
. -
Avoid
recursive
procedures if possible.recursive
procedures are usually difficult to understand, and are always difficult to optimise in a supercomputer environment. -
Avoid using the specific names of intrinsic procedures. Use the generic names of intrinsic procedures where possible.
-
Not necessary to use the
only
clause in ause <module>
statement, since each module should have few public symbols and all should be named beginning with the module prefix. -
The use of operator overloading is discouraged, as it can lead to confusion. The only acceptable use is to allow the standard operators (+, - etc.) to work with derived data types, where this makes sense.
-
Avoid using archaic Fortran 77 features and features deprecated in Fortran 90.