Porting and Parallelising Codes
Choice of Machines
The CSAR
machines have very similar architectures, all being shared memory machines
built around SGI's
NUMAlink interconnect.
However, due to differences between the MIPS
processors in the Origins, and the Itanium 2 processors in the Altix,
the choice of machine can depend a lot on the type of code that you
have and how much work you are prepared to put in (with our help) in
modifying it so that it runs efficiently. You may wish to refer to the
CSAR helpdesk about the suitability of the different machines for your
type of code. When you have decided which machine you wish to run your
code on then there are two aspects to consider for code development
which are described below.
Types of Parallelism
The CSAR machines are all ccNUMA shared memory machines with an underlying distributed-memory
architecture. ccNUMA stands for cache-coherent non-uniform memory access,
and this means that the processors should all have the same view of
memory at any given moment, but memory access times are not the same
across the machine for any processor, with some parts of memory considered
"closer" than others.
On a shared memory ccNUMA machine such as the Origin
and Altix computers, parallel programs can be written in one of two
ways:-
- Distributed Memory Model: In
the distributed memory model, programs are written so that independent
processes communicate with each other by inserting into the code calls
to message passing libraries such as MPI
or SHMEM. Computer codes using the MPI library are generally
highly portable between different parallel machines.
- Shared Memory Model: Using the
shared memory model, programs are parallelised by inserting compiler
directives into the source files which tell the compiler to parallelise
loops or sections of code. The most common, and portable, set of these
directives is defined in the OpenMP
standard.
Porting Codes
Apart from the programming paradigm, there are other
issues that you should be aware of when porting a code between machines.
The major ones are detailed below.
- Default Data Sizes:
Different machines can have different default sizes for intrinsic
numerical types such as a Fortran
Real or a C float .
For example on a Cray T3E a C float is 8 bytes long whereas
on an SGI Origin it is 4 bytes long. If you do not take this into
account, your program might still run without any changes but it might
give incorrect results.
- Compilers: Compiler
behaviour and compiler options can change between different systems
and in some cases compiler flags can completely change their meaning.
For example on the MIPSpro compiler on an Origin machine
-mp
indicates that there are OpenMP parallelisation directives in the
code which should be compiled, whereas to the Intel compiler on the
Altix the -mp option is an instruction to maintain precision
by reducing the optimisation level.
- Non-standard Fortran Extensions:
Often Fortran code is written to use functions or subroutines which
are not part of the Fortran standard, but which are implemented by
the compiler vendor. These routines might be timing routines, utilities
to extend the input/output capabilities or extra numerical functions.
For routines that are not part of the Fortran standard it may be necessary
to use different function or subroutine calls on different machines.
- External Libraries:
External libraries of routines which are used by the code may have
different properties on different machines or may not even exist when
moving from one architecture to another. Routine names may change,
input arguments to functions may be different or completely new routines
in different libraries may need to be selected.
CSAR Support
If you are thinking of porting a code to one of the CSAR machines, then
you may wish to consider looking into the application
support and optimisation services and training
courses which you are entitled to apply for when submitting an application
for CSAR resources to the research councils.
|