 
 

Porting Codes 4External LibrariesMany codes will require the use of optimised libraries to provide efficient computational kernels for numerically intensive parts of the program. For some software, the code is already designed to use a specific library whereas other algorithms may have the option of choosing a library of routines to improve the speed of execution. General Numerical AlgorithmsThere are a variety of libraries available which address different numerical problems, and these include the NAG library which implement a diverse range of algorithms. The NAG library may not be highly refined for a particular architecture, but can provide ready access to a host of routines which would otherwise need handcoding and would also require the application of some generic optimisation techniques. The NAG libraries are available on both Origin and Altix architectures, and further information is available on the Tools and Utilities web page. Optimised Numerical Linear AlgebraThe accepted way to carry out high speed basic numerical linear algebra operations on modern computers is to use the BLAS library, preferably a specific BLAS library tuned to the machine architecture by the computer vendor. Built on top of this set of routines is the LAPACK library which can be used for more complicated operations such as performing matrix factorisations, solving systems of simultaneous equations and finding eigenvalues. For distributed computing, there are parallel versions of these libraries called PBLAS and ScaLAPACK which are built on a communication library called BLACS. Codes that have a large requirement to do numerical linear algebra will either already make use of the BLAS libraries on the architecture that they are to be ported from, or will possibly benefit from translating hand written algorithms in the current code into a format which can make use of these routines. The BLAS and LAPACK routines are primarily written for Fortran programs but they are accessible through alternative interfaces from C/C++ codes.
On both the Origin and Altix machines these algorithms are provided by SGI through the SCSL routines which may be
loaded through the Optimised Fourier Transform Routines
For optimised FFT routines the
standard approach on the CSAR machines is to use SGI's SCSL library which
provides one, two and three dimensional FFT routines as well as extra
functionality for multiple onedimensional FFTs to be performed at once
providing extra optimisation. The routine names and arguments are the same on
both Origin and Altix architectures, but it should be noted that due to
differences in the default data sizes, codes
which are being ported from a Cray T3E would need to upgrade the data types from
The algorithms provided by the SCSL FFTs will be often be similar to routines
available on other architectures such as IBM's ESSL, and the only way to port between the
different routines is to read the Optimised Discrete Transform Routines on the AltixOptimised DFT routines are provided for Fortran programmers on the Altix through Intel's MKL for the Itanium 2 processor. There is a clean Fortran 95 interface to these routines which allows full procedure argument checking and removes the need for spurious working arrays as in the SCSL FFT routines. These routines may be useful for developers writing new codes, or introducing optimised routines in place of hand coded algorithms, but they are unlikely to be a first choice option when porting. FFTW  A Portable Fourier Transform Library
The FFTW library is an optimised portable Fourier Transform library which is
available on a variety of different platforms. Implementation of FFT routines using the FFTW library allows
a code to be taken between different machines in the knowledge that the library
will either be readily available on the machine or can be downloaded from the FFTW website and compiled for virtually any
architecture. The FFTW library is provided on both Origin and Altix platforms
and is available through the Portable Binary I/O LibrariesSince different architectures can output binary data in different formats, portability of data can be a troublesome issue, particularly if there is a requirement to transfer the data around between different machines at preprocessing, computation and postprocessing stages. One common way of producing portable data is to output information in ASCII format, but this is much slower than outputting binary data as the formatting is a computationally expensive process. Due to the difficulty of moving binary data between different architectures, there are standards such as XDR which enable programs to convert data from their native format into the portable format and viceversa so that computers can talk to each other or read and write files in a common standard machineindependent way. Many applications and libraries use XDR for data transfer including some MPI libraries where the messages are to be passed in a heterogeneous environment. There are two libraries of routines provided on the CSAR machines which enable codes to write machineindependent binary data. These libraries are netCDF and HDF and if a program is likely to produce large quantities of output it would be advisable to consider coding the application in a manner which is able to make use of one of these libraries to generate the binary data. Further information is available on the netCDF website and the HDF website. 

Page maintained by csaradvice@cfs.ac.uk This page last updated: Tuesday, 17Aug2004 15:32:55 BST 