UK National HPC Service

Computer Services for Academic Research Logo
Home | Helpdesk | Machine Status | Search | Apply
 

Porting Codes 4

External Libraries

Many codes will require the use of optimised libraries to provide efficient computational kernels for numerically intensive parts of the program. For some software, the code is already designed to use a specific library whereas other algorithms may have the option of choosing a library of routines to improve the speed of execution.

General Numerical Algorithms

There are a variety of libraries available which address different numerical problems, and these include the NAG library which implement a diverse range of algorithms. The NAG library may not be highly refined for a particular architecture, but can provide ready access to a host of routines which would otherwise need hand-coding and would also require the application of some generic optimisation techniques. The NAG libraries are available on both Origin and Altix architectures, and further information is available on the Tools and Utilities web page.

Optimised Numerical Linear Algebra

The accepted way to carry out high speed basic numerical linear algebra operations on modern computers is to use the BLAS library, preferably a specific BLAS library tuned to the machine architecture by the computer vendor. Built on top of this set of routines is the LAPACK library which can be used for more complicated operations such as performing matrix factorisations, solving systems of simultaneous equations and finding eigenvalues. For distributed computing, there are parallel versions of these libraries called PBLAS and ScaLAPACK which are built on a communication library called BLACS.

Codes that have a large requirement to do numerical linear algebra will either already make use of the BLAS libraries on the architecture that they are to be ported from, or will possibly benefit from translating hand written algorithms in the current code into a format which can make use of these routines. The BLAS and LAPACK routines are primarily written for Fortran programs but they are accessible through alternative interfaces from C/C++ codes.

On both the Origin and Altix machines these algorithms are provided by SGI through the SCSL routines which may be loaded through the module command. On the Altix, the algorithms may also be accessed through an Intel library called MKL which is also accessible through the module command.

Optimised Fourier Transform Routines

For optimised FFT routines the standard approach on the CSAR machines is to use SGI's SCSL library which provides one, two and three dimensional FFT routines as well as extra functionality for multiple one-dimensional FFTs to be performed at once providing extra optimisation. The routine names and arguments are the same on both Origin and Altix architectures, but it should be noted that due to differences in the default data sizes, codes which are being ported from a Cray T3E would need to upgrade the data types from Real to Double Precision, and then change the names of the corresponding FFT routines, in order to maintain the same accuracy as the Cray.

The algorithms provided by the SCSL FFTs will be often be similar to routines available on other architectures such as IBM's ESSL, and the only way to port between the different routines is to read the man pages for each library.

Optimised Discrete Transform Routines on the Altix

Optimised DFT routines are provided for Fortran programmers on the Altix through Intel's MKL for the Itanium 2 processor. There is a clean Fortran 95 interface to these routines which allows full procedure argument checking and removes the need for spurious working arrays as in the SCSL FFT routines. These routines may be useful for developers writing new codes, or introducing optimised routines in place of hand coded algorithms, but they are unlikely to be a first choice option when porting.

FFTW - A Portable Fourier Transform Library

The FFTW library is an optimised portable Fourier Transform library which is available on a variety of different platforms. Implementation of FFT routines using the FFTW library allows a code to be taken between different machines in the knowledge that the library will either be readily available on the machine or can be downloaded from the FFTW website and compiled for virtually any architecture. The FFTW library is provided on both Origin and Altix platforms and is available through the module command.

Portable Binary I/O Libraries

Since different architectures can output binary data in different formats, portability of data can be a troublesome issue, particularly if there is a requirement to transfer the data around between different machines at preprocessing, computation and postprocessing stages. One common way of producing portable data is to output information in ASCII format, but this is much slower than outputting binary data as the formatting is a computationally expensive process.

Due to the difficulty of moving binary data between different architectures, there are standards such as XDR which enable programs to convert data from their native format into the portable format and vice-versa so that computers can talk to each other or read and write files in a common standard machine-independent way. Many applications and libraries use XDR for data transfer including some MPI libraries where the messages are to be passed in a heterogeneous environment.

There are two libraries of routines provided on the CSAR machines which enable codes to write machine-independent binary data. These libraries are netCDF and HDF and if a program is likely to produce large quantities of output it would be advisable to consider coding the application in a manner which is able to make use of one of these libraries to generate the binary data. Further information is available on the netCDF website and the HDF website.

Page maintained by This page last updated: Tuesday, 17-Aug-2004 15:32:55 BST