UK National HPC Service

Computer Services for Academic Research Logo
Home | Helpdesk | Machine Status | Search | Apply

GOTO BLAS Library v0.95


The GOTO library is an optimised implementation of BLAS routines, developed by Kazushige Goto at the University of Texas at Austin, providing an alternative to MKL or SCSL.

From the author's web page:

"During the last decade, a number of projects have pursued the high-performance implementation of matrix multiplication. Typically, these projects organize the computation around an "inner-kernel", C = trans(A) B + C, that keeps one of the operands in the L1 cache, while streaming parts of the other operands through that cache. Variants include approached that extend this principle to multiple levels of cache or that apply the same principle to the L2 cache while essentially ignoring the L1 cache. The purpose of the game is to optimally amortize the cost of moving data between memory layers.

Our approach is fundamentally different. It starts by observing that for current generation architectures, much of the overhead comes from Translation Look-aside Buffer (TLB) table misses. While the importance of caches is also taken into consideration, it is the minimization of such TLB misses that drives the approach. The result is a novel approach that achieves highly competitive performance on broad spectrum of current high-performance architectures. "

Restrictions on Use

This product is only available on Newton.

Set Up Procedure

You will need to load the goto_blas module in order to run code linked with the Goto library.

Running the Code

In order to link with the GOTO library you need to provide the following option to your link command:


The BLAS routine xerbla must link to I/O routines, which are compiler dependent. Thus, you must also add either xerbla.f or xerbla.c to your build.

Further Information

Further information is available from the Goto web pages at

Page maintained by This page last updated: Friday, 02-Dec-2005 10:20:19 GMT