GOTO BLAS Library v0.95
The GOTO library is an optimised implementation of BLAS routines, developed by Kazushige Goto at the University of Texas at Austin,
providing an alternative to MKL or SCSL.
From the author's web page:
"During the last decade, a number of projects have pursued the
high-performance implementation of matrix multiplication. Typically,
these projects organize the computation around an "inner-kernel", C
= trans(A) B + C, that keeps one of the operands in the L1 cache, while
streaming parts of the other operands through that cache. Variants include
approached that extend this principle to multiple levels of cache or
that apply the same principle to the L2 cache while essentially ignoring
the L1 cache. The purpose of the game is to optimally amortize the cost
of moving data between memory layers.
Our approach is fundamentally different. It starts by observing that
for current generation architectures, much of the overhead comes from
Translation Look-aside Buffer (TLB) table misses. While the importance
of caches is also taken into consideration, it is the minimization of
such TLB misses that drives the approach. The result is a novel approach
that achieves highly competitive performance on broad spectrum of current
high-performance architectures. "
Restrictions on Use
This product is only available on Newton.
Set Up Procedure
You will need to load the
goto_blas module in order to run code linked with the Goto library.
Running the Code
In order to link with the GOTO library you need to
provide the following option to your link command:
The BLAS routine xerbla must link to I/O routines, which are compiler dependent.
Thus, you must also add either xerbla.f or xerbla.c to your build.
Further information is available from the Goto web pages at