Dominique Lavenier

Dominique Lavenier

HOME RESEARCH PUBLICATIONS TEACHING SOFTWARE

SAMBA
Systolic Accelerator for Molecularbiological Applications

History

The design of SAMBA was decided in september 1993. The main decision was to choose the target technology: programmable or custom architecture. It was decided to design a systolic array with fully dedicated processors to get high performance and minimize the design of a programming environment - a task which usually requires a lot of efforts.

The design effort was estimated to 11 person/months: 1 month for functional specification and simulation, one month for architectural specification and simulation, 2 months for architectural synthesis, 2 month for chip testing, 0.5 month for board design, 4 month for interface design, and 0.5 month for final integration

Algorithm

The algorithm implemented on SAMBA belongs to the dynamic programming class. The recurrence equation comes from the Smith and Waterman algorithm. However, it has been parametrized to cover a larger scope. A similarity matrix is calculated recursively using the following equation:

H(i,j)=max(delta,E(i,j),F(i,j),H)i-1,j-1)+sub(S1i,S2j))

with:

E(i,j) = Max (H(i,j-1)-alpha,E(i,j-1)-beta)
F(i,j) = Max (H(i-1,j)-alpha,F(i-1,j)-beta)

and the initializations given by:

H(i,0) = E(i,0) = hi(i)
H(0,j) = F(0,j) = vi(j)

alpha, beta, delta, hi and vi are parameters for tuning the algorithm to local or global search, with or without gap penalty

Parallelization

The sequence comparison on a linear systolic array proceeds as follows: one sequence (the query sequence) is stored into the array (one character per processor) and the other sequence flows from the left to right through the array. During each systolic step, one elementary matrix computation is performed on each processor. The result is available on the rightmost processor when the last character of the flowing sequence is output.

Compared to a sequential machine the speed-up for computing one query sequence of size N against a database of M residues is given by:

(N x M) / N + M - 1 = N (considering that M >> N)

N = size of the query sequence (number of processors)
M = size of the database

Architecture

SAMBA comprised a workstation (with its local disk), a systolic array made out of 128 VLSI full custom processors and a FPGA-Memory interface which fills the gap between a complete hardwired array of processors and a programmable Von Neuman machine.

The array was composed of 32 full-custom identical chips, each housing 4 processors, and leading to a 128 processor array. The chip was designed at IRISA and was providing a computational power of 40 millions matrix cells per second. Hence, the complete array was able to reach a peak performance of 1.28 billions matrix cells per second.

Software

SAMBA is a hardware accelerator with a library of high level comparison C-procedures. This makes possible the choice of pre-defined programs, such as the classical programs already developed for bank-scanning, or user-defined programs tuned to a specific application.

SAMBA is controlled by a few procedure calls inside a normal C-program, without forcing the user to have specific knowledge of the structure of the accelerator and how it works. A library of basic procedures has been developed to be rapidly understood by programmers; but these procedures stay close enough to the accelerator hardware to provide efficient speed-up.

Bibliography

D. Lavenier, Speeding up genome computations with a systolic accelerator, SIAM news, 31 (8) 1998. (pdf)
P. Guerdoux-Jamet, D. Lavenier, SAMBA: Hardware Accelerator for Biological Sequence Comparison, CABIOS, 13 (6) 1997.
D. Lavenier, Dedicated Hardware for Biological Sequence Comparison, Journal of Universal Computer Science, 2 (2) 1996. (pdf)
P. Guerdoux-Jamet, D. Lavenier, C. Wagner, P. Quinton, Design and Implementation of a Parallel Architecture for Biological Sequence Comparison, EuroPar'96, European Conference on Parallelism, Lyon, France, 1996. (pdf)

Last modified: Thu Oct 01 21:26:49 2009

SAMBASystolic Accelerator for Molecularbiological Applications