MSc Dissertation: David Neil MacLeod

Citation:

Adobe-PDF-downloadMacLeod, David Neil. ROACH accelerated BLAST. MSc Dissertation. Department of Electrical Engineering, University of Cape Town, 2013.

 

Abstract:

Reconfigurable computing, in recent years, has been taking great strides in becoming part of mainstream computing largely due to the rapid growth in the size of FPGAs and their ability to adapt to certain complex applications efficiently. This dissertation investigates the reuse of application specific hardware developed for radio astronomy in accelerating a popular bioinformatics algorithm.

To showcase the abilities of reconfigurable computing the BLASTN sequence similarity search algorithm for DNA was selected and implemented on a ROACH reconfigurable computer with a Xilinx SX95T FPGA. The implementation divides the work between an x86 computer and the FPGA with communication taking place over 10GbE. The  FPGA contains processing elements to perform the seed detection and extension portions of the algorithm while trying to keep the matches as close as possible to the original BLAST.

The results show that NCBI BLAST’s runtime is highly dependent on the word size and the length of the query, while ROACH BLAST’s runtime is largely unaffected by  varying these parameters. This creates conditions where ROACH BLAST outperforms NCBI BLAST and others where NCBI BLAST outperforms ROACH BLAST. With a query length of 361 letters and a word size of 4 ROACH BLAST is 7x faster, however with a query length of 8 letters and a word size of 31 NCBI BLAST is 6x faster when compared one to one, in a production solution multiple queries would be loaded into the FPGA simultaneously.

BLAST’s heuristic nature has a clear negative impact on the design since it is unable to take effective advantage of the massive reduction in search space the algorithm  provides. However provided the query is long enough or the sensitivity high enough ROACH BLAST will overtake NCBI BLAST demonstrating the huge processing power of the FPGA. The scalability of the FPGA design shows promise with estimated easy scalability to query lengths of 30000 letters across multiple boards, with only minor changes required to support longer queries.

 

 

 

Leave a Reply