This project dealt with the acceleration of an aircraft tracking algorithm using a ClearSpeed mathematical co-processor. The algorithm is based on non-linear differential correction (also known as the Gauss-Newton method) and uses Doppler and bearing data from a Passive Coherent Location (PCL) radar system. A PCL radar uses a network of receivers to track targets through their back-scatter from existing Continuous Wave (CW) transmissions, such as broadcast TV or radio. The lack of an active transmitter in a PCL system results in relatively low procurement, operation and maintenance costs. This is of particular advantage for airports in third world countries, many of which do not have radar assisted air traffic control.
Differential correction combines data from a number of radar receivers using the minimum variance rule. New data is incorporated as it is received in order to continually correct the model in an optimal fashion. The model produced is a 10th degree polynomial which describes the trajectory of the aircraft in a Cartesian co-ordinate system.
The objectives of the dissertation were to identify the computationally intensive areas of the algorithm, then accelerate performance by offloading computation to the co-processor. The co-processor used was the ClearSpeed Advance X620 accelerator board, which fits into the PCI-X slot of a conventional computing platform. It is claimed that the Advance board range uses the fastest and most power efficient double-precision 64-bit floating point processors in the world. Application acceleration is achieved via two methods, parallelization and hardware optimized library routines. An investigation into acceleration of the Gauss-Newton algorithm with ClearSpeed was deemed worthwhile as the algorithm makes use of linear algebra routines supported by ClearSpeed, as well as computationally intensive double precision arithmetic.
Profiling of the pre-existing implementation of the algorithm (written by Dr. Richard Lord in IDL) revealed that the most computationally expensive areas are arithmetic operations that calculate partial derivatives of the observation functions, and high dimension double precision matrix manipulations. The Gauss-Newton algorithm was successfully implemented in C, with accuracy results that compared favourably with the original IDL implementation. The performance of the C version (with no ClearSpeed acceleration) was 1.38 times faster than the IDL implementation, mainly due to the efficiency of the hardware tuned ATLAS BLAS library.
It was found that the sizes of the matrices involved in the multiplications are not a good fit for direct acceleration via the provided ClearSpeed library functions. Further investigation concluded that a moderate speedup could be attained through parallelization of one of the matrix multiplications due to the sparse and data redundant nature of the input matrices. The accuracy results verified the correct functionality of the ClearSpeed accelerated algorithm, however the results showed that the estimates produced were an order of magnitude less accurate than the IDL version. This can be attributed to the differences in accuracy of the card side matrix multiplication and the IDL / BLAS library routines. A 2.22 times increase in performance was achieved (over the C implementation) through the co-processor offloading.
Given the amount of programming effort required to achieve this moderate speedup, it can be concluded that the Gauss-Newton algorithm is not a good fit for ClearSpeed assisted acceleration.