Thesis Type: Doctorate
Institution Of The Thesis: Orta Doğu Teknik Üniversitesi, Faculty of Engineering, Department of Computer Engineering, Turkey
Approval Date: 2017
Student: ÖZCAN DÜLGER
Supervisor: MEHMET HALİT S. OĞUZTÜZÜN
Abstract:This thesis addresses the implementation of the resampling stage of the particle filter on graphics processing unit (GPU). Some of the well-known sequential resampling methods are the Multinomial, Stratified and Systematic resampling. They have dependency in their loop structure which impedes their parallel implementation. Although such impediments were overcome on their GPU implementation, these algorithms suffer from numerical instability due to the accumulation of rounding errors when single precision is used. Rounding errors arise in cumulative summation over the weights of the particles when the weights differ widely or the number of particles is large. There are resampling algorithms such as Metropolis and Rejection, which do not suffer from numerical instability as they only calculate ratios of weights pairwise rather than perform collective operations over the weights. They are more suitable for the GPU implementation of the particle filter. However, they suffer from non-coalesced global memory access patterns which cause their speed deteriorate rapidly as the number of particles gets large. In the first part of this thesis, we offer solutions for this problem of the Metropolis resampling. We introduce two implementation techniques, designated Metropolis-C1 and Metropolis-C2, and compare them with the original Metropolis resampling on NVIDIA Tesla K40 board. In the first scenario where these two techniques achieve their fastest execution times, Metropolis-C1 is faster than the others, but yields the worst results in quality. However, Metropolis-C2 is closer to the Metropolis resampling in quality. In the second scenario where all three algorithms yield similar quality, although Metropolis-C1 and Metropolis-C2 are slower, they are still faster than the original Metropolis resampling. In the second part of the thesis, we introduce a new resampling method, designated Uphill resampling, which is free from numerical instability as it avoids the accumulation of rounding errors. We make comparisons with the Systematic, Metropolis and Rejection resampling methods with respect to quality and speed. We achieve similar results with the Metropolis and Rejection resampling. Furthermore, we devise a coalesced version of the Uphill resampling, designated Uphill-CA, which does not undergo non-coalesced global memory access patterns. With Uphill-CA, we achieve faster results with quality similar to the original Uphill. Thus, the Uphill resampling provides the users of particle filters with a spectrum of fast alternatives on the GPU that is comparable, in terms of quality, with other methods.