GPU-Accelerated Data Mining with Swarm Intelligence

Undergraduate Honors Thesis

Swarm intelligence describes the ability of groups of social animals and insects to exhibit highly organized and complex problem-solving behaviors that allow the group as a whole to accomplish tasks which are beyond the capabilities of any one of the constituent individuals. This natural phenomenon is the inspiration for swarm intelligence systems, a class of algorithms that utilizes the emergent patterns of swarms to solve computational problems.There have been a number of publications regarding the application of swarm intelligence to various data mining problems, but few consider multi-threaded, let alone GPU-based implementations. In this project we adopt the General-Purpose GPU parallel computing model and show how it can be leveraged to increase the accuracy and efficiency of two types of swarm intelligence algorithms for data mining.

To illustrate the efficacy of GPU computing for swarm intelligence, we present two swarm intelligence data mining algorithms implemented with CUDA for execution on a GPU device. These algorithms are: (1) AntMinerGPU, an ant colony optimization algorithm for rule-based classification, and (2) ClusterFlockGPU, a bird-flocking algorithm for data clustering.

Our results indicate that the AntMinerGPU algorithm is markedly faster than the sequential algorithm on which it is based, and is able to produce classification rules which are competitive with those generated by traditional methods. Additionally, we show that ClusterFlockGPU is competitive with other swarm intelligence and traditional clustering methods, and is not affected by the dimensionality of the data being clustered making it theoretically well-suited for high-dimensional problems.

Download my thesis here