Volume 11 - Issue 1
Efficient Distribution and Processing of Data for Parallelizing Data Mining in Mobile Clouds
- Ivan Kholod
Saint Petersburg Electrotechnical University “LETI”, Saint Petersburg, Russia
iiholod@mail.ru
- Andrey Shorov
Saint Petersburg Electrotechnical University “LETI”, Saint Petersburg, Russia
ashxz@mail.ru
- Sergei Gorlatch
University of Muenster, Muenster, Germany
gorlatch@uni-muenster.de
Keywords: mobile cloud, wireless networks, parallel algorithms, distributed algorithms, distributed data mining, parallel data mining
Abstract
We study different kinds of data distributions for improving the efficient, parallelized implementation
of data mining in mobile cloud systems. Our formally-based approach ensures the correctness of the
obtained parallel implementation. We apply our approach to parallel implementation of data mining
algorithms in systems where a cloud is accessed via a mobile (wireless) network. Our approach
derives a parallel implementation of a data mining algorithm that performs as much as possible
computations at local servers of a mobile network, rather than transferring data for processing to a
high-performance cluster in the cloud as it is done in the current cloud systems based on MapReduce.
We implement our approach by extending the Java-based library DXelopes, and we illustrate our results
with the popular data-mining Normal Bayes classifier training algorithm. Our experiments on
real-world data sets confirm that our approach significantly reduces the network traffic and the application
run time.