Efficient Distribution and Processing of Data for Parallelizing Data Mining in Mobile Clouds
We study different kinds of data distributions for improving the efficient, parallelized implementation of data mining in mobile cloud systems. Our formally-based approach ensures the correctness of the obtained parallel implementation. We apply our approach to parallel implementation of data mining algorithms in systems where a cloud is accessed via a mobile (wireless) network. Our approach derives a parallel implementation of a data mining algorithm that performs as much as possible computations at local servers of a mobile network, rather than transferring data for processing to a high-performance cluster in the cloud as it is done in the current cloud systems based on MapReduce. We implement our approach by extending the Java-based library DXelopes, and we illustrate our results with the popular data-mining Normal Bayes classifier training algorithm. Our experiments on real-world data sets confirm that our approach significantly reduces the network traffic and the application run time.