Unsupervised Feature Selection Using the Atomic Orbital Search Algorithm for Information Retrieval
Sattam Abdallah AlyusufFaculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi, Selangor, Malaysia. alyusefsattam@gmail.com0009-0009-0477-2140
Mohd Zakree Ahmad NazriFaculty of Information Science & Technology, University Kebangsaan Malaysia, Bangi, Selangor, Malaysia. zakree@ukm.edu.my0000-0003-2267-4965
Keywords: Unsupervised Feature Selection, Atomic Orbital Search, Mechanistic Optimization, Text Mining, Information Retrieval, Dimensionality Reduction.
Abstract
Classical information retrieval methods face increasing difficulty in handling large-scale, high-dimensional datasets due to the rapid growth of digital content. As feature dimensionality increases, traditional retrieval techniques suffer from high computational complexity, increased noise sensitivity, and reduced retrieval efficiency. This study introduces a new method based on the principles of quantum mechanics for unsupervised feature selection (UFS) known as Adaptive Optical Search for Unsupervised Feature Selection (AOSUFS). This is aimed at exploring high-dimensional data for information retrieval in the absence of labeled data. The new approach is based on a multi-layer search space and a criterion using the mean absolute difference to obtain the optimal feature subsets. AOSUFS is evaluated using the Reuters dataset comprising 12,152 bag-of-words features and is compared with several optimisation algorithms, including Genetic Algorithm, Harmony Search, Particle Swarm Optimisation, Simulated Annealing, and Krill Herd. The results of the experiments show that AOSUFS cuts the dimensionality by 51.4%, leaving only 5,904 features in the feature space. The proposed method achieves the highest mean average precision of 0.251. This is 9 percent higher than the baseline that does not use feature selection. The Mean Average Recall drops to 0.1384. This shows a 73 percent drop. Krill Herd got second place with a MAP of 0.2499. The unfiltered Harmony Search variant got the lowest score. This work presents the first application of adaptive optical search to unsupervised information retrieval, demonstrating improved retrieval effectiveness, reduced computational requirements, and efficient dimensionality reduction for large, sparse datasets.