SPSA-FSR: Simultaneous Perturbation Stochastic Approximation for Feature Selection and Ranking

Recent emergence of datasets with massive numbers of features has made pattern recognition an ever-challenging task. In particular, such high numbers of features give rise to various issues such as overfitting, poor generalization, and inferior prediction performance, (2) slow and computationally expensive predictors, and (3)  difficulty in comprehending the underlying process. Feature selection (FS) can be defined as selecting a subset of available features in a dataset that are associated with the response variable by excluding irrelevant and redundant features. An effective feature selection process mitigates the problems associated with large datasets in the sense that it results in (1) better classification performance, (2) reduced storage and computational cost, and (3) generalized and more interpretable models. An alternative to FS for dimensionality reduction is feature extraction (FE) wherein original features are first combined and then projected into a new feature space with lower dimensionality. A major downside of FE is that the transformed features lose their physical meaning, which complicates further analysis of the model and makes it difficult to interpret. Thus, FS is superior to FE in terms of readability and interpretability.

Please refer to the article below on feature selection and ranking with SPSA published in the journal Expert Systems with Applications.

K-best feature selection and ranking via stochastic approximation