Adaptive k-Nearest Neighbor Learning for Robust Modal Regression on Multimodal and Heavy-Tailed Data

Sutarman Sutarman, Netti Herawati, Adli Abdillah Nababan

Abstract


Modal regression has attracted increasing attention as an alternative to mean-based regression, particularly in settings characterized by heteroscedasticity, multimodal conditional distributions, and heavy-tailed noise. In such scenarios, estimators based on central tendency may yield predictions that fall in low-density regions of the response space. This paper proposes an adaptive k-nearest neighbor framework for modal regression that integrates entropy-guided neighborhood selection with nonparametric mode estimation, including MeanShift clustering and one-dimensional kernel density estimation. The proposed approach adjusts neighborhood size based on local uncertainty, allowing the regression model to adapt to variations in data density without relying on a globally fixed parameter. Extensive experiments on simulated datasets and real-world benchmarks demonstrate that adaptive modal regression methods generally reduce or stabilize prediction errors relative to fixed-k modal regression and classical kNN mean and median estimators, particularly under heteroscedastic and multimodal conditions, although the magnitude of improvement varies across scenarios. Statistical tests confirm significant differences in most experimental settings, with practical gains ranging from incremental to substantial depending on data complexity. In addition to accuracy, computational behavior is explicitly examined. The findings show a trade-off between computational cost and predictive robustness: entropy-guided adaptive modal regression requires additional runtime due to neighborhood adaptation and density estimation, but this overhead increases proportionally with sample size and remains manageable for medium-sized datasets. Based on these results, adaptive modal regression provides a useful and flexible alternative for regression tasks involving complex and heterogeneous data distributions where robustness is prioritized over minimal computation time.

Article Metrics

Abstract: 35 Viewers PDF: 13 Viewers

Keywords


Adaptive K-Nearest Neighbors; Modal Regression; Entropy-Based Neighborhood Selection; Robust Nonparametric Regression; Multimodal Conditional Distributions; Heavy-Tailed Noise; Instance-Based Learning; Meanshift Clustering; Kernel Density Estimation; Loca

Full Text:

PDF


Refbacks

  • There are currently no refbacks.



Barcode

Journal of Applied Data Sciences

ISSN : 2723-6471 (Online)
Collaborated with : Computer Science and Systems Information Technology, King Abdulaziz University, Kingdom of Saudi Arabia.
Publisher : Bright Publisher
Website : http://bright-journal.org/JADS
Email : taqwa@amikompurwokerto.ac.id (principal contact)
    support@bright-journal.org (technical issues)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0