An Adaptive Method to Identify Outliers in Skewed Observations: Application to Assess NAACCR Cancer Registry Data Usage
Document Type
Article
Publication Date
3-22-2026
Publication Title
Stats
Abstract
Outlier detection is a fundamental component of data preprocessing and quality monitoring across diverse scientific domains, including engineering, biomedical sciences, and finance. While many variables in controlled environments approximate a normal distribution, real-world data, particularly biological, environmental, and epidemiological measures, are frequently characterized by pronounced right-skewness. To address the shortcomings of conventional methods, this study introduces the Dynamic Threshold for Outlier Detection (DTOD), which reframes outlier detection as a concrete operational workflow. The DTOD framework dynamically adjusts detection thresholds based on a functional relationship between skewness and tail morphology. Validation through large-scale simulation experiments across light-, middle-, and high-skewness levels confirms the method’s versatility. The DTOD proves particularly effective at two ends of the spectrum: enhancing sensitivity for detecting subtle anomalies in light-skewed data while serving as a conservative, high-confidence screening tool that controls false positives in high-skewness environments. In real-world application to North American Association of Central Cancer Registries (NAACCR) data, the method successfully identified outliers with abnormally high unknown tumor size rates in colorectal cancer and maintained a low misclassification rate in highly skewed lung cancer data. Ultimately, the DTOD provides a promising, interpretable solution for improving data quality in skewed scenarios.
Volume
9
Issue
2
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Yang, Xiaowen; Bam, Amjila; Rizvi, Nubaira; Wu, Xiao Cheng; Mercante, Donald; and Yu, Qingzhao, "An Adaptive Method to Identify Outliers in Skewed Observations: Application to Assess NAACCR Cancer Registry Data Usage" (2026). School of Public Health Faculty Publications. 579.
https://digitalscholar.lsuhsc.edu/soph_facpubs/579
10.3390/stats9020033