Consensus spectral clustering with weighted similarity functions for single-cell RNA sequencing data

Document Type

Article

Publication Date

11-16-2025

Publication Title

Communications in Statistics - Simulation and Computation

Abstract

In this paper, we explore unsupervised clustering algorithms on real-world single-cell RNA-sequencing datasets. While single-cell RNA sequencing technologies have revolutionized the ability to profile gene expression at the resolution of individual cells, providing unprecedented insights into cellular heterogeneity and revealing previously undetectable rare cell subpopulations, the high dimensionality, sparsity, and noise inherent to scRNA-seq data pose significant challenges for traditional clustering methods in accurately delineating distinct cell types, states, and trajectories from the transcriptomic profiles. Conventional algorithms often struggle to capture the complex structure and geometry of single-cell gene expression landscapes and can be sensitive to the curse of dimensionality or fail to account for non-linear manifold structures underlying the data. In this study, we propose random sampling-based consensus clustering framework with the manifold-based spectral clustering to predict cluster numbers. Furthermore, the geodesic distance measure and two weighting measures are integrated into the generally used similarity kernel to enhance clustering performance on scRNA-seq data. We found that, when applied to scRNA-seq datasets with different types of variations, the geodesic distance-based similarity kernel with a discrete weighting measure performs best on datasets having discrete variations.

Rights

Rights managed by Taylor & Francis

Share

COinS