ePoster

UnitRefine: A community toolbox for automated spike sorting curation

Anoushka Jain, Matthias Hennig, Simon Musall, Robyn Greene, Federico Suprio, Jake Swann, Chris Halcrow, Alexander Kleinjohann, Severin Graff, Juergen Gall, Bjorn Kampa, Sonja Grun, Alessio Buccino
COSYNE 2025(2025)
Montreal, Canada

Conference

COSYNE 2025

Montreal, Canada

Resources

Authors & Affiliations

Anoushka Jain, Matthias Hennig, Simon Musall, Robyn Greene, Federico Suprio, Jake Swann, Chris Halcrow, Alexander Kleinjohann, Severin Graff, Juergen Gall, Bjorn Kampa, Sonja Grun, Alessio Buccino

Abstract

Electrophysiological recordings capture signals from hundreds of neurons simultaneously, but isolating single-cell activity often requires manual curation due to limitations in spike- sorting algorithms. As dataset sizes grow, the time and expertise required for accurate and consistent human curation pose a major challenge for experimental labs. To address this issue, we developed UnitRefine, a classification toolbox that leverages diverse machine-learning algorithms to minimize manual curation efforts. Using acute recordings with Neuropixel probes, we collected a large neural dataset with highly reproducible experimental conditions and had multiple expert human curators label each recording for reliable cluster identification. This carefully labeled dataset served as the foundation for our automated curation system that learns from human annotations and replicates curator decisions. UnitRefine incorporates existing and newly developed quality metrics, including hyper-synchronous spiking events and drifts in firing rate, to automatically separate noise from neural clusters with high accuracy. To address inherent labeling imbalances between well-isolated single-cell clusters and mixed-population activity, we implemented a cascading classification system. UnitRefine uses a comprehensive hyperparameter optimization search across various classification algorithms, including deep-learning and ensemble methods, to identify optimal model parameters. Across recordings, optimized Random Forest decoder outperformed other approaches, with up to 87\% accuracy for unseen recordings. The broad applicability of UnitRefine is demonstrated by its successful performance across diverse labs and recording conditions, including high-density probes in rats, clinical recordings in epilepsy patients, and open datasets from the Allen Institute. Notably, labeling just 20\% of the novel data significantly improved curation performance. UnitRefine is specifically developed for broad community adoption, easy to use, and fully integrated into SpikeInterface, allowing users to either apply our pre-trained models or generate new decoders based on their own curation. New models can also be trained on custom metrics and easily shared via HuggingFaceHub.

Unique ID: cosyne-25/unitrefine-community-toolbox-automated-7abb289d