Resources
Authors & Affiliations
Quilee Simeon, Anshul Kashyap, Konrad Kording, Ed Boyden
Abstract
We present a comprehensive dataset of neural activity, compiled from 11 neuroimaging experiments in the nematode worm C. elegans. These source datasets, while collected under different experimental protocols, all measure neural activity via changes in calcium fluorescence in labeled subsets of the hermaphrodite worm’s 300 neurons. Our contribution is to standardize these disparate datasets into a unified framework including ordering the neural data from labeled neurons consistently, resampling traces to a common timestep, and including a feature mask to indicate which neurons were labeled in each animal. Our compilation includes data from ~900 worms and ~250 uniquely labeled neurons, providing a rich and diverse set of neural recordings for the research community. The goal of this dataset is to facilitate the training of a foundation model of the C. elegans nervous system, leveraging the unique advantage of being able to practically measure neural activity from every neuron with a definitive label. Additionally, the complete wiring diagram or “connectome” of the C. elegans neural network is known making this organism ideal for such computational modeling efforts. We supplement our large dataset of neural activity with a second dataset of published C. elegans connectomes that we have preprocessed into graph-based data structures, providing a resource for developing connectome-constrained models. We are open-sourcing these meta-datasets on HuggingFace:
(1) neural data: https://huggingface.co/datasets/qsimeon/celegans_neural_data
(2) connectome data: https://huggingface.co/datasets/qsimeon/celegans_connectome_data
By providing a standardized and comprehensive neural activity dataset enriched with connectivity information for C. elegans, we aim to encourage computational research towards creating a foundation model of an entire small nervous system.