ePoster

Open-source solutions for research data management in neuroscience collaborations

Reema Gupta, Thomas Wachtler
Bernstein Conference 2024(2024)
Goethe University, Frankfurt, Germany

Conference

Bernstein Conference 2024

Goethe University, Frankfurt, Germany

Resources

Authors & Affiliations

Reema Gupta, Thomas Wachtler

Abstract

Scientific progress increasingly relies on interdisciplinary collaborations as experimental and simulation datasets grow in complexity. However, these datasets are often underutilized due to inadequate solutions for data management, sharing, and analysis. Existing solutions address different aspects of the data workflow, but an integrated approach is still missing. We address these challenges within the EU-funded In2PrimateBrains [1] project, which investigates brain networks in non-human primates, using modular open-source tools, methods, and services to balance standardization and adaptability. Metadata recording conventions vary between laboratories and are often insufficient for reuse and analysis. We leverage community standards like BIDS [2], BEP032 [3], and openMINDS [4] to create a comprehensive metadata schema. Creating metadata instances with these rich schemas typically requires expertise and time, we simplify this with a user-friendly, form-based interface within the CEDAR [5] metadata workbench, incorporating built-in data validation, controlled vocabularies, and semantic annotations. We then utilize the odML metadata format [6] to store the metadata such that it is compatible with both machines and humans, enabling automated enrichment, accessibility, and contextualization [7]. The use of various recording systems with proprietary formats and multimodal data presents additional challenges for harmonization and usability. We use Neo [8], a common electrophysiological data representation that interfaces with various lab formats. For storage alongside other modalities, we use the versatile NIX format [9], providing a coherent structure for organizing and integrating data and metadata. This facilitates efficient data selection for analysis, leveraging metadata and provenance. Sharing and tracking evolving data requires a dedicated scientific data platform. GIN [10] provides version control [11,12], fine-grained access control, collaborative features, data publication services [13], and can be accessed via a variety of interfaces including web browser, command line client, or DataLad [14]. These tools and solutions adhere to the FAIR [15] principles and ensure interoperability with related tools [16, 17], offering researchers an efficient means to manage, collaborate, and work with diverse datasets without disrupting established research workflows.

Unique ID: bernstein-24/open-source-solutions-research-data-cf10fd06