ePoster

NDIF: AN OPEN TOOL ECOSYSTEM FOR PROBING REPRESENTATIONS AND CIRCUITS IN DEEP LEARNING MODELS

Gabriele Sartiand 7 co-authors

Northeastern University

FENS Forum 2026 (2026)
Barcelona, Spain
Board PS05-09AM-035

Presentation

Date TBA

Board: PS05-09AM-035

Poster preview

NDIF: AN OPEN TOOL ECOSYSTEM FOR PROBING REPRESENTATIONS AND CIRCUITS IN DEEP LEARNING MODELS poster preview

Event Information

Poster Board

PS05-09AM-035

Abstract

Despite advances in high-density electrophysiology and single-cell neuroimaging, recording and modulating full-brain activity at a fine-grained level remains unfeasible. Artificial neural networks (ANNs), such as LLMs, can serve as in silico alternatives for measuring cognitive-like behaviors using neuro-inspired methods, including activation recording, targeted perturbations, and representational analysis. However, most ANN studies focus on small models due to computational constraints, leaving the inner workings of capable frontier systems largely unexplored. The National Deep Inference Fabric (NDIF) bridges this gap by providing open tools and infrastructure for mechanistic analyses of large-scale open-source ANNs.

The NDIF ecosystem comprises three core tools. NNsight is a low-level package for recording and intervening on internal model activations, enabling causal experiments analogous to lesion studies and optogenetic manipulations. NNterp standardizes interpretability techniques across model architectures for reproducible analyses at scale. Finally, Workbench is our interpretability research platform for rapid exploration of model behaviors and AI pedagogy.

We conduct a systematic survey of 184 recent interpretability studies, finding a 40% performance gap on the Massive Multitask Language Understanding (MMLU) benchmark between commonly analyzed models and frontier systems. NDIF addresses this limitation by allowing researchers to access and conduct experiments directly on multi-billion-parameter models available on remote high-performance computing resources, enabling large-scale analyses without dedicated computational infrastructure.

Research using NDIF has already yielded insights into neuro-related questions such as multilingual concept encoding, Theory of Mind capabilities, and sentence-processing mechanisms in LLMs. We invite collaborations from the computational neuroscience community to explore parallels between artificial and biological neural computation.

Left: Blocks representing the hierarchy of tools composing the NDIF ecosystem. From top to bottom: Workbench, a low-code UI for experimentation and educational applications; NNsight, foundational access to interventions and model internals; NNterp, a standardized interface for reproducible interpretability analyses. NNsight communicates with the NDIF server to coordinate remote execution of large language models (LLMs). Right: Line plot showcasing the gap between the performance of frontier models and systems analyzed in interpretability research between 2019 and 2025. While some studies analyze models achieving scores >70% on the MMLU benchmark, most work still focuses on systems with performance <40% due to computational constraints.

Recommended posters

Cookies

We use essential cookies to run the site. Analytics cookies are optional and help us improve World Wide. Learn more.