Skip to content
Mi-Crow Logo

Mi-Crow

Python library for mechanistic interpretability research on Large Language Models


What is Mi-Crow?

Mi-Crow is a Python library designed for researchers working on mechanistic interpretability of Large Language Models (LLMs). It provides a unified interface for analyzing and controlling model behavior through mechanistic interpretability methods, making it easy to understand what's happening inside neural networks.

Key Capabilities

  • :robot: Activation Analysis

    Save and analyze model activations from any layer with minimal performance overhead

  • :brain: SAE Training

    Train sparse autoencoders to discover interpretable features and concepts

  • :bulb: Concept Discovery

    Identify and name concepts learned by SAE neurons through automated analysis

  • :video_game: Model Steering

    Manipulate model behavior through concept-based interventions and activation control

  • :hook:Hook System

    Flexible framework for intercepting and modifying activations at any layer

  • :floppy_disk: Data Persistence

    Efficient hierarchical storage for managing large-scale experiment data

Quick Start

Installation

pip install mi-crow

Basic Usage

from mi_crow.language_model import LanguageModel

# Initialize a language model
lm = LanguageModel(model_id="bielik")

# Run inference
outputs = lm.forwards(["Hello, world!"])

# Access activations and outputs
print(outputs.logits)

Training an SAE

from mi_crow.language_model import LanguageModel
from mi_crow.mechanistic.sae import SaeTrainer
from mi_crow.mechanistic.sae.modules import TopKSae

# Load model and collect activations
lm = LanguageModel(model_id="bielik")
activations = lm.save_activations(
    dataset=["Your text data here"],
    layers=["transformer_h_0_attn_c_attn"]
)

# Train SAE
trainer = SaeTrainer(
    model=lm,
    layer="transformer_h_0_attn_c_attn",
    sae_class=TopKSae,
    hyperparams={"epochs": 10, "batch_size": 256}
)
sae = trainer.train(activations)

Documentation Structure

๐Ÿš€ Getting Started

New to Mi-Crow? Start here:

๐Ÿ“š User Guide

Comprehensive guides for all features:

๐Ÿงช Experiments

Real-world experiments demonstrating Mi-Crow usage:

๐Ÿ“– API Reference

Complete API documentation:


Features

Unified Model Interface

Work with any HuggingFace language model through a consistent API. No need to handle model-specific initialization details.

Research-Focused Design

Built specifically for interpretability research workflows:

  • Comprehensive Testing: 85%+ code coverage requirement
  • Type Safety: Extensive use of Python type annotations
  • Documentation: Complete API reference and user guides
  • CI/CD: Automated testing and deployment
  • Minimal Overhead: Hook system introduces negligible latency during inference

Flexible Architecture

Five core modules that work independently or together:

  1. Language Model - Unified interface for any HuggingFace model
  2. Hooks - Flexible activation interception system
  3. Mechanistic - SAE training and concept manipulation
  4. Store - Hierarchical data persistence
  5. Datasets - Dataset loading and processing


Citation

If you use Mi-Crow in your research, please cite:

@thesis{kaniasty2025microw,
  title={Mechanistic Interpretability for Large Language Models: A Production-Ready Framework},
  author={Kaniasty, Adam and Kowalski, Hubert},
  year={2025},
  school={Warsaw University of Technology},
  note={Engineering Thesis}
}

Next Steps