Mi-Crow¶
Python library for mechanistic interpretability research on Large Language Models
What is Mi-Crow?¶
Mi-Crow is a Python library designed for researchers working on mechanistic interpretability of Large Language Models (LLMs). It provides a unified interface for analyzing and controlling model behavior through mechanistic interpretability methods, making it easy to understand what's happening inside neural networks.
Key Capabilities¶
-
:robot: Activation Analysis
Save and analyze model activations from any layer with minimal performance overhead
-
:brain: SAE Training
Train sparse autoencoders to discover interpretable features and concepts
-
:bulb: Concept Discovery
Identify and name concepts learned by SAE neurons through automated analysis
-
:video_game: Model Steering
Manipulate model behavior through concept-based interventions and activation control
-
:hook:Hook System
Flexible framework for intercepting and modifying activations at any layer
-
:floppy_disk: Data Persistence
Efficient hierarchical storage for managing large-scale experiment data
Quick Start¶
Installation¶
pip install mi-crow
Basic Usage¶
from mi_crow.language_model import LanguageModel
# Initialize a language model
lm = LanguageModel(model_id="bielik")
# Run inference
outputs = lm.forwards(["Hello, world!"])
# Access activations and outputs
print(outputs.logits)
Training an SAE¶
from mi_crow.language_model import LanguageModel
from mi_crow.mechanistic.sae import SaeTrainer
from mi_crow.mechanistic.sae.modules import TopKSae
# Load model and collect activations
lm = LanguageModel(model_id="bielik")
activations = lm.save_activations(
dataset=["Your text data here"],
layers=["transformer_h_0_attn_c_attn"]
)
# Train SAE
trainer = SaeTrainer(
model=lm,
layer="transformer_h_0_attn_c_attn",
sae_class=TopKSae,
hyperparams={"epochs": 10, "batch_size": 256}
)
sae = trainer.train(activations)
Documentation Structure¶
๐ Getting Started¶
New to Mi-Crow? Start here:
- Installation Guide - Set up your environment
- Quick Start Tutorial - Run your first example in minutes
- Core Concepts - Understand the fundamentals
๐ User Guide¶
Comprehensive guides for all features:
- Hooks System - Complete guide to the powerful hooks framework
- Fundamentals - Core concepts
- Detectors - Observing activations
- Controllers - Modifying behavior
- Registration - Hook management
-
Advanced Usage - Advanced patterns
-
Workflows - Step-by-step guides for common tasks
- Saving Activations
- Training SAE Models
- Concept Discovery
- Concept Manipulation
-
Best Practices - Tips for effective research
- Troubleshooting - Common issues and solutions
- Examples - Example notebooks overview
๐งช Experiments¶
Real-world experiments demonstrating Mi-Crow usage:
- Experiments Overview - Available experiments
- Verify SAE Training - Complete SAE training workflow
- SLURM Pipeline - Distributed training setup
๐ API Reference¶
Complete API documentation:
- API Overview - API structure and organization
- Language Model - Model loading and inference
- SAE - Sparse autoencoder APIs
- Datasets - Dataset loading and processing
- Store - Persistence layer
- Hooks - Hook system APIs
Features¶
Unified Model Interface¶
Work with any HuggingFace language model through a consistent API. No need to handle model-specific initialization details.
Research-Focused Design¶
Built specifically for interpretability research workflows:
- Comprehensive Testing: 85%+ code coverage requirement
- Type Safety: Extensive use of Python type annotations
- Documentation: Complete API reference and user guides
- CI/CD: Automated testing and deployment
- Minimal Overhead: Hook system introduces negligible latency during inference
Flexible Architecture¶
Five core modules that work independently or together:
- Language Model - Unified interface for any HuggingFace model
- Hooks - Flexible activation interception system
- Mechanistic - SAE training and concept manipulation
- Store - Hierarchical data persistence
- Datasets - Dataset loading and processing
Repository & Links¶
- GitHub: AdamKaniasty/Inzynierka
- PyPI: mi-crow
- Documentation: This site
Citation¶
If you use Mi-Crow in your research, please cite:
@thesis{kaniasty2025microw,
title={Mechanistic Interpretability for Large Language Models: A Production-Ready Framework},
author={Kaniasty, Adam and Kowalski, Hubert},
year={2025},
school={Warsaw University of Technology},
note={Engineering Thesis}
}
Next Steps¶
-
:rocket: Quick Start
Get up and running in minutes
-
:book: User Guide
Comprehensive documentation
-
:wrench: Examples
Explore example notebooks
-
๐งช Experiments
Real-world use cases