Hooks System Overview¶
The hooks system is the foundation of mi-crow's interpretability capabilities. It provides a powerful, flexible framework for intercepting and processing model activations during inference.
What are Hooks?¶
Hooks are callbacks that execute at specific points during a neural network's forward pass. In mi-crow, hooks allow you to:
- Observe activations without modifying them (Detectors)
- Modify activations to change model behavior (Controllers)
- Compose multiple hooks for complex experiments
- Manage hook lifecycle (register, enable, disable, unregister)
Why Hooks Matter¶
Hooks are central to everything mi-crow does:
- Activation Saving: Detectors capture activations for analysis
- SAE Integration: SAEs work as hooks to decode activations
- Concept Discovery: Text tracking uses detectors to collect examples
- Model Steering: Controllers modify activations to change behavior
- Experimentation: Combine multiple hooks for intervention studies
Without hooks, you'd need to modify model code directly. Hooks provide a clean, non-invasive way to inspect and control models.
Hook Architecture¶
graph TD
A[Language Model] --> B[Layer Forward Pass]
B --> C{Hook Type?}
C -->|FORWARD| D[Post-Forward Hook]
C -->|PRE_FORWARD| E[Pre-Forward Hook]
D --> F[Detector or Controller]
E --> F
F --> G[Process Activation]
G -->|Detector| H[Observe Only]
G -->|Controller| I[Modify & Return]
H --> J[Continue Forward]
I --> J
Detectors vs Controllers¶
Detectors¶
Purpose: Observe and collect data without modification
Use Cases: - Saving activations for analysis - Tracking statistics (mean, variance, etc.) - Collecting top activating texts - Debugging model behavior
Key Property: Detectors never modify activations - they're purely observational.
Controllers¶
Purpose: Modify activations to change model behavior
Use Cases: - Amplifying or suppressing specific features - Concept manipulation through SAE neurons - Intervention experiments - Model steering
Key Property: Controllers return modified activations that replace the original.
Hook Types¶
Hooks can execute at two points:
- PRE_FORWARD: Before a layer processes its input
- Receives: Layer inputs
- Can modify: Inputs before processing
-
Use case: Modify what the layer receives
-
FORWARD: After a layer produces its output
- Receives: Layer outputs
- Can modify: Outputs before passing to next layer
- Use case: Modify what the layer produces
Most hooks use FORWARD hooks, as they operate on layer outputs (activations).
Quick Reference¶
| Feature | Detector | Controller |
|---|---|---|
| Modifies activations | ❌ No | ✅ Yes |
| Can save to Store | ✅ Yes | Optional |
| Accumulates metadata | ✅ Yes | Optional |
| Returns modified value | ❌ No | ✅ Yes |
| Use case | Observation | Intervention |
Documentation Structure¶
This hooks guide is organized into:
- Fundamentals - Core concepts, lifecycle, and basics
- Detectors - Using detector hooks for observation
- Controllers - Using controller hooks for modification
- Registration - Managing hooks on layers
- Advanced - Advanced patterns and best practices
Getting Started¶
If you're new to hooks, start here:
- Read Fundamentals to understand the basics
- Try Detectors for observation tasks
- Explore Controllers for modification tasks
- Learn Registration for hook management
- Check Advanced for complex patterns
Example: Simple Hook Usage¶
from mi_crow.hooks import LayerActivationDetector
from mi_crow.language_model import LanguageModel
from mi_crow.store import LocalStore
# Setup
store = LocalStore(base_path="./store")
lm = LanguageModel.from_huggingface("gpt2", store=store)
# Create a detector hook
detector = LayerActivationDetector(
layer_signature="transformer.h.0.attn.c_attn"
)
# Register the hook
hook_id = lm.layers.register_hook("transformer.h.0.attn.c_attn", detector)
# Run inference - hook automatically executes
outputs, encodings = lm.inference.execute_inference(["Hello, world!"])
# Access collected data
activations = detector.tensor_metadata.get("activations")
# Clean up
lm.layers.unregister_hook(hook_id)
This simple example demonstrates the core hook workflow: create, register, use, and cleanup.
Integration with Other Features¶
Hooks integrate seamlessly with other mi-crow features:
- SAEs: Work as both detectors and controllers
- Activation Saving: Uses detectors internally
- Concept Discovery: Uses detectors for text tracking
- Model Steering: Uses controllers for interventions
Understanding hooks is essential for understanding how mi-crow works under the hood.
Next Steps¶
- Hooks Fundamentals - Start here for detailed explanation
- Using Detectors - Learn about observation hooks
- Using Controllers - Learn about modification hooks
- Workflows - See hooks in action