Experiments¶

This section provides detailed walkthroughs of sample experiments demonstrating real-world usage of mi-crow.

Overview¶

Experiments showcase complete workflows from data collection through analysis, using real models and datasets. They demonstrate best practices and provide templates for your own research.

Available Experiments¶

Verify SAE Training ¶

Complete workflow for training and validating SAE models on the Bielik model using TinyStories dataset.

What it covers: - Saving activations from a production model - Training SAEs with proper hyperparameters - Validating training success - Concept discovery and naming - Analysis and visualization

Time required: Several hours (depending on hardware)

Prerequisites: - Access to Bielik model or similar - Sufficient GPU memory - Understanding of basic SAE concepts

SLURM SAE Pipeline ¶

Distributed training setup for large-scale SAE training on cluster environments.

What it covers: - SLURM job configuration - Distributed activation saving - Large-scale SAE training - Resource management

Time required: Days (cluster-dependent)

Prerequisites: - Access to SLURM cluster - Understanding of cluster computing - Large-scale dataset

Experiment Structure¶

Each experiment typically includes:

Setup: Environment and dependencies
Data Collection: Saving activations
Training: SAE model training
Validation: Verifying results
Analysis: Understanding outcomes
Documentation: Recording findings

Running Experiments¶

Prerequisites¶

# Install dependencies
pip install -e .

# Or with uv
uv sync

Basic Workflow¶

# 1. Navigate to experiment directory
cd experiments/verify_sae_training

# 2. Review README
cat README.md

# 3. Run scripts in order
python 01_save_activations.py
python 02_train_sae.py

# 4. Open analysis notebooks
jupyter notebook 03_analyze_training.ipynb

Customization¶

Experiments are designed to be customizable:

Modify model names
Adjust hyperparameters
Change dataset sources
Adapt to your hardware

Experiment Outputs¶

Experiments produce:

Saved activations: Organized in store
Trained models: SAE checkpoints
Analysis results: Visualizations and metrics
Documentation: Findings and observations

Best Practices¶

Start small: Test with limited data first
Monitor resources: Watch memory and compute usage
Document changes: Record any modifications
Save checkpoints: Don't lose progress
Validate results: Verify outputs make sense

Contributing Experiments¶

If you create a new experiment:

Create directory in experiments/
Include README with description
Provide runnable scripts/notebooks
Document setup and requirements
Share findings and observations

Next Steps¶

Verify SAE Training - Start with this experiment
User Guide - Learn fundamentals first
Examples - Try examples before experiments