Bilinear MLPs enable weight-based mechanistic interpretability
Bron
ICLR 2025 : thirteenth International Conference on Learning Representations, 24-28 April, 2025, Singapore- () p. 1-28
Compositionality unlocks deep interpretable models
Bron
AAAI'25 workshop on CoLoRAI - Connecting Low-Rank Representations in AI, March 4, 2025, Philadelphia, Pennsylvania, USA- () p. 1-10
Tokenized SAEs : disentangling SAE reconstructions
Bron
ICML 2024 Workshop on Mechanistic Interpretability, July 27, 2024, Vienna, Austria- () p. 1-13
Weight-based decomposition : a case for bilinear MLPs
Bron
ICML 2024 Workshop on Mechanistic Interpretability, July 27, 2024, Vienna, Austria- () p. 1-20