Bilinear MLPs enable weight-based mechanistic interpretability

Bron
ICLR 2025 : thirteenth International Conference on Learning Representations, 24-28 April, 2025, Singapore- () p. 1-28
Auteur(s)

Compositionality unlocks deep interpretable models

Bron
AAAI'25 workshop on CoLoRAI - Connecting Low-Rank Representations in AI, March 4, 2025, Philadelphia, Pennsylvania, USA- () p. 1-10
Auteur(s)

Tokenized SAEs : disentangling SAE reconstructions

Bron
ICML 2024 Workshop on Mechanistic Interpretability, July 27, 2024, Vienna, Austria- () p. 1-13
Auteur(s)

Weight-based decomposition : a case for bilinear MLPs

Bron
ICML 2024 Workshop on Mechanistic Interpretability, July 27, 2024, Vienna, Austria- () p. 1-20
Auteur(s)