Anthropic (an AI safety company) published their first interpretability paper exploring a mathematical framework for trying to reverse engineer transformer language models: A Mathematical Framework for Transformer Circuits https://transformer-circuits.pub/2021/framework/index.html
Links for 2021-12-25
Links for 2021-12-25
Links for 2021-12-25
Anthropic (an AI safety company) published their first interpretability paper exploring a mathematical framework for trying to reverse engineer transformer language models: A Mathematical Framework for Transformer Circuits https://transformer-circuits.pub/2021/framework/index.html