Re-Engineering the Concept of Understanding for AI
With Pierre Beckmann.
Argues that the concept of understanding needs to be re-engineered for artificial cognition in a way that is empirically informed by mechanistic interpretability research and theoretically informed by a grasp of the functions of the concept.
AI, conceptual engineering, mechanistic interpretability, understanding, conceptual change, functions
PDF coming soon
Mechanistic Indicators of Understanding in Large Language Models
Philosophical Studies. With Pierre Beckmann. doi:10.48550/arXiv.2507.08017
Draws on detailed technical evidence from research on mechanistic interpretability (MI) to argue that while LLMs differ profoundly from human cognition, they do more than tally up word co-occurrences: they form internal structures that are fruitfully compared to different forms of human understanding, such as conceptual, factual, and principled understanding. We synthesize MI’s most relevant findings to date while embedding them within an integrative theoretical framework for thinking about understanding in LLMs. As the phenomenon of “parallel mechanisms” shows, however, the differences between LLMs and human cognition are as philosophically fruitful to consider as the similarities.
explainable AI, LLM, mechanistic interpretability, philosophy of AI, understanding, conceptual change
Download PDF