Philosophy-of-Ai

Mechanistic Indicators of Understanding in Large Language Models

Philosophical Studies. With Pierre Beckmann. doi:10.48550/arXiv.2507.08017

Draws on detailed technical evidence from research on mechanistic interpretability (MI) to argue that while LLMs differ profoundly from human cognition, they do more than tally up word co-occurrences: they form internal structures that are fruitfully compared to different forms of human understanding, such as conceptual, factual, and principled understanding. We synthesize MI’s most relevant findings to date while embedding them within an integrative theoretical framework for thinking about understanding in LLMs. As the phenomenon of “parallel mechanisms” shows, however, the differences between LLMs and human cognition are as philosophically fruitful to consider as the similarities.

explainable AI, LLM, mechanistic interpretability, philosophy of AI, understanding, conceptual change

Download PDF

Explainability through Systematicity: The Hard Systematicity Challenge for Artificial Intelligence

Minds and Machines 35 (35): 1–39. 2025. doi:10.1007/s11023-025-09738-9

Offers a framework for thinking about “the systematicity of thought” that distinguishes four senses of the phrase, defuses the alleged tension between systematicity and connectionism that Fodor and Pylyshyn influentially diagnosed, and identifies a “hard” form of the systematicity challenge that continues to defy connectionist models.

AI, explainable AI, philosophy of AI, rationality, systematicity, conceptual change

Download PDF

Philosophy-of-Ai 2

Mechanistic Indicators of Understanding in Large Language Models

Explainability through Systematicity: The Hard Systematicity Challenge for Artificial Intelligence

Philosophy-of-Ai ²