Llm ³

Mechanistic Indicators of Understanding in Large Language Models

Philosophical Studies. With Pierre Beckmann. doi:10.48550/arXiv.2507.08017

Draws on detailed technical evidence from research on mechanistic interpretability (MI) to argue that while LLMs differ profoundly from human cognition, they do more than tally up word co-occurrences: they form internal structures that are fruitfully compared to different forms of human understanding, such as conceptual, factual, and principled understanding. We synthesize MI’s most relevant findings to date while embedding them within an integrative theoretical framework for thinking about understanding in LLMs. As the phenomenon of “parallel mechanisms” shows, however, the differences between LLMs and human cognition are as philosophically fruitful to consider as the similarities.

explainable AI, LLM, mechanistic interpretability, philosophy of AI, understanding, conceptual change

Download PDF

Can AI Rely on the Systematicity of Truth? The Challenge of Modelling Normative Domains

Philosophy & Technology 38 (34): 1–27. 2025. doi:10.1007/s13347-025-00864-x

Argues that the asystematicity of normative domains, stemming from the plurality, incompatibility, and incommensurability of values, poses a challenge to AI’s ability to comprehensively model these domains and underscores the indispensable role of human agency in practical deliberation.

AI, asystematicity, LLM, philosophy of technology, normativity, systematicity

Download PDF

On the Fundamental Limitations of AI Moral Advisors

Philosophy & Technology 38 (71): 1–4. 2025. Invited commentary. doi:10.1007/s13347-025-00896-3

Argues that while the asystematicity of truth militates against the personalization of AI moral advisors, it also imposes limitations on generalist AI moral advisors.

AI, AI ethics, deliberation, asystematicity, LLM, normativity

Download PDF

Llm 3

Mechanistic Indicators of Understanding in Large Language Models

Can AI Rely on the Systematicity of Truth? The Challenge of Modelling Normative Domains

On the Fundamental Limitations of AI Moral Advisors

Llm ³