I’m interested in how multi-agent LLM systems exhibit failure modes that are invisible to single-agent evaluation — emergent collusion, cross-agent goal drift, deceptive coordination — and in building auditing infrastructure that can catch and steer them before deployment. The work runs along three threads: studying failure itself, especially the misalignment that emerges under conditions where standard evaluation looks clean (Betley et al., 2025); auditing mechanistically, on what these systems are actually doing inside when they coordinate, deceive, or drift; and agents and society, on how multi-agent systems get embedded in human institutions and the technical analysis that makes that coexistence work (Reuel et al., 2024).

Currently a Master’s student in CS at Dartmouth College, co-advised by Nikhil Singh (Science and Art of Human-AI Systems Lab) and Soroush Vosoughi (Minds, Machines, and Society Lab). Before this, I did my undergrad at New York University, and undergrad thesis with Multimodal Agentic Personalization Systems (MAPS) Group.

News