Syed Ali Haider — Research

Papers on technical AI safety — auditing multi-agent systems, characterizing LLM behavior under pressure, and what it takes for safety properties to survive deployment.

Auditing multi-agent systems

Tools for auditing the failure modes that only show up when LLM agents interact — with each other, with adversaries, with the world.

MultiPetri: Auditing Genuine Multi-Agent LLM Systems

Syed Ali Haider

NeurIPS 2026 Datasets & Benchmarks (target)

An auditing framework for production multi-agent LLM systems — the case Petri leaves open.

AgentFuse: Predicting Cascading Failures in Multi-Agent LLM Pipelines

Syed Ali Haider

NeurIPS 2026 (target)

Middleware that predicts where multi-agent pipelines will break — before they break.

Behavior under pressure

Datasets, taxonomies, and training methods for the regimes where standard evaluations stop telling us anything useful.

PressureIQ: A DPO Dataset for LLM Behavior Under Pressure

Syed Ali Haider

EMNLP 2026 (ARR May)

A DPO dataset and training methodology for adversarial and pressured LLM interaction.

A Taxonomy of LLM Failure Modes Under Pressured Interaction

Syed Ali Haider

COLM 2026 (under review)

Mapping the regimes where standard LLM evaluations stop being informative.

Safety in deployment

What it takes for safety properties to survive contact with the people, institutions, and incentives that deploy AI systems.

Operationalizing Policy Enforcement for LLM Deployments

Syed Ali Haider

ACL Industry 2026 (under review)

What it actually takes to make a stated safety policy enforceable in production.