Learning in an Agentic World

Foundations and Challenges

Theoretical foundations for learning with agents, strategic and adaptive environments, reliability, personalization, and LLM-based agentic systems.

About

About the Workshop

AI agents are becoming increasingly powerful and prevalent, interacting with the world on behalf of users and organizations. These systems raise foundational questions around reliability, adaptivity, personalization, strategic behavior, and learning from feedback in non-i.i.d. environments.

This workshop aims to bring together researchers in learning theory, online learning, game theory, statistical learning, and modern AI systems to develop rigorous frameworks for learning in the agentic era. Topics include learning with and from agents, strategic classification, performative prediction, multi-agent learning, verification and reliability, preference learning, personalization, and theory for LLM agents.

Abstract and Objectives

AI agents are becoming increasingly powerful and prevalent, used to interact in the world on behalf of their owners. However, they often don’t know what they don’t know, can be overconfident or convinced to behave inappropriately, and in general raise concerns around safety, reliability, personalization, and adaptivity. One challenge is that agentic systems operate in interactive, non-i.i.d. environments where decisions influence future data, users respond strategically, and objectives may evolve over time. These features call for theoretical frameworks that move beyond static prediction toward learning in dynamic, feedback-driven, multi-agent ecosystems.

This workshop aims to bring together the learning theory community to discuss work towards principled foundations for learning in the agentic era. We will explore models and guarantees for learning with and from agents, including strategic classification and performative prediction, multi-agent and game-theoretic learning, and settings with adaptive or partially aligned participants. A central theme is reliability—how can we design learning algorithms and agentic systems that are robust to distribution shift, adversarial behavior, and reasoning errors? We will highlight emerging challenges posed by large language models (LLMs), including learning to generate, verify, and refine reasoning processes, in order to achieve trustworthy outputs and decisions.

We also invite work on AI alignment (including personalization and preference learning) in agentic systems, where agents must adapt to heterogeneous users while maintaining privacy, fairness, or generalization guarantees. This includes learning user preferences from interaction, designing personalized agents, and understanding the statistical and online complexity of adaptive decision-making. By synthesizing ideas from online learning, statistical learning theory, game theory, and modern AI systems, the workshop seeks to identify key open problems and catalyze new directions for rigorous, reliable, and adaptive learning in the presence of autonomous agents.

Speakers

Keynote Speakers

Maria-Florina Balcan

Maria-Florina Balcan

Carnegie Mellon University

Nika Haghtalab

Nika Haghtalab

UC Berkeley

Yian Ma

Yian Ma

UC San Diego

Schedule

Tentative Schedule

Time Activity
8:30-8:40 Welcome remarks
8:40-9:10
Maria-Florina Balcan
Learning Verifiers for Chain of Thought Reasoning
Abstract

Large language models (LLMs) with chain-of-thought generation have demonstrated tremendous potential for solving complex reasoning and planning tasks across many domains. However, the outputs of current LLMs are not always fully reliable or aligned with human preferences, making effective verification essential. In this talk, I will present new frameworks and algorithms for learning verifiers that detect when an LLM’s reasoning goes off track. These learned verifiers help mitigate catastrophic failure modes of LLMs (e.g., they can appear convincing while being wrong) and, more broadly, improve the reliability and reasoning capabilities of modern LLMs.

9:10-9:40
Yian Ma
On the rationality of multi-agent learning
Abstract

I will discuss an interesting phenomena in multi-agent learning, that the mixed Nash equilibria are uniformly stable if and only if they are collectively rational. This justifies the effusive use of multi-agent learning systems and resolves the 'as if' rationality problem in classical economics. If partial knowledge (about utility or algorithm) can be obtained about the opponents, then the agent can steer towards their favorable Stackelberg equilibria, surpassing Nash outcomes.

9:40-10:10
Nika Haghtalab
Distortion of AI Alignment: Does Preference Optimization optimize for preferences?
Abstract

After pre-training, large language models are aligned with human preferences based on pairwise comparisons. State-of-the-art alignment methods (such as PPO-based RLHF and DPO) are built on the assumption of aligning with a single preference model, despite being deployed in settings where users have diverse preferences. As a result, it is not even clear that these alignment methods produce models that satisfy users on average — a minimal requirement. Drawing on social choice theory and modeling users’ comparisons through individual Bradley-Terry (BT) models, we introduce an alignment method’s distortion: the worst-case ratio between the optimal achievable average utility, and the average utility of the learned policy. The notion of distortion helps draw sharp distinctions between alignment methods: Nash Learning from Human Feedback achieves the minimax optimal distortion of a constant. We also give a fine-grained understanding of the distortion of RLHF (PPO or DPO based) which can suffer unbounded distortion in the worst-case.

12:30-14:00 Poster Session
CfP

Call for Abstracts

We invite submissions of short abstracts, at most one page plus a link to a paper (optional but recommended), describing recent results, work in progress, or open problems related to the workshop themes. Accepted abstracts will be invited for poster presentation at the in-person workshop.

Relevant topics include, but are not limited to:

Submission Instructions

Email your submissions in PDF format to law2026colt@gmail.com . Submissions should use a font size of at least 10pt and margins of at least 1 inch.

Accepted Abstracts

Titles and authors are listed below. Click on any title to expand and view the submitted PDF.

Distribution-Free Sequential Prediction with Abstentions Jialin Yu; Moïse Blanchard
Open submitted PDF in a new tab
Resource-Bounded Discovery of Reusable Actions Roshan Klein-Seetharaman
Open submitted PDF in a new tab
How Recursive Language Models Generalize Chenxiao Yang; Zhiyuan Li; David McAllester; Nathan Srebro
Open submitted PDF in a new tab
Near-Optimal Last-Iterate Convergence for Zero-Sum Games with Bandit Feedback and Opponent Actions Soumita Hait; Ping Li; Haipeng Luo; Mengxiao Zhang
Open submitted PDF in a new tab
Selective Rigidity: An Impossibility Result and Benchmark for Identity-Preserving Agent Learning Shubham Chakraborty; Sneh Nandu; Anupam Srivastava
Open submitted PDF in a new tab
Subsidy Design for Better Social Outcomes Maria-Florina Balcan; Matteo Pozzi; Dravyansh Sharma
Open submitted PDF in a new tab
Safe Learning of Multi-Agent Action Models from Concurrent Joint Action Observations Argaman Mordoch; Ori Karat; Lea Shmilovich; Yarin Benjamin; Brendan Juba; Roni Stern
Open submitted PDF in a new tab
Hair-Trigger Alignment: Black-Box Evaluation Cannot Guarantee Post-Update Alignment Yavuz Bakman; Duygu Nur Yaldiz; Salman Avestimehr; Sai Praneeth Karimireddy; Eleni Triantafillou; Peter Kairouz
Open submitted PDF in a new tab
Two-Sided Time-Independent Regret for Matching Markets with Limited Interviews Amirmahdi Mirfakhar; Xuchuang Wang; Mengfan Xu; Hedyeh Beyhaghi; Mohammad Hajiesmaili
Open submitted PDF in a new tab
On Randomized Algorithms in Online Strategic Classification Chase Hutton; Adam Melrod; Han Shao
Open submitted PDF in a new tab
Last-Iterate Convergence for Symmetric, General-Sum, 2 × 2 Games Under the Exponential Weights Dynamic Guanghui Wang; Krishna Acharya; Lokranjan Lakshmikanthan; Juba Ziani; Vidya Muthukumar
Open submitted PDF in a new tab
Reaching a Consensus in Predictive Loops Jiduan Wu; Rediet Abebe; Celestine Mendler-Dünner
Open submitted PDF in a new tab
PAC Learning with Improvements Idan Attias; Avrim Blum; Keziah Naggita; Donya Saless; Dravyansh Sharma; Matthew Walter
Open submitted PDF in a new tab
Geometry-Aware, Adaptive Risk for Agents: A Coherent-Risk and Wasserstein-Robust Foundation for Learning under Interaction and Evolving Uncertainty Deep Ganguly
Open submitted PDF in a new tab
Regularized Robustly Reliable Learning Avrim Blum; Donya Saless
Open submitted PDF in a new tab
Strategic PAC Learnability via Geometric Definability Yuval Filmus; Shay Moran; Elizaveta Nesterova; Nir Rosenfeld; Alexander Shlimovich
Open submitted PDF in a new tab
Learning to Price with Persuasion Maria-Florina Balcan; Tejas Pagare; Karan Singh
Open submitted PDF in a new tab
Important Dates

Important Dates

Abstract submission deadline June 7, 2026*
Notification of acceptance June 8, 2026
Workshop date June 29, 2026

*Submissions after the deadline might be considered depending on space.

Organizers

Organizers

Hedyeh Beyhaghi

Hedyeh Beyhaghi

University of Massachusetts Amherst

hbeyhaghi@umass.edu

Avrim Blum

Avrim Blum

Toyota Technological Institute at Chicago

avrim@ttic.edu

Han Shao

Han Shao

University of Maryland, College Park

hanshao@umd.edu

Dravyansh Sharma

Dravyansh Sharma

Toyota Technological Institute at Chicago

dravy@ttic.edu

Registration

Registration

Participants should follow the COLT 2026 registration instructions once available.

Interested in attending?

Participation is included in COLT registration.

Register
Expanded LAW 2026 Logo