- Title题目 Algorithmic Phase Transitions in In-Context Learning for Markov Chains
- Speaker报告人 崔文平/Wen-Ping Cui (普林斯顿大学)
- Date日期 2026年3月10日 10:00
- Venue地点 南楼6620
Modern distributed architectures, particularly transformers, exhibit an emergent ability known as in-context learning: they can infer task-specific structure from limited input data without updating their parameters. A central theoretical challenge is to identify the architectural principles and data conditions that enable this behavior. Here, we provide a mechanistic and dynamical characterization of in-context generalization in a transformer trained on discrete stationary Markov chains. We show that training gives rise to two distinct algorithmic phases: a unigram phase and a bigram phase. Mechanistically, we demonstrate that the bigram solution is implemented through a statistical induction head. We further derive an effective theory for the learning dynamics of this induction head, explain why its formation occurs abruptly, and show that the transition time is governed by data statistical bias that guides optimization toward the generalizing solution.
Biography
普林斯顿大学博士后研究员。2011年在中国科学技术大学天体物理学专业取得学士学位,2014年在德国波恩大学取得物理学硕士学位,2021年在美国波士顿学院取得物理学博士学位。2021-2024年期间在加州大学圣芭芭拉分校-卡维里理论物理研究所从事博士后研究。研究领域为生物物理,特别是利用统计物理来探索复杂系统的基本原理。研究方向包括生态与演化,细胞感知与响应机制,动物行为学,以及人工神经网络的注意力学习机制。
Inviter: Pan Zhang