Yuqiao Tan (谭宇乔)
M.S. Student @ CASIA
Research focuses on LLM Reasoning, LLM Interpretability, Reinforcement Learning, and Personalized Agent. Open to discussions, internships, and collaborations.
LLM Reasoning
LLM Interpretability
Reinforcement Learning
Personalized Agent
News
- [2026.01] Invited Talk at NICE: Internal Policy of LLMs and RL — [Video]
- [2025.12] Paper "Bottom-up Policy Optimization" released — Ranked #3 on Huggingface Daily Papers!
- [2025.10] One paper accepted by NeurIPS 2025 (Efficient Reasoning Workshop).
- [2025.05] One paper accepted by ACL 2025.
- [2025.03] One paper accepted by ICMR 2025.
Publications
// 2025
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
Preprint Code
The Zero-Step Thinking: An Empirical Study of Mode Selection as Harder Early Exit in Reasoning Models
NeurIPS 2025 Efficient Reasoning Workshop Code
Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in LLMs
ACL 2025 Code
RobustPT: Dynamic Disentanglement Prompt Tuning in Vision-Language Models with Missing Modalities
ICMR 2025 Code
// 2024
Projects
- Baidu-AI Differential Search Index Competition Differential search index for advertisement search based on LLMs. Top-1 (1/3600)
Education
- M.S. — Pattern Recognition and Intelligent System, CASIA · 2025 – Now
- B.S. — Software Engineering, UESTC · 2021 – 2025
Internships
Research Intern — Smart Internet Group (SIG), Tsinghua University · 2023.07 – 2024.05
Research Intern — DCar-AI-Y, ByteDance · 2024.01 – 2024.07
Honors
- Outstanding Graduate of Sichuan Province · 2024
- First Prize, Baidu Business AI Technology Innovation Competition (CTI) · 2024
- Soong Ching Ling Scholarship, UESTC · 2023
- National Scholarship, Ministry of Education · 2022
Invited Talks
- NICE (2026.01) — Internal Policy of LLMs and Reinforcement Learning — [Video]
Services
- Reviewer: ICMR 2025, NeurIPS ER 2025