I’m the first year master student at the Institute of Automation, Chinese Academy of Sciences (CASIA), supervised by A.P. Shizhu He .

My research interests lie at Large Language Models, LLMs Interpretability github repo stars and LLMs Reasoning. Now I’m focusing on combining interpretability tools with RL techniques to improve LLMs. I belive “you can’t improve what you don’t understand”.

If my research interests you, please feel free to contact me at tanyuqiao2025@ia.ac.cn. I look forward to potential collaborations and internship opportunities🤗.

🔥 News

2025.12:🎉 Our paper “Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies” release!! We are the first to decompose the language model policy and conduct bottom-up RL training!! 🎉 We Ranked #2 of the day on Huggingface Daily Papers.
2025.10:🎉 One paper is accepted by Neurips 2025 Efficient Reasoning Workshop.
2025.05:🎉 One paper is accepted by ACL 2025 Main.
2025.03:🎉 One paper is accepted by ICMR 2025.

📝 Publications

Preprint New!!! Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Yuqiao Tan, Minzheng Wang, Shizhu He, et al.
NeurIPS 2025 The Zero-Step Thinking: An Empirical Study of Mode Selection as Harder Early Exit in Reasoning Models Yuqiao Tan, Shizhu He, Kang Liu, Jun Zhao
ACL 2025 Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models Yuqiao Tan, Shizhu He, Kang Liu, Jun Zhao
Preprint Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement Yuqiao Tan, Shizhu He, Huanxuan Liao, Jun Zhao, Kang Liu
ICMR 2025 MuAP: Multi-step Adaptive Prompt Learning for Vision-Language Model with Missing Modality Ruiting Dai, Yuqiao Tan
ICMR 2024 G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning Ruiting Dai, Yuqiao Tan

🤖 Projects

2024.08 Baidu-AI Differential Search Index Competition: Differential search index for advertisement search based on large language models. Top-1🥇 (1/3600) of 2024 Baidu business AI technology innovation competition (CTI).
2025.04 Chinese-Logic-RL: Exploring LLM Reasoning with Rule-based Reinforcement Learning in Chinese.

🎖 Honors and Awards

2022.12 National Scholorship, Ministry of Education.
2023.12 Soong Ching Ling Scholorship, University of Electronic Science and Technology of China
2024.08 First Prize of Baidu business AI technology innovation Competition (CTI) , Chinese Association for Artificial Intelligence
2024.10 The outstanding graduate of Sichuan Province.

📖 Educations

2025.09 - Now, Pattern Recognition and Intelligent System, Institute of Automation, Chinese Academy of Sciences (CASIA)
2021.09 - 2025.06, Software and Engineering, University of Electronic Science and Technology of China (UESTC)

💻 Internships