I’m the first year master student at the Institute of Automation, Chinese Academy of Sciences (CASIA), supervised by A.P. Shizhu He .

My research interests lie at Large Language Models, LLMs Interpretabilitygithub repo stars and LLMs Reasoning. Now I’m focusing on combining interpretability tools with RL techniques to improve LLMs. I belive “you can’t improve what you don’t understand”.

If my research interests you, please feel free to contact me at tanyuqiao2025@ia.ac.cn. I look forward to potential collaborations and internship opportunities🤗.

🔥 News

  • 2025.12:🎉 Our paper “Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies” release!! We are the first to decompose the language model policy and conduct bottom-up RL training!! 🎉 We Ranked #2 of the day on Huggingface Daily Papers.

  • 2025.10:🎉 One paper is accepted by Neurips 2025 Efficient Reasoning Workshop.
  • 2025.05:🎉 One paper is accepted by ACL 2025 Main.
  • 2025.03:🎉 One paper is accepted by ICMR 2025.

📝 Publications

🤖 Projects

  • 2024.08 Baidu-AI Differential Search Index Competition: Differential search index for advertisement search based on large language models. Top-1🥇 (1/3600) of 2024 Baidu business AI technology innovation competition (CTI).
  • 2025.04 Chinese-Logic-RL: Exploring LLM Reasoning with Rule-based Reinforcement Learning in Chinese.

🎖 Honors and Awards

  • 2022.12 National Scholorship, Ministry of Education.
  • 2023.12 Soong Ching Ling Scholorship, University of Electronic Science and Technology of China
  • 2024.08 First Prize of Baidu business AI technology innovation Competition (CTI) , Chinese Association for Artificial Intelligence
  • 2024.10 The outstanding graduate of Sichuan Province.

📖 Educations

  • 2025.09 - Now, Pattern Recognition and Intelligent System, Institute of Automation, Chinese Academy of Sciences (CASIA)
  • 2021.09 - 2025.06, Software and Engineering, University of Electronic Science and Technology of China (UESTC)

💻 Internships