CV
Education
- M.S. in Computer Technology, University of Chinese Academy of Sciences, Institute of Automation, 2022-2025 (expected)
- GPA: 3.68/4
- Major Courses: Information Theory (93), Advanced Operating Systems (92), Principles and Applications of Computational Game Theory
- B.S. in Biotechnology, Northwest A&F University, College of Life Sciences, 2016-2020
- Major Courses: Inorganic Chemistry (97), Genomics and Proteomics, Bioinformatics
Research Experience
- June 2024-Present: Research Intern
- Camel.ai & KAUST
- Projects:
- Created StarCraft II mini-games multimodal environment
- Implemented VLM (GPT4o) integration for StarCraft II gameplay
- Developed real-time game state understanding approaches
- May 2024-Present: Research Collaboration
- Tencent AI Lab
- Projects:
- Leading development of large-scale LLM-RL models
- Built StarCraft II community dataset
- Applied adapter tuning techniques for DI-star
- Developed RL and language model integration approaches
- June 2023-Present: Research Assistant
- Group Decision Intelligence Laboratory, UCAS
- Projects:
- Led “Large Language Models Play StarCraft II” project (Neurips 2024)
- Created LLM Agent framework and TextSC2 environment
- Developed multiple high-impact GitHub projects
- Leading “Adaptive Command” project research
- March 2022-June 2023: Research Assistant
- Institute of Automation, UCAS
- Projects:
- Established RL environment for StarCraft II
- Implemented PPO and D3QN algorithms
- Created CSGO AI system using visual imitation learning
Skills
- Programming & AI/ML
- PyTorch
- LLMs
- Reinforcement Learning
- Deep Learning
- Prompt Engineering
- Game AI Development
- StarCraft II API
- CSGO Engine
- LLM Agent Building
- Professional Gaming Achievements
- StarCraft II Grandmaster League (Korean Server)
- WESG China Regional Finals Top 24
- NESO China Regional Finals Top 24
- NSL Challenger Division Runner-up
- Languages
Publications
- Ma, W. Y., et al. (2024). “Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach.” Neurips 2024.
- “Token-level Direct Preference Optimization.” ICML 2024. (Third Author)
- Ma, W. Y., et al. (2024). “Adaptive Command: Real-Time Policy Adjustment via Language Models in StarCraft II.” DAI 2024.
Awards
- First Prize in MathorCup Challenge (2023)
- Second Prize in UCAS Innovation Competition (2023)