Biography
I’m Danrui Qi, a final-year Ph.D. candidate at Simon Fraser University under the supervision of Prof. Jiannan Wang. I also work closely with Prof. Zhengjie Miao.
Recently I am passionate about agentic AI systems, a paradigm reshaping how we tackle complex, data-intensive problems across industries. My current work focuses on building agent-powered data systems and a self-improving multi-agent scaling system, designed to enable intelligent agents to reason, plan, and collaborate in automating real-world, data-centric tasks. Through recent research, I aim to advance agentic architectures and reimagine how we design data preparation workflow and interact with data systems.
Prior to this, my research centered on automating data preparation pipelines, including automatic feature augmentation and preprocessing for tabular data. I leveraged techniques from Bayesian Optimization and AutoML to systematically address the challenges of feature engineering, demonstrating significant improvements in downstream machine learning performance across diverse datasets.
Impact: In my PhD, I built the DB-GPT stack (17K+ GitHub stars) and Dataprep (2K+ GitHub stars) for democratizing data science and enabling natural language interactions with databases. These systems have been deployed across data science, business intelligence, automated ML, and database applications in diverse industries. My CleanAgent framework introduced one of the first LLM-based multi-agent systems for data standardization, demonstrating how agentic architectures can automate complex data preparation tasks that traditionally required extensive domain expertise and manual effort. My research techniques such as automatic feature augmentation (ICDE 2024) and automated feature preprocessing (EDBT 2024) have contributed to advancing AutoML workflows, while my contributions to MassGen (577 GitHub stars), a multi-agent scaling system, are advancing how intelligent systems collaborate to solve complex real-world tasks at scale.
🎓 Education
- 2020.09 - 2025.09 (expected), Ph.D. Candidate, Computer Science, Simon Fraser University, Burnaby, BC, Canada, under the supervision of Prof. Jiannan Wang.
- 2017.09 - 2020.07, Master of Engineering, School of Software, Tsinghua University, Beijing, China, under the supervision of Prof. Shaoxu Song.
- 2013.09 - 2017.07, Bachelor, School of Software, Tsinghua University, Beijing, China
📝 Publications
Published Papers
-
Arijit Khan, Yuyu Luo, M. Tamer Ozsu,
Danrui Qi, Jiannan Wang. “Second International Workshop on LLM+Vector Data: Agentic RAG Edition.” ICDE 2026. -
Danrui Qi, Jiannan Wang. “CleanAgent: Automating Data Standardization with LLM-based Agents.” DataAI@VLDB 2025. [paper] [code] -
Danrui Qi, Weiling Zheng, Jiannan Wang. “FeatAug: Automatic Feature Augmentation From One-to-Many Relationship Tables.” ICDE 2024. [paper] [code] -
Danrui Qi, Jinglin Peng, Yongjun He, Jiannan Wang. “Auto-FP: An Experimental Study of Automated Feature Preprocessing for Tabular Data.” EDBT 2024. [paper] [code] [talk] -
Siqiao Xue,
Danrui Qi, Caigao Jiang, Wenhui Shi, Fangyin Cheng, Keting Chen, Hongjun Yang et al. “Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models.” Demonstration@VLDB 2024. [paper] -
Jinglin Peng, Weiyuan Wu, Jing Nathan Yan,
Danrui Qi, Jeffrey M. Rzeszotarski, Jiannan Wang. “User Interfaces for Exploratory Data Analysis: A Survey of Open-Source and Commercial Tools.” IEEE Data Eng. Bull. 45(3): 116-128, 2022. [paper] -
Danrui Qi. “On concise explanations of non-answers over big data.” SRC@SIGMOD 2017. [paper]
Preprints & Drafts
-
Aaron Xuxiang Tian, Ruofan Zhang, Jiayao Tang, Young Min Cho, Xueqian Li, Qiang Yi, Ji Wang, Zhunping Zhang,
Danrui Qi, Sharath Chandra Guntuku, Lyle Ungar, Tianyu Shi and Chi Wang. “Beyond the Strongest LLM: Multi-Turn Multi-Agent Orchestration vs. Single LLMs on Benchmarks.” arXiv 2025. -
Danrui Qiwith Microsoft Research collaborators. “TABLE-REDUCER: General-Purpose Models for Table Context Reduction in Diverse Table Task.” draft 2025. -
Fan Zhou, Siqiao Xue,
Danrui Qi, Wenhui Shi, Wang Zhao, Ganglin Wei, Hongyang Zhang et al. “DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models.” arXiv 2024. [paper] -
Siqiao Xue, Caigao Jiang, Wenhui Shi, Fangyin Cheng, Keting Chen, Hongjun Yang, Zhiping Zhang, Jianshan He, Hongyang Zhang, Ganglin Wei, Wang Zhao, Fan Zhou,
Danrui Qi, Hong Yi, Shaodong Liu, Faqiang Chen. “DB-GPT: Empowering Database Interactions with Private Large Language Models.” arXiv 2023. [paper] [code] [demo]
🏛️ Experiences
- 2024.05 - 2025.05, Research Intern at Microsoft Research, worked with Dr. Yeye He
💻 Open-Source Projects
- 2020.7 - Present, Founder Member and Maintainer of Dataprep, which has
2.3k stars - 2023.09 - Present, Main Contributor of DB-GPT-Hub, which has
1.8k stars - 2023.09 - Present, Main Contributor of DB-GPT, which has
17.5k stars - 2024.01 - Present, Founder of CleanAgent, which has
70 stars - 2025.07 - Present, Founder Member and Maintainer of MassGen, which has
577 stars
🏛️ Services
- Workshop Co-Chair of LLM+Vector Data@ICDE 2026
- Shadow PC of VLDB 2026
- Program Committee Member of WWW 2026, CIKM 2025, IJCAI 2025 Survey Track, CIKM 2024, ICDE 2022
- Reviewer of TKDE, ICLR 2026, KDD 2025 Research & ADS Track, WACV 2025, ICLR 2025
- External Reviewer of DASFAA 2024, ICDE 2022, ICDE 2024, CIKM 2023
🏅 Awards
- 2024.01, Westak International Sales Inc. Scholarship at Simon Fraser University
- 2023.09, 2024.01, PhD Research Scholarship at Simon Fraser University
- 2020.09, Graduate Dean’s Entrance (GDES) at Simon Fraser University
- 2017.06, Outstanding Graduate Thesis Award at Tsinghua University
- 2016.10, National Inspirational Scholarship at Tsinghua University