CV
Curriculum Vitae
General Information
| Full Name | Youwei Xiao (肖有为) |
| shallwe@pku.edu.cn | |
| Phone | +86 185 1920 4005 |
Education
-
2022 - Present Beijing, China
Ph.D. Candidate in Integrated Circuit Science and Engineering
School of Integrated Circuits, Peking University - Advisor: Prof. Yun Liang
- Research: Compilers, DSLs, and hardware-software co-design for MLSys and EDA
-
2018 - 2022 Beijing, China
Bachelor of Science in Computer Science
School of EECS, Peking University
Research Interests
-
Hardware Synthesis & EDA
- Compiler infrastructures and domain-specific languages for hardware synthesis (Hector, Cement, SkyEgg)
- Multi-level intermediate representations for HLS
- E-graph based optimization for hardware synthesis
-
Computer Architecture
- Compiler-guided accelerator generation and instruction specialization (Cayman, ISAMORE, APS)
- Custom instruction design and reuse
- Hardware-software co-design frameworks
-
ML Compilers & Systems
- Retargetable tensor compilers, execution runtimes, and superoptimizers (HTP, PTO Runtime, EggMind, Hive)
- Distributed and mega-kernel compilation
- LLM inference optimization
Selected Publications
-
2026 ISAMORE: Finding Reusable Instructions via E-Graph Anti-Unification
ASPLOS 2026 (Best Paper Award, 5/1048) - Youwei Xiao, Chenyun Yin, Yitian Sun, Yun Liang
-
2025 APS: Open-Source Hardware-Software Co-Design Framework for Agile Processor Specialization
ICCAD 2025 (Invited Paper) - Youwei Xiao, Yuyang Zou, Yansong Xu, et al.
-
2025 Clay: High-level ASIP Framework for Flexible Microarchitecture-Aware Instruction Customization
ICCAD 2025 - Weijie Peng*, Youwei Xiao*, Yuyang Zou, et al. (*Equal contribution)
-
2025 Cayman: Custom Accelerator Generation with Control Flow and Data Access Optimization
DAC 2025 - Youwei Xiao, Fan Cui, Zizhang Luo, Weijie Peng, Yun Liang
-
2024 Cement: Streamlining FPGA Hardware Design with Cycle-Deterministic eHDL and Synthesis
FPGA 2024 - Youwei Xiao, Zizhang Luo, Kexing Zhou, Yun Liang
-
2022 HECTOR: A Multi-Level Intermediate Representation for Hardware Synthesis Methodologies
ICCAD 2022 - Ruifan Xu, Youwei Xiao, Jin Luo, Yun Liang
Honors and Awards
-
March 2026 ASPLOS 2026 Best Paper Award
"Finding Reusable Instructions via E-Graph Anti-Unification" (ISAMORE), recognized with ASPLOS 2026 Best Paper Award (5/1048) -
2018-2019 Yang Fuqing & Wang Yangyuan Academician Scholarship
Awarded to only 5 students at EECS, Peking University, for academic excellence -
2019-2020 Shenzhen Stock Exchange Scholarship
Awarded to 17 students from EECS, Peking University -
2018-2020 Merit Student
Awarded to top 10% students for comprehensive excellence -
August 2020 Second Place in EDAthon 2020
IEEE CEDA Hong Kong Chapter - Programming Competition on Electronic Design Automation -
July 2017 Second Prize in NOI 2017
National Olympiad in Informatics, China Computer Federation -
May 2017 Second Prize in APIO 2017
Asia Pacific Informatics Olympiad (China District), China Computer Federation
Tutorials
-
January 2026 APS: An MLIR-Based Hardware-Software Co-design Framework
ASP-DAC 2026, Tokyo, Japan -
May 2025 Agile Hardware Specialization: A toolbox for Agile Chip Front-end Design
ISEDA 2025, Xi'an, China -
March 2025 Agile Hardware Specialization (AHS): A toolbox for Agile Chip Front-end Design
ASPLOS 2025, Rotterdam, Netherlands -
March 2025 Agile Hardware Specialization: A toolbox for Agile Chip Front-end Design
DATE 2025, Lyon, France -
January 2025 AHS: An EDA toolbox for Agile Chip Front-end Design
ASP-DAC 2025, Tokyo, Japan
Selected Projects
-
2025 - Present ISAMORE
Reusable instruction discovery with e-graph anti-unification - End-to-end framework for discovering reusable custom instructions from domain applications.
- Recognized with the ASPLOS 2026 Best Paper Award.
-
2024 - Present APS / Aquas
MLIR-based hardware-software co-design framework for agile processor specialization - Initiated and led the project toward a fully integrated co-design toolchain.
- Combines hardware synthesis, retargetable compilation, and custom instruction adoption.
-
2024 - 2025 Cayman
Automatic accelerator generation with control-flow and data-access optimization - Built an end-to-end framework for accelerator candidate selection and synthesis.
-
2024 - 2025 SkyEgg
Joint implementation selection and scheduling using e-graphs - Explores e-graph-based optimization for hardware synthesis implementation choices.
-
2023 - 2024 Cement
Cycle-deterministic embedded HDL and compiler framework in Rust - Developed productive FPGA design abstractions with timing-aware control synthesis.
-
2021 - 2022 Hector
Multi-level intermediate representation for hardware synthesis on MLIR - Built a unified two-level IR infrastructure for HLS and hardware generators.
-
2025 - Present HTP
Human- and LLM-friendly retargetable compiler substrate with inspectable IRs - Builds interpretable compiler artifacts for heterogeneous tensor programs across NVIDIA, Qualcomm Hexagon, and Huawei Ascend.
-
2025 - Present PTO Runtime
Task-graph runtime for distributed tensor compilation and host-device execution - Turns compiled dependency graphs into structured host-device execution for large-scale tensor workloads.
-
2025 - Present EggMind
Execution-guided agentic superoptimizer for scalable equality saturation - Uses bounded search with diagnosis and execution feedback to discover better rewrite strategies.
-
2025 - Present Hive
Explicit orchestration substrate for programmable multi-agent systems - Separates workflow control, state transitions, and serving boundaries for programmable multi-agent execution.
Teaching Experience
-
Spring 2022 & Fall 2022 Teaching Assistant - High-level Chip Design
Peking University -
Fall 2020 Teaching Assistant - Introduction to Computer Systems
Peking University
Skills
-
Programming Languages
- Rust, C++, Python, Scala, Verilog, CUDA
-
Tools & Frameworks
- Compiler Infrastructure: MLIR, LLVM
- ML Frameworks: PyTorch, Tilelang, Triton