Youwei Xiao

self_at_fuzhou.jpg

Youwei Xiao (肖有为)

School of Integrated Circuits

Peking University

Beijing, China

I am a Ph.D. candidate at the School of Integrated Circuits, Peking University, advised by Prof. Yun Liang. I received my Bachelor of Science in EECS from Peking University in 2022. My research is broadly about compiler-centered hardware-software co-design for MLSys, computer architecture, and EDA: how software workloads can guide architectural customization, how those customized capabilities can be synthesized into hardware, and how compiler and system layers can make the result usable by real applications.

One thread of my work studies hardware synthesis as the implementation layer of this stack. I build EDA software abstractions that connect high-level design intent with efficient RTL-level hardware. This includes the high-level synthesis framework Hector (ICCAD 2022), the Rust-based hardware description language Cement (FPGA 2024), and more recent work on e-graph-based synthesis optimization in SkyEgg. Across these projects, I have used MLIR, Rust, and compiler IR design to make hardware synthesis more programmable, analyzable, and optimizable.

A second thread focuses on architecture customization. I explored how compiler analysis, application profiling, design space exploration, and formal methods can automate the discovery of useful accelerators and custom instructions. Cayman (DAC 2025) generates domain-specific accelerators while considering control flow and data access strategies. ISAMORE (ASPLOS 2026) uses e-graph anti-unification to find reusable custom instructions from equivalent program fragments.

These two threads motivated a larger goal: a fully integrated co-design toolchain that can derive architecture design, hardware implementation, and compiler support from agile specifications or even directly from target applications. I initiated and led the APS project with lab classmates toward this goal. One long-term example is generating an optimized ML ASIC solution with full ML compiler support from target ML models, with as little manual intervention as possible. I also help organize tutorials at major EDA and architecture conferences to share our work on agile hardware specialization and co-design methodologies; see the APS tutorials.

More recently, I have been extending this agenda into LLM-era compiler and system techniques. On one side, I study LLMs and agents as new interfaces for compiler optimization and hardware-software co-design, including EggMind for LLM-guided equality-saturation strategy synthesis and agentic co-design workflows such as the next-generation APS and Spine. On the other side, I work on compiler and runtime systems for emerging ML workloads, including IntelliC, a human- and LLM-friendly infrastructure for retargetable tensor compilation and superoptimization across NVIDIA, Qualcomm Hexagon, and Huawei Ascend architectures; distributed tensor compilation and runtime work such as PTO Runtime; and Hive, an inference infrastructure for multi-agent systems with programming-surface and control-layer support.

news

Mar 31, 2026 ISAMORE winnes the ASPLOS 2026 Best Paper Award (5/1048)!
Mar 19, 2026 I will present EggMind at the Architecture 2.0 Workshop and ISAMORE in the main program at ASPLOS 2026.
Jan 19, 2026 Successfully held our APS-MLIR tutorial at ASP-DAC 2026 in Hong Kong!

selected publications

  1. Preprint
    LLM-Guided Strategy Synthesis for Scalable Equality Saturation
    Chenyun Yin*Youwei Xiao*, Yuze Luo, and 2 more authors
    2026
  2. Preprint
    Hive: A Multi-Agent Infrastructure for Algorithm- and Task-Level Scaling
    Zizhang Luo, Yuhao Luo, Youwei Xiao, and 3 more authors
    2026
  3. Arch 2.0
    EggMind: LLM-Driven Two-Dimensional Intelligence for Scalable Equality Saturation
    Youwei Xiao, Chenyun Yin, and Yun Liang
    In Architecture 2.0: Workshop on AI for Computing Systems Design, 2026
  4. ASPLOS
    Finding Reusable Instructions via E-Graph Anti-Unification
    Youwei Xiao, Chenyun Yin, Yitian Sun, and 2 more authors
    In Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (ASPLOS ’26), 2026
    Best Paper Award at ASPLOS 2026 (5/1048 submissions).
  5. Preprint
    Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR
    Yuyang Zou, Youwei Xiao, Yansong Xu, and 6 more authors
    2025
  6. Preprint
    SkyEgg: Joint Implementation Selection and Scheduling for Hardware Synthesis using E-graphs
    Youwei Xiao, Yuyang Zou, and Yun Liang
    2025
  7. Preprint
    Cement2: Temporal Hardware Transactions for High-Level and Efficient FPGA Programming
    Youwei Xiao, Zizhang Luo, Weijie Peng, and 2 more authors
    2025
  8. ICCAD
    Invited Paper: APS: Open-Source Hardware-Software Co-Design Framework for Agile Processor Specialization
    Youwei Xiao, Yuyang Zou, Yansong Xu, and 6 more authors
    In Proceedings of the 44rd IEEE/ACM International Conference on Computer-Aided Design (ICCAD ’25), 2025
  9. ICCAD
    Clay: High-level ASIP Framework for Flexible Microarchitecture-Aware Instruction Customization
    Weijie Peng*Youwei Xiao*, Yuyang Zou, and 2 more authors
    In Proceedings of the 44rd IEEE/ACM International Conference on Computer-Aided Design (ICCAD ’25), 2025
  10. DAC
    Cayman: Custom Accelerator Generation with Control Flow and Data Access Optimization
    Youwei Xiao, Fan Cui, Zizhang Luo, and 2 more authors
    In Proceedings of the 62nd ACM/IEEE Design Automation Conference (DAC ’25), 2025
  11. LATTE
    cmt2: Rule-Based Hardware Description in Rust with Temporal Semantics
    Youwei Xiao, Zizhang Luo, and Yun Liang
    In 5th Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE’25), 2025
  12. FPGA
    An Empirical Comparison of LLM-based Hardware Design and High-level Synthesis
    Fan Cui, Youwei Xiao, Kexing Zhou, and 1 more author
    In Proceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’25), 2025
  13. FPGA
    Cement: Streamlining FPGA Hardware Design with Cycle-Deterministic eHDL and Synthesis
    Youwei Xiao, Zizhang Luo, Kexing Zhou, and 1 more author
    In Proceedings of the 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA ’24), 2024
  14. ICCAD
    HECTOR: A Multi-Level Intermediate Representation for Hardware Synthesis Methodologies
    Ruifan Xu, Youwei Xiao, Jin Luo, and 1 more author
    In Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design (ICCAD ’22), 2022