About

I'm Yifan (亦凡) Yuan (袁), and I work at Meta's AI Systems Co-Design team on future compute/AI infra systems architecture. My research focuses on hardware-software co-design for modern cloud datacenter, ranging from accelerators to I/O, memory, networking, and distributed systems.

Before that, I spent two years at Intel Labs (Systems Architecture Lab) as a research scientist. Before that, I got PhD from ECE@UIUC (advised by Prof. Nam Sung Kim). I'm interested in computer architecture and system, especially building modern datacenter with emerging hardware and system software.

Before joining UIUC in 2017, I spent three years on my undergraduate study at Zhejiang University (ZJU), focusing on computer architecture and VLSI design.

I grew up in Tianjin, a city in China with spirit of humor and optimism. I enjoy badminton, hiking, and cooking.


Publications

Full List (including US patents)
  • DCPerf: An Open-Source, Battle-Tested Performance Benchmark Suite for Datacenter Workloads
    ISCA 2025 (Industry Session)
    Cloud Computing Performance Modeling and Projection
  • Dynamic Load Balancer in Intel Xeon Scalable Processor: Performance Analyses, Enhancements and Guidelines
    ISCA 2025
    Accelerator Cloud Computing In-network Computing
  • A4: Microarchitecture-Aware LLC Management for Datacenter Servers with Emerging I/O Devices
    ISCA 2025
    Memory Technology Peripheral Device Performance Characterization
  • M5: Mastering page migration and memory management for CXL-based tiered memory systems
    ASPLOS 2025 paper
    Memory Technology OS
  • Demystifying a CXL Type-2 Device: A Heterogeneous Cooperative Computing Perspective
    MICRO 2024 paper
    Memory Technology OS Peripheral Device Accelerator Performance Characterization
  • Nomad: Non-Exclusive Memory Tiering via Transactional Page Migration
    Memory Technology OS
  • Intel Accelerator Ecosystem: An SoC-Oriented Perspective
    ISCA 2024 (Industry Session) paper slides
    On-chip Accelerator
  • A Quantitative Analysis and Guidelines of Data Streaming Accelerator in Modern Intel Xeon Scalable Processors
    ASPLOS 2024 paper slides
    On-chip Accelerator
  • BonsaiKV: Towards Fast, Scalable, and Persistent Key-Value Stores with Tiered, Heterogeneous Memory System
    VLDB 2024 paper
    Memory Technology
  • Demystifying CXL Memory with True CXL-Ready Systems and CXL Memory Devices
    MICRO 2023 paper slides
    Memory Technology Peripheral Devices Performance Characterization
  • STYX: Exploiting SmartNIC Capability to Reduce Datacenter Memory Tax
    ATC 2023 paper
    Accelerator Datacenter Tax
  • RAMBDA: RDMA-driven Acceleration Framework for Memory-intensive us-scale Datacenter Applications
    HPCA 2023 paper slides
    Accelerator Cloud Computing
  • IDIO: Network-Driven, Inbound Network Data Orchestration on Server Processors
    MICRO 2022 paper slides
    I/O Subsystem Peripheral Devices
  • Unlocking the Power of Inline Floating-Point Operations on Programmable Switches
    ML Training In-network Computing
  • Don’t Forget the I/O When Allocating Your LLC
    ISCA 2021 paper slides
    I/O Subsystem Peripheral Devices
  • QEI: Query Acceleration Can be Generic and Efficient in the Cloud
    HPCA 2021 paper slides
    On-chip Accelerator Cloud Computing
  • Data Direct I/O Characterization for Future I/O System Exploration
    ISPASS 2020 paper slides video
    I/O Subsystem Peripheral Devices Performance Modeling and Projection
  • HALO: Accelerating Flow Classification for Scalable Packet Processing in NFV
    ISCA 2019 paper slides
    On-chip Accelerator Cloud Computing
  • Accelerating Distributed Reinforcement Learning with In-Switch Computing
    ISCA 2019 paper slides
    ML Training In-network Computing
  • A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks
    MICRO 2018 paper slides
    ML Training In-network Computing

Professional Services and Activities