Our research centers on efficient and intelligent computing with a focus on advancing the efficiency frontier of AI, thus broadening its practical applicability, and emphasizes a balanced integration between algorithms and hardware through automated tools. Our research methodology starts with a top-down approach for AI algorithm development and a bottom-up innovation for AI hardware accelerators, and finally seeks to bridge these two perspectives by developing automated co-exploration and co-search techniques for efficient AI algorithms and accelerators. The primary objective is to maximize AI acceleration efficiency and facilitate the rapid development of AI solutions.

  • Top-Down: Hardware-Aware Efficient AI/DNN Algorithms
    • At the algorithm level, we advocate that AI algorithm design should be acutely aware of the underlying hardware of their target devices, each of which often possesses distinct storage and processing capabilities, as represented by the following work: Early-Bird Tickets at ICLR'20 (spotlight paper, ranked top 3%), CPT-Train at ICLR'21 (spotlight paper, ranked top 3%), SuperTickets at ECCV'22, and Hint-Aug at CVPR'23.

  • Bottom-Up: Algorithm-Aware Efficient AI/DNN Accelerators
    • At the hardware level, we emphasize that AI accelerator design should transcend conventional methods to embrace algorithmic opportunities, as represented by the following work: SmartExchange at ISCA'20, ViTCoD at HPCA'23, Instant-3D at ISCA'23, and Gen-NeRF at ISCA'23.

  • Bridging: Automated Tools for Facilitating Fast Development of Efficient AI Solutions
    • Our group has been at the forefront of developing automated tools for creating efficient AI solutions, aiming to maximize the achievable efficiency as well as facilitate fast development of efficient AI solutions. Representative work includes the following: AdaDeep at MobiSys'18, HW-NAS-Bench at ICLR'21 (spotlight paper, ranked top 3%), Auto-NBA at ICML'21, and GPT4AIGChip at ICCAD'23.

  • System Integration and Demonstration Towards Real-World Applications
    • In pursuit of ubiquitous on-device intelligence and green AI, we consistently strive to validate our techniques on real-world devices by using commercial devices or designing custom FPGA/ASIC accelerators for system integration and demonstration. Representative work includes the following: First Place in the ACM/IEEE TinyML Design Contest at ICCAD'22 and EyeCoD at IEEE Micro's Top Picks of 2023.

    i-FlatCam (ASIC):
    Won 1st Place in Best University Demo at DAC'2022
    A 253 FPS, 91.49 µJ/Frame
    Ultra-Compact Intelligent Lensless Camera
    Gen-NeRF (FPGA):
    Won 2nd Place in Best University Demo at DAC'2023
    Real-time, Low-power, and Generalizable Scene Rendering and Segmentation based on NeRFs with Interactive View Control