Hi, there 👋

I am currently a Ph.D student in Nankai University from August 2022, supervised by Prof. Yaxing Wang. I obtained my master’s degree in Computer Technology from the College of Computer Science, Nankai University.

My research interests include Generative Models, Image Generation, and Image-to-image Translation.

I’m currently conducting some research in image editing and efficient inference, including:

🎨 Image editing based on Generative Models (GANs and Diffusion Models).

🚀 The acceleration of inferecne by training-free or data-free distillation.

🔥 News

  • [2025.02]  🥳🥳 Two papers (including one co-authored paper MaskUNet) accepted by CVPR2025.
    ✨One-Way Ticket : Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Model. See Github.
  • [2025.01]  🥳🥳 Two papers (including one co-authored paper 1Prompt1Story) accepted by ICLR2025.
    ✨InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration. See paper and Project Page.
  • [2024.09]  🥳🥳 StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing accepted by CVMJ2024. See paper and code.
  • [2024.09]  🥳🥳 Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference accepted by NeurIPS2024. See paper and Project Page.
  • [2024.01]  🥳🥳 Get What You Want, Not What You Don’t: Image Content Suppression for Text-to-Image Diffusion Models accepted by ICLR2024. See paper and code.
  • [2023.12]  🎉🎉 New work Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models. See paper and code.
  • [2023.02]  🥳🥳 3D-Aware Multi-Class Image-to-Image Translation with NeRFs accepted by CVPR2023. See paper and code.
  • [2020.12]  🥳🥳 Low-rank Constrained Super-Resolution for Mixed-Resolution Multiview Video accepted by TIP2020. See paper and code.

📝 Publications

CVPR 2025
sym

One-Way Ticket : Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Model

Senmao Li, Lei Wang, Kai Wang, Tao Liu, Jiehang Xie, Joost van de Weijier, Fahad Shahbaz Khan, Shiqi Yang, Yaxing Wang*, Jian Yang

  • We introduce the first Time-independent Unified Encoder (TiUE) architecture, which is a loop-free distillation approach and eliminates the need for iterative noisy latent processing while maintaining high sampling fidelity with a time cost comparable to previous one-step methods.
[中译版] [code] [abstract] [slide] [poster]
ICLR 2025
sym

InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

Senmao Li, Kai Wang*, Joost van de Weijier, Fahad Shahbaz Khan, Chun-Le Guo, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng

  • By considering the low-quality image as the intermediate state of LCM models, we can effectively maintain better semantic consistency in face restorations.
  • Our method InterLCM has additional advantages: few-step sampling with much faster speed and integrating our framework with commonly used perceptual loss and adversarial loss in face restoration.
[paper]|[中译版] [code] [abstract] [poster] [demo]
NeurIPS 2024
sym

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference

Senmao Li, Taihang Hu, Joost van de Weijier, Fahad Shahbaz Khan, Linxuan Li, Shiqi Yang, Yaxing Wang*, Ming-Ming Cheng, Jian Yang

  • A thorough empirical study of the features of the UNet in the diffusion model showing that encoder features vary minimally (whereas decoder feature vary significantly)
  • An encoder propagation scheme to accelerate the diffusion sampling without requiring any training or fine-tuning technique
  • ~1.8x acceleration for stable diffusion, 50 DDIM steps, ~1.8x acceleration for stable diffusion, 20 Dpm-solver++ steps, and ~1.3x acceleration for DeepFloyd-IF
[paper]|[中译版] [code] [abstract] [slide] [poster] [中文解读]
ICLR 2024
sym

Get What You Want, Not What You Don’t: Image Content Suppression for Text-to-Image Diffusion Model

Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang*, Jian Yang

  • The [EOT] embeddings contain significant, redundant and duplicated semantic information of the whole input prompt.
  • We propose soft-weighted regularization (SWR) to eliminate the negative target information from the [EOT] embeddings.
  • We propose inference-time text embedding optimization (ITO).
[paper]|[中译版] [code] [abstract] [poster] [中文解读]
CVMJ 2024
sym

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang*, Jian Yang

  • Only optimizing the input of the value linear network in the cross-attention layers is sufficiently powerful to reconstruct a real image
  • Attention regularization to preserve the object-like attention maps after reconstruction and editing, enabling us to obtain accurate style editing without invoking significant structural changes
[paper]|[中译版] [code] [abstract] [中文解读]
CVPR 2023
sym

3D-Aware Multi-Class Image-to-Image Translation with NeRFs

Senmao Li, Joost van de Weijer, Yaxing Wang*, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang

  • The first to explore 3D-aware multi-class I2I translation
  • Decouple 3D-aware I2I translation into two steps
[paper]|[中译版] [code] [abstract] [slide] [poster]
  • sym InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration . Senmao Li, Kai Wang, Joost van de Weijier, Fahad Shahbaz Khan, Chun-Le Guo, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng. ICLR2025.
    [paper] [Project Page] [abstract]
  • sym Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference . Senmao Li, Taihang Hu, Joost van de Weijier, Fahad Shahbaz Khan, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang. NeurIPS2024.
    [paper] [Project Page] [abstract]
  • sym Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models . Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang. ICLR 2024.
    [paper] [code] [abstract]
  • sym StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing . Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang. CVMJ 2024.
    [paper] [code] [abstract]
  • sym 3D-Aware Multi-Class Image-to-Image Translation with NeRFs . Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang. CVPR 2023.
    [paper] [code] [abstract]

📄 Academic Service

💻 Internships