Hi, there 👋

I am currently a Ph.D student in Nankai University from August 2022, supervised by Prof. Yaxing Wang. I obtained my master’s degree in Computer Technology from the College of Computer Science, Nankai University.

My research interests include Generative Models, Image Generation, and Image-to-image Translation.

I’m currently conducting some research in image editing and efficient inference, including:

🎨 Image editing based on Generative Models (GANs and Diffusion Models).

🚀 The acceleration of inferecne by training-free or data-free distillation.

🔥 News

  • [2025.01]  🥳🥳 Two papers (including one co-authored paper) accepted by ICLR2025. InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration. See paper and Project Page.
  • [2024.09]  🥳🥳 StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing accepted by CVMJ2024. See paper and code.
  • [2024.09]  🥳🥳 Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference accepted by NeurIPS2024. See paper and Project Page.
  • [2024.01]  🥳🥳 Get What You Want, Not What You Don’t: Image Content Suppression for Text-to-Image Diffusion Models accepted by ICLR2024. See paper and code.
  • [2023.12]  🎉🎉 New work Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models. See paper and code.
  • [2023.02]  🥳🥳 3D-Aware Multi-Class Image-to-Image Translation with NeRFs accepted by CVPR2023. See paper and code.
  • [2020.12]  🥳🥳 Low-rank Constrained Super-Resolution for Mixed-Resolution Multiview Video accepted by TIP2020. See paper and code.

📝 Publications

ICLR 2025
sym

InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

Senmao Li, Kai Wang*, Joost van de Weijier, Fahad Shahbaz Khan, Chun-Le Guo, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng

  • By considering the low-quality image as the intermediate state of LCM models, we can effectively maintain better semantic consistency in face restorations.
  • Using LCM mapping each state to the original image level point, our method InterLCM has additional advantages: few-step sampling with much faster speed and integrating our framework with commonly used perceptual loss and adversarial loss in face restoration.
[paper] [code] [abstract]
NeurIPS 2024
sym

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference

Senmao Li, Taihang Hu, Joost van de Weijier, Fahad Shahbaz Khan, Linxuan Li, Shiqi Yang, Yaxing Wang*, Ming-Ming Cheng, Jian Yang

  • A thorough empirical study of the features of the UNet in the diffusion model showing that encoder features vary minimally (whereas decoder feature vary significantly)
  • An encoder propagation scheme to accelerate the diffusion sampling without requiring any training or fine-tuning technique
  • Our method can be combined with existing methods (like DDIM, and DPM-solver) to further accelerate diffusion model inference time
  • ~1.8x acceleration for stable diffusion, 50 DDIM steps, ~1.8x acceleration for stable diffusion, 20 Dpm-solver++ steps, and ~1.3x acceleration for DeepFloyd-IF
[paper]|[中译版] [code] [abstract]
ICLR 2024
sym

Get What You Want, Not What You Don’t: Image Content Suppression for Text-to-Image Diffusion Model

Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang*, Jian Yang

  • The [EOT] embeddings contain significant, redundant and duplicated semantic information of the whole input prompt.
  • We propose soft-weighted regularization (SWR) to eliminate the negative target information from the [EOT] embeddings.
  • We propose inference-time text embedding optimization (ITO).
[paper]|[中译版] [code] [abstract] [中文解读]
CVMJ 2024
sym

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang*, Jian Yang

  • Only optimizing the input of the value linear network in the cross-attention layers is sufficiently powerful to reconstruct a real image
  • Attention regularization to preserve the object-like attention maps after reconstruction and editing, enabling us to obtain accurate style editing without invoking significant structural changes
[paper]|[中译版] [code] [abstract] [中文解读]
CVPR 2023
sym

3D-Aware Multi-Class Image-to-Image Translation with NeRFs

Senmao Li, Joost van de Weijer, Yaxing Wang*, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang

  • The first to explore 3D-aware multi-class I2I translation
  • Decouple 3D-aware I2I translation into two steps
[paper]|[中译版] [code] [abstract]
  • sym InterLCM: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration . Senmao Li, Kai Wang, Joost van de Weijier, Fahad Shahbaz Khan, Chun-Le Guo, Shiqi Yang, Yaxing Wang, Jian Yang, Ming-Ming Cheng. ICLR2025.
    [paper] [Project Page] [abstract]
  • sym Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference . Senmao Li, Taihang Hu, Joost van de Weijier, Fahad Shahbaz Khan, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang. NeurIPS2024.
    [paper] [Project Page] [abstract]
  • sym Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models . Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang. ICLR 2024.
    [paper] [code] [abstract]
  • sym StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing . Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang. CVMJ 2024.
    [paper] [code] [abstract]
  • sym 3D-Aware Multi-Class Image-to-Image Translation with NeRFs . Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang. CVPR 2023.
    [paper] [code] [abstract]

📄 Academic Service

  • Conference Reviewer: CVPR’25, ICLR’25, NeurIPS’24

💻 Internships