Hi, there 👋

I am currently a Ph.D student in Nankai University from August 2022, supervised by Prof. Yaxing Wang. I obtained my master’s degree in Computer Technology from the College of Computer Science, Nankai University.

My research interests include Generative Models, Image Generation, and Image-to-image Translation.

I’m currently conducting some research in image editing and efficient inference, including:

🎨 Image editing based on Generative Models (GANs and Diffusion Models).

🚀 The acceleration of inferecne by training-free or data-free distillation.

🔥 News

  • 2024.09:  🥳🥳 Our paper “Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference” accepted by NeurIPS’24.
  • 2024.01:  🥳🥳 Our paper “Get What You Want, Not What You Don’t: Image Content Suppression for Text-to-Image Diffusion Models” accepted by ICLR’24. See our paper and code.
  • 2023.12:  🎉🎉 Our new work, FasterDiffusion: Rethinking the Role of UNet Encoder in Diffusion Models. See our paper and code.
  • 2023.02:  🥳🥳 Our paper “3D-Aware Multi-Class Image-to-Image Translation with NeRFs” accepted by CVPR’23. See our paper and code.
  • 2020.12:  🥳🥳 Our paper “Low-rank Constrained Super-Resolution for Mixed-Resolution Multiview Video” accepted by TIP’20. See our paper and code.

📝 Publications

ICLR 2024
sym

Get What You Want, Not What You Don’t: Image Content Suppression for Text-to-Image Diffusion Model

Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang

  • The [EOT] embeddings contain significant, redundant and duplicated semantic information of the whole input prompt.
  • We propose soft-weighted regularization (SWR) to eliminate the negative target information from the [EOT] embeddings.
  • We propose inference-time text embedding optimization (ITO).
[paper] [code] [abstract] [中文解读]
arXiv
sym

Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models

Senmao Li, Taihang Hu, Fahad Khan, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang

  • A thorough empirical study of the features of the UNet in the diffusion model showing that encoder features vary minimally (whereas decoder feature vary significantly)
  • An encoder propagation scheme to accelerate the diffusion sampling without requiring any training or fine-tuning technique
  • Our method can be combined with existing methods (like DDIM, and DPM-solver) to further accelerate diffusion model inference time
  • ~1.8x acceleration for stable diffusion, 50 DDIM steps, ~1.8x acceleration for stable diffusion, 20 Dpm-solver++ steps, and ~1.3x acceleration for DeepFloyd-IF
[paper] [code] [abstract]
arXiv
sym

StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang

  • Only optimizing the input of the value linear network in the cross-attention layers is sufficiently powerful to reconstruct a real image
  • Attention regularization to preserve the object-like attention maps after reconstruction and editing, enabling us to obtain accurate style editing without invoking significant structural changes
[paper] [code] [abstract]
CVPR 2023
sym

3D-Aware Multi-Class Image-to-Image Translation with NeRFs

Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang

  • The first to explore 3D-aware multi-class I2I translation
  • Decouple 3D-aware I2I translation into two steps
[paper] [code] [abstract]
  • sym Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models . Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang. ICLR 2024.
    [paper] [code] [abstract]
  • sym Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models . Senmao Li, Taihang Hu, Fahad Khan, Linxuan Li, Shiqi Yang, Yaxing Wang, Ming-Ming Cheng, Jian Yang. arXiv.
    [paper] [code] [abstract]
  • sym StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing . Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, Jian Yang. arXiv.
    [paper] [code] [abstract]
  • sym 3D-Aware Multi-Class Image-to-Image Translation with NeRFs . Senmao Li, Joost van de Weijer, Yaxing Wang, Fahad Shahbaz Khan, Meiqin Liu, Jian Yang. CVPR 2023.
    [paper] [code] [abstract]

📄 Academic Service

  • Conference Reviewer: NeurIPS’24

💻 Internships