- Results in Table 1 on synthetic dataset (CelebA-Test) from main paper.
- Results in Table 2 on real-world datasets (Wider-Test, LFW-Test, WebPhoto-Test) from main paper.
Diffusion models have demonstrated impressive performance in face restoration. Yet, their multi-step inference process remains computationally intensive, limiting their applicability in real-world scenarios. Moreover, existing methods often struggle to generate face images that are harmonious, realistic, and consistent with the subject’s identity. In this work, we propose OSDFace, a novel one-step diffusion model for face restoration. Specifically, we propose a visual representation embedder (VRE) to better capture prior information and understand the input face. In VRE, low-quality faces are processed by a visual tokenizer and subsequently embedded with a vector-quantized dictionary to generate visual prompts. Additionally, we incorporate a facial identity loss derived from face recognition to further ensure identity consistency. We further employ a generative adversarial network (GAN) as a guidance model to encourage distribution alignment between the restored face and the ground truth. Experimental results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics, generating high-fidelity, natural face images with high identity consistency.
We propose OSDFace, a novel one-step diffusion model for face restoration. First, to
establish a visual representation embedder (VRE), we train the autoencoder and VQ dictionary for HQ and LQ
face domains using self-reconstruction and feature association loss
@inproceedings{wang2025osdface,
title={{OSDFace}: One-Step Diffusion Model for Face Restoration},
author={Wang, Jingkai and Gong, Jue and Zhang, Lin and Chen, Zheng and Liu, Xing and Gu, Hong and Liu, Yutong and Zhang, Yulun and Yang, Xiaokang},
booktitle={CVPR},
year={2025}
}