Generative Restore
Personalized Generative Low-light Image Denoising and Enhancement

Purdue University


Abstract

Modern cameras produce remarkably high-quality images, yet their performance in low-light conditions remains suboptimal due to fundamental limitations in photon shot noise and sensor read noise. While generative image restoration methods have shown promising results compared to traditional approaches, they often suffer from hallucinatory content generation when the signal-to-noise ratio (SNR) is low. Leveraging the availability of personalized photo galleries on users' smartphones, we introduce Diffusion-based Personalized Generative Denoising (DiffPGD), a novel approach that builds a customized diffusion model for individual users. Our key innovation lies in the development of an identity-consistent physical buffer that extracts the physical attributes of the person from the gallery. This ID-consistent physical buffer serves as a robust prior that can be seamlessly integrated into the diffusion model to restore degraded images without the need for fine-tuning. Over a wide range of low-light testing scenarios, we show that DiffPGD achieves superior image denoising and enhancement performance compared to existing diffusion-based denoising approaches.

Motivation

framework

  • Why Gallery Photos? Smartphone cameras today store hundreds if not thousands of a user's photos, captured at different times, in different places, and under different lighting conditions. While these images have many variations, they are all about the same person(s). Therefore, if the imaging goal is to take a photo of this user, the gallery on the phone would be the best source to build a prior \(p(\mathbf{x})\). The situation is summarized in the above Figure. In the context of diffusion-based image restoration, the original solution space can be large because many candidate solutions are consistent with the noisy observation. The gallery provides a strong constraint to the search problem. This allows us to search for better quality images with high precision of the person's identity.
  • Physical Buffers to the Rescue? Given the gallery photos, what kind of prior information would be useful for restoration? Advancements in computer vision have made it possible to extract detailed facial physical buffers—including albedo and normal maps—from a person's gallery of images. These physical buffers capture crucial identity-defining properties such as surface geometry and skin color, effectively encoding an individual's unique identity. At the same time, it also eliminates the influence of environmental lighting, pose, and other identity-independent variables. We will leverage this rich prior information to improve restoration.
  • Methods

    A description of the image
    The overall architecture of the proposed method. Our core idea is to use ID-consistent physical buffers, extracted from gallery photos, to constrain the generative space in the diffusion model restoration process. For a high-quality gallery, we use LAP (Zhang et al.) to extract the albedo and normal information for each photo and apply adaptive aggregation to fuse the entire gallery. The extracted albedo represents base skin color and facial appearance, while normal captures facial geometry. In our framework, the output physical buffers isolate the intrinsic ID properties from lighting, shading, and pose, enabling the diffusion model to apply only ID-related information consistently.

    Click the buttons above to switch between real-life case examples.

    BibTeX

    @article{wang2024genrestore,
      title={Personalized Generative Low-light Image Denoising and Enhancement},
      author={Wang, Xijun and Chennuri, Prateek and Yuan, Yu and Ma, Bole and Zhang, Xingguang and Chan, Stanley},
      journal={arXiv preprint arXiv},
      year={2024}
    }