Understanding Distribution Learning of Diffusion Models via Low-Dimensional Modeling

Thursday, March 13, 2025 11 a.m. to noon

Speaker: Dr. Peng Wang

From: University of Michigan

Abstract

Recent empirical studies have demonstrated that diffusion models can effectively learn the image distribution and generate new samples. Remarkably, these models can achieve this even with a small number of training samples despite a large image dimension, circumventing the curse of dimensionality. In this work, we provide theoretical insights into this phenomenon by leveraging key empirical observations: (i) the low intrinsic dimensionality of image datasets and (ii) the low-rank property of the denoising autoencoder in trained diffusion models. These observations motivate us to assume the underlying data distribution as a mixture of low-rank Gaussians and to parameterize the denoising autoencoder as a low-rank model. With these setups, we rigorously show that optimizing the training loss of diffusion models is equivalent to solving the canonical subspace clustering problem over the training samples. This insight carries practical implications for training and controlling diffusion models. Specifically, it allows us to characterize precisely the minimal number of samples necessary for learning correctly the low-rank data support, shedding light on the phase transition from memorization to generalization. Moreover, we empirically establish a correspondence between the subspaces and the semantic representations of image data, facilitating image editing. We validate these results with corroborated experimental results on both simulated distributions and image datasets. 

For more info, please follow this link.

Read More

Locations:

TC2: 222 [ View Website ]

Contact:


Calendar:

CS/CRCV Seminars

Category:

Speaker/Lecture/Seminar

Tags:

UCFCRCV