Measuring distribution shift using only corrupted measurements
Figure 1: Comparison of the distribution shift (dashed lines), computed using clean images, and our proposed measurement-domain KL metric (solid lines) between an InD model trained on FFHQ and OOD models trained on MetFaces, AFHQ, and Microscopy. Results are shown under inpainting masks with \( p \in \{0.2, 0.5, 0.8\} \). The vertical axis shows \( \mathrm{KL} \), evaluated as the integrand in equations (9) and (4) of the paper up to diffusion noise level \( \sigma \). Right: Samples from InD and OOD datasets. Note how the proposed metric accurately tracks the KL divergence, even under high-levels of corruption (smaller values of \( p \)).
Figure 2: KL divergence plotted against the noise level \( \sigma \) for InD and OOD Gaussian mixture models (GMMs). KL divergence computed in the image domain (blue) and measurement domain (red) under inpainting corruption with probability \( p \), using \( N \) InD data examples. The measurement-domain KL divergence closely tracks its image-domain counterpart, and the approximation improves with increasing \( N \) and \( p \).
Diffusion models are widely used as priors in imaging inverse problems. However, their performance often degrades under distribution shifts between the training and test-time images. Existing methods for identifying and quantifying distribution shifts typically require access to clean test images, which are almost never available while solving inverse problems (at test time). We propose a fully unsupervised metric for estimating distribution shifts using only indirect (corrupted) measurements and score functions from diffusion models trained on different datasets. We theoretically show that this metric estimates the KL divergence between the training and test image distributions. Empirically, we show that our score-based metric, using only corrupted measurements, closely approximates the KL divergence computed from clean images. Motivated by this result, we show that aligning the out-of-distribution score with the in-distribution score---using only corrupted measurements---reduces the KL divergence and leads to improved reconstruction quality across multiple inverse problems.
Theorem 1. Let \( \bar{\mathbf{y}}_\sigma = \mathbf{P} \bar{\mathbf{x}} + \bar{\mathbf{n}} \) denote the noisy projected measurements at noise level \( \sigma \) according to Eq. (8). Then, the KL divergence between the InD density \( p(\mathbf{x}) \) and the OOD density \( q(\mathbf{x}) \) can be expressed as
\[ \mathrm{KL}(p(\mathbf{x}) \| q(\mathbf{x})) = \int_0^\infty \mathbb{E} \left[ \left\| \mathbf{W} \left( \nabla \log p_\sigma(\mathbf{V} \bar{\mathbf{y}}_\sigma) - \nabla \log q_\sigma(\mathbf{V} \bar{\mathbf{y}}_\sigma) \right) \right\|_2^2 \right] \sigma~ \mathrm{d}\sigma, \]
where \( \mathbf{W} = \mathbb{E}[\mathbf{P}]^{-3/2} \) is a diagonal scaling matrix, \( \mathbf{V} \) is the right singular vector from the SVD of \( \mathbf{H} \), and the expectation is taken over \( \mathbf{P} \) and \( \bar{\mathbf{y}} \sim p(\bar{\mathbf{y}}| \mathbf{P}) \).