Unsupervised Detection of Distribution Shift in Inverse Problems using Diffusion Models

Measuring distribution shift using only corrupted measurements


Shirin Shoushtari1, Edward P. Chandler1, Yuanhao Wang1,
M. Salman Asif2, Ulugbek S. Kamilov1

1WashU   2UC Riverside

Banner

Figure 1: Comparison of the distribution shift (dashed lines), computed using clean images, and our proposed measurement-domain KL metric (solid lines) between an InD model trained on FFHQ and OOD models trained on MetFaces, AFHQ, and Microscopy. Results are shown under inpainting masks with \( p \in \{0.2, 0.5, 0.8\} \). The vertical axis shows \( \mathrm{KL} \), evaluated as the integrand in equations (9) and (4) of the paper up to diffusion noise level \( \sigma \). Right: Samples from InD and OOD datasets. Note how the proposed metric accurately tracks the KL divergence, even under high-levels of corruption (smaller values of \( p \)).

Banner

Figure 2: KL divergence plotted against the noise level \( \sigma \) for InD and OOD Gaussian mixture models (GMMs). KL divergence computed in the image domain (blue) and measurement domain (red) under inpainting corruption with probability \( p \), using \( N \) InD data examples. The measurement-domain KL divergence closely tracks its image-domain counterpart, and the approximation improves with increasing \( N \) and \( p \).


Abstract


Diffusion models are widely used as priors in imaging inverse problems. However, their performance often degrades under distribution shifts between the training and test-time images. Existing methods for identifying and quantifying distribution shifts typically require access to clean test images, which are almost never available while solving inverse problems (at test time). We propose a fully unsupervised metric for estimating distribution shifts using only indirect (corrupted) measurements and score functions from diffusion models trained on different datasets. We theoretically show that this metric estimates the KL divergence between the training and test image distributions. Empirically, we show that our score-based metric, using only corrupted measurements, closely approximates the KL divergence computed from clean images. Motivated by this result, we show that aligning the out-of-distribution score with the in-distribution score---using only corrupted measurements---reduces the KL divergence and leads to improved reconstruction quality across multiple inverse problems.

Theoretical Result


Theorem 1. Let \( \bar{\mathbf{y}}_\sigma = \mathbf{P} \bar{\mathbf{x}} + \bar{\mathbf{n}} \) denote the noisy projected measurements at noise level \( \sigma \) according to Eq. (8). Then, the KL divergence between the InD density \( p(\mathbf{x}) \) and the OOD density \( q(\mathbf{x}) \) can be expressed as

\[ \mathrm{KL}(p(\mathbf{x}) \| q(\mathbf{x})) = \int_0^\infty \mathbb{E} \left[ \left\| \mathbf{W} \left( \nabla \log p_\sigma(\mathbf{V} \bar{\mathbf{y}}_\sigma) - \nabla \log q_\sigma(\mathbf{V} \bar{\mathbf{y}}_\sigma) \right) \right\|_2^2 \right] \sigma~ \mathrm{d}\sigma, \]

where \( \mathbf{W} = \mathbb{E}[\mathbf{P}]^{-3/2} \) is a diagonal scaling matrix, \( \mathbf{V} \) is the right singular vector from the SVD of \( \mathbf{H} \), and the expectation is taken over \( \mathbf{P} \) and \( \bar{\mathbf{y}} \sim p(\bar{\mathbf{y}}| \mathbf{P}) \).

Distribution Shift in MRI Subsampled Measurements


Description
Figure 3: Comparison of the distribution shift (dashed lines), computed using clean images, and our proposed measurement-domain KL metric (solid lines) between an InD model trained on Brain slices and OOD models trained on Knee and Prostate slices from the fastMRI dataset with acceleration rate \( 4 \). The vertical axis shows \( \mathrm{KL} \) up to diffusion noise level \( \sigma \). The proposed metric accurately tracks the KL divergence.

Adaptation from Corrupted Measurements: Impact on distribution shift and performance in inverse problems


Description
Figure 4: \( \mathrm{KL} \) between FFHQ and AFHQ, as well as adapted models using 64 and 128 projected measurements, measured in the image domain (dashed) and the measurement domain (solid) for inpainting with \( p = 0.8 \). Notably, adapting the network using only projected measurements significantly reduces the distributional gap.
Description
Figure 5: Visual comparison of inpainting results (DPS) on an FFHQ image with mask rate \( p = 0.8 \) and measurement noise level \( \sigma = 0.01 \). The top row shows full reconstructions, while the bottom row displays residual maps (left) and zoomed-in regions (right). Note the performance gap between the InD and OOD models, and the improvement achieved by adapting the OOD models using only corrupted measurements.

Paper


Bibtex


@article{Shoushtari2025klmeas, author={Shoushtari, Shirin and Chandler, Edward P., and Wang, Yuanhao, and Asif, M. Salman, and Kamilov, Ulugbek S.}, title={Unsupervised Detection of Distribution Shift in Inverse Problems using Diffusion Models}, note={arXiv:2505.11482}, year={2025} }