First OOD detection framework that leverages denoising posterior covariance in diffusion models
Figure 1. We compare negative log-likelihood (NLL), score norm \(\sqrt{\sum_t \|\epsilon_\theta(\mathbf{x}_t, t)\|_2^2}\), score derivative norm \(\sqrt{\sum_t \|\partial_t \epsilon_\theta(\mathbf{x}_t, t)\|_2^2}\), and the eigenvalue sum (ours) \(\sum_{t,k}\lambda_k^t(\mathbf{x}_t)\) as OOD detection statistics. Top row: near OOD task for C10 (InD) vs. C100, NLL and score-based metrics fail to separate distributions, showing substantial overlap. Bottom row: for C10 (InD) vs. SVHN (OOD), the ordering of metrics inverts—score and derivative norms assign lower values to OOD than InD, making thresholds unreliable. In both settings, our eigenvalue-based metric achieves clear separation and consistently assigns higher scores to OOD samples.
Out-of-distribution (OOD) detection is critical for the safe deployment of machine learning systems in safety-sensitive domains. Diffusion models have emerged as powerful generative models, capable of capturing complex data distributions through iterative denoising. Building on this progress, recent work has explored their potential for OOD detection. We propose EigenScore, a new OOD detection method that leverages the eigenvalue spectrum of the posterior covariance induced by a diffusion model. We argue that posterior covariance provides a consistent signal of distribution shift, leading to larger trace and leading eigenvalues on OOD inputs, yielding a clear spectral signature. We further provide analysis explicitly linking posterior covariance to distribution mismatch, establishing it as a reliable signal for OOD detection. To ensure tractability, we adopt a Jacobian-free subspace iteration method to estimate the leading eigenvalues using only forward evaluations of the denoiser. Empirically, EigenScore achieves state-of-the-art performance, with up to 5% AUROC improvement over the best baseline. Notably, it remains robust in near-OOD settings such as CIFAR-10 vs CIFAR-100, where existing diffusion-based methods often fail.
Figure 2. Denoised outputs (left), corresponding uncertainty maps (first principle component) (middle), and violin plots of the three largest eigenvalues for CelebA dataset (right). Top: clean CelebA image and its noisy variants for varying t. Middle: InD model (trained on CelebA) applied to CelebA inputs. Bottom: OOD model (trained on C100) applied to the same inputs. InD models yield sharp reconstructions and localized uncertainty with smaller leading eigenvalues, whereas OOD models produce blurrier outputs, diffuse uncertainty, and inflated eigenvalues—highlighting the eigenvalue spectrum as an indicator of distribution shift.
Algorithms 1. Given the EigenScore feature matrix M(x), we first estimate the mean and standard deviation of each feature column using the training data. The validation set is then employed to determine the optimal selection of timesteps and the aggregation method. In the testing phase, EigenScore features are extracted according to the chosen configuration and standardized using the parameters obtained from the training phase. The final detection score for each sample is computed as the sum of the standardized feature values across all columns.
Table 1. Main OOD detection results (AUROC). Comparison of EigenScore with likelihood-based, reconstruction-based, and diffusion-based baselines across multiple InD–OOD dataset pairings (CelebA, C10, C100, SVHN). best and second best are highlighted. Note that EigenScore achieves the best average performance and is either best or second best in most settings.
Table 2. Near-OOD detection results (AUROC). We evaluate on semantically related datasets, including C10 vs. C100 and TinyImageNet, which are particularly challenging due to shared low-level statistics between InD and OOD samples. The best and second best methods are highlighted. EigenScore achieves the best average performance across both tasks, with a clear margin over prior diffusion-based approaches.