arxiv:1911.07023

Effectively Unbiased FID and Inception Score and where to find them

Published on Nov 16, 2019

Authors:

Abstract

This paper shows that two commonly used evaluation metrics for generative models, the Fr\'echet Inception Distance (FID) and the Inception Score (IS), are biased -- the expected value of the score computed for a finite sample set is not the true value of the score. Worse, the paper shows that the <PRE_TAG>bias term</POST_TAG> depends on the particular model being evaluated, so model A may get a better score than model B simply because model A's <PRE_TAG>bias term</POST_TAG> is smaller. This effect cannot be fixed by evaluating at a fixed number of samples. This means all comparisons using FID or IS as currently computed are unreliable. We then show how to extrapolate the score to obtain an effectively bias-free estimate of scores computed with an infinite number of samples, which we term textrm{FID}_infty and textrm{IS}_infty. In turn, this effectively bias-free estimate requires good estimates of scores with a finite number of samples. We show that using Quasi-Monte Carlo integration notably improves estimates of FID and IS for finite sample sets. Our extrapolated scores are simple, drop-in replacements for the finite sample scores. Additionally, we show that using low discrepancy sequence in GAN training offers small improvements in the resulting generator.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/1911.07023 in a dataset README.md to link it from this page.

Spaces citing this paper 9

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.