Improved Training of Wasserstein GANs

Improved Training of Wasserstein GANs
- Contribution
- WGAN

Contribution

show how critic weight clipping can lead to pathological behavior
propose WGAN with gradient penalty
demonstrate stable training of many difficult GAN architectures with default settings, performance improvements over weight clipping

WGAN

loss $$ \min_G\max_{D\in \mathcal D}\mathbb E_{x\backsim \mathbb P_r}[D(x)]-\mathbb E_{\hat x\backsim \mathbb P_g}[D(\hat x)] $$

where $\mathcal D$ is the set of 1-Lipschitz functions

To enforce the Lipschitz constraint on the critic, WGAN propose to clip the weights of the critic to lie within a compact space $[-c, c]$. The set of functions satisfying this constraint is a subset of the $k$-Lipschitz functions for some $k$ which depends on $c$ and the critic architecture.