Home

The problem

  • $X$, $Y$ are two image domains
  • Goal is to train a generator $G: X \rightarrow Y$
  • Dataset consists of unpaired samples $x \in X$ and $y \in Y$.
  • The output of $G$ should look like it belongs to domain $Y$

The model

  • The loss function consists of an adversarial loss

$$L_G = l_{adv}(G(x), Y) + \lambda l_reg(x, G(x))$$

  • The generator has two branches

    • $G_0$:

      • Vanilla generator
      • Translates input into output domain to create a similar image
    • $G_{attn}$:

      • Attention branch
      • Output probability map as attention mask
  • Loss function consists of usual adversarial loss plus self-regularisation loss

  • Self-regularisation loss
    • Need to prevent generator from mapping input images to random permutations of images in the target domain
    • We wish to constrain the mapping such that it is meaningful
    • Require that $G$ should preserve visual characteristics of input image
    • Input and target should share perceptual similarities
    • Especially low-level features e.g. colour, edges, shape, objects, etc. should be similar
    • One way of approaching this is to consider that the features extracted by the early layers