A method that learns to turn bad images into good ones without ever having seen good images.
Consider a model where inputs are corrupted images $\hat{x}_i$ and the targets are the clean images $ y_i$.
The optimisation task for a neural network mapping from $\hat{x}_i$ to $ y_i$ would typically be defined as $ \arg\min_{\theta} \sum_i L(f_\theta(\hat{x}_i), y_i)$.
Expectation of loss function over data distribution is approximately: $$\frac{1}{N}\sum_{i=1}^{n} f_\theta(\hat{x}_i)$$
In the limit of infinite data it becomes $ \arg \min_{\theta} E_{(x)}, \left[L(f_\theta(x), y)\right]$, (dropping the hat over $x$).
The optimisation problem makes it seem that there is a one-to-one mapping between the corrupted and clean images.
There is the original clean image and there is the corrupted version from which are trying to recover the original.
However this is not the case.

A simpler problem with set of noisy measurements $a$, and $b$ which is the estimate of true value which we are trying to measure.
The value of $b$ that minimises an $L_2$ loss $E[\lvert a-b\rvert^2]$, would be $b = E_{(a)}[a]$
But notice that this means that we could replace $ a$ with any set of noisy measurements with the same mean and we would get the same optimal value of $ b$.

Using the law of iterated expectations $$ \arg \min_{\theta} E_{(x, y)} \left[L(f_\theta(x), y)\right] = \arg \min_{\theta} E _{(x)}\left[ E_{(y|x)} \left[L(f_\theta(x), y)\right]\right]$$
If loss is $L_2$ then inner expectation resembles the one in point estimation problem where $a$ is $y$ and $b$ is $f_\theta(x)$.
The network could in theory minimise above loss for each example by solving the point estimation problem by finding parameters such that for each example $f_\theta(x) = E_{(y|x)}\left[y\right]$
But notice that this is saying that there are many possible $y$ that could plausibly be the clean version of $x$.
The value we end up with will be the average over all these.
Instead of considering a mapping to a single clean version of the corrupted $x$, we can consider the images $y$ drawn from $p(y|x)$ as different noisy measurements of the clean version.
We now want to solve $\arg\min_{\theta} \sum_i L(f_\theta(\hat{x}_i), \hat{y}_i)$ where the targets are also corrputed images (hence noise2noise).
If $E_{(\hat{y}_i\vert \hat{x}_i)}\left[\hat{y}_i\right] = y_i$ then in the limit of infinite data this will solve the original optimisation problem where the targets were uncorrupted and recover the 'true' clean version $y_i$.