23 Tentative solutions

At the beginning I analyzed what I found in the repository and compared it to the original MedGAN paper. I quickly noticed how the Generator was incomplete since the shortcut connections were missing.

In addition I tried:

Increasing Generator number of layers (size of the layer is fixed by Decoder and shortcut connections)
Increasing Discriminator number and size of layers. (best results by first increasing size and then decreasing it slowly)
Decreasing learning rate and β and weight decay in Discriminator (and Generator indipendently)
Define a penalty to the Binary Cross Entropy in both Generator and Discriminator losses
Retrain the AutoEncoder to increase the size of the Generator
Regularization with L1
Switch activation function from Relu to Selu
Train the Generator N times each epoch while only once for the Discriminator. In the original paper was the other way around.

With some combination of hyperparameters I had the impression it was training better and more stable but I couldn’t get it to stop mode collapsing in the end.