These days, I was interested in Generative Adversarial Network (GAN) and wanted to create something fun: use a Deep Convolutional Generative Adversarial Network (DCGAN) to generate Magic cards. Basically, in this algorithm we have two neural networks: the Discriminator which must distinguish the fake and real cards; and the Generator which must create fake cards to fool the Discriminator. The magic part in this algorithm is that we give a random vector to Generator to create a persuasive image.
First of all, we will need data. There are tons of fans of Magic The Gathering so it was easy to find this site and scrape it to get images of good resolution. Then we must sort the cards a bit because there are some special template like the ones for the Planeswalkers or old edition.
After training on the most common cards template of Magic The Gathering, I get the following results.
If you look carefully, you can see that some cards have mana costs. We can almost read Creature or Instant on the type line. We also have the power and toughness associated to a creature card and an unclear symbol for the expansion sets.
For the neural networks, I resized all the images to 224x224px. It is easier to work with square images. Moreover, it's small enough to fit in my GPU and to blur the text. I also scale the images to the range [-1, 1] because of the Tanh activation function used for the Generator.
Now, I will show you the main part of my DCGAN implementation in Tensorflow. We will start with the Discriminator implementation because it is like implementing an image classifier.
Comments on the Discriminator:
I use five convolutional layers.
For GAN, we need the LeakyReLU activation for all layers to avoid sparse gradient. It is primordial to train the Generator.
I also use batch normalization.
Dropout is needed to avoid the Discriminator to overfit the data.
The goal of the Generator is to produce counterfeit images that is similar to real images. The input Z will be a vector generated from a normal distribution and the output channel will be the depth of the final image : three for RGB image or one for grayscale image.
Comments on the Generator:
I am applying noise before the "deconvolutional" layers
Then I am applying batch normalisation before the activation function
LeakyReLU activation is used with a tiny alpha (0.1) except for the last layer which have a Tanh activation
Dropout allow better realistic image.
Next we will define the loss function. We need labels but these labels are very simple to define. For the Discriminator, we will set 1 (real) for all inputs which come from our dataset and 0 (fake) for those which come from the Generator. For the Generator, the labels is set to 1 because its goal is to fool the Discriminator.
After that, I am using the cross entropy between the Discriminator's predictions and the labels.
Comment on the losses:
The variable smooth1 (resp. smooth0) generates just random values which will be subtracted (resp. added) to the labels. This technique is called Label Smoothing and is used to prevent the very large gradient signal.
For the training, we use Adam optimizer. The training loop is almost like all the machine learning training part except I unbalanced the training between the Discriminator and the Generator. For each batches I am training the Discriminator until its smaller than 2. I observed than if the Discriminator's loss is too big, then the Generator doesn't make effort to create realistic image. To force the Generator to create better realistic images, for each batches I am also training the Generator until his loss is smaller than the Discriminator's loss.
It is an equilibrium : if one player "win" too often, the other doesn't want to play.
Comments on the training:
To generate the above images I start with 100 epochs then continue for 50 epochs.
Beyond 150 epochs, the system loose its stability.
Additional notes to improve the training:
Some hacks recommend to use an average pooling layer for the Discriminator instead of fully connected layer.
Make the learning rate bigger for the discriminator.
Figuring out the correct training parameters is harsh. Inspiration from the latest papers can be a very good start.
Move dropout between [0.3, 0.5]
Move Momentum between [0.2, 0.5]
To go further
The complete implementation can be found on my github.
This post is concise because there are tons of tutorials and resources to get a better understanding of GANs or find practical hacks :
Magic The Gathering cards : https://scryfall.com/
GAN Architecture picture : https://twitter.com/ch402/status/793911806494261248