6 GAN Architectures You Really Should Know

Generative Adversarial Networks (GANs) first came to be known by the world in 2014. It was conceptualised and materialised by Ian Goodfellow. Since then, GAN has become a new area of research. GANs are deep learning-based generative models. Here a machine is taught to generate an output on data that it has not seen before. It also helps two neural networks to compete with each other in an adversarial fashion. This helps to make the neural network more accurate and deliver the result.

For instance, the image of a new human face can be generated from scratch. The unique feature about this is that every human face is unique, something that has never existed before. But it looks real because it is based on some data that has been trained into the deep learning model. This technique is what Generative Adversarial Network or GAN is all about.

AI and ML engineers have researched it, and within a few years, the research community came up with plenty of unpatented and patented work and progressed the invention of GANs further. This led to the Generative Models showing promising results in producing realistic images. GANs have also displayed tremendous prowess in Computer Vision and have replicated the success in audio and text as well. But most of the work is in the developmental stage.

But there are six important GAN architectures that every AI and ML enthusiast should know about:

PixelRNN

PixelRNNs are GANs that can predict the pixels in an image sequentially. This image should have two spatial dimensions. The modelling is done along the lines of the raw pixels’ discrete probability values, and then the encryption can then complete the sets of dependencies in that specific image.

Text-2-image GAN

Text-to-image is one of the most useful GANs. It is a deep learning model which can generate images from textual descriptions. Earlier, this was done by police sketch artists because of low technological resources. But now, it can be done with the help of text-2-image GANs.

StyleGAN

The StyleGAN is a continuation of the progressive, developing GAN that is a proposition for training generator models to synthesise enormous high-quality photographs via the incremental development of both discriminator and generator models from minute to extensive pictures.

DiscoGAN

A DiscoGAN is a GAN that produces images of products in domain B if an image is given in domain A. These images resemble each other in style and pattern. This is a powerful ability because the relationship can be learned without openly pairing images during training. It also saves time.

CycleGAN

The CycleGAN, or the Cycle Generative Adversarial Network, or CycleGAN, is a training approach for image-to-image translation tasks in a deep convolutional neural network. The Neural Network uses an unpaired dataset to learn the mapping between the input and output images.

LSGAN

Least Squares GAN or LSGAN is a GAN type that accepts the least squares loss function for the role of the discriminator. Therefore, minimising the objective function of LSGAN also minimises the Pearson divergence.

Parting Thoughts

These were some of the GAN architectures being used, and constant research and development are going on in these fields. If you want to read more AI, ML and DL blogs like these, visit the website of E2E Networks.

‍

Reference Link

https://machinelearningmastery.com/what-are-generative-adversarial-networks-gans/

https://developers.google.com/machine-learning/gan/gan_structure

https://www.kaggle.com/roydatascience/introduction-to-generative-adversarial-networks

https://www.geeksforgeeks.org/cycle-generative-adversarial-network-cyclegan-2/

https://neptune.ai/blog/6-gan-architectures

https://medium.com/towards-artificial-intelligence/generating-matching-bags-from-shoe-images-and-vice-versa-using-discogans-8149e2cbc02

‍