The model follows a structure of a Multiclass PatchGAN.
You will give the model an image representing a certain category. Then, the model will work its magic to create a new image based on what you provided, tweaking it to better match the features of that category.
The "magic" it is learned during the model's training, where it uses a multiclass version of a PatchGan discriminator. This model works by taking the input image and the expected output, combining them, and then splitting the resulting image into patches. The discriminator then examines each patch, labeling it as "True" if it believes the input is non-AI generated or "False" if it thinks it's AI generated. Additionally, it assigns each patch a class label.
The model has two main parts: the Generator and the Discriminator. The Generator takes an input image and its corresponding class label, then attempts to create an image that looks like a Lego® Star Wars minifigure.
On the other hand, the Discriminator checks pairs of images. It compares either the generated image with the original one or the original image with an expected one. It decides if the concatenation of both images is true or false and assigns a class label for each patch.
Together, these two parts make up the whole model. When you give it a class tag and an image, the Generator makes a new image. Then, both this new image and the original one go to the Discriminator. The Discriminator checks each patch of these combined images and says what class it thinks each patch belongs to and if they are AI generated or not.
The internal structure of the Generator it is as follows:
The internal structure of the Discriminator it is as follows:
The internal structure of the Final Model it is as follows: