AAI_2025_Capstone_Chronicles_Combined

20 model than the CNN model, and highlights how the ViT does not depend much on the noise produced by the generative process to decide whether the image is Real or Fake, and instead may rely on the overall image structure. This experiment has showed the great capability of CNNs to detect “raw” fake generated images, but it has also highlighted how vulnerable to exploitation this method becomes by easily applying noise or blur to the classified images. It also demonstrated that ViTs also have great potential for Fake image detection, but may also be deceived by applying different kinds of distortion to the images. Still, the fact that the two different architectures are vulnerable to different types of image modifications opens the door to finding some way of merging their strengths. One possibility may be building a model ensemble that, based on some characteristics of the image, like IQA measurements, assigns different weights to the predictions made to both models before making a decision. Ham et al. (2024) proposes a similar solution using model ensembles for generic image forgery detection. Another option would be building a hybrid architecture, like Soudy et al. (2024) proposed. Lastly, the training dataset may also be augmented with different kinds of image artifacts to improve the accuracy of the models in those scenarios, but I would expect this measure to possibly drop the overall accuracy of the models.

377

Made with FlippingBook - Share PDF online