You have a dataset of face images at 128×128 resolution, som…
You have a dataset of face images at 128×128 resolution, some are severely noisy (grainy camera shots). You want to classify each image into one of five expressions: happy, sad, angry, surprised, neutral. You decide to build: Autoencoder (AE) for denoising. CNN that classifies the AE’s output. GAN for data augmentation—generating extra images in each expression category. After some early success, you suspect domain mismatch and overfitting. Let’s see what goes wrong. — You see that many final images lose fine expression cues—like subtle eyebrow changes—once the AE cleans them. The CNN’s accuracy on “angry” and “sad” is low. What’s the most likely conceptual reason?