AAI_2025_Capstone_Chronicles_Combined
Abstract
Data collection and storage in healthcare present significant challenges due to high costs
and resource limitations. Many hospitals struggle to gather and maintain large datasets
necessary for training machine learning models that could otherwise enhance diagnostic
accuracy. Additionally, healthcare providers often face overwhelming workloads, complex cases,
and limited resources, increasing the potential for diagnostic errors. This project addresses
these issues by employing Generative Adversarial Networks (GANs) to generate synthetic X-ray
images of patients with pneumonia. The synthetic images aim to enrich the dataset for training
a Convolutional Neural Network (CNN) classifier to detect pneumonia from X-ray samples. The
primary objective is to increase data availability through synthetic generation of realistic images.
By supplementing the dataset with realistic synthetic images, diagnostic models such as CNN
classifiers are expected to achieve higher accuracy, ultimately enhancing diagnostic precision.
The dataset utilized is the Labeled Optical Coherence Tomography (OCT) and Chest X-ray
Images for Classification from Kermany et al. (2018), publicly available under a Creative
Commons Attribution 4.0 International (CC BY 4.0) license. It contains 5,863 images labeled as
either pneumonia or normal, originating from retrospective cohorts of pediatric patients aged
one to five years at Guangzhou Women and Children's Medical Center, where chest X-rays were
collected as part of routine clinical care (Kermany, Zhang, & Goldbaum, 2018). In a live system,
data would be sourced from hospital imaging databases, with synthetic augmentation
continuously refining the dataset during training.
95
Made with FlippingBook - Share PDF online