Improving CNN Test Accuracy and Generalization with Targeted Optimization Training: : Data augmentation, Dropout, L2 Weight Decay, and Learning Rate Scheduling

Jiseok Ham

doi:10.65397/rc.v1i01.14

Authors

Jiseok Ham Saint Johnsbury Academy Jeju

DOI:

https://doi.org/10.65397/rc.v1i01.14

Keywords:

Optimization, Image Classification, Convolutional Neural Networks

Abstract

This study evaluates whether targeted optimization (early data augmentation, dropout, L2 kernel regularization, and exponential learning-rate decay) improves the generalization and accuracy of a convolutional neural network (CNN) model on a three-class image dataset (450 training images consisting of backpack, eraser, and pen images and 15 test images). The optimized CNN reached 100% test accuracy (15/15), outperforming the unoptimized CNN along with other traditional models tested in the same conditions (Random Forest 46.67%, Gradient Boosting 53.33%). Whilst training, the unoptimized model showed a validation loss trend of bottoming early and inconsistently rising afterwards, a clear sign of overfitting. On the other hand, the optimized model maintained a stable validation curve along with a significantly smaller train-validation gap. The added optimizations moderately increased the runtime of the CNN (optimized CNN = 399 seconds, unoptimized CNN = 355 seconds, Random Forest = 10 seconds, Gradient Boosting = 1946 seconds). All models failed a simple unknown-class detection relying on confidence thresholds, mainly due to the small dataset size. Such results note that targeted optimization can notably improve CNN test accuracy and generalization, although having minor drawbacks.

Improving CNN Test Accuracy and Generalization with Targeted Optimization Training:

Data augmentation, Dropout, L2 Weight Decay, and Learning Rate Scheduling

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

Make a Submission