Modified MNIST [Kaggle]

This was a competition hosted on Kaggle and was a miniproject for the COMP 551: Applied Machine Learning Course. We analyze different Machine Learning models to process a modified version of the MNIST dataset and develop a supervised classification model that can predict the number with the largest numeric value that is present in an Image.

We analyze Images from a modified version of the MNIST dataset (Yann Le Cunn, 2001) . MNIST is a dataset that contains handwritten numeric digits from 0-9 and the goal is to classify which digit is present in an image. The given dataset contains 50,000 modified MNIST images.The images are grayscale images of size 128*128. Each image contains three MNIST style randomly sampled numbers on custom grayscale backgrounds each at various positions and orientations in the image. The task was to train a model in order to identify the number with the highest numerical value in the image.

We experimented numerous models with different configurations for this task. The models chosen were primarily pretrained complex neural network models, such as ResNets, VGGNets and EfficientNets . After fine-tuning the best performing models’ hyper-parameters, to further boost the classification accuracy, we used various data augmentation techniques, including Affine Transformation Mappings, Scale-Space blurring, Contrast changes and Perspective transforms. By doing so, we were able to gain a higher accuracy on the test set, as compared to before data augmentation.

The final fine-tuned model was able to achieve an accuracy of 99% on the validation data, and an accuracy of 99.166% on the test data in the public leaderboard of the competition. We finished 2nd and 4th out of 105 teams(Group 30) on the public and the private leaderboards of the competition respectively.

comments powered by Disqus

Related