This is a coursework of the Deep Learning class from my Data Science master's degree. The overall goal for this coursework is to train two deep neural networks which can take 100x100 pixel images with a cell nucleus in the centre of the image and classify it into one of the following types which are shown in the figures above:
- Normal epithelial cell nuclei with label 0.
- Cancer epithelial cell nuclei with label 1.
- Muscle cell nuclei with label 2.
- Immune leukocyte cell nuclei with label 3.
This coursework was implemented on the Kaggle notebook with the GPU P100 config. Notable libraries used include torch
, torchvision
, sklearn
and more. Overall workflow as demonstrated in the notebook is as follows:
- data preprocessing - dealing with class imbalance, custom dataset class, data transformation, load data to dataloader
- modeling - building 2 CNN models, one small model from scratch and one modified ResNet18, defining train function & running the training
- interpreting the models - loss and accuracy curves, confusion matrix, Captum
- hyperparameter tuning - using ray library to tune the learning rate, weight decay and momentum, define new training function
- final model - brief evaluation, generating predictions using the final model
- future enhancement - suggestions include data augmentation, early stopping, ray library, closer look at model1
train.zip
- zip file containing the training imagestest.zip
- zip file containing the test imagestrain.csv
- csv file containing the training image filenames and ground truth labels (0, 1, 2, 3)example.csv
- an example submission file in the correct format
Filename
- filename of an image file.Label
- the type of cell nucleus in the middle of the image.