Chapter 7: Questionnaire – Ismail's Personal Blog

Q1:
What is the difference between ImageNet and Imagenette? When is it better to experiment on one versus the other?
* ImageNet is dataset with 1.3 million images and 1000 gategories, while Imagenette is a dataset that represent a small portion of ImageNet with 10 classes. * For studying/devloping ideas/ prototyping we better use a small dataset.

Q2:
What is normalization?
* Normalization is a method that get the mean close to 0, and the standar diviation clos to 1 (ideally mean==0, std==1)

Q3:
Why didn’t we have to care about normalization when using a pretrained model?
* Using pretrained models through vision_learner set the Normalization method automatically.

Q4:
What is progressive resizing?
* Progressive resizing is the idea of using small images in the earlier epochs of training phase, then changing the size of the images by a bit and fine tune the model for more epochs, repeat this process till we reach the original size of the image from the dataset.

Q5:
What is test time augmentation? How do you use it in fastai?
* Validation set by default uses centre crop for images, which will leads to information lost, TTA addresses this problem by cropping from multiple areas of the image and calculate the predictions of all this crops, then take the average(or the max). preds,targs = learn.tta()
accuracy(preds, targs).item()

Q6:
Is using TTA at inference slower or faster than regular inference? Why?
* It will take more time than regular inference, because the model calculate the prediction of an image more than once.

Q7:
What is Mixup? How do you use it in fastai?
* It’s a data augmenatation method that takes 2 images and mix them together. In fastai mixup used as callback : cbs=Mixup()

Q10:
What is the idea behind label smoothing?
* It’s a technique that change the one-hot-encodings value from 0 and 1 to float values, this reduce the overfitting and produce better performance.

Q11:
When using label smoothing with five categories, what is the target associated with the index 1?

labels = [0, 1, 0, 0, 0]
param = 0.1

def label_smoothing(labels, param):
    new_labels = []
    for label in labels:
        if label == 1:
            new_label = 1 - param + param / len(labels)
            new_labels.append(new_label)
        else:
            new_label = param / len(labels)
            new_labels.append(new_label)
    return new_labels

label_smoothing(labels, param)

[0.02, 0.92, 0.02, 0.02, 0.02]

Q12:
What is the first step to take when you want to prototype quick experiments on a new dataset?
* First do the protoype and experiments, if it takes more than couple of minutes, then we need to consider new subset of that dataset.