Chapter 6: Questionnaire

Fastai
Pytorch
Pandas
Deep Learning
Author

Ismail TG

Published

October 29, 2022

Q1:
How could multi-label classification improve the usability of the bear classifier?

Multi-label classfication gives the models in production the ability to return 0 prediction if the user inputs an image that doesn't contains any of the classes we trained the model to predict. In the case of bear classification inference, if we upload an image that doesn't contains any bear, our model still predict  a class.

Q2:
How do we encode the dependent variable in a multi-label classification problem?

In multi-label classification we use One-Hot encoders, which is represented as an list with zeros and ones, where one(s) represent the label of that particular image.

Q3:
How do you access the rows and columns of a DataFrame as if it was a matrix?

We could use .iloc or loc to access any row or colum of a DataFrame

Q4:
How do you get a column by name from a DataFrame?

import pandas as pd
dic = {'col1':[1, 2, 3], 'col2':[4, 5, 6]}
df =  pd.DataFrame(dic)
df
col1 col2
0 1 4
1 2 5
2 3 6
# access column by name:
df.col1
0    1
1    2
2    3
Name: col1, dtype: int64
# or this way
df['col2']
0    4
1    5
2    6
Name: col2, dtype: int64

Q5:
What is the difference between a Dataset and DataLoader?

Dataset could be any collection the return a tuple with dependent and independent variables when we index to it (DataFrame in our case)
DataLoader is an iterator that provide the stream of mini batch, shuffling the datapoints..

Q6:
What does a Datasets object normally contain?

Datasets objects contains Training Dataset and Validation Dataset

Q7:
What does a DataLoaders object normally contain?

Dataloaders contains training DataLoader and Validation DataLoader

Q8:
What does lambda do in Python?

It provides a shortcut for declaring small anonymous functions

Q9:
What are the methods to customize how the independent and dependent variables are created with the data block API?

We could use functions to customize how to define the independent and dependent variable, then assign these functions to get_x and get_y 

Q10:
Why is softmax not an appropriate output activation function when using a one hot encoded target?

Softmax pushes the model to pick one class among others, while when using one-hot encoding with multi-label classification we have more the one class per image.

Q12:
What is the difference between nn.BCELoss and nn.BCEWithLogitsLoss?

nn.BCELoss has no sigmoid while nn.BCEWithLogitsLoss includes sigmoid in it.

Q13:
When is it okay to tune a hyperparameter on the validation set?

When the results seems to have a smooth curve, what indicates the results change  based on changes to the hyper-parameters.

Q14:
How is y_range implemented in fastai?

function define the range of our targets. In fastai this function is implemented using the `sigmoid_range`

Q16:
What is a regression problem? What loss function should you use for such a problem?

Regression problem is where we have to predict a continous value.
Usually in this kind of problem the mean squared error is used

Q17:
What do you need to do to make sure the fastai library applies the same data augmentation to your input images and your target point coordinates?

The PointBlock class needs to be passed to the blocks parameter.