Chapter 5: Questionnaire – Ismail's Personal Blog

Q1:
Why do we first resize to a large size on the CPU, and then to a smaller size on the GPU?
____
* Reisizing images is done image per image on cpu, while transformation is done one gpu. This method is called presizing: - crop the image and resize it to 460 by 460 first, this operation is done on CPU. - then we do the data augmentation in batches, by cropping a rotated random part of that 460^2 image, and taking the cropped image then resize again to a 224 by 224 image, all this operation are done on batch level, which mean on GPU

Q2:
If you are not familiar with regular expressions, find a regular expression tutorial, and some problem sets, and complete them. Have a look on the book’s website for suggestions.
___
* Will write serie of blog posts about Regular Expression

Q3:
What are the two ways in which data is most commonly provided, for most deep learning datasets?
___
* Data is usually provided as : - Individual files like images of text - Comma separted Values csv.

Q4:
Look up the documentation for L and try using a few of the new methods that it adds.
____
* Later

Q5:
Look up the documentation for the Python pathlib module and try using a few methods of the Path class.
___
* Later

Q6:
Give two examples of ways that image transformations can degrade the quality of the data
____

Interpolations lead to an image with pixel values that are estimated using estimated pixel values which loses quality each time.
Rotating an image 45 degrees creates empty space in the corners.

Q7:
What method does fastai provide to view the data in a DataLoaders?
___

DataLoader.show_batch

Q8:
What method does fastai provide to help you debug a DataBlock?
___
DataBlock.summary

Q9:
Should you hold off on training a model until you have thoroughly cleaned your data?
___
* No. It’s better to start using the building a model as soon as possible, we could even use it as data cleaning tool.**

Q10:
What method does fastai provide to help you debug a DataBlock?
___
* Plot_Confusion_Matrix: for displaying where the model make bad predictions the most. * Plot_Top_Losses: A method that displays the images with the highest loss value.

Q11:
What are the two pieces that are combined into cross-entropy loss in PyTorch?
___
* Softmax function and Negative Log Likelihood Loss.

Q12:
What are the two properties of activations that softmax ensures? Why is this important?
____
* It make sure all activations add uo to 1 * It help the model to pick one class

Q13:
When might you want your activations to not have these two properties?
____
* When we have multi-label classification, more than one label for one image.

Q14:
Calculate the exp and softmax columns of <> yourself (i.e., in a spreadsheet, with a calculator, or in a notebook).
___
Later

Q15:
Why can’t we use torch.where to create a loss function for datasets where our label can have more than two categories?
___
* When we have more than 2 classes to class torch.where it’s not ideal, because it build only to select 1 between 2 categories.

Q16:
What is the value of log(-2)? Why?
____
* The logarithm is the inverse of the exponential function, which mean logarithm of a value is the result of that value after applying the exponential, and there’s no number that can result a nigative number after the exponential, so log(-2) = not defined

Q17:
What are two good rules of thumb for picking a learning rate from the learning rate finder?
___
* Pick Learning rate smaller 10x than the minimum value of loss * Use the learning rate at the last point that the loss value was decreasing.

Q18:
What two steps does the fine_tune method do?
____
* Train the added layer using random weights for one epoch * Unfreeze all the layers and train them all together as normal model for number of epochs

Q19:
In Jupyter notebook, how do you get the source code for a method or function?
____
* By using ?? after the function.

Q20:
What are discriminative learning rates?
____
* It’s a method that allow us to use different learning rate for each part of the neural network. Using a pretrained model means that the earlier layers are trained for many epochs which mean that the parameters don’t need to be updated by much, in other hand the last layers needs to be matched with task we have. That’s why it’s good to pick slightly smaller value for learning rate for the earlier layers, and bigger one for the last ones.

Q21:
How is a Python slice object interpreted when passed as a learning rate to fastai?
____
* The first value in the slice object sets the lowest learning rate. * The second value in the slice object sets the highest learning rate.