-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update markdown instructions for Penguin classification exercise (1) #66
base: main
Are you sure you want to change the base?
Conversation
" - Print ``data`` and recognise that ``load_penguins`` has returned the dataframe.\n", | ||
"- Analyse which features it might make sense to use in order to classify the species of the penguins.\n", | ||
" - You can print the column names using ``pd.DataFrame.keys()``\n", | ||
" - You can also obtain useful statical information on the dataset using ``pd.DataFrame.Series.describe()``" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
statistical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'Consider' is probably fine here, but either is fine.
@@ -108,23 +108,25 @@ | |||
"source": [ | |||
"### Task 2: creating a ``torch.utils.data.Dataset``\n", | |||
"\n", | |||
"All PyTorch dataset objects are subclasses of the ``torch.utils.data.Dataset`` class. To make a custom dataset, create a class which inherits from the ``Dataset`` class, implement some methods (the Python magic (or dunder) methods ``__len__`` and ``__getitem__``) and supply some data.\n", | |||
"To be able to use Pytorch functionalities, we need to make the dataset compatible with Pytorch. We do it using PyTorch's Dataset class called ``torch.utils.data.Dataset``. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyTorch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should keep the bit on torch.utils.data.Dataset
. I really like the new comment however. Perhaps this should be the first sentence:
"To be able to use Pytorch functionalities, we need to make the dataset compatible with Pytorch. We do it using PyTorch's Dataset class called ``torch.utils.data.Dataset``. \n",
Followed by:
"All PyTorch dataset objects are subclasses of the torch.utils.data.Dataset
class. To make a custom dataset, create a class which inherits from the Dataset
class, implement some methods (the Python magic (or dunder) methods __len__
and __getitem__
) and supply some data.\n",
Sorry, I realise this is fine. It's just been restructured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are helpful to understand the material. I've made a few comments here and there.
- Your git profile configuration is a bit messed up. Can you please fix this.
- You've probably already been done this, but when commiting the solutions, restart the kernel, clear the outputs and run everything through once more.
- Make sure to restart the kernel and clear the outputs for the exercises before committing.
- I think adding some steps (as python comments) in task 9, task 10 and task 11 would make it easier to complete. See how I've done this in PR Simpler PenguinDataset #71 for the first set of tasks. Maybe we should open another PR for these changes?
Some other things worth mentioning either in the session itself or perhaps in a later PR:
- What are the implications of the batch size (i.e. too large and model is prone to overfitting / too small and we get inefficient use of hardware)
- Task 6: We could mention feature scaling / normalisation using batch norm.
- Optimiser (could include the lr here).
- What
model.train()
andmodel.eval()
are doing. Just noticed there's a mention of model.train() and model.eval() already. - Whether the train & valid routines lead to good code or not. (repetition - often this is packaged in a training class / alternatively use pytorch lightning.
@@ -108,23 +108,25 @@ | |||
"source": [ | |||
"### Task 2: creating a ``torch.utils.data.Dataset``\n", | |||
"\n", | |||
"All PyTorch dataset objects are subclasses of the ``torch.utils.data.Dataset`` class. To make a custom dataset, create a class which inherits from the ``Dataset`` class, implement some methods (the Python magic (or dunder) methods ``__len__`` and ``__getitem__``) and supply some data.\n", | |||
"To be able to use Pytorch functionalities, we need to make the dataset compatible with Pytorch. We do it using PyTorch's Dataset class called ``torch.utils.data.Dataset``. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should keep the bit on torch.utils.data.Dataset
. I really like the new comment however. Perhaps this should be the first sentence:
"To be able to use Pytorch functionalities, we need to make the dataset compatible with Pytorch. We do it using PyTorch's Dataset class called ``torch.utils.data.Dataset``. \n",
Followed by:
"All PyTorch dataset objects are subclasses of the torch.utils.data.Dataset
class. To make a custom dataset, create a class which inherits from the Dataset
class, implement some methods (the Python magic (or dunder) methods __len__
and __getitem__
) and supply some data.\n",
Sorry, I realise this is fine. It's just been restructured.
" - We allow the model to learn directly from the training set—i.e. we fit the function to these data.\n", | ||
" - During training, we monitor the model's performance on the validation set in order to check how it's doing on unseen data. Normally, people use the validation performance to determine when to stop the training process.\n", | ||
"- Note: Here we create a training and validation set.\n", | ||
" - We allow the model to learn from the training setn i.e. we fit the function to these data.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has this been fixed?
" - ``train``— A boolean variable determining if the model returns the training or validation split (``True`` for training).\n", | ||
" - ``x_tfms``— A ``Compose`` object with functions which will convert the raw input to a tensor. This argument is _optional_.\n", | ||
" - ``x_tfms``— A ``Compose`` object with functions which will convert the raw input to a tensor. This argument is _optional_. Remember Pytorch deals with tensors only.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about: Recall that PyTorch deals with torch.Tensors
s only
" - The number of items we supply at once is called the batch size.\n", | ||
" - The ``DataLoader`` can also randomly shuffle the data each epoch (when training).\n", | ||
" - It allows us to load different mini-batches in parallel, which can be very useful for larger datasets and images that can't all fit in memory at once.\n", | ||
" - The number of observations we pass at once is called the batch size.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer 'items' and 'supply' here
@@ -323,19 +359,19 @@ | |||
"### Task 5: Creating ``DataLoaders``—and why\n", | |||
"\n", | |||
"- Once we have created a ``Dataset`` object, we wrap it in a ``DataLoader``.\n", | |||
" - The ``DataLoader`` object allows us to put our inputs and targets in mini-batches, which makes for more efficient training.\n", | |||
" - The ``DataLoader`` utility allows us to load and put inputs and targets in mini-batches, which makes iterating over the dataset during training more efficient.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the old comment. It's slightly more succinct.
@@ -495,14 +534,18 @@ | |||
"source": [ | |||
"### Task 8: Selecting an optimiser\n", | |||
"\n", | |||
"Optimiser updates the model's weights and biases on the basis of the gradient of the loss function w.r.t each weight and bias. The aim of weight updation is to reduce the loss. The weight updation goes on until the loss is minimised as much as possible. \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The optimiser updates the model's weights...
- I wouldn't use the word 'updation', use 'the weight update' or weight update process/procedure.
- again, it's a little vague. "The weight update procedure proceeds until the loss is minimised subject to some stopping criterion?"
"While we talked about stochastic gradient descent in the slides, most people use the so-called [Adam optimiser](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html).\n", | ||
"\n", | ||
"You can think of it as a more complex and improved implementation of SGD." | ||
"You can think of it as a slightly more complex and improved implementation of SGD." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the old comment.
" - This activation function is good for classifcation when the result is one of ``A or B or C``.\n", | ||
" - It's bad if you even want to assign two classification to one images—say a photo of a dog _and_ a cat.\n", | ||
" - It turns the raw outputs, or logits, into \"psuedo probabilities\", and we take our prediction to be the most probable class.\n", | ||
"\n", | ||
"\n", | ||
"- The purpose of training loop is that for each epoch, all the mini-batches are sequentially passed through the model to make predictions, compute the loss, and update the model parameters and repeat it until we receive a satisfactory performance.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of the training loop
and update the model parameters. This is repeated until we receive a satisfactory performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is useful but it needs to be reworded.
" - This activation function is good for classifcation when the result is one of ``A or B or C``.\n", | ||
" - It's bad if you even want to assign two classification to one images—say a photo of a dog _and_ a cat.\n", | ||
" - It turns the raw outputs, or logits, into \"psuedo probabilities\", and we take our prediction to be the most probable class.\n", | ||
"\n", | ||
"\n", | ||
"- The purpose of training loop is that for each epoch, all the mini-batches are sequentially passed through the model to make predictions, compute the loss, and update the model parameters and repeat it until we receive a satisfactory performance.\n", | ||
"- The validation loop works similarly, but the purpose is to validate the model training's performance without any weight updates. Hence, we are not calculating any gradient during validation. This is made sure using the ``no_grad`` decorator with validation loop.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @no_grad
decorator disables gradient tracking in the function. We could point out that it is optional.
We also don't need gradients in the batch_level_accuracy() function.
" - Print ``data`` and recognise that ``load_penguins`` has returned the dataframe.\n", | ||
"- Analyse which features it might make sense to use in order to classify the species of the penguins.\n", | ||
" - You can print the column names using ``pd.DataFrame.keys()``\n", | ||
" - You can also obtain useful statical information on the dataset using ``pd.DataFrame.Series.describe()``" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'Consider' is probably fine here, but either is fine.
This will also need to be rebased on #71 |
Thanks for the comments @ma595. I have made the changes as highlighted by you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed a few other changes during the first review, and resolved some other requests as they were really good additions.
"\n", | ||
"\n", | ||
"Note: we are going to use batch normalisation layers in our network, which don't work if the batch size is one. This can happen on the last batch, if we don't choose a batch size that evenly divides the number of items in the data set. To avoid this, we can set the ``drop_last`` argument to ``True``. The last batch, which will be of size ``len(data_set) % batch_size`` gets dropped, and the data are reshuffled. This is only relevant during the training process - validation will use population statistics." | ||
"Note: we are going to use batch normalisation in our network, which doesn't work if the batch size is one. This can happen on the last batch, if we don't choose a batch size that evenly divides the number of items in the data set. To avoid this, we can set the ``drop_last`` argument to ``True``. The last batch, which will be of size ``len(data_set) % batch_size`` gets dropped, and the data are reshuffled. This is only relevant during the training process - validation will use population statistics." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for changing frombatch normalisation layers
to batch normalisation
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just to lay more focus on batch normalisation as a concept. Can revert the change if that looks better
@@ -520,17 +558,20 @@ | |||
"\n", | |||
"- Before we jump in and write these loops, we must first choose an activation function to apply to the model's outputs.\n", | |||
" - Here we are going to use the softmax activation function: see [the PyTorch docs](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html).\n", | |||
" - For those of you who've studied physics, you may be remininded of the partition function in thermodynamics.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quite like this comment and would like it to be kept.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected
Amendments to Penguin classifiication exercise to aid readability.
colab
branch.