diff --git a/lessons/3-NeuralNetworks/04-OwnFramework/README.md b/lessons/3-NeuralNetworks/04-OwnFramework/README.md
index 01ba21714a2351d125a9f5918e8230c620bf8b1f..b1e99df85331a8a3ddc3ba5d5fcc3e9029c082ee 100644
--- a/lessons/3-NeuralNetworks/04-OwnFramework/README.md
+++ b/lessons/3-NeuralNetworks/04-OwnFramework/README.md
@@ -38,7 +38,7 @@ There is a well-known method of function optimization called **gradient descent*
 
 During training, the optimization steps are supposed to be calculated considering the whole dataset (remember that loss is calculated as a sum through all training samples). However, in real life we take small portions of the dataset called **minibatches**, and calculate gradients based on a subset of data. Because subset is taken randomly each time, such method is called **stochastic gradient descent** (SGD).
 
-## Multi-Layered Perceptrons and Back Propagation
+## Multi-Layered Perceptrons and Backpropagation
 
 One-layer network, as we have seen above, is capable of classifying linearly separable classes. To build a richer model, we can combine several layers of the network. Mathematically it would mean that the function *f* would have a more complex form, and will be computed in several steps:
 * z<sub>1</sub>=w<sub>1</sub>x+b<sub>1</sub>
diff --git a/lessons/3-NeuralNetworks/05-Frameworks/README.md b/lessons/3-NeuralNetworks/05-Frameworks/README.md
index 2bd34c537a8ee177bc9483310b3dd67559840361..211765e10944a4c2f264bc9773309a62633d4735 100644
--- a/lessons/3-NeuralNetworks/05-Frameworks/README.md
+++ b/lessons/3-NeuralNetworks/05-Frameworks/README.md
@@ -33,7 +33,9 @@ In this course, we offer most of the content both for PyTorch and TensorFlow. Yo
 
 Where possible, we will use High-Level APIs for simplicity. However, we believe it is important to understand how neural networks work from the ground up, thus in the beginning we start by working with low-level API and tensors. However, if you want to get going fast and do not want to spend a lot of time on learning these details, you can skip those and go straight into high-level API notebooks.
 
-## Continue into Notebooks
+## 鉁嶏笍 Exercises: Frameworks
+
+Continue your learning in the following notebooks:
 
 Low-Level API | [TensorFlow+Keras Notebook](IntroKerasTF.ipynb) | [PyTorch](IntroPyTorch.ipynb)
 --------------|-------------------------------------|--------------------------------
diff --git a/lessons/4-ComputerVision/07-ConvNets/README.md b/lessons/4-ComputerVision/07-ConvNets/README.md
index 89fbe6da8831e64ee0e88520178c3ee9d34ecbbb..8e8a14fda3e4cbcfdd5d6589c9e36b531578b69d 100644
--- a/lessons/4-ComputerVision/07-ConvNets/README.md
+++ b/lessons/4-ComputerVision/07-ConvNets/README.md
@@ -34,7 +34,7 @@ The way CNNs work is based on the following important ideas:
 
 > Image from [a paper by Hislop-Lynch](https://www.semanticscholar.org/paper/Computer-vision-based-pedestrian-trajectory-Hislop-Lynch/26e6f74853fc9bbb7487b06dc2cf095d36c9021d), based on [their research](https://dl.acm.org/doi/abs/10.1145/1553374.1553453)
 
-## Continue in Notebook
+## 鉁嶏笍 Exercises: Convolutional Neural Networks
 
 Let's continue exploring how convolutional neural networks work, and how we can achieve trainable filters, by working through the corresponding notebooks:
 
diff --git a/lessons/4-ComputerVision/08-TransferLearning/README.md b/lessons/4-ComputerVision/08-TransferLearning/README.md
index bb58498908f32db9e74ba7ead881ce66203fc84a..4df5f967c84b971ea84eadce606b031112437eac 100644
--- a/lessons/4-ComputerVision/08-TransferLearning/README.md
+++ b/lessons/4-ComputerVision/08-TransferLearning/README.md
@@ -26,7 +26,7 @@ Here are sample features extracted from a picture of a cat by VGG-16 network:
 
 In this example, we will use a dataset of [Cats and Dogs](https://www.microsoft.com/download/details.aspx?id=54765&WT.mc_id=academic-57639-dmitryso), which is very close to a real-life image classification scenario.
 
-## Continue in a Notebook
+## 鉁嶏笍 Exercise: Transfer Learning
 
 Let's see transfer learning in action in corresponding notebooks:
 
diff --git a/lessons/4-ComputerVision/09-Autoencoders/README.md b/lessons/4-ComputerVision/09-Autoencoders/README.md
index 0400747eb8fc31c7dd1fedd36090919a84ed9ea5..c6ad1bce9dc97c49a98a424c651b7e4591b47e0d 100644
--- a/lessons/4-ComputerVision/09-Autoencoders/README.md
+++ b/lessons/4-ComputerVision/09-Autoencoders/README.md
@@ -6,7 +6,7 @@ When training CNNs, one of the problems is that we need a lot of labeled data. I
 
 However, we might want to use raw (unlabeled) data for training CNN feature extractors, which is called **self-supervised learning**. Instead of labels, we will use training images as both network input and output. The main idea of **autoencoder** is that we will have an **encoder network** that converts input image into some **latent space** (normally it is just a vector of some smaller size), then the **decoder network**, whose goal would be to reconstruct the original image.
 
-鉁� An [autoencoder](https://wikipedia.org/wiki/Autoencoder) is "a type of artificial neural network used to learn efficient codings of unlabeled data."
+> 鉁� An [autoencoder](https://wikipedia.org/wiki/Autoencoder) is "a type of artificial neural network used to learn efficient codings of unlabeled data."
 
 Since we are training an autoencoder to capture as much of the information from the original image as possible for accurate reconstruction, the network tries to find the best **embedding** of input images to capture the meaning.谢.
 
@@ -50,15 +50,15 @@ One important advantage of VAEs is that they allow us to generate new images rel
 
 <img alt="vaemnist" src="images/vaemnist.png" width="50%"/>
 
-> Image generated by Dmitry Soshnikov
+> Image by [Dmitry Soshnikov](http://soshnikov.com)
 
 Observe how images blend into each other, as we start getting latent vectors from the different portions of the latent parameter space. We can also visualize this space in 2D:
 
 <img alt="vaemnist cluster" src="images/vaemnist-diag.png" width="50%"/> 
 
-> Image generated by Dmitry Soshnikov
+> Image by [Dmitry Soshnikov](http://soshnikov.com)
 
-## Continue in a Notebook
+## 鉁嶏笍 Exercises: Autoencoders
 
 Learn more about autoencoders in these corresponding notebooks:
 
@@ -73,12 +73,6 @@ Learn more about autoencoders in these corresponding notebooks:
 
 ## [Post-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/209)
 
-* [Building Autoencoders in Keras](https://blog.keras.io/building-autoencoders-in-keras.html)
-* [Blog post on NeuroHive](https://neurohive.io/ru/osnovy-data-science/variacionnyj-avtojenkoder-vae/)
-* [Variational Autoencoders Explained](https://kvfrans.com/variational-autoencoders-explained/)
-* [Conditional Variational Autoencoders](https://ijdykeman.github.io/ml/2016/12/21/cvae.html)
-
-
 ## Conclusion
 
 In this lesson, you learned about the various types of autoencoders available to the AI scientist. You learned how to build them, and how to use them to reconstruct images. You also learned about the VAE and how to use it to generate new images.
diff --git a/lessons/4-ComputerVision/10-GANs/README.md b/lessons/4-ComputerVision/10-GANs/README.md
index 76c7d6dd08a1819900da77a4b708a03fd10cd2bc..c9003b3fb51a53a6ed45020567df37d10c1c7aca 100644
--- a/lessons/4-ComputerVision/10-GANs/README.md
+++ b/lessons/4-ComputerVision/10-GANs/README.md
@@ -1,33 +1,36 @@
 # Generative Adversarial Networks
 
-In the previous section, we have learnt about **generative models** - i.e. models that can generate new images similar to the ones in the training dataset. VAE was a good example of generative model.
+In the previous section, we learned about **generative models**: models that can generate new images similar to the ones in the training dataset. VAE was a good example of a generative model.
 
 ## [Pre-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/110)
 
-However, if we try to generate something really meaningful, like a painting at reasonable resolution, with VAE, we will see that training does not converge well. There is another architecture specifically targeted at generative models - **Generative Adversarial Networks**, or GANs.
+However, if we try to generate something really meaningful, like a painting at reasonable resolution, with VAE, we will see that training does not converge well. For this use case, we should learn about another architecture specifically targeted at generative models - **Generative Adversarial Networks**, or GANs.
 
-The main idea of GAN is to have two neural networks that will be trained against each other:
+The main idea of a GAN is to have two neural networks that will be trained against each other:
 
 <img src="images/gan_architecture.png" width="70%"/>
 
 > Image by [Dmitry Soshnikov](http://soshnikov.com)
 
- * **Generator** is a network that takes some random vector, and produces the image as a result
- * **Discriminator** is a network that takes an image, and it should tell whether it is a real image (from training dataset), or it was generated by a generator. It is essentially an image classifier.
+> 鉁� A little vocabulary:
+> * **Generator** is a network that takes some random vector, and produces the image as a result
+> * **Discriminator** is a network that takes an image, and it should tell whether it is a real image (from training dataset), or it was generated by a generator. It is essentially an image classifier.
 
 ### Discriminator
 
-The architecture of discriminator does not differ from an ordinary image classification network. In simplest case it can be fully-connected classifier, but most probably it will be a [convolutional network](../07-ConvNets/README.md).
+The architecture of discriminator does not differ from an ordinary image classification network. In the simplest case it can be fully-connected classifier, but most probably it will be a [convolutional network](../07-ConvNets/README.md).
 
-> GAN based on convolutional networks is called [DCGAN](https://arxiv.org/pdf/1511.06434.pdf)
+> 鉁� A GAN based on convolutional networks is called a [DCGAN](https://arxiv.org/pdf/1511.06434.pdf)
 
-CNN discriminator consists of the following layers: several convolutions+poolings (with decreasing spatial size) and, one-or-more fully-connected layers to get "feature vector", final binary classifier.
+A CNN discriminator consists of the following layers: several convolutions+poolings (with decreasing spatial size) and, one-or-more fully-connected layers to get "feature vector", final binary classifier.
+
+> 鉁� A 'pooling' in this context is a technique that reduces the size of the image. "Pooling layers reduce the dimensions of data by combining the outputs of neuron clusters at one layer into a single neuron in the next layer." - [source](https://wikipedia.org/wiki/Convolutional_neural_network#Pooling_layers)
 
 ### Generator
 
-Generator is slightly more tricky. You can consider it to be a reversed discriminator - starting from latent vector (in place of a feature vector), it has fully-connected layer to convert it into required size/shape, followed by deconvolutions+upscaling. This is similar to *decoder* part of [autoencoder](../09-Autoencoders/README.md).
+A Generator is slightly more tricky. You can consider it to be a reversed discriminator. Starting from a latent vector (in place of a feature vector), it has a fully-connected layer to convert it into the required size/shape, followed by deconvolutions+upscaling. This is similar to *decoder* part of [autoencoder](../09-Autoencoders/README.md).
 
-> Because convolution layer is implemented as a linear filter traversing the image, deconvolution is essentially similar to convolution, and can be implemented using the same layer logic.
+> 鉁� Because the convolution layer is implemented as a linear filter traversing the image, deconvolution is essentially similar to convolution, and can be implemented using the same layer logic.
 
 <img src="images/gan_arch_detail.png" width="70%"/>
 
@@ -35,16 +38,16 @@ Generator is slightly more tricky. You can consider it to be a reversed discrimi
 
 ### Training the GAN
 
-GANs are called **adversarial** because there is a constant competition between generator and discriminator. During this competition, both generator and discriminator improve, thus the network learns to produce better and better pictures.
+GANs are called **adversarial** because there is a constant competition between the generator and the discriminator. During this competition, both generator and discriminator improve, thus the network learns to produce better and better pictures.
 
 The training happens in two stages:
 
-* **Training the discriminator**. It is pretty straightforward: we generate a batch of images by the generator (for them label would be 0, which stands for fake image), and take a batch of images from the input dataset (with label 1, real image). We obtain some *discriminator loss*, and perform back prop.
-* **Training the generator**. This is slightly more tricky, because we do not know the expected output for the generator directly. We take the whole GAN network consisting of generator followed by discriminator, feed it with some random vectors, and expect the result to be 1 (corresponding to real images). We then freeze the parameters of the discriminator (we do not want it to be trained at this step), and perform the back prop.
+* **Training the discriminator**. This task is pretty straightforward: we generate a batch of images by the generator, labeling them 0, which stands for fake image, and taking a batch of images from the input dataset (with label 1, real image). We obtain some *discriminator loss*, and perform backprop.
+* **Training the generator**. This is slightly more tricky, because we do not know the expected output for the generator directly. We take the whole GAN network consisting of a generator followed by discriminator, feed it with some random vectors, and expect the result to be 1 (corresponding to real images). We then freeze the parameters of the discriminator (we do not want it to be trained at this step), and perform the backprop.
 
-During this process, both generator and discriminator losses are not going down significantly. In the ideal situation, they should oscillate, corresponding to both networks improving their performance.
+During this process, both the generator and the discriminator losses are not going down significantly. In the ideal situation, they should oscillate, corresponding to both networks improving their performance.
 
-## Go to Notebook
+## 鉁嶏笍 Exercises: GANs
 
 * [GAN Notebook in TensorFlow/Keras](GANTF.ipynb)
 * [GAN Notebook in PyTorch](GANPyTorch.ipynb)
@@ -53,17 +56,29 @@ During this process, both generator and discriminator losses are not going down
 
 GANs are known to be especially difficult to train. Here are a few problems:
 
-* **Mode Collapse**. By this term we mean that generator learns to produce one successful image that tricks the generator, and not a variety of different images.
-* **Sensitivity to hyperparameters**. Often you can see that GAN does not converge at all, and then suddenly decrease in the learning rate can lead to convergence.
-* Keeping **balance** between generator and discriminator. In many cases discriminator loss can drop to zero relatively quickly, which results in generator being unable to train further. To overcome this, we can try setting different learning rates for generator and discriminator, or skip discriminator training if the loss is already too low.
-* Training for **high resolution**. It is the same problems as with autoencoders, because reconstructing too many layers of convolutional network leads to artifacts. This problem is typically solved with so-called **progressive growing**, when first a few layers are trained on low-res images, and then layers are "unblocked" or added. Another solutions would be adding extra connections between layers and training several resolutions at once - see [Multi-Scale Gradient GANs paper](https://arxiv.org/abs/1903.06048) for details.
+* **Mode Collapse**. By this term we mean that the generator learns to produce one successful image that tricks the generator, and not a variety of different images.
+* **Sensitivity to hyperparameters**. Often you can see that a GAN does not converge at all, and then suddenly decreases in the learning rate leading to convergence.
+* Keeping a **balance** between the generator and the discriminator. In many cases discriminator loss can drop to zero relatively quickly, which results in the generator being unable to train further. To overcome this, we can try setting different learning rates for the generator and discriminator, or skip discriminator training if the loss is already too low.
+* Training for **high resolution**. Reflecting the same problem as with autoencoders, this problem is triggered because reconstructing too many layers of convolutional network leads to artifacts. This problem is typically solved with so-called **progressive growing**, when first a few layers are trained on low-res images, and then layers are "unblocked" or added. Another solution would be adding extra connections between layers and training several resolutions at once - see this [Multi-Scale Gradient GANs paper](https://arxiv.org/abs/1903.06048) for details.
 
 ## [Post-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/210)
 
-> 鉁� Todo: Assignment, conclusion
+## Conclusion
+
+In this lesson, you learned about GANS and how to train them. You also learned about the special challenges that this type of Neural Network can face, and some strategies on how to move past them.
+
+## 馃殌 Challenge
+
+Run through the [Style Transfer notebook](StyleTransfer.ipynb) using your own images.
 
-## References
+## Review & Self Study
+
+For reference, read more about GANs in these resources:
 
 * Marco Pasini, [10 Lessons I Learned Training GANs for one Year](https://towardsdatascience.com/10-lessons-i-learned-training-generative-adversarial-networks-gans-for-a-year-c9071159628)
 * [StyleGAN](https://en.wikipedia.org/wiki/StyleGAN), a *de facto* GAN architecture to consider
 * [Creating Generative Art using GANs on Azure ML](https://soshnikov.com/scienceart/creating-generative-art-using-gan-on-azureml/)
+
+## Assignment
+
+Revisit one of the two notebooks associated to this lesson and retrain the GAN on your own images. What can you create?
\ No newline at end of file