diff --git a/README.md b/README.md
index f10e688bdd4f911be26fefa6e3329df8e78358e7..b931f8706423d43ea0150f3e596a65b6adc438f8 100644
--- a/README.md
+++ b/README.md
@@ -80,7 +80,7 @@ For a gentle introduction to *AI in the Cloud* topics you may consider taking th
 <tr><td>16</td><td>Recurrent Neural Networks</td><td><a href="lessons/5-NLP/16-RNN/README.md">Text</a></td><td><a href="lessons/5-NLP/16-RNN/RNNPyTorch.ipynb">PyTorch</a></td><td><a href="lessons/5-NLP/16-RNN/RNNTF.ipynb">TensorFlow</a></td><td></td></tr>
 <tr><td>17</td><td>Generative Recurrent Networks</td><td><a href="lessons/5-NLP/17-GenerativeNetworks/README.md">Text</a></td><td><a href="lessons/5-NLP/17-GenerativeNetworks/GenerativePyTorch.md">PyTorch</a></td><td><a href="lessons/5-NLP/17-GenerativeNetworks/GenerativeTF.md">TensorFlow</a></td><td><a href="lessons/5-NLP/17-GenerativeNetworks/lab/README.md">Lab</a></td></tr>
 <tr><td>18</td><td>Transformers. BERT.</td><td><a href="lessons/5-NLP/18-Transformers/README.md">Text</a></td><td><a href="lessons/5-NLP/18-Transformers/TransformersPyTorch.md">PyTorch</a></td><td><a href="lessons/5-NLP/18-Transformers/TransformersTF.md">TensorFlow</a></td><td></td></tr>
-<tr><td>19</td><td>Named Entity Recognition</td><td>Text</td><td>PyTorch</td><td>TensorFlow</td><td></td></tr>
+<tr><td>19</td><td>Named Entity Recognition</td><td><a href="lessons/5-NLP/19-NER/README.md">Text</a></td><td></td><td><a href="lessons/5-NLP/19-NER/NER-TF.ipynb">TensorFlow</a></td><a href="lessons/5-NLP/19-NER/lab/README.md">Lab</a><td></td></tr>
 <tr><td>20</td><td>Large Language Models, Prompt Programming and Few-Shot Tasks</td><td>Text</td><td>PyTorch</td><td>TensorFlow</td><td></td></tr>
 <tr><td>VI</td><td colspan="4"><b>Other AI Techniques</b></td><td></td></tr>
 <tr><td>21</td><td>Genetic Algorithms</td><td><a href="lessons/6-Other/21-GeneticAlgorithms/README.md">Text</a><td colspan="2"><a href="lessons/6-Other/21-GeneticAlgorithms/Genetic.ipynb">Notebook</a></td><td></td></tr>
diff --git a/lessons/5-NLP/19-NER/NER-TF.ipynb b/lessons/5-NLP/19-NER/NER-TF.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..46ed324938db1f1028b47122e5d1bff0dc11afb4
--- /dev/null
+++ b/lessons/5-NLP/19-NER/NER-TF.ipynb
@@ -0,0 +1,481 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Named Entity Recognition (NER)\n",
+    "\n",
+    "This notebook is from [AI for Beginners Curriculum](http://aka.ms/ai-beginners).\n",
+    "\n",
+    "In this example, we will learn how to train NER model on [Annotated Corpus for Named Entity Recognition](https://www.kaggle.com/datasets/abhinavwalia95/entity-annotated-corpus) Dataset from Kaggle. Before procedding, please donwload [ner_dataset.csv](https://www.kaggle.com/datasets/abhinavwalia95/entity-annotated-corpus?resource=download&select=ner_dataset.csv) file into current directory."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 62,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "from tensorflow import keras\n",
+    "import numpy as np"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Preparing the Dataset \n",
+    "\n",
+    "We will start by reading the dataset into a dataframe. If you want to learn more about using Pandas, visit a [lesson on data processing](https://github.com/microsoft/Data-Science-For-Beginners/tree/main/2-Working-With-Data/07-python) in our [Data Science for Beginners](http://aka.ms/datascience-beginners)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Sentence #</th>\n",
+       "      <th>Word</th>\n",
+       "      <th>POS</th>\n",
+       "      <th>Tag</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Sentence: 1</td>\n",
+       "      <td>Thousands</td>\n",
+       "      <td>NNS</td>\n",
+       "      <td>O</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>of</td>\n",
+       "      <td>IN</td>\n",
+       "      <td>O</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>demonstrators</td>\n",
+       "      <td>NNS</td>\n",
+       "      <td>O</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>have</td>\n",
+       "      <td>VBP</td>\n",
+       "      <td>O</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>NaN</td>\n",
+       "      <td>marched</td>\n",
+       "      <td>VBN</td>\n",
+       "      <td>O</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "    Sentence #           Word  POS Tag\n",
+       "0  Sentence: 1      Thousands  NNS   O\n",
+       "1          NaN             of   IN   O\n",
+       "2          NaN  demonstrators  NNS   O\n",
+       "3          NaN           have  VBP   O\n",
+       "4          NaN        marched  VBN   O"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df = pd.read_csv('ner_dataset.csv',encoding='unicode-escape')\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's get unique tags and create lookup dictionaries that we can use to convert tags into class numbers:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "array(['O', 'B-geo', 'B-gpe', 'B-per', 'I-geo', 'B-org', 'I-org', 'B-tim',\n",
+       "       'B-art', 'I-art', 'I-per', 'I-gpe', 'I-tim', 'B-nat', 'B-eve',\n",
+       "       'I-eve', 'I-nat'], dtype=object)"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "tags = df.Tag.unique()\n",
+    "tags"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'O'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "id2tag = dict(enumerate(tags))\n",
+    "tag2id = { v : k for k,v in id2tag.items() }\n",
+    "\n",
+    "id2tag[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we need to do the same with vocabulary. For simplicity, we will create vocabulary without taking word frequency into account; in real life you might want to use Keras vectorizer, and limit the number of words."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "vocab = set(df['Word'].apply(lambda x: x.lower()))\n",
+    "id2word = { i+1 : v for i,v in enumerate(vocab) }\n",
+    "id2word[0] = '<UNK>'\n",
+    "vocab.add('<UNK>')\n",
+    "word2id = { v : k for k,v in id2word.items() }"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We need to create a dataset of sentences for training. Let's loop through the original dataset and separate all individual sentences into `X` (lists of words) and `Y` (list of tokens):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 41,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "X,Y = [],[]\n",
+    "s,t = [],[]\n",
+    "for i,row in df[['Sentence #','Word','Tag']].iterrows():\n",
+    "    if pd.isna(row['Sentence #']):\n",
+    "        s.append(row['Word'])\n",
+    "        t.append(row['Tag'])\n",
+    "    else:\n",
+    "        if len(s)>0:\n",
+    "            X.append(s)\n",
+    "            Y.append(t)\n",
+    "        s,t = [row['Word']],[row['Tag']]\n",
+    "X.append(s)\n",
+    "Y.append(t)\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will now vectorize all words and tokens:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 93,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "([10386,\n",
+       "  23515,\n",
+       "  4134,\n",
+       "  29620,\n",
+       "  7954,\n",
+       "  13583,\n",
+       "  21193,\n",
+       "  12222,\n",
+       "  27322,\n",
+       "  18258,\n",
+       "  5815,\n",
+       "  15880,\n",
+       "  5355,\n",
+       "  25242,\n",
+       "  31327,\n",
+       "  18258,\n",
+       "  27067,\n",
+       "  23515,\n",
+       "  26444,\n",
+       "  14412,\n",
+       "  358,\n",
+       "  26551,\n",
+       "  5011,\n",
+       "  30558],\n",
+       " [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0])"
+      ]
+     },
+     "execution_count": 93,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "def vectorize(seq):\n",
+    "    return [word2id[x.lower()] for x in seq]\n",
+    "\n",
+    "def tagify(seq):\n",
+    "    return [tag2id[x] for x in seq]\n",
+    "\n",
+    "Xv = list(map(vectorize,X))\n",
+    "Yv = list(map(tagify,Y))\n",
+    "\n",
+    "Xv[0], Yv[0]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For simplicity, we will pad all sentences with 0 tokens to the maximum length. In real life, we might want to use more clever strategy, and pad sequences only within one minibatch."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 51,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "X_data = keras.preprocessing.sequence.pad_sequences(Xv,padding='post')\n",
+    "Y_data = keras.preprocessing.sequence.pad_sequences(Yv,padding='post')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Defining Token Classification Network\n",
+    "\n",
+    "We will use two-layer bidirectional LSTM network for token classification. In order to apply dense classifier to each of the output of the last LSTM layer, we will use `TimeDistributed` construction, which replicates the same dense layer to each of the outputs of LSTM at each step: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 94,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Model: \"sequential_3\"\n",
+      "_________________________________________________________________\n",
+      " Layer (type)                Output Shape              Param #   \n",
+      "=================================================================\n",
+      " embedding_4 (Embedding)     (None, 104, 300)          9545400   \n",
+      "                                                                 \n",
+      " bidirectional_6 (Bidirectio  (None, 104, 200)         320800    \n",
+      " nal)                                                            \n",
+      "                                                                 \n",
+      " bidirectional_7 (Bidirectio  (None, 104, 200)         240800    \n",
+      " nal)                                                            \n",
+      "                                                                 \n",
+      " time_distributed_3 (TimeDis  (None, 104, 17)          3417      \n",
+      " tributed)                                                       \n",
+      "                                                                 \n",
+      "=================================================================\n",
+      "Total params: 10,110,417\n",
+      "Trainable params: 10,110,417\n",
+      "Non-trainable params: 0\n",
+      "_________________________________________________________________\n"
+     ]
+    }
+   ],
+   "source": [
+    "maxlen = X_data.shape[1]\n",
+    "vocab_size = len(vocab)\n",
+    "num_tags = len(tags)\n",
+    "model = keras.models.Sequential([\n",
+    "    keras.layers.Embedding(vocab_size, 300, input_length=maxlen),\n",
+    "    keras.layers.Bidirectional(keras.layers.LSTM(units=100, activation='tanh', return_sequences=True)),\n",
+    "    keras.layers.Bidirectional(keras.layers.LSTM(units=100, activation='tanh', return_sequences=True)),\n",
+    "    keras.layers.TimeDistributed(keras.layers.Dense(num_tags, activation='softmax'))\n",
+    "])\n",
+    "model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['acc'])\n",
+    "model.summary()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note here that we are explicity specifying `maxlen` for our dataset - in case we want the network to be able to handle variable length sequences, we need to be a bit more clever when defining the network.\n",
+    "\n",
+    "Let's now train the model. For speed, we will only train for one epoch, but you may try training for longer time. Also, you may want to separate some part of the dataset as training dataset, to observe validation accuracy."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "1499/1499 [==============================] - 740s 488ms/step - loss: 0.0667 - acc: 0.9841\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "<keras.callbacks.History at 0x16f0bb2a310>"
+      ]
+     },
+     "execution_count": 57,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "model.fit(X_data,Y_data)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Testing the Result\n",
+    "\n",
+    "Let's now see how our entity recognition model works on a sample sentence: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 91,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sent = 'John Smith went to Paris to attend a conference in cancer development institute'\n",
+    "words = sent.lower().split()\n",
+    "v = keras.preprocessing.sequence.pad_sequences([[word2id[x] for x in words]],padding='post',maxlen=maxlen)\n",
+    "res = model(v)[0]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 92,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "john -> B-per\n",
+      "smith -> I-per\n",
+      "went -> O\n",
+      "to -> O\n",
+      "paris -> B-geo\n",
+      "to -> O\n",
+      "attend -> O\n",
+      "a -> O\n",
+      "conference -> O\n",
+      "in -> O\n",
+      "cancer -> B-org\n",
+      "development -> I-org\n",
+      "institute -> I-org\n"
+     ]
+    }
+   ],
+   "source": [
+    "r = np.argmax(res.numpy(),axis=1)\n",
+    "for i,w in zip(r,words):\n",
+    "    print(f\"{w} -> {id2tag[i]}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Takeaway\n",
+    "\n",
+    "Even simple LSTM model shows reasonable results at NER. However, to get much better results, you may want to use large pre-trained language models such as BERT. Training BERT for NER using Huggingface Transformers library is described [here](https://huggingface.co/course/chapter7/2?fw=pt)."
+   ]
+  }
+ ],
+ "metadata": {
+  "interpreter": {
+   "hash": "16af2a8bbb083ea23e5e41c7f5787656b2ce26968575d8763f2c4b17f9cd711f"
+  },
+  "kernelspec": {
+   "display_name": "Python 3.8.12 ('py38')",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.12"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/lessons/5-NLP/19-NER/README.md b/lessons/5-NLP/19-NER/README.md
index f6e0339af6f3eaf2326abb6105bfc9471dddf1b0..380b458385b34b175e502d50e94b3ff97f180ad6 100644
--- a/lessons/5-NLP/19-NER/README.md
+++ b/lessons/5-NLP/19-NER/README.md
@@ -1 +1,68 @@
-Placeholder
\ No newline at end of file
+# Named Entity Recognition
+
+Up to now, we have mostly been concentrating on one NLP task - classification. However, there are also other NLP tasks that can be accomplished with neural networks. One of those tasks is **[Named Entity Recognition](https://en.wikipedia.org/wiki/Named-entity_recognition)** (NER), which deals with recognizing specific entities within text, such as places, person names, date-time intervals, chemical formulae and so on.
+
+## [Pre-lecture quiz](https://black-ground-0cc93280f.1.azurestaticapps.net/quiz/119) 
+
+## Example of Using NER
+
+Suppose you want to develop a natural language chat bot, similar to Amazon Alexa or Google Assistant. The way intelligent chat bots work is to *understand* what the user want by doing text classification on the input sentence. The result of this classification is so-called **intent**, which determines what a chat bot should do.
+
+![Bot NER](images/bot-ner.png)
+
+> Image by author
+
+However, a user may provide some parameters as part of the phrase. For example, when asking for a weather, she may specify a location or date. A bot should be able to understand those entities, and fill in the parameter slots accordingly before performing the action. This is exactly when NER comes in. 
+
+Another example would be [analyzing scientific medical papers](https://soshnikov.com/science/analyzing-medical-papers-with-azure-and-text-analytics-for-health/), one of the main things we need to look for are specific medical terms, such as diseases and medical substances. While a small number of diseases can probably be extracted using substring search, more complex entities, such as chemical compounds and medication names, need more complex approach.
+
+## NER as Token Classification
+
+NER models are essentially **token classification models**, because for each of the input tokens we need to decide whether it belongs to an entity or not, and if it does - to which entity class.
+
+Consider the following paper title:
+
+**Tricuspid valve regurgitation** and **lithium carbonate** **toxicity** in a newborn infant.
+
+Entities here are:
+* Tricuspid valve regurgitation is a disease (`DIS`)
+* Lithium carbonate is a chemical substance (`CHEM`)
+* Toxicity is also a disease (`DIS`)
+
+Notice that one entity can span several tokens. And, as in this case, we need to distinguish between two consecutive entities. Thus it is common to use two classes for each entity - one specifying the first token of the entity (often `B-` prefix is used, for beginning), and another - the continuation of an entity (`I-`, for inner token). We also use `O` as a class to represent all other tokens. Such token tagging is called [BIO tagging](https://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning_(tagging)) (or IOB). When tagged, our title will look like this:
+
+Token | Tag
+------|-----
+Tricuspid | B-DIS
+valve | I-DIS
+regurgitation | I-DIS
+and | O
+lithium | B-CHEM
+carbonate | I-CHEM
+toxicity | B-DIS
+in | O
+a | O
+newborn | O
+infant | O
+. | O
+
+Since we need to build one-to-one correspondence between tokens and this classes, we can train rightmost **many-to-many** neural network model from this picture:
+
+![Image showing common recurrent neural network patterns.](../17-GenerativeNetworks/images/unreasonable-effectiveness-of-rnn.jpg)
+
+> *Image from blog post [Unreasonable Effectiveness of Recurrent Neural Networks](http://karpathy.github.io/2015/05/21/rnn-effectiveness/) by [Andrej Karpaty](http://karpathy.github.io/). NER token classification models correspond to right-most network architecture on this picture.*
+
+## Training NER models
+
+Since NER model is essentially a token classification model, we can use RNNs that we are already familiar with for this task. In this case, each block of recurrent network will return the token ID. Example notebook shows how to train LSTM for token classification.
+
+## ✍️ Example Notebooks: NER
+
+Continue your learning in the following notebooks:
+
+* [NER with TensorFlow](NER-TF.ipynb)
+
+## [Assignment](lab/README.md)
+
+In the assignment for this lesson, you will have to train medical entity recognition model. You can start with training LSTM model as described in this lesson, and proceed with using BERT transformer model. Read [the instructions](lab/README.md) to get all the details.
+
diff --git a/lessons/5-NLP/19-NER/images/bot-ner.png b/lessons/5-NLP/19-NER/images/bot-ner.png
new file mode 100644
index 0000000000000000000000000000000000000000..2b5c982566076065559362a5af407a758a5a4b35
Binary files /dev/null and b/lessons/5-NLP/19-NER/images/bot-ner.png differ
diff --git a/lessons/5-NLP/19-NER/lab/README.md b/lessons/5-NLP/19-NER/lab/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..44e54da0c008f141bac4bc6cab9c9818cf94fb1c
--- /dev/null
+++ b/lessons/5-NLP/19-NER/lab/README.md
@@ -0,0 +1,47 @@
+# NER
+
+Lab Assignment from [AI for Beginners Curriculum](https://github.com/microsoft/ai-for-beginners).
+
+## Task
+
+In this lab, you need to train named entity recognition model for medical terms.
+
+## The Dataset
+
+To train NER model, we need properly labeled dataset with medical entities. [BC5CDR dataset](https://biocreative.bioinformatics.udel.edu/tasks/biocreative-v/track-3-cdr/) contains labeled diseases and chemicals entities from more than 1500 papers. You may download the dataset after registering at their web site.
+
+BC5CDR Dataset looks like this:
+
+```
+6794356|t|Tricuspid valve regurgitation and lithium carbonate toxicity in a newborn infant.
+6794356|a|A newborn with massive tricuspid regurgitation, atrial flutter, congestive heart failure, and a high serum lithium level is described. This is the first patient to initially manifest tricuspid regurgitation and atrial flutter, and the 11th described patient with cardiac disease among infants exposed to lithium compounds in the first trimester of pregnancy. Sixty-three percent of these infants had tricuspid valve involvement. Lithium carbonate may be a factor in the increasing incidence of congenital heart disease when taken during early pregnancy. It also causes neurologic depression, cyanosis, and cardiac arrhythmia when consumed prior to delivery.
+6794356	0	29	Tricuspid valve regurgitation	Disease	D014262
+6794356	34	51	lithium carbonate	Chemical	D016651
+6794356	52	60	toxicity	Disease	D064420
+...
+```
+
+In this dataset, there are paper title and abstract in the first two lines, and then there are individual entities, with beginning and end positions within title+abstract block. In addition to entity type, you get the ontology ID of this entity within some medical ontology.
+
+You will need to write some Python code to convert this into BIO encoding.
+
+## The Network
+
+First attempt at NER can be done by using LSTM network, as in our example you have seen during the lesson. However, in NLP tasks, [transformer architecture](https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)), and specifically [BERT language models](https://en.wikipedia.org/wiki/BERT_(language_model)) show much better results. Pre-trained BERT models understand the general structure of a language, and can be fine-tuned for specific tasks with relatively small datasets and computational costs.
+
+Since we are planning to apply NER to medical scenario, it makes sense to use BERT model trained on medical texts. Microsoft Research has released a pre-trained a model called [PubMedBERT][PubMedBERT] ([publication][PubMedBERT-Pub]), which was fine-tuned using texts from [PubMed](https://pubmed.ncbi.nlm.nih.gov/) repository.
+
+The *de facto* standard for training transformer models is [Hugging Face Transformers](https://huggingface.co/) library. It also contains a repository of community-maintained pre-trained models, including PubMedBERT. To load and use this model, we just need a couple of lines of code:
+
+```python
+model_name = "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract"
+classes = ... # number of classes: 2*entities+1
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = BertForTokenClassification.from_pretrained(model_name, classes)
+```
+
+This gives us the `model` itself, built for token classification task using `classes` number of classes, as well as `tokenizer` object that can split input text into tokens. You will need to convert the dataset into BIO format, taking PubMedBERT tokenization into account. You can use [this bit of Python code](https://gist.github.com/shwars/580b55684be3328eb39ecf01b9cbbd88) as an inspiration.
+
+## Takeaway
+
+This task is very close to the actual task you are likely go have if you want to gain more insights into large volumes on natural language texts. In our case, we can apply our trained model to the [dataset of COVID-related papers](https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge) and see which insights we will be able to get. [This blog post](https://soshnikov.com/science/analyzing-medical-papers-with-azure-and-text-analytics-for-health/) and [this paper](https://www.mdpi.com/2504-2289/6/1/4) describe the research the can be done on this corpus of papers using NER.
diff --git a/lessons/6-Other/22-DeepRL/README.md b/lessons/6-Other/22-DeepRL/README.md
index 15ecb6c84de9fdc498e91bcf714a9172a3b4e357..6f3266fe2c276d790f1fcdbf354a75254d4db5f6 100644
--- a/lessons/6-Other/22-DeepRL/README.md
+++ b/lessons/6-Other/22-DeepRL/README.md
@@ -94,7 +94,7 @@ Reinforcement Learning nowadays is a fast growing field of research. Some of the
 * Teaching computer to play board games, such as Chess and Go. Recent state-of-the-art programs like **Alpha Zero** was trained from scratch by two agents playing against each other, and improving at each step.
 * In industry, RL is used to create control systems from simulation. A service called [Bonsai](https://azure.microsoft.com/services/project-bonsai/) is specifically designed for that.
 
-## Assignment: [Train a Mountain Car](labs/README.md)
+## Assignment: [Train a Mountain Car](lab/README.md)
 
 Your goal during this assignment would be to train a different Gym environment - [Mountain Car](https://www.gymlibrary.ml/environments/classic_control/mountain_car/).