Skip to content
Snippets Groups Projects
Commit d0573edb authored by Dmitri Soshnikov's avatar Dmitri Soshnikov
Browse files

Add Semantic Segmentation section

parent 228b87e2
No related branches found
No related tags found
No related merge requests found
# Segmentation
We have already learnt about Object Detection, which allows us to locate objects in the image by predicting their *bounding boxes*. However, for some tasks we do not only need bounding boxes, but also more precise object localization. This task is called **segmentation**.
Segmentation can be viewed as **pixel classification**, whereas for **each** pixel of image we must predict its class (*background* being one of the classes). There are two main segmentation algorithms:
* **Semantic segmentation** only tells pixel class, and does not make a distinction between different objects of the same class
* **Instance segmentation** divides classes into different instances.
For instance segmentation 10 sheep are different objects, for semantic segmentation all sheep are represented by one class.
<img src="images/instance_vs_semantic.jpeg" width="50%">
> Image from [this blog post](https://nirmalamurali.medium.com/image-classification-vs-semantic-segmentation-vs-instance-segmentation-625c33a08d50)
There are different neural architectures for segmentation, but they all have the same structure. In a way, it is similar to autoencoder, but instead of deconstructing the original image, our goal is to deconstruct a **mask**. Thus, segmentation network has the following parts:
* **Encoder** extracts features from input image
* **Decoder** transforms those features into the **mask image**, with the same size and number of channels corresponding to the number of classes.
<img src="images/segm.png" width="80%">
> Image from [this publication](https://arxiv.org/pdf/2001.05566.pdf)
We should especially mention the loss function that is used for segmentation. In classical autoencoders we need to measure the similarity between two images, and we can use mean square error to do that. In segmentation, each pixel in the target mask image represents the class number (one-hot-encoded along the third dimension), so we need to use loss functions specific for classification - cross-entropy loss, averaged over all pixels. If the mask is binary - **binary cross-entropy loss** (BCE) is used.
## Segmentation for Medical Imaging
In this lesson, we will see the segmentation in action by training the network to recognize human nevi on the medical images. We will be using <a href="https://www.fc.up.pt/addi/ph2%20database.html">PH<sup>2</sup> Database</a> of dermoscopy images. This dataset contains 200 images of three classes: typical nevus, atypical nevus, and melanoma. All images also contain corresponding **mask** that outline the nevus.
<img src="images/navi.png"/>
We will train a model to segment any nevus from the background.
## Notebooks
Open [the notebook](SemanticSegmentationPytorch.ipynb) to learn more about different semantic segmentation architectures and see them in action.
## [Lab](lab/README.md)
In this lab, we encourage you to try ** human body segmentation** using [Segmentation Full Body MADS Dataset](https://www.kaggle.com/datasets/tapakah68/segmentation-full-body-mads-dataset) from Kaggle.
## Assignment
Body segmentation is just one of the common tasks that we can do with images of people. Another important tasks include **skeleton detection** and **pose detection**. Try out [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) library to see how pose detection can be used.
4-ComputerVision/12-Segmentation/images/navi.png

38.4 KiB

4-ComputerVision/12-Segmentation/images/segm.png

219 KiB | W: | H:

4-ComputerVision/12-Segmentation/images/segm.png

64.7 KiB | W: | H:

4-ComputerVision/12-Segmentation/images/segm.png
4-ComputerVision/12-Segmentation/images/segm.png
4-ComputerVision/12-Segmentation/images/segm.png
4-ComputerVision/12-Segmentation/images/segm.png
  • 2-up
  • Swipe
  • Onion skin
4-ComputerVision/12-Segmentation/images/segnet.png

292 KiB | W: | H:

4-ComputerVision/12-Segmentation/images/segnet.png

72 KiB | W: | H:

4-ComputerVision/12-Segmentation/images/segnet.png
4-ComputerVision/12-Segmentation/images/segnet.png
4-ComputerVision/12-Segmentation/images/segnet.png
4-ComputerVision/12-Segmentation/images/segnet.png
  • 2-up
  • Swipe
  • Onion skin
4-ComputerVision/12-Segmentation/images/unet.png

75.4 KiB | W: | H:

4-ComputerVision/12-Segmentation/images/unet.png

27.2 KiB | W: | H:

4-ComputerVision/12-Segmentation/images/unet.png
4-ComputerVision/12-Segmentation/images/unet.png
4-ComputerVision/12-Segmentation/images/unet.png
4-ComputerVision/12-Segmentation/images/unet.png
  • 2-up
  • Swipe
  • Onion skin
This diff is collapsed.
# Human Body Segmentation
Lab Assignment from [AI for Beginners Curriculum](https://github.com/microsoft/ai-for-beginners).
## Task
In video production, for example, in weather forecasts, we often need to cut out a human image from camera and place it on top of some other footage. This is typically done using **chroma key** techniques, when a human is filmed in front of a uniform color background, which is then removed. In this lab, we will train a neural network model to cut out the human silhouette.
## The Dataset
We will be using [Segmentation Full Body MADS Dataset](https://www.kaggle.com/datasets/tapakah68/segmentation-full-body-mads-dataset) from Kaggle. Download the dataset manually from Kaggle.
## Stating Notebook
Start the lab by opening [BodySegmentation.ipynb](BodySegmentation.ipynb)
## Takeaway
Body segmentation is just one of the common tasks that we can do with images of people. Another important tasks include **skeleton detection** and **pose detection**. Look into [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) library to see how those tasks can be implemented.
......@@ -19,8 +19,8 @@ We are currently actively looking for contributions on the following topics:
- [ ] Write section + notebook on OpenCV / preprocessing images
- [ ] Write section on Deep Reinforcement Learning
- [ ] Translate Semantic Segmentation notebook to TensorFlow
- [ ] Improve section + notebook on Object Detection
- [ ] Write section + notebook on Instance Segmentation
- [ ] PyTorch Lightning (for [this section](https://github.com/microsoft/AI-For-Beginners/blob/main/3-NeuralNetworks/05-Frameworks/README.md))
- [ ] Improve GAN section and translate samples to PyTorch ([here](https://github.com/microsoft/AI-For-Beginners/blob/main/4-ComputerVision/10-GANs/README.md))
- [ ] Translate Autoencoder sample to PyTorch ([here](https://github.com/microsoft/AI-For-Beginners/blob/main/4-ComputerVision/09-Autoencoders/README.md))
- [ ] Write section + samples on Named Entity Recognition
- [ ] Create samples for training our own embeddings for [this section](https://github.com/microsoft/AI-For-Beginners/tree/main/5-NLP/15-LanguageModeling)
......@@ -65,7 +65,7 @@ For a gentle introduction to *AI in the Cloud* topic you may consider taking the
<tr><td>9</td><td>Autoencoders and VAEs</td><td><a href="4-ComputerVision/09-Autoencoders/README.md">Text</a></td><td><a href="4-ComputerVision/09-Autoencoders/AutoEncodersPytorch.ipynb">PyTorch</td><td><a href="4-ComputerVision/09-Autoencoders/AutoencodersTF.ipynb">TensorFlow</a></td><td></td></tr>
<tr><td>10</td><td>Generative Adversarial Networks</td><td><a href="4-ComputerVision/10-GANs/README.md">Text</a></td><td><a href="4-ComputerVision/10-GANs/GANPytorch.ipynb">PyTorch</td><td><a href="4-ComputerVision/10-GANs/GANs.ipynb">TensorFlow</a></td><td></td></tr>
<tr><td>11</td><td>Object Detection</td><td>Text</td><td>PyTorch</td><td>TensorFlow</td><td></td></tr>
<tr><td>12</td><td>Instance Segmentation. U-Net</td><td>Text</td><td><a href="4-ComputerVision/12-Segmentation/SemanticSegmentationPytorch.ipynb">PyTorch</td><td>TensorFlow</td><td></td></tr>
<tr><td>12</td><td>Semantic Segmentation. U-Net</td><td><a href="4-ComputerVision/12-Segmentation/README.md">Text</a></td><td><a href="4-ComputerVision/12-Segmentation/SemanticSegmentationPytorch.ipynb">PyTorch</td><td>TensorFlow</td><td><a href="4-ComputerVision/12-Segmentation/lab/README.md">Lab</a></td></tr>
<tr><td>V</td><td colspan="2"><b><a href="5-NLP/README.md">Natural Language Processing</a></b></td>
<td><a href="https://docs.microsoft.com/learn/modules/intro-natural-language-processing-pytorch/?WT.mc_id=academic-57639-dmitryso">MS Learn</a></td>
<td><a href="https://docs.microsoft.com/learn/modules/intro-natural-language-processing-TensorFlow/?WT.mc_id=academic-57639-dmitryso">MS Learn</a></td>
......@@ -77,7 +77,7 @@ For a gentle introduction to *AI in the Cloud* topic you may consider taking the
<tr><td>17</td><td>Generative Recurrent Networks</td><td><a href="5-NLP/17-GenerativeNetworks/README.md">Text</a></td><td><a href="5-NLP/17-GenerativeNetworks/GenerativePyTorch.md">PyTorch</a></td><td><a href="5-NLP/17-GenerativeNetworks/GenerativeTF.md">TensorFlow</a></td><td></td></tr>
<tr><td>18</td><td>Transformers. BERT.</td><td><a href="5-NLP/18-Transformers/README.md">Text</a></td><td><a href="5-NLP/18-Transformers/TransformersPyTorch.md">PyTorch</a></td><td><a href="5-NLP/18-Transformers/TransformersTF.md">TensorFlow</a></td><td></td></tr>
<tr><td>19</td><td>Named Entity Recognition</td><td>Text</td><td>PyTorch</td><td>TensorFlow</td><td></td></tr>
<tr><td>20</td><td>Text Generation using GPT</td><td>Text</td><td>PyTorch</td><td>TensorFlow</td><td></td></tr>
<tr><td>20</td><td>Large Language Models, Prompt Programming and Few-Shot Tasks</td><td>Text</td><td>PyTorch</td><td>TensorFlow</td><td></td></tr>
<tr><td>VI</td><td colspan="4"><b>Other AI Techniques</b></td><td>PAT</td></tr>
<tr><td>21</td><td>Genetic Algorithms</td><td><a href="6-Other/21-GeneticAlgorithms/README.md">Text</a><td colspan="2"><a href="6-Other/21-GeneticAlgorithms/Genetic.ipynb">Notebook</a></td><td></td></tr>
<tr><td>22</td><td>Deep Reinforcement Learning</td><td><a href="6-Other/22-DeepRL/README.md">Text</a></td><td>PyTorch</td><td>TensorFlow</td><td></td></tr>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment