Sunday, August 23, 2020

How to dataset annotations for instance segmentation and semantic segmentation

In this blogs we will learn How to dataset annotations for instance segmentation and semantic Segmentation.

First we need to understand what is the semantic segmentation and instance segmentation.


In this image you can better understand.In semantic segmentation Every pixel in the image belongs to One a particular class. 

And Different instances of the same class are segmented individually in instance segmentation.

Now come to the point mask rcnn is a instance segmentation model 

And Deeplab is a semantic Segmentation model.

After annotations you can easily train your own mask rcnn, and deeplab model. dataset annotations is Very important part of machine learning.

In this blogs we will use Labelme Tools for annotations our dataset. you need to install this tools. don't Worry this is not a rocket science.

I recommend you use virtual environment for this without any installation error.

Create virtual environment using python 3.6

cmd :- virtualenv --python=python3.6 annotations

After this activate the virtualenv

cmd :- source annotations/bin/activate

after this you install labelme tools

cmd :- pip install labelme

one more package we need to install 

cmd :- pip install pyqt5

Installation process is done now we are ready to prepare your dataset for model training.

cmd :- labelme 

After hit this command this the display open like this.


After this you need to click Open Dir button to select your images folder for annotations. and now We are ready to dataset preparation one by one images. click create polygons button for polygons of our Folder images. after create polygons you need to insert the name of you labels class. then click save Button. you can select your store path of annotations files. i recommend you store your json annotations in your images folder which is generate our images according.

 
This process same for both instance segmentation and semantic segmentation.

After annotation we need to convert your annotations dataset in instance segmentation and semantic Segmentation.

Note : - Friends dataset preparation most important part of AI Model so please Make sure your Polygons will correct otherwise this polygons reflects your model.

Optional steps :- Also we need to remove negative images which is not contain any objects in your Images folder. i have already mentioned in my github repository. you can use this script for remove Negative images In our dataset folder. just add replace the path of your folder.  

Script :- delete_images_json_not.py   (you need to modified this script according your requirement)

(1) Instance segmentation For MASK RCNN custom training

First we will convert our dataset in instance segmentation For MASK RCNN custom training.

we need python script for this. labelme github repository provide this script. you can read about this.

we will follow simple steps for you. you just clone my github repository.

before this we need to install pycocotools

cmd :-  pip install pycocotools

cmd : - https://github.com/Manishsinghrajput98/dataset_pre.git

cmd :- cd dataset_pre

cmd :- instance_segmentation_mask_rcnn

Note :- I have mentioned use.txt files in folder you can read this how to use this script. don't worry this Is not a rocket science. you just replace the path of your annotations dataset folder. (command line Arguments)

you have to change the label name in your labels.txt. please don't remove this labels name __ignored__
background_ otherwise your model will not train correctly.

One more point you need to pass the command line argument. just we need 3 paths 1 is dataset Annotations path and 2 is labels.txt files and 3 is output folder.

You need to run this script under the virtual environment which is created already 

cmd :- python labelme2coco.py /home/rajput/Desktop/dataset_pre/instance_segmentation_mask_rcnn/kangaroo_images/ /home/rajput/Desktop/dataset_pre/instance_segmentation_mask_rcnn/kangaroo_train --labels /home/rajput/Desktop/dataset_pre/instance_segmentation_mask_rcnn/labels.txt

For mask rcnn training we need to run python script 2 times one for train and second for validation Folder. you can use 10 percent for validation and remaining for training.

after finish this process we need to create Annotations Folder. in annotations folder we need to create train and val folder in train and val folder we need to create images folder. and just copy/past our output images and json files (COCO) in images folder.

After this our Annotations Folder ready For mask rcnn custom dataset training. 

Folder structure like

Annotations :-
(1) train :-
    (I) images :- 
    (II) annotations.json 
(2) val :- 
     (I) images
    (II) annotations.json

Now come to the semantic segmentation for Deeplab custom training.

In next blog. i have covered all the details for mask rcnn custom dataset training please review this blog And learn how to train mask rcnn model for custom dataset these blog i have used google colab.
Click here this url

(2) Semantic segmentation for Deeplabs custom training

labelme also provide python script to convert Semantic segmentation. my folder contained this script Just you need to run this script under the virtual environment. 

One more point you need to pass the command line argument. just we need 3 paths 1 is dataset Annotations path and 2 is labels.txt files and 3 is output folder.

Also you have to change the label name in your labels.txt. please don't remove this labels name __ignored__ background_ otherwise your model will not train correctly.

cmd : - cd semantic_segmentation_deeplabs

Note :- I have mentioned use.txt files in folder you can read this how to use this script. don't worry this Is not a rocket science. you just replace the path of your annotations dataset folder. (command line Arguments)

cmd :- python labelme2voc.py /home/rajput/Desktop/dataset_pre/semantic_segmentation_deeplabs/kangaroo_images/ /home/rajput/Desktop/dataset_pre/semantic_segmentation_deeplabs/kangaroo_train --labels /home/rajput/Desktop/dataset_pre/semantic_segmentation_deeplabs/labels.txt

Generated output folder contained multiple folder. but we need 2 folder and one folder for deeplabs model training 1 is class_name.txt file and 2 is JPEGImages folder and 3 is SegmentationClassPNG Folder. 

After this process You can copy/past these folder in Separate folder which is need in future blog for Deeplab model training.

In deeplabs model we don't need dataset split for now. In further we will create tf records this time we Need dataset split. this process we will cover in our new Blog :- (Deeplabs Custom Dataset Training)

This is the complete process of dataset preparations for  instance segmentation and semantic segmentation.

This is the short video of our process you can watch this videos and follow our dataset preparation for mask rcnn and deeplab model training. (youtube)



In next blog. i have covered all the details for mask rcnn custom dataset training please review this blog And learn how to train mask rcnn model for custom dataset these blog i have used google colab.
Click here this url

Thanks For Read.

If you have any doubt so please comment.

13 comments:

  1. It's great information by you
    Very well done Manish

    ReplyDelete
  2. It's great to see info like that. Step by step. Nice work Manish ji

    ReplyDelete
  3. How can I prepare a dataset with fish body parts? Can I do this? Could you help me with some questions via email? I'm a beginner on the subject, I'm trying to get by myself.

    ReplyDelete
  4. Sure! Please email me if you have any dubt about this
    Thanks

    ReplyDelete
  5. following the code step by step but mine is not generating any images into the _train folder. its just empty any reason for this?

    ReplyDelete
  6. How can we augment the dataset?

    ReplyDelete
  7. please when the Deeplabs Custom Dataset Training article will be online ? thanks :)

    ReplyDelete

  8. I have an image which contains small cells of different kind. I want to do the same thing using quick selection tool in adobe photoshop as it selects the ROI more precisely. Can you plz help me with this.
    thanks in advance.

    ReplyDelete
  9. Hi Manish, I am running the code on MacOS and getting error which says-> usage: labelme2coco.py [-h] --labels LABELS input_dir output_dir
    labelme2coco.py: error: unrecognized arguments: /Users/deepali/Documents/CV Projects/semantic/vehicles_train Projects/semantic/labels.txt

    ReplyDelete
  10. For anyone using this post's tutorial, if you encounter TypeError: label2rgb() got an unexpected keyword argument 'img'
    in the second part, change 'img' to 'image' in labelme2voc.py will solve the problem.

    ReplyDelete

If you have any doubts. Please let me know