Mask R-CNN[1] is a framework for object instance segmentation, which adds a branch for
predicting an object maskin parallel with the existing branch for bounding box recognition ofFaster R-CNN. There are some details of reading and implementing it.
Contents
Paper & Code & note
Paper: Mask R-CNN(ICCV 2017 paper)
Code: Pytorch
Note: Mendeley
Paper
Abstract
- It is a framework for object instance segmentation.
- It extends
Faster R-CNNby adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.- It challenges instance
segmentation, bounding-box objectdetection, and personkeypointdetection.- It serves as a solid baseline in instance-level recognition.
Problem Description

Problem Solution

Conceptual Understanding

Core Conception
Loss

Mask


RoIAlign


Experiments



Code
The complete code can be found in detectron2[2].
Note
More details of Mask R-CNN and its extends like RoIAlign, bilinear interpolation and etc. can be found in [3].
References
[1] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.
[2] detectron2. https://github.com/facebookresearch/detectron2
[3] stone. “The amazing Mask R-CNN.” https://zhuanlan.zhihu.com/p/37998710