Mask R-CNN[1] is a framework for object instance segmentation, which adds a branch for
predicting an object mask
in parallel with the existing branch for bounding box recognition ofFaster R-CNN
. There are some details of reading and implementing it.
Contents
Paper & Code & note
Paper: Mask R-CNN(ICCV 2017 paper)
Code: Pytorch
Note: Mendeley
Paper
Abstract
- It is a framework for object instance segmentation.
- It extends
Faster R-CNN
by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.- It challenges instance
segmentation
, bounding-box objectdetection
, and personkeypoint
detection.- It serves as a solid baseline in instance-level recognition.
Problem Description
Problem Solution
Conceptual Understanding
Core Conception
Loss
Mask
RoIAlign
Experiments
Code
The complete code can be found in detectron2[2].
Note
More details of Mask R-CNN and its extends like RoIAlign, bilinear interpolation and etc. can be found in [3].
References
[1] He K, Gkioxari G, Dollár P, et al. Mask r-cnn[C]//Proceedings of the IEEE international conference on computer vision. 2017: 2961-2969.
[2] detectron2. https://github.com/facebookresearch/detectron2
[3] stone. “The amazing Mask R-CNN.” https://zhuanlan.zhihu.com/p/37998710