源码实现-Normalization

本文整理 BatchNorm / LayerNorm / RMSNorm 的作用与差异,并给出与 PyTorch 思路一致的简化实现(dummy),便于对照官方源码阅读。


源码实现-MobileNet

本文整理 MobileNetV1 / V2 (Depthwise Separable、Inverted Residual)中与标准卷积的差异,并给出与常见实现思路一致的 PyTorch 极简模块,便于对照 timm 等源码阅读。


PPNet

PPNet(Part-aware Prototype Network for Few-shot Semantic Segmentation)[1] decompose the holistic class representation into a set of part-aware prototypes, and leverage unlabeled data to better modeling of intra-class variations. Besides, graph neural network model is used to generate and enhance the proposed part-aware prototypes. There are some details of reading and implementing it.


PANet

PANet(PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment)[1] learns class-specific prototype representations for images and matches each pixel to the learned prototypes. There are some details of reading and implementing it.


CANet

CANet(CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning)[1] consists of a two-branch dense comparison module which performs multi-level feature comparison, and an iterative optimization module which iteratively refines the predicted results. There are some details of reading and implementing it.


SG-One

SG-One(SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation)[1] adopt a masked average pooling strategy for producing the guidance features, then leverage the cosine similarity to build the relationship. There are some details of reading and implementing it.


co-FCN

co-FCN(Conditional Networks for Few-Shot Semantic Segmentation)[1] handle sparse pixel-wise annotations to achieve nearly the same accuracy. There are some details of reading and implementing it.


OSLSM

OSLSM(One-Shot Learning for Semantic Segmentation)[1] firstly proposed two-branch approach to one-shot semantic segmentation. Conditioning branch trains a network to get parameter $\theta$, and Segmentaion branch outputs the final mask based on parameter $\theta$. There are some details of reading and implementing it.


Mask R-CNN

Mask R-CNN[1] is a framework for object instance segmentation, which adds a branch for predicting an object mask in parallel with the existing branch for bounding box recognition of Faster R-CNN. There are some details of reading and implementing it.


LTM

LTM(Local Transformation Module)[1] focus on the relationship of the local features. It uses linear transformation of the relationship matrix in a high-dimensional metric embedding space to accomplish the transformation. There are some details of reading and implementing it.


Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×