FM — Flow Matching for Generative Modeling

想象生成一张图片,就像把一片噪声云慢慢「流」成一座数据岛:起点是随机混沌,终点是清晰图像。扩散模型(DDPM)走的是一条绕远路的弯曲河道——粒子必须沿预设的 VP 噪声调度蜿蜒前行,采样步数多、路径长;而最优传输(OT)则像直线航道,两点之间最短。

SD - Stable Diffusion

以前的扩散模型(如 DDPM)直接在「像素世界」里反复擦噪点画图——一张 512×512 的图有 78 万个像素,每一步去噪都要在这么大的画布上算一遍,训练动辄几百张 GPU 卡跑上几周。本文(潜在扩散 LDM,也就是后来的 Stable Diffusion)的核心招数是:先把图压缩到一个小很多的「缩略草稿世界」里再画。

DDPM — Denoising Diffusion Probabilistic Models

想象一个「倒放」游戏:先把一张清晰照片一帧帧泼上雪花噪点,直到变成满屏的电视雪花;DDPM 要训练 AI 学会把这个过程倒着放——从纯雪花开始,一步步擦掉噪点,最后还原出一张全新的、以前没见过的照片。

PPNet

PPNet(Part-aware Prototype Network for Few-shot Semantic Segmentation)[1] decompose the holistic class representation into a set of part-aware prototypes, and leverage unlabeled data to better modeling of intra-class variations. Besides, graph neural network model is used to generate and enhance the proposed part-aware prototypes. There are some details of reading and implementing it.


PANet

PANet(PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment)[1] learns class-specific prototype representations for images and matches each pixel to the learned prototypes. There are some details of reading and implementing it.


CANet

CANet(CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning)[1] consists of a two-branch dense comparison module which performs multi-level feature comparison, and an iterative optimization module which iteratively refines the predicted results. There are some details of reading and implementing it.


SG-One

SG-One(SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation)[1] adopt a masked average pooling strategy for producing the guidance features, then leverage the cosine similarity to build the relationship. There are some details of reading and implementing it.


co-FCN

co-FCN(Conditional Networks for Few-Shot Semantic Segmentation)[1] handle sparse pixel-wise annotations to achieve nearly the same accuracy. There are some details of reading and implementing it.


OSLSM

OSLSM(One-Shot Learning for Semantic Segmentation)[1] firstly proposed two-branch approach to one-shot semantic segmentation. Conditioning branch trains a network to get parameter $\theta$, and Segmentaion branch outputs the final mask based on parameter $\theta$. There are some details of reading and implementing it.


Mask R-CNN

Mask R-CNN[1] is a framework for object instance segmentation, which adds a branch for predicting an object mask in parallel with the existing branch for bounding box recognition of Faster R-CNN. There are some details of reading and implementing it.


Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×