chester (@chesterzelaya)
2025-10-07 | โค๏ธ 270 | ๐ 29
< How to Architect a Modelโs Neck >
neck? yes, a neck is what we call the intermediate step between the backbone and the head in computer vision models
backbone - in charge of extracting multi-scale features
neck - in charge of fusing/reshaping features together, priming the data for the head
head - in charge of producing task-specific outputs (logits, boxes, masks)
now, what are some of the ways you can fuse the low level features together?
concatenation - preserves info, increases channels/compute addition - cheap, requires aligned channels; good default element-wise multiplication - acts like a gate; can be fragile to scale weighted summation - learnable mixing (e.g. BiFPN); best of both, slight overhead
very common out-of-the-box necks include:
- FPN
- BiFPN
- NAS-FPN
- PANet
all with pros-and-consโฆ focused on the balance between speed and accuracy
๐ Related
Auto-generated bookmark