ssds.modeling.ssds¶
ssds.modeling.ssds.ssdsbase¶
ssds.modeling.ssds.ssd¶
-
class
ssds.modeling.ssds.
SSD
(backbone, extras, head, num_classes)[source]¶ Bases:
ssds.modeling.ssds.ssdsbase.SSDSBase
SSD: Single Shot MultiBox Detector See: https://arxiv.org/pdf/1512.02325.pdf for more details.
- Parameters
backbone – backbone layers for input
extras – extra layers that feed to multibox loc and conf layers
head – “multibox head” consists of loc and conf conv layers
num_classes – num of classes
-
static
add_extras
(feature_layer, mbox, num_classes)[source]¶ Define and declare the extras, loc and conf modules for the ssd model.
The feature_layer is defined in cfg.MODEL.FEATURE_LAYER. For ssd model can be int, list of int and str:
- int
The int in the feature_layer represents the output feature in the backbone.
- str
The str in the feature_layer represents the extra layers append at the end of the backbone.
- Parameters
feature_layer – the feature layers with detection head, defined by cfg.MODEL.FEATURE_LAYER
mbox – the number of boxes for each feature map
num_classes – the number of classes, defined by cfg.MODEL.NUM_CLASSES
-
forward
(x)[source]¶ Applies network layers and ops on input image(s) x.
- Parameters
x – input image or batch of images.
- Returns
When self.training==True, loc and conf for each anchor box;
When self.training==False. loc and conf.sigmoid() for each anchor box;
For each player, conf with shape [batch, num_anchor*num_classes, height, width];
For each player, loc with shape [batch, num_anchor*4, height, width].
ssds.modeling.ssds.yolo¶
-
class
ssds.modeling.ssds.
YOLOV3
(backbone, extras, head, num_classes)[source]¶ Bases:
ssds.modeling.ssds.ssdsbase.SSDSBase
YOLOv3: An Incremental Improvement See: https://arxiv.org/abs/1804.02767v1 for more details.
- Parameters
backbone – backbone layers for input
extras – contains transforms and extra layers that feed to multibox loc and conf layers
head – “multibox head” consists of loc and conf conv layers
num_classes – num of classes
-
static
add_extras
(feature_layer, mbox, num_classes)[source]¶ Define and declare the extras, loc and conf modules for the yolo v3 model.
The feature_layer is defined in cfg.MODEL.FEATURE_LAYER. For yolo v3 model can be int, list of int and str:
- int
The int in the feature_layer represents the output feature in the backbone.
- list of int
The list of int in the feature_layer represents the output feature in the backbone, the first int is the backbone output and the second int is the upsampling branch to fuse feature.
- str
The str in the feature_layer represents the extra layers append at the end of the backbone.
- Parameters
feature_layer – the feature layers with detection head, defined by cfg.MODEL.FEATURE_LAYER
mbox – the number of boxes for each feature map
num_classes – the number of classes, defined by cfg.MODEL.NUM_CLASSES
-
forward
(x)[source]¶ Applies network layers and ops on input image(s) x.
- Parameters
x – input image or batch of images.
- Returns
When self.training==True, loc and conf for each anchor box;
When self.training==False. loc and conf.sigmoid() for each anchor box;
For each player, conf with shape [batch, num_anchor*num_classes, height, width];
For each player, loc with shape [batch, num_anchor*4, height, width].
-
class
ssds.modeling.ssds.
YOLOV4
(backbone, extras, head, num_classes)[source]¶ Bases:
ssds.modeling.ssds.ssdsbase.SSDSBase
YOLO V4 Architecture See: https://arxiv.org/abs/2004.10934v1 for more details.
- Parameters
backbone – backbone layers for input
extras – contains transforms, extra and fpn layers that feed to multibox loc and conf layers
head – “multibox head” consists of loc and conf conv layers
num_classes – num of classes
-
static
add_extras
(feature_layer, mbox, num_classes)[source]¶ Define and declare the extras, loc and conf modules for the yolo v4 model.
The feature_layer is defined in cfg.MODEL.FEATURE_LAYER. For yolo v4 model can be int, list of int and str:
- int
The int in the feature_layer represents the output feature in the backbone.
- list of int
The list of int in the feature_layer represents the output feature in the backbone, the first int is the backbone output and the second int is the upsampling branch to fuse feature.
- str
The str in the feature_layer represents the extra layers append at the end of the backbone.
- Parameters
feature_layer – the feature layers with detection head, defined by cfg.MODEL.FEATURE_LAYER
mbox – the number of boxes for each feature map
num_classes – the number of classes, defined by cfg.MODEL.NUM_CLASSES
-
forward
(x)[source]¶ Applies network layers and ops on input image(s) x.
- Parameters
x – input image or batch of images.
- Returns
When self.training==True, loc and conf for each anchor box;
When self.training==False. loc and conf.sigmoid() for each anchor box;
For each player, conf with shape [batch, num_anchor*num_classes, height, width];
For each player, loc with shape [batch, num_anchor*4, height, width].
ssds.modeling.ssds.fpn¶
-
class
ssds.modeling.ssds.
SSDFPN
(backbone, extras, head, num_classes)[source]¶ Bases:
ssds.modeling.ssds.ssdsbase.SSDSBase
RetinaNet in Focal Loss for Dense Object Detection See: https://arxiv.org/abs/1708.02002v2 for more details.
Compared with the original implementation, change the conv2d in the extra and head to ConvBNReLU to helps the model converage easily Not add the bn&relu to transforms cause it is followed by interpolate and element-wise sum
- Parameters
backbone – backbone layers for input
extras – contains transforms and extra layers that feed to multibox loc and conf layers
head – “multibox head” consists of loc and conf conv layers
num_classes – num of classes
-
static
add_extras
(feature_layer, mbox, num_classes)[source]¶ Define and declare the extras, loc and conf modules for the ssdfpn model.
The feature_layer is defined in cfg.MODEL.FEATURE_LAYER. For ssdfpn model can be int, list of int and str:
- int
The int in the feature_layer represents the output feature in the backbone.
- list of int
The list of int in the feature_layer represents the output feature in the backbone, the first int is the backbone output and the second int is the upsampling branch to fuse feature.
- str
The str in the feature_layer represents the extra layers append at the end of the backbone.
- Parameters
feature_layer – the feature layers with detection head, defined by cfg.MODEL.FEATURE_LAYER
mbox – the number of boxes for each feature map
num_classes – the number of classes, defined by cfg.MODEL.NUM_CLASSES
-
forward
(x)[source]¶ Applies network layers and ops on input image(s) x.
- Parameters
x – input image or batch of images.
- Returns
When self.training==True, loc and conf for each anchor box;
When self.training==False. loc and conf.sigmoid() for each anchor box;
For each player, conf with shape [batch, num_anchor*num_classes, height, width];
For each player, loc with shape [batch, num_anchor*4, height, width].
ssds.modeling.ssds.bifpn¶
-
class
ssds.modeling.ssds.
SSDBiFPN
(backbone, extras, head, num_classes)[source]¶ Bases:
ssds.modeling.ssds.ssdsbase.SSDSBase
EfficientDet: Scalable and Efficient Object Detection See: https://arxiv.org/abs/1911.09070v6 for more details.
Compared with the original implementation, change the conv2d in the extra and head to ConvBNReLU to helps the model converage easily Not add the bn&relu to transforms cause it is followed by interpolate and element-wise sum
- Parameters
backbone – backbone layers for input
extras – contains transforms, extra and stack_bifpn layers that feed to multibox loc and conf layers
head – “multibox head” consists of loc and conf conv layers
num_classes – num of classes
-
static
add_extras
(feature_layer, mbox, num_classes)[source]¶ Define and declare the extras, loc and conf modules for the ssdfpn model.
The feature_layer is defined in cfg.MODEL.FEATURE_LAYER. For ssdfpn model can be int, list of int and str:
- int
The int in the feature_layer represents the output feature in the backbone.
- list of int
The list of int in the feature_layer represents the output feature in the backbone, the first int is the backbone output and the second int is the upsampling branch to fuse feature.
- str
The str in the feature_layer represents the extra layers append at the end of the backbone.
- Parameters
feature_layer – the feature layers with detection head, defined by cfg.MODEL.FEATURE_LAYER
mbox – the number of boxes for each feature map
num_classes – the number of classes, defined by cfg.MODEL.NUM_CLASSES
-
forward
(x)[source]¶ Applies network layers and ops on input image(s) x.
- Parameters
x – input image or batch of images.
- Returns
When self.training==True, loc and conf for each anchor box;
When self.training==False. loc and conf.sigmoid() for each anchor box;
For each player, conf with shape [batch, num_anchor*num_classes, height, width];
For each player, loc with shape [batch, num_anchor*4, height, width].