ssds.core¶

ssds.core.checkpoint¶

ssds.core.checkpoint.find_previous_checkpoint(output_dir)[source]¶

Return the most recent checkpoint in the checkpoint_list.txt

checkpoint_list.txt is usually saved at cfg.EXP_DIR

Parameters: output_dir (str) – the folder contains the previous checkpoints and checkpoint_list.txt

ssds.core.checkpoint.resume_checkpoint(model, resume_checkpoint, resume_scope='')[source]¶

Resume the checkpoints to the given ssds model based on the resume_scope.

The resume_scope is defined by cfg.TRAIN.RESUME_SCOPE.

When:

cfg.TRAIN.RESUME_SCOPE = “”
All the parameters in the resume_checkpoint are resumed to the model
cfg.TRAIN.RESUME_SCOPE = “a,b,c”
Only the the parameters in the a, b and c are resumed to the model

Parameters

model – the ssds model
resume_checkpoint (str) – the file address for the checkpoint which contains the resumed parameters
resume_scope – the scope of the resumed parameters, defined at cfg.TRAIN.RESUME_SCOPE

ssds.core.checkpoint.save_checkpoints(model, output_dir, checkpoint_prefix, epochs)[source]¶

Save the model parameter to a pth file.

Parameters

model – the ssds model
output_dir (str) – the folder for model saving, usually defined by cfg.EXP_DIR
checkpoint_prefix (str) – the prefix for the checkpoint, usually is the combination of the ssd model and the dataset
epochs (int) – the epoch for the current training

ssds.core.config¶

ssds.core.config.cfg_from_file(filename)[source]¶: Load a config file and merge it into the default options.

ssds.core.criterion¶

class ssds.core.criterion.MultiBoxLoss(negpos_ratio=3, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

The MultiBox Loss is used to calculate the classification loss in object detection task.

MultiBox Loss is introduce by [SSD: Single Shot MultiBox Detector](https://arxiv.org/abs/1512.02325v5) and can be described as:

L(x,c,l,g) = (Lconf(x, c) + \alpha Lloc(x,l,g)) / N

where, $Lconf$ is the CrossEntropy Loss and $Lloc$ is the SmoothL1 Loss weighted by $\alpha$ which is set to 1 by cross val.

Compute Targets:

Produce Confidence Target
Indices by matching ground truth boxes with (default) ‘priorboxes’ that have jaccard index > threshold parameter (default threshold: 0.5).
Produce localization target
by ‘encoding’ variance into offsets of ground truth boxes and their matched ‘priorboxes’.
Hard negative mining
to filter the excessive number of negative examples that comes with using a large number of default bounding boxes. (default negative:positive ratio 3:1)

To reduce the code and make it more easier to embed into the pipeline. Here, only the classification loss is included in this class

Parameters: negpos_ratio – ratio of negative over positive samples in the given feature map, Default: 3

forward(pred_logits, target, depth)[source]¶

Parameters

pred_logits – Predict class for each box
target – Target class for each box
depth – the sign for the positive and negative samples from anchor mathcing. Basically it can be splited to 3 types: positive(>0), background/negative(=0), ignore(<0)

Returns

The classification loss for the given feature map

class ssds.core.criterion.FocalLoss(alpha=0.25, gamma=2, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

The Focal Loss is used to calculate the classification loss in object detection task.

Focal Loss is introduce by [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002) and can be described as:

FL(p_t)=-\alpha(1-p_t)^{\gamma}ln(p_t)

where $p_t$ is the cross entropy for each box. $\alpha$ controls the ratio of positive sample and the $\gamma$ controls the attention for the difficult samples.

Parameters

alpha (float) – the param to control the ratio of positive sample, (0,1). Default: 0.25
gamma (float) – the param to the attention for the difficult samples, [0,n), [0,5] has been shown in the original paper. Default: 2

forward(pred_logits, target, depth)[source]¶

Parameters

pred_logits – Predict class for each box
target – Target class for each box
depth – Does not used in this function

Returns

The classification loss for the given feature map

class ssds.core.criterion.SmoothL1Loss(beta=0.11)[source]¶

Bases: torch.nn.modules.module.Module

The SmoothL1 Loss is used to calculate the localization loss in object detection task.

This criterion that uses a squared term if the absolute element-wise error falls below 1 and an L1 term otherwise. It is less sensitive to outliers than the MSELoss and in some cases prevents exploding gradients (e.g. see Fast R-CNN paper by Ross Girshick). Also known as the Huber loss:

\text{loss}(x_i, y_i) = \begin{cases} 0.5 (x_i - y_i)^2, & \text{if } |x_i - y_i| < \beta \\ |x_i - y_i| - 0.5, & \text{otherwise } \end{cases}

$x$ and $y$ arbitrary shapes with a total of $n$ elements each the sum operation still operates over all the elements, and divides by $n$ .

$\beta$ is used as the threshold and smooth the loss

Parameters: beta (float) – the param to control the threshold and smooth the loss, (0,1). Default: 0.11

forward(pred, target)[source]¶

Parameters

pred – Predict box for each box
target – Target box for each box

Returns

The localization loss for the given feature map

class ssds.core.criterion.IOULoss(loss_type='iou')[source]¶

Bases: torch.nn.modules.module.Module

The IOU Loss is used to calculate the localization loss in object detection task.

IoU Loss is introduce by [IoU Loss for 2D/3D Object Detection](https://arxiv.org/abs/1908.03851v1) and can be described as:

IoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box.

This class actually implemented multiple IoU related losses and use loss_type to choose the specific loss func.

Parameters: loss_type (str) – param to choose the specific loss type.

forward(pred, target)[source]¶

Parameters

pred – Predict box for each box, format with x,y,w,h
target – Target box for each box, format with x,y,w,h

Returns

The localization loss for the given feature map

delta2ltrb(deltas)[source]¶: deltas [x,y,w,h] with [batch, anchor, 4, h, w]

ssds.core.criterion.GIOULoss()[source]¶

The GIOU Loss is used to calculate the localization loss in object detection task.

Generalized IoU Loss is introduce by [IoU Loss for 2D/3D Object Detection](https://arxiv.org/abs/1908.03851v1) and can be described as:

IoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}

GIoU(A, B) = IoU(A, B) - \frac{C - U}{C}

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box; C is defined as the smallest convex shapes enclosing both A and B; U represents the union area $|A| + |B| - A \cap B$

In implementation, it calls the IOULoss with loss_type="giou".

ssds.core.criterion.DIOULoss()[source]¶

The DIOU Loss is used to calculate the localization loss in object detection task.

Distance IoU Loss is introduce by [Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression](https://arxiv.org/abs/1911.08287v1) and can be described as:

IoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}

DIoU(A, B) = IoU(A, B) - \frac{diag_{inter}}{diag_{outer}}

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box; $diag_{inter}$ is defined as center distance between A and B; $diag_{outer}$ is the diagonal length of the smallest enclosing box covering the two boxes.

In implementation, it calls the IOULoss with loss_type="diou".

ssds.core.criterion.CIOULoss()[source]¶

The CIOU Loss is used to calculate the localization loss in object detection task.

Complete IoU Loss is introduce by [Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression](https://arxiv.org/abs/1911.08287v1) and can be described as:

IoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}

DIoU(A, B) = IoU(A, B) - \frac{diag_{inter}}{diag_{outer}}

CIoU(A, B) = DIoU(A, B) - \alpha v

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box; $\alpha = \frac{v}{(1-IoU(A,B))+v}$ and $v = \frac{4}{\pi^2} (arctan \frac{w^A}{h^A} − arctan \frac{w^B}{h^B})^2$ is used to impose the consistency of aspect ratio.

In CIoU loss, the $\alpha$ part is not used for backpropagation.

In implementation, it calls the IOULoss with loss_type="ciou".

ssds.core.data_parallel¶

class ssds.core.data_parallel.BalancedDataParallel(gpu0_bsz, *args, **kwargs)[source]¶

This class is used to replace the original pytorch DataParallel and balance the first GPU memory usage.

The original script is from: https://github.com/kimiyoung/transformer-xl/blob/master/pytorch/utils/data_parallel.py

ssds.core.evaluation_metrics¶

class ssds.core.evaluation_metrics.MeanAveragePrecision(num_classes, conf_threshold, iou_threshold)[source]¶

Bases: object

__call__(detections, targets)[source]¶: Call self as a function.

get_results()[source]¶

ssds.core.optimizer¶

ssds.core.optimizer.configure_lr_scheduler(optimizer, cfg)[source]¶

Return the learning rate scheduler for the trainable parameters

Basically, it returns the learning rate scheduler defined by cfg.TRAIN.LR_SCHEDULER.SCHEDULER. Some parameters for the learning rate scheduler are also defined in cfg.TRAIN.LR_SCHEDULER.

Currently, there are 4 popular learning rate scheduler supported: step, multi_step, exponential and sgdr.

TODO: directly fetch the optimizer by getattr(lr_scheduler, cfg.SCHEDULER) and send the the relative parameter by dict.

Parameters

optimizer – the optimizer in the given ssds model, check configure_optimizer() for more details.
cfg – the config dict, which is defined in cfg.TRAIN.LR_SCHEDULER.

ssds.core.optimizer.configure_optimizer(trainable_param, cfg)[source]¶

Return the optimizer for the trainable parameters

Basically, it returns the optimizer defined by cfg.TRAIN.OPTIMIZER.OPTIMIZER. The learning rate for the optimizer is defined by cfg.TRAIN.OPTIMIZER.LEARNING_RATE and cfg.TRAIN.OPTIMIZER.DIFFERENTIAL_LEARNING_RATE. Some other parameters are also defined in cfg.TRAIN.OPTIMIZER.

Currently, there are 4 popular optimizers supported: sgd, rmsprop, adam and amsgrad.

TODO: directly fetch the optimizer by getattr(optim, cfg.OPTIMIZER) and send the the relative parameter by dict.

Parameters

trainable_param – the trainable parameter in the given ssds model, check trainable_param() for more details.
cfg – the config dict, which is defined in cfg.TRAIN.OPTIMIZER.

ssds.core.optimizer.trainable_param(model, trainable_scope)[source]¶

Return the trainable parameters for the optimizers by cfg.TRAIN.TRAINABLE_SCOPE

If the module in trainable scope, then train this module’s parameters

When :

cfg.TRAIN.TRAINABLE_SCOPE = “”
All the parameters in the model are used to train
cfg.TRAIN.TRAINABLE_SCOPE = “a,b,c.d”
Only the the parameters in the a, b and c.d are used to train
cfg.TRAIN.TRAINABLE_SCOPE = “a;b,c.d”
Only the the parameters in the a, b and c.d are used to train. module a and model b&c.d can be assigned to different learning rate (differential learning rate)

Parameters

model – the ssds model for training
trainable_scope (str) – the scope for the trainable parameter in the given ssds model, which is defined in the cfg.TRAIN.TRAINABLE_SCOPE

ssds.core.visualize_funcs¶

ssds.core.visualize_funcs.add_anchorStrategy(writer, targets, num_thresholds=100)[source]¶

ssds.core.visualize_funcs.add_defaultAnchors(writer, image, anchors, epoch=0)[source]¶

ssds.core.visualize_funcs.add_imagesWithBoxes(writer, tag, images, boxes, class_names=[], epoch=0)[source]¶

ssds.core.visualize_funcs.add_imagesWithMatchedBoxes(writer, tag, images, boxes, targets, class_names=[], epoch=0)[source]¶

ssds.core.visualize_funcs.add_matchedAnchor(writer)[source]¶

ssds.core.visualize_funcs.add_matchedAnchorsWithBox(writer, image, anchor, stride, depth, epoch=0)[source]¶

ssds.core.visualize_funcs.add_prCurve(writer, precision, recall, class_names=[], epoch=0)[source]¶