ssds.core

ssds.core.checkpoint

ssds.core.checkpoint.find_previous_checkpoint(output_dir)[source]

Return the most recent checkpoint in the checkpoint_list.txt

checkpoint_list.txt is usually saved at cfg.EXP_DIR

Parameters

output_dir (str) – the folder contains the previous checkpoints and checkpoint_list.txt

ssds.core.checkpoint.resume_checkpoint(model, resume_checkpoint, resume_scope='')[source]

Resume the checkpoints to the given ssds model based on the resume_scope.

The resume_scope is defined by cfg.TRAIN.RESUME_SCOPE.

When:

  • cfg.TRAIN.RESUME_SCOPE = “”

    All the parameters in the resume_checkpoint are resumed to the model

  • cfg.TRAIN.RESUME_SCOPE = “a,b,c”

    Only the the parameters in the a, b and c are resumed to the model

Parameters
  • model – the ssds model

  • resume_checkpoint (str) – the file address for the checkpoint which contains the resumed parameters

  • resume_scope – the scope of the resumed parameters, defined at cfg.TRAIN.RESUME_SCOPE

ssds.core.checkpoint.save_checkpoints(model, output_dir, checkpoint_prefix, epochs)[source]

Save the model parameter to a pth file.

Parameters
  • model – the ssds model

  • output_dir (str) – the folder for model saving, usually defined by cfg.EXP_DIR

  • checkpoint_prefix (str) – the prefix for the checkpoint, usually is the combination of the ssd model and the dataset

  • epochs (int) – the epoch for the current training

ssds.core.config

ssds.core.config.cfg_from_file(filename)[source]

Load a config file and merge it into the default options.

ssds.core.criterion

class ssds.core.criterion.MultiBoxLoss(negpos_ratio=3, **kwargs)[source]

Bases: torch.nn.modules.module.Module

The MultiBox Loss is used to calculate the classification loss in object detection task.

MultiBox Loss is introduce by [SSD: Single Shot MultiBox Detector](https://arxiv.org/abs/1512.02325v5) and can be described as:

L(x,c,l,g)=(Lconf(x,c)+αLloc(x,l,g))/NL(x,c,l,g) = (Lconf(x, c) + \alpha Lloc(x,l,g)) / N

where, LconfLconf is the CrossEntropy Loss and LlocLloc is the SmoothL1 Loss weighted by α\alpha which is set to 1 by cross val.

Compute Targets:

  • Produce Confidence Target

    Indices by matching ground truth boxes with (default) ‘priorboxes’ that have jaccard index > threshold parameter (default threshold: 0.5).

  • Produce localization target

    by ‘encoding’ variance into offsets of ground truth boxes and their matched ‘priorboxes’.

  • Hard negative mining

    to filter the excessive number of negative examples that comes with using a large number of default bounding boxes. (default negative:positive ratio 3:1)

To reduce the code and make it more easier to embed into the pipeline. Here, only the classification loss is included in this class

Parameters

negpos_ratio – ratio of negative over positive samples in the given feature map, Default: 3

forward(pred_logits, target, depth)[source]
Parameters
  • pred_logits – Predict class for each box

  • target – Target class for each box

  • depth – the sign for the positive and negative samples from anchor mathcing. Basically it can be splited to 3 types: positive(>0), background/negative(=0), ignore(<0)

Returns

The classification loss for the given feature map

class ssds.core.criterion.FocalLoss(alpha=0.25, gamma=2, **kwargs)[source]

Bases: torch.nn.modules.module.Module

The Focal Loss is used to calculate the classification loss in object detection task.

Focal Loss is introduce by [Focal Loss for Dense Object Detection](https://arxiv.org/abs/1708.02002) and can be described as:

FL(pt)=α(1pt)γln(pt)FL(p_t)=-\alpha(1-p_t)^{\gamma}ln(p_t)

where ptp_t is the cross entropy for each box. α\alpha controls the ratio of positive sample and the γ\gamma controls the attention for the difficult samples.

Parameters
  • alpha (float) – the param to control the ratio of positive sample, (0,1). Default: 0.25

  • gamma (float) – the param to the attention for the difficult samples, [0,n), [0,5] has been shown in the original paper. Default: 2

forward(pred_logits, target, depth)[source]
Parameters
  • pred_logits – Predict class for each box

  • target – Target class for each box

  • depth – Does not used in this function

Returns

The classification loss for the given feature map

class ssds.core.criterion.SmoothL1Loss(beta=0.11)[source]

Bases: torch.nn.modules.module.Module

The SmoothL1 Loss is used to calculate the localization loss in object detection task.

This criterion that uses a squared term if the absolute element-wise error falls below 1 and an L1 term otherwise. It is less sensitive to outliers than the MSELoss and in some cases prevents exploding gradients (e.g. see Fast R-CNN paper by Ross Girshick). Also known as the Huber loss:

loss(xi,yi)={0.5(xiyi)2,if xiyi<βxiyi0.5,otherwise \text{loss}(x_i, y_i) = \begin{cases} 0.5 (x_i - y_i)^2, & \text{if } |x_i - y_i| < \beta \\ |x_i - y_i| - 0.5, & \text{otherwise } \end{cases}

xx and yy arbitrary shapes with a total of nn elements each the sum operation still operates over all the elements, and divides by nn .

β\beta is used as the threshold and smooth the loss

Parameters

beta (float) – the param to control the threshold and smooth the loss, (0,1). Default: 0.11

forward(pred, target)[source]
Parameters
  • pred – Predict box for each box

  • target – Target box for each box

Returns

The localization loss for the given feature map

class ssds.core.criterion.IOULoss(loss_type='iou')[source]

Bases: torch.nn.modules.module.Module

The IOU Loss is used to calculate the localization loss in object detection task.

IoU Loss is introduce by [IoU Loss for 2D/3D Object Detection](https://arxiv.org/abs/1908.03851v1) and can be described as:

IoU(A,B)=ABAB=ABA+BABIoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box.

This class actually implemented multiple IoU related losses and use loss_type to choose the specific loss func.

Parameters

loss_type (str) – param to choose the specific loss type.

forward(pred, target)[source]
Parameters
  • pred – Predict box for each box, format with x,y,w,h

  • target – Target box for each box, format with x,y,w,h

Returns

The localization loss for the given feature map

delta2ltrb(deltas)[source]

deltas [x,y,w,h] with [batch, anchor, 4, h, w]

ssds.core.criterion.GIOULoss()[source]

The GIOU Loss is used to calculate the localization loss in object detection task.

Generalized IoU Loss is introduce by [IoU Loss for 2D/3D Object Detection](https://arxiv.org/abs/1908.03851v1) and can be described as:

IoU(A,B)=ABAB=ABA+BABIoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}
GIoU(A,B)=IoU(A,B)CUCGIoU(A, B) = IoU(A, B) - \frac{C - U}{C}

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box; C is defined as the smallest convex shapes enclosing both A and B; U represents the union area A+BAB|A| + |B| - A \cap B

In implementation, it calls the IOULoss with loss_type="giou".

ssds.core.criterion.DIOULoss()[source]

The DIOU Loss is used to calculate the localization loss in object detection task.

Distance IoU Loss is introduce by [Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression](https://arxiv.org/abs/1911.08287v1) and can be described as:

IoU(A,B)=ABAB=ABA+BABIoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}
DIoU(A,B)=IoU(A,B)diaginterdiagouterDIoU(A, B) = IoU(A, B) - \frac{diag_{inter}}{diag_{outer}}

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box; diaginterdiag_{inter} is defined as center distance between A and B; diagouterdiag_{outer} is the diagonal length of the smallest enclosing box covering the two boxes.

In implementation, it calls the IOULoss with loss_type="diou".

ssds.core.criterion.CIOULoss()[source]

The CIOU Loss is used to calculate the localization loss in object detection task.

Complete IoU Loss is introduce by [Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression](https://arxiv.org/abs/1911.08287v1) and can be described as:

IoU(A,B)=ABAB=ABA+BABIoU(A, B) = \frac{A \cap B}{A \cup B} = \frac{A \cap B}{|A| + |B| - A \cap B}
DIoU(A,B)=IoU(A,B)diaginterdiagouterDIoU(A, B) = IoU(A, B) - \frac{diag_{inter}}{diag_{outer}}
CIoU(A,B)=DIoU(A,B)αvCIoU(A, B) = DIoU(A, B) - \alpha v

where, A and B represents the two convex shapes. In here, it means the predict box and the groundtruth box; α=v(1IoU(A,B))+v\alpha = \frac{v}{(1-IoU(A,B))+v} and v=4π2(arctanwAhAarctanwBhB)2v = \frac{4}{\pi^2} (arctan \frac{w^A}{h^A} − arctan \frac{w^B}{h^B})^2 is used to impose the consistency of aspect ratio.

In CIoU loss, the α\alpha part is not used for backpropagation.

In implementation, it calls the IOULoss with loss_type="ciou".

ssds.core.data_parallel

class ssds.core.data_parallel.BalancedDataParallel(gpu0_bsz, *args, **kwargs)[source]

This class is used to replace the original pytorch DataParallel and balance the first GPU memory usage.

The original script is from: https://github.com/kimiyoung/transformer-xl/blob/master/pytorch/utils/data_parallel.py

ssds.core.evaluation_metrics

class ssds.core.evaluation_metrics.MeanAveragePrecision(num_classes, conf_threshold, iou_threshold)[source]

Bases: object

__call__(detections, targets)[source]

Call self as a function.

get_results()[source]

ssds.core.optimizer

ssds.core.optimizer.configure_lr_scheduler(optimizer, cfg)[source]

Return the learning rate scheduler for the trainable parameters

Basically, it returns the learning rate scheduler defined by cfg.TRAIN.LR_SCHEDULER.SCHEDULER. Some parameters for the learning rate scheduler are also defined in cfg.TRAIN.LR_SCHEDULER.

Currently, there are 4 popular learning rate scheduler supported: step, multi_step, exponential and sgdr.

TODO: directly fetch the optimizer by getattr(lr_scheduler, cfg.SCHEDULER) and send the the relative parameter by dict.

Parameters
  • optimizer – the optimizer in the given ssds model, check configure_optimizer() for more details.

  • cfg – the config dict, which is defined in cfg.TRAIN.LR_SCHEDULER.

ssds.core.optimizer.configure_optimizer(trainable_param, cfg)[source]

Return the optimizer for the trainable parameters

Basically, it returns the optimizer defined by cfg.TRAIN.OPTIMIZER.OPTIMIZER. The learning rate for the optimizer is defined by cfg.TRAIN.OPTIMIZER.LEARNING_RATE and cfg.TRAIN.OPTIMIZER.DIFFERENTIAL_LEARNING_RATE. Some other parameters are also defined in cfg.TRAIN.OPTIMIZER.

Currently, there are 4 popular optimizers supported: sgd, rmsprop, adam and amsgrad.

TODO: directly fetch the optimizer by getattr(optim, cfg.OPTIMIZER) and send the the relative parameter by dict.

Parameters
  • trainable_param – the trainable parameter in the given ssds model, check trainable_param() for more details.

  • cfg – the config dict, which is defined in cfg.TRAIN.OPTIMIZER.

ssds.core.optimizer.trainable_param(model, trainable_scope)[source]

Return the trainable parameters for the optimizers by cfg.TRAIN.TRAINABLE_SCOPE

If the module in trainable scope, then train this module’s parameters

When :

  • cfg.TRAIN.TRAINABLE_SCOPE = “”

    All the parameters in the model are used to train

  • cfg.TRAIN.TRAINABLE_SCOPE = “a,b,c.d”

    Only the the parameters in the a, b and c.d are used to train

  • cfg.TRAIN.TRAINABLE_SCOPE = “a;b,c.d”

    Only the the parameters in the a, b and c.d are used to train. module a and model b&c.d can be assigned to different learning rate (differential learning rate)

Parameters
  • model – the ssds model for training

  • trainable_scope (str) – the scope for the trainable parameter in the given ssds model, which is defined in the cfg.TRAIN.TRAINABLE_SCOPE

ssds.core.visualize_funcs

ssds.core.visualize_funcs.add_anchorStrategy(writer, targets, num_thresholds=100)[source]
ssds.core.visualize_funcs.add_defaultAnchors(writer, image, anchors, epoch=0)[source]
ssds.core.visualize_funcs.add_imagesWithBoxes(writer, tag, images, boxes, class_names=[], epoch=0)[source]
ssds.core.visualize_funcs.add_imagesWithMatchedBoxes(writer, tag, images, boxes, targets, class_names=[], epoch=0)[source]
ssds.core.visualize_funcs.add_matchedAnchor(writer)[source]
ssds.core.visualize_funcs.add_matchedAnchorsWithBox(writer, image, anchor, stride, depth, epoch=0)[source]
ssds.core.visualize_funcs.add_prCurve(writer, precision, recall, class_names=[], epoch=0)[source]