StrongSORT（deepsort强化版）学习体会

2023-11-11

少废话，先做备忘录

StrongSORT（deepsort强化版）浅实战＋代码解析
参考：https://blog.csdn.net/weixin_50862344/article/details/127070391
https://github.com/phil-bergmann/tracking_wo_bnw
https://www.zhihu.com/question/511584675
https://blog.51cto.com/u_15316394/3215952
https://github.com/KaiyangZhou/deep-person-reid
https://www.jianshu.com/p/dc6a866600c2
https://blog.csdn.net/sc1434404661/article/details/126973759
https://drive.google.com/drive/folders/1xhG0kRH1EX5B9_Iz8gQJb7UNnn_riXi6
遇到问题：
https://github.com/ultralytics/yolov5/issues/1855

deepsort：几篇不错的文章
https://blog.csdn.net/didiaopao/article/details/120272947?spm=1001.2014.3001.5502
https://blog.csdn.net/didiaopao/article/details/120274519?spm=1001.2014.3001.5502
https://blog.csdn.net/didiaopao/article/details/120276922?spm=1001.2014.3001.5502

1. 如何使用

2. 代码解读

blog指引： https://blog.csdn.net/weixin_40620310/article/details/124501917

2.1 两个模型：

基于yolov5的detector模型
tracker 基于跟踪器的tracker 模型，更新状态。

2.2 跟进一下demo.py

既然是视频跟踪解析，送入视频流一定需要while True结构，进行逐帧推理
创建一个检测器实例，不要被它的名字蒙蔽，虽然叫检测器，但是远比yolov5的检测器复杂，否则没法维护状态矩阵进行状态的维护和更新

det = Detector()

先来看一下Detector类的基类

class baseDet(object):
    def __init__(self):
        self.img_size = 640
        self.threshold = 0.3
        self.stride = 1

    def build_config(self):
        self.frameCounter = 0
    
    def feedCap(self, im, func_status):
        # 送入图片流
        retDict = {
            'frame': None,
            'list_of_ids': None,
            'obj_bboxes': []
        }
        self.frameCounter += 1
        # 这里和追踪器建立了关联, objtracker.update（） im是送入的某一帧图片,会调用ReID模型得到特征，然后进行匹配
        im, obj_bboxes = objtracker.update(self, im)
        retDict['frame'] = im
        retDict['obj_bboxes'] = obj_bboxes

        return retDict

    def init_model(self):
        raise EOFError("Undefined model type.")

    def preprocess(self):
        raise EOFError("Undefined model type.")

    def detect(self):
        # detect 方法是yolov5真正的检测方法
        raise EOFError("Undefined model type.")

这里是真正的检测类

# 对YOLOV5检测器的一个封装，使得使用起来更加简便
class Detector(baseDet):
    def __init__(self):
        super(Detector, self).__init__()
        self.init_model()
        self.build_config()
    # 加载模型
    def init_model(self):
        self.weights = DETECTOR_PATH
        self.device = '0' if torch.cuda.is_available() else 'cpu'
        self.device = select_device(self.device)
        model = attempt_load(self.weights, map_location=self.device)
        model.to(self.device).eval()
        model.half()
        self.m = model
        self.names = model.module.names if hasattr(
            model, 'module') else model.names
    # 对传进来的视频帧进行预处理
    def preprocess(self, img):
        img0 = img.copy()
        img = letterbox(img, new_shape=self.img_size)[0]
        img = img[:, :, ::-1].transpose(2, 0, 1)
        img = np.ascontiguousarray(img)
        img = torch.from_numpy(img).to(self.device)
        img = img.half()  # 半精度
        img /= 255.0  # 图像归一化
        if img.ndimension() == 3:
            img = img.unsqueeze(0)
        return img0, img # img0是原始的图像，img是处理后的图像

    def detect(self, im):
        """
        detect 方法就是传统一样上的检测方法，更为关键的是基类的feedCap方法，
        feedCap方法中最重要的是objtracker.update(self, im) 这个self其实就是yolov5的detector， 
        再update方法内部，用到了，这个方法detect，吐槽一下，他这个写法破坏了类的封装性，类之间有了依赖关系
        """
        im0, img = self.preprocess(im)
        pred = self.m(img, augment=False)[0] # 将图像传入检测器中，得到推理后的结果
        pred = pred.float()
        pred = non_max_suppression(pred, self.threshold, 0.4) # 进行非极大值抑制
        pred_boxes = []
        for det in pred:
            if det is not None and len(det):
                det[:, :4] = scale_coords(
                    img.shape[2:], det[:, :4], im0.shape).round()
                for *x, conf, cls_id in det:
                    lbl = self.names[int(cls_id)]
                    if not lbl in OBJ_LIST:	
                       # 这里就是判断类别，不在我们需要检测的类别中就跳过。 @todo其实这里可以修改成不是这个关注的类，就不关注，而不是检测出来再过滤，但是否更省资源，要看源码。
                        continue
                    x1, y1 = int(x[0]), int(x[1])
                    x2, y2 = int(x[2]), int(x[3])
                    pred_boxes.append(
                        (x1, y1, x2, y2, lbl, conf))
        return im, pred_boxes # 最后返回原始图像以及检测到的目标框

cfg = get_config()
cfg.merge_from_file("deep_sort/configs/deep_sort.yaml")
deepsort = DeepSort(cfg.DEEPSORT.REID_CKPT,
                    max_dist=cfg.DEEPSORT.MAX_DIST, min_confidence=cfg.DEEPSORT.MIN_CONFIDENCE,
                    nms_max_overlap=cfg.DEEPSORT.NMS_MAX_OVERLAP, max_iou_distance=cfg.DEEPSORT.MAX_IOU_DISTANCE,
                    max_age=cfg.DEEPSORT.MAX_AGE, n_init=cfg.DEEPSORT.N_INIT, nn_budget=cfg.DEEPSORT.NN_BUDGET,
                    use_cuda=True)

def update(target_detector, image):
        # 这里也就是用之前的检测器得到检测框
        _, bboxes = target_detector.detect(image)
        bbox_xywh = []
        confs = []
        bboxes2draw = []
        if len(bboxes):
            # Adapt detections to deep sort input format
            for x1, y1, x2, y2, _, conf in bboxes:
                obj = [
                    int((x1+x2)/2), int((y1+y2)/2),
                    x2-x1, y2-y1
                ]
                bbox_xywh.append(obj)
                confs.append(conf)
            xywhs = torch.Tensor(bbox_xywh)
            confss = torch.Tensor(confs)

            # Pass detections to deepsort(这里就可以得到最终的这一帧的目标框和目标ID)
            outputs = deepsort.update(xywhs, confss, image)
            for value in list(outputs):
                x1,y1,x2,y2,track_id = value
                bboxes2draw.append(
                    (x1, y1, x2, y2, '', track_id)
                )
        # 这里起到一个将检测框和ID信息绘制到图像上的作用
        image = plot_bboxes(image, bboxes2draw)
        return image, bboxes2draw

用的是yolov5的2022.4月份之前的版本，还没有把focus网络去掉。关于focus网络的解读看这里
https://blog.csdn.net/qq_38253797/article/details/119684388

fuseforward

设计思路：理论上从高分辨率图像中，周期性的抽出像素点重构到低分辨率图像中，即将图像相邻的四个位置进行堆叠，聚焦wh维度信息到c通道空，提高每个点感受野，并减少原始信息的丢失。这个组件并不是为了增加网络的精度的，而是为了减少计算量，增加速度。
后来的版本中已经去掉了，证明其没卵用

3. 其它

需要代码的私信流邮箱，或者我找一个其它方式，看如何搞定。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)