少废话,先做备忘录
- StrongSORT(deepsort强化版)浅实战+代码解析
参考:https://blog.csdn.net/weixin_50862344/article/details/127070391
- https://github.com/phil-bergmann/tracking_wo_bnw
- https://www.zhihu.com/question/511584675
- https://blog.51cto.com/u_15316394/3215952
- https://github.com/KaiyangZhou/deep-person-reid
- https://www.jianshu.com/p/dc6a866600c2
- https://blog.csdn.net/sc1434404661/article/details/126973759
- https://drive.google.com/drive/folders/1xhG0kRH1EX5B9_Iz8gQJb7UNnn_riXi6
遇到问题:
https://github.com/ultralytics/yolov5/issues/1855
deepsort:几篇不错的文章
https://blog.csdn.net/didiaopao/article/details/120272947?spm=1001.2014.3001.5502
https://blog.csdn.net/didiaopao/article/details/120274519?spm=1001.2014.3001.5502
https://blog.csdn.net/didiaopao/article/details/120276922?spm=1001.2014.3001.5502
1. 如何使用
2. 代码解读
blog指引: https://blog.csdn.net/weixin_40620310/article/details/124501917
2.1 两个模型:
- 基于yolov5的detector模型
- tracker 基于跟踪器的tracker 模型,更新状态。
2.2 跟进一下demo.py
既然是视频跟踪解析,送入视频流一定需要while True结构,进行逐帧推理
创建一个检测器实例,不要被它的名字蒙蔽,虽然叫检测器,但是远比yolov5的检测器复杂,否则没法维护状态矩阵进行状态的维护和更新
det = Detector()
先来看一下Detector类的基类
class baseDet(object):
def __init__(self):
self.img_size = 640
self.threshold = 0.3
self.stride = 1
def build_config(self):
self.frameCounter = 0
def feedCap(self, im, func_status):
# 送入图片流
retDict = {
'frame': None,
'list_of_ids': None,
'obj_bboxes': []
}
self.frameCounter += 1
# 这里和追踪器建立了关联, objtracker.update() im是送入的某一帧图片,会调用ReID模型得到特征,然后进行匹配
im, obj_bboxes = objtracker.update(self, im)
retDict['frame'] = im
retDict['obj_bboxes'] = obj_bboxes
return retDict
def init_model(self):
raise EOFError("Undefined model type.")
def preprocess(self):
raise EOFError("Undefined model type.")
def detect(self):
# detect 方法是yolov5真正的检测方法
raise EOFError("Undefined model type.")
这里是真正的检测类
# 对YOLOV5检测器的一个封装,使得使用起来更加简便
class Detector(baseDet):
def __init__(self):
super(Detector, self).__init__()
self.init_model()
self.build_config()
# 加载模型
def init_model(self):
self.weights = DETECTOR_PATH
self.device = '0' if torch.cuda.is_available() else 'cpu'
self.device = select_device(self.device)
model = attempt_load(self.weights, map_location=self.device)
model.to(self.device).eval()
model.half()
self.m = model
self.names = model.module.names if hasattr(
model, 'module') else model.names
# 对传进来的视频帧进行预处理
def preprocess(self, img):
img0 = img.copy()
img = letterbox(img, new_shape=self.img_size)[0]
img = img[:, :, ::-1].transpose(2, 0, 1)
img = np.ascontiguousarray(img)
img = torch.from_numpy(img).to(self.device)
img = img.half() # 半精度
img /= 255.0 # 图像归一化
if img.ndimension() == 3:
img = img.unsqueeze(0)
return img0, img # img0是原始的图像,img是处理后的图像
def detect(self, im):
"""
detect 方法就是传统一样上的检测方法,更为关键的是基类的feedCap方法,
feedCap方法中最重要的是objtracker.update(self, im) 这个self其实就是yolov5的detector,
再update方法内部,用到了,这个方法detect,吐槽一下,他这个写法破坏了类的封装性,类之间有了依赖关系
"""
im0, img = self.preprocess(im)
pred = self.m(img, augment=False)[0] # 将图像传入检测器中,得到推理后的结果
pred = pred.float()
pred = non_max_suppression(pred, self.threshold, 0.4) # 进行非极大值抑制
pred_boxes = []
for det in pred:
if det is not None and len(det):
det[:, :4] = scale_coords(
img.shape[2:], det[:, :4], im0.shape).round()
for *x, conf, cls_id in det:
lbl = self.names[int(cls_id)]
if not lbl in OBJ_LIST:
# 这里就是判断类别,不在我们需要检测的类别中就跳过。 @todo其实这里可以修改成不是这个关注的类,就不关注,而不是检测出来再过滤,但是否更省资源,要看源码。
continue
x1, y1 = int(x[0]), int(x[1])
x2, y2 = int(x[2]), int(x[3])
pred_boxes.append(
(x1, y1, x2, y2, lbl, conf))
return im, pred_boxes # 最后返回原始图像以及检测到的目标框
cfg = get_config()
cfg.merge_from_file("deep_sort/configs/deep_sort.yaml")
deepsort = DeepSort(cfg.DEEPSORT.REID_CKPT,
max_dist=cfg.DEEPSORT.MAX_DIST, min_confidence=cfg.DEEPSORT.MIN_CONFIDENCE,
nms_max_overlap=cfg.DEEPSORT.NMS_MAX_OVERLAP, max_iou_distance=cfg.DEEPSORT.MAX_IOU_DISTANCE,
max_age=cfg.DEEPSORT.MAX_AGE, n_init=cfg.DEEPSORT.N_INIT, nn_budget=cfg.DEEPSORT.NN_BUDGET,
use_cuda=True)
def update(target_detector, image):
# 这里也就是用之前的检测器得到检测框
_, bboxes = target_detector.detect(image)
bbox_xywh = []
confs = []
bboxes2draw = []
if len(bboxes):
# Adapt detections to deep sort input format
for x1, y1, x2, y2, _, conf in bboxes:
obj = [
int((x1+x2)/2), int((y1+y2)/2),
x2-x1, y2-y1
]
bbox_xywh.append(obj)
confs.append(conf)
xywhs = torch.Tensor(bbox_xywh)
confss = torch.Tensor(confs)
# Pass detections to deepsort(这里就可以得到最终的这一帧的目标框和目标ID)
outputs = deepsort.update(xywhs, confss, image)
for value in list(outputs):
x1,y1,x2,y2,track_id = value
bboxes2draw.append(
(x1, y1, x2, y2, '', track_id)
)
# 这里起到一个将检测框和ID信息绘制到图像上的作用
image = plot_bboxes(image, bboxes2draw)
return image, bboxes2draw
用的是yolov5的2022.4月份之前的版本,还没有把focus网络去掉。关于focus网络的解读看这里
https://blog.csdn.net/qq_38253797/article/details/119684388
fuseforward
设计思路:理论上从高分辨率图像中,周期性的抽出像素点重构到低分辨率图像中,即将图像相邻的四个位置进行堆叠,聚焦wh维度信息到c通道空,提高每个点感受野,并减少原始信息的丢失。这个组件并不是为了增加网络的精度的,而是为了减少计算量,增加速度。
后来的版本中已经去掉了,证明其没卵用
3. 其它
需要代码的私信流邮箱,或者我找一个其它方式,看如何搞定。