深度相机进行目标检测

2023-05-16

1.介绍

国外博客，这个博客还行，可以下载源码。

CSDN

目标检测采用YOLOv5，已经搭建好。相机采用realsense d435i。

官网有文档，示例代码，工具等。看官网是最好的选择。

官方GitHub

SDK没有用到

2. python库的搭建

进入conda环境后，以下命令安装。可以选择从源代码进行构建(一种安装方式)，但是我还是用最简单的

pip install pyrealsense2

用以下命令即可导入

import pyrealsense2 as rs

官网的文档中没有给出比较详细的pythonAPI使用方法，主要还是看github中的一些examples，以及pyrealsense2库的document

3.代码

初始化相机开始收集数据流是第一步：用以下代码完成

config.enable_stream()一个是配置深度通道，一个是配置图像通道，其中的参数为分辨率、色彩格式，帧率。当然这些参数不能乱选，可以打开viewer看看有哪些选项，自己设定。（深度通道与图像通道的最大分辨率也不同）

    # 进行一些初始化化
    pipeline = rs.pipeline()
    config = rs.config()
    # D435i的分辨率只能是()()()
    config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 60)
    config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 60)
    # Start streaming
    pipeline.start(config)

训练好的模型可以用以下语句直接加载。（预测也可以用detect.py，不过文件比较臃肿，因此就直接加载比较方便，不过缺点也很明显，少了一些设置阈值、iou之类的）

model = torch.hub.load('D:/company/py/yolo', 'custom', path='mymodels/yolov5l_terminal.pt', source='local')
model.eval()

总体代码如下：

import pyrealsense2 as rs
import numpy as np
import cv2
import random
import torch


model = torch.hub.load('D:/company/py/yolo', 'custom', path='mymodels/yolov5l_terminal.pt', source='local')
model.eval()


def get_mid_pos(frame, box, depth_data, randnum):
    distance_list = []
    mid_pos = [(box[0] + box[2]) // 2, (box[1] + box[3]) // 2]  # 确定索引深度的中心像素位置
    min_val = min(abs(box[2] - box[0]), abs(box[3] - box[1]))  # 确定深度搜索范围
    # randnum是为了多取一些值来取平均
    for i in range(randnum):
        bias = random.randint(-min_val // 4, min_val // 4)
        dist = depth_data[int(mid_pos[1] + bias), int(mid_pos[0] + bias)]
        # 画出深度搜索范围
        cv2.circle(frame, (int(mid_pos[0] + bias), int(mid_pos[1] + bias)), 4, (255, 0, 0), -1)
        if dist:
            distance_list.append(dist)
    distance_list = np.array(distance_list)
    distance_list = np.sort(distance_list)[randnum // 2 - randnum // 4:randnum // 2 + randnum // 4]  # 冒泡排序+中值滤波
    return np.mean(distance_list)


# 这个函数主要是在原图上画框,标出深度信息
def dectshow(org_img, boxs, depth_data):
    img = org_img.copy()
    for box in boxs:
        cv2.rectangle(img, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 255, 0), 2)
        dist = get_mid_pos(org_img, box, depth_data, 24)
        cv2.putText(img, box[-1] + str(dist / 1000)[:4] + 'm',
                    (int(box[0]), int(box[1])), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
    cv2.imshow('dec_img', img)


if __name__ == "__main__":
    # 配置摄像机参数，用opencv的时候，颜色通道是bgr
    pipeline = rs.pipeline()
    config = rs.config()
    config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 60)
    config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 60)
    # 开始采集
    pipeline.start(config)
    try:
        while True:
            # 获取深度图以及彩色图像
            frames = pipeline.wait_for_frames()
            depth_frame = frames.get_depth_frame()
            color_frame = frames.get_color_frame()
            if not depth_frame or not color_frame:
                continue

            # 转化成numpy格式
            depth_image = np.asanyarray(depth_frame.get_data())
            color_image = np.asanyarray(color_frame.get_data())

            # 开始预测
            results = model(color_image)
            boxs = results.pandas().xyxy[0].values
            dectshow(color_image, boxs, depth_image)

            # 在深度图像上应用colormap(图像必须先转换为每像素8位)
            depth_colormap = cv2.applyColorMap(cv2.convertScaleAbs(depth_image, alpha=0.03), cv2.COLORMAP_JET)

            # 水平堆叠深度图和彩色图
            images = np.hstack((color_image, depth_colormap))
            # 展示,刷新时间为1ms
            # cv2.namedWindow('RealSense', cv2.WINDOW_AUTOSIZE)
            # cv2.imshow('RealSense', images)
            key = cv2.waitKey(1)

            # Press esc or 'q' to close the image window
            if key & 0xFF == ord('q') or key == 27:
                cv2.destroyAllWindows()
                break
    finally:
        pipeline.stop()

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)