yolov5转tensorrt c++

F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\TensorRT-7.2.3.4\include
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\opencv3.4.0\include\opencv
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\opencv3.4.0\include

库文件：
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\opencv3.4.0\lib\vc15
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\TensorRT-7.2.3.4\lib

compute_30,sm_30

int main_yolov5()

改为：

int main()

yolo_detector.cpp中的int main()

改为：

int main_yolov5()

3.更改tensorrt的版本。

2.报错：

严重性代码说明项目文件行禁止显示状态
错误 LNK2001 无法解析的外部符号 createNvOnnxParser_INTERNAL tiny_tensorrt_onnx F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project-main\tiny_tensorrt_onnx\Trt.obj 1

解决：

tensorrt版本：7.2.3.4

添加lib：

nvonnxparser.lib
nvparsers.lib

https://github.com/ttanzhiqiang/onnx_tensorrt_project

cuda11.1 tensorrt 7.2.3版本，我换成6.0版本，int8放弃，编译成功，转onnx成功，转engine失败了

报错代码：

    if(!parser->parseFromFile(onnxModel.c_str(), static_cast<int>(nvinfer1::ILogger::Severity::kWARNING))) {
        spdlog::error("error: could not parse onnx engine");
        return false;
    }

yolov5-4.0,预测出来的值与python的结果稍微不一样：

自己生成weights模型，以及加载报错解决

yolo5各个版本cfg下载：

yolo-tensorrt/yolov5_tutorial.md at master · enazoe/yolo-tensorrt · GitHub

yolov5 tensorrt_jacke121的专栏-CSDN博客_yolov5tensorrt

改之前的：get_weights.py

import torch
import struct
from utils.torch_utils import select_device

# Initialize
device = select_device('cpu')
# Load model
# model = torch.load('yolov5s.pt', map_location=device)['model'].float()  # load to FP32
model = torch.load('_0.9437_5997.pt', map_location=device)['model'].float()  # load to FP32
model.to(device).eval()

f = open('yolov5s.weights', 'w')
f.write('{}\n'.format(len(model.state_dict().keys())))
for k, v in model.state_dict().items():
    vr = v.reshape(-1).cpu().numpy()
    f.write('{} {} '.format(k, len(vr)))
    for vv in vr:
        f.write(' ')
        f.write(struct.pack('>f',float(vv)).hex())
    f.write('\n')

报错解决之后的：get_weights.py

修改代码：f.write('{} '.format(k))

import torch
import struct
from utils.torch_utils import select_device

# Initialize
device = select_device('cpu')
# Load model
# model = torch.load('yolov5s.pt', map_location=device)['model'].float()  # load to FP32
model = torch.load('_0.9437_5997.pt', map_location=device)['model'].float()  # load to FP32
model.to(device).eval()

f = open('yolov5s.weights', 'w')
f.write('{}\n'.format(len(model.state_dict().keys())))
for k, v in model.state_dict().items():
    vr = v.reshape(-1).cpu().numpy()
    #f.write('{} {} '.format(k, len(vr)))
    f.write('{} '.format(k))
    for vv in vr:
        f.write(' ')
        f.write(struct.pack('>f',float(vv)).hex())
    f.write('\n')

c++解析weights，修改之前的：

void Yolo::load_weights_v5(const std::string s_weights_path_,
	std::map<std::string,std::vector<float>> &vec_wts_)
{
	vec_wts_.clear();
	assert(fileExists(s_weights_path_));
	std::cout << "Loading pre-trained weights..." << std::endl;
	std::ifstream file(s_weights_path_, std::ios_base::binary);
	assert(file.good());
	std::string line;
	while (std::getline(file,line))
	{
		if(line.empty())continue;
		std::stringstream iss(line);
		std::string wts_name;
		iss >> wts_name ;
		std::vector<float> weights;
		uint32_t n_str;
		while(iss >> std::hex >> n_str)
		{
			weights.push_back(reinterpret_cast<float&>(n_str));
		}
		vec_wts_[wts_name] = weights;
	}
	std::cout << "Loading complete!" << std::endl;
}

自己导出的weights，导入报错，报错的地方：

std::cout << size << " size " << map_wts_[s_layer_name_ + ".weight"].size() << std::endl;
assert(size == (map_wts_[s_layer_name_ + ".weight"].size()));

结果； 3456 3457

发现python保存时，保存名字+空格+长度+空格+数据。

c++读取，读取名字后，直接读取数据，导出实际长度比size大1。

修改后代码：

void Yolo::load_weights_v5(const std::string s_weights_path_,
	std::map<std::string,std::vector<float>> &vec_wts_)
{
	vec_wts_.clear();
	assert(fileExists(s_weights_path_));
	std::cout << "Loading pre-trained weights..." << std::endl;
	std::ifstream file(s_weights_path_, std::ios_base::binary);
	assert(file.good());
	std::string line;
	while (std::getline(file,line))
	{
		if(line.empty())continue;
		std::stringstream iss(line);
		std::string wts_name;
		iss >> wts_name ;

		int size_a;
		iss >> size_a;

		std::vector<float> weights;
		uint32_t n_str;
		while(iss >> std::hex >> n_str)
		{
			weights.push_back(reinterpret_cast<float&>(n_str));
		}
		vec_wts_[wts_name] = weights;
	}
	std::cout << "Loading complete!" << std::endl;
}

参考：Pytorch通过保存为ONNX模型转TensorRT5_连正的博客-CSDN博客_onnx转tensorrt

生成引擎报错解决

ERROR: (Unnamed Layer* 193) [Concatenation]: all concat input tensors must have the same dimensions except on the concatenation axis (0), but dimensions mismatched at index 1. Input 0 shape: [384,18,18], Input 1 shape: [384,17,17]
Assertion failed: d.nbDims == 3, file f:\project\detect\yolov5\tensorrt\yolo-tensorrt_exe\modules\trt_utils.cpp, line 444

后来发现是分辨率给错了，分辨率需要是64的倍数

yolov5s6.cfg 修改内容：

width和height，需要是64的倍数

nc，实际训练的类别个数。

批量预测

yolo.cpp文件：

m_BatchSize = 2;// std::stoi(trim(block.at("batch")));

自动创建引擎

先检测engine文件不存在，然后自动创建，创建大概要上分钟，需要耐心等待。

void Yolo::create_engine_yolov5(const nvinfer1::DataType dataType,
Int8EntropyCalibrator* calibrator )
{

yolov5s6-kFLOAT-batch1

设备	模型	model	batch	gpu计算耗时/ms	平均耗时
1060	yolov5s6	fp32	1	23	23
1060	yolov5s7	fp32	2	40	20
1060	yolov5s8	fp32	4	72	18
1060	yolov5s9	fp32	8	130	16.25

输出特征维数：

w	h	anchors	特征数量	类别加box
80	64	3	15360	85	1305600
40	32	3	3840	85	326400
20	16	3	960	85	81600
10	8	3	240	85	20400	173400

打印特征：

std::vector<BBoxInfo> Yolo::decodeDetections(const int& imageIdx,
										     const int& imageH,
                                             const int& imageW)
{
//	Timer timer;
    std::vector<BBoxInfo> binfo;
    for (auto& tensor : m_OutputTensors)
    {

		for (int a = 0; a < 20; a++) {
			std::cout << tensor.hostBuffer[a] << " ";
		}
		std::cout << std::endl;

        std::vector<BBoxInfo> curBInfo = decodeTensor(imageIdx, imageH, imageW, tensor);
        binfo.insert(binfo.end(), curBInfo.begin(), curBInfo.end());
    }
//	timer.out("decodeDetections");
    return binfo;

解决检测框乱的问题

第一步：转化weights文件时，记录anchors，

就是下面代码中的：anchor_grid

代码文件：

gen_weight_new.py

anchor_grid = model.model[-1].anchors * model.model[-1].stride[...,None,None]

print(anchor_grid)
# model.model[-1].anchor_grid = anchor_grid
delattr(model.model[-1], 'anchor_grid') # model.model[-1] is detect layer
model.model[-1].register_buffer("anchor_grid",anchor_grid)

打印结果：

tensor([[[ 35.18750, 5.43750],

[ 69.50000, 6.88672],

[ 49.90625, 9.78906]],

[[ 75.00000, 10.44531],

[112.87500, 8.17969],

[141.87500, 10.35156]],

[[217.25000, 8.93750],

[243.00000, 12.73438],

[320.00000, 10.21875]]])

第二步：保存到cfg中：

D:\work\lbg\yolov5_track\yolo-tensorrt_dll\sln\x64\Release\configs\kuaizi.cfg

width_multiple=0.5

anchors=35.18750,5.4375,69.50,6.88672, 49.90625,9.78906,75.0,10.445,112.870,8.17969,141.875, 10.35156,217.25,8.9375,243.0,12.73438,320.0,10.21875

提速

cfg中修改：

[net]

width=640

height=128

然后删掉kuaizi-kFLOAT-batch5_640.engine文件，重新生成engine文件，

自己的项目速度由16ms提升到8ms。

cfg中修改：

[net]

width=640

height=128

然后删掉kuaizi-kFLOAT-batch5_640.engine文件，重新生成engine文件，

速度右16ms提升到8ms。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)