关键点格式描述如下
https://cocodataset.org/#format-data https://cocodataset.org/#format-data
特别是这一行
annotation{
"keypoints" : [x1,y1,v1,...],
...
}
说关键点是一个数组x1,y1,v1,...
.
yolov7-pose 官方githubhttps://github.com/WongKinYiu/yolov7/tree/pose https://github.com/WongKinYiu/yolov7/tree/pose有下载准备好的 COCO 数据集的链接【MS COCO 2017要点标签】 https://github.com/WongKinYiu/yolov7/releases/download/v0.1/coco2017labels-keypoints.zip下载它,打开并进入目录labels\train2017
。您可以打开任意一个txt
文件,你会看到类似这样的行
0 0.671279 0.617945 0.645759 0.726859 0.519751 0.381250 2.000000 0.550936 0.348438 2.000000 0.488565 0.367188 2.000000 0.642412 0.354687 2.000000 0.488565 0.395313 2.000000 0.738046 0.526563 2.000000 0.446985 0.534375 2.000000 0.846154 0.771875 2.000000 0.442827 0.812500 2.000000 0.925156 0.964063 2.000000 0.507277 0.698438 2.000000 0.702703 0.942187 2.000000 0.555094 0.950000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
该行具有以下格式
class top_left_x top_left_y bottom_right_x bottom_right_y kpt1_x kpt1_y kpt1_v kpt2_x kpt2_y kpt2_v ...
这是代码(来自general.py
) 负责加载它
def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0, kpt_label=False):
# Convert nx4 boxes from [x, y, w, h] normalized to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
# it does the same operation as above for the key-points
y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
y[:, 0] = w * (x[:, 0] - x[:, 2] / 2) + padw # top left x
y[:, 1] = h * (x[:, 1] - x[:, 3] / 2) + padh # top left y
y[:, 2] = w * (x[:, 0] + x[:, 2] / 2) + padw # bottom right x
y[:, 3] = h * (x[:, 1] + x[:, 3] / 2) + padh # bottom right y
if kpt_label:
num_kpts = (x.shape[1]-4)//2
for kpt in range(num_kpts):
for kpt_instance in range(y.shape[0]):
if y[kpt_instance, 2 * kpt + 4]!=0:
y[kpt_instance, 2*kpt+4] = w * y[kpt_instance, 2*kpt+4] + padw
if y[kpt_instance, 2 * kpt + 1 + 4] !=0:
y[kpt_instance, 2*kpt+1+4] = h * y[kpt_instance, 2*kpt+1+4] + padh
return y
这是从调用的
labels[:, 1:] = xywhn2xyxy(labels[:, 1:], ratio[0] * w, ratio[1] * h, padw=pad[0], padh=pad[1], kpt_label=self.kpt_label)
请注意1
偏移量labels[:, 1:]
,它省略了类标签。
标签坐标必须按照此处所述进行标准化
assert (l[:, 5::3] <= 1).all(), 'non-normalized or out of bounds coordinate labels'
assert (l[:, 6::3] <= 1).all(), 'non-normalized or out of bounds coordinate labels'
正确设置标签格式是唯一棘手的部分。剩下的就是将图像存储在正确的目录中。结构是
images/
train/
file_name1.jpg
...
test/
val/
labels/
train/
file_name1.txt
...
test/
val/
train.txt
test.txt
val.txt
where train.txt
包含图像的路径。它的内容看起来像这样
./images/train/file_name1.jpg
...