mask-rcnn训练识别纸箱模型

2023-11-15

源码来源:matterport/Mask_RCNNhttps://github.com/matterport/Mask_RCNN

一、开发环境及工具

1.开发环境:anaconda3、python3.6、jupyter、pycharm

2.样本标注工具:labelme

3.python包:numpy1.16.1、scipy1.2.1、Pillow5.4.1、cython0.29.6、scikit-image0.14.2、keras2.12、tensorflow1.10、opencv、h5py2.8.0、yaml3.13

二、文件目录

工程文件目录:在matterport源码基础上只增加train_data_400样本文件

训练样本文件目录:

cv2_mask为标注完成后的掩模图片

json为labelme标注完成的掩模坐标信息

labelme_json为labelme使用labelme_json_to_dataset命令生成文件

pic为图像源文件

(标注过程在第三节)

三、labelme样本制作

Open Terminal打开终端:

在终端中输入安装pythonQT和labelme:

pip install pyqt

pip install labelme

在终端中启动labelme:

标注过程:

(注:在同一个样本出现多个同类型物体不能分为一类,否则识别时也会放在一起识别;解决方案有:1.可以用XX1、XX2、XX3等区分开。2.同一个样本建立多个图层副本,一个图层标注其中一个物体。)否则会出现多个物体一起识别的问题。

标注完成后点击SAVE按钮存储json文件

labelme_json_to_dataset  <文件名>.json生成8位掩模图片文件label.png及类别信息info.yaml等

label图片是8位彩色,这与很多博客写的不一样,因为labelme在GitHub上一直更新,2019.3.7以后输出为8位彩色图片,不用按照其他教程使用其他工具改变图片

提取labelme_json\xx_json中的label.png文件至从cv2_mask文件夹中并安装序号重命名

import os
import shutil

log_path = "D:\\photoclub\\400x400_150\\labelme_json\\"
outpath = "D:\\photoclub\\400x400_150\\cv2_mask\\"

for i in range(1,158):
    inpath = log_path + str(i) + "_json\\label.png"
    maskpath = outpath + str(i) +".png"
    print(inpath)
    print(maskpath)
    shutil.copyfile(inpath ,maskpath)

按照样本目录整理文件后进行下一步训练

四、自定义样本训练

样本训练代码借鉴csdn博客:https://blog.csdn.net/qq_29462849/article/details/81037343

# -*- coding: utf-8 -*-

import os
import sys
import random
import math
import re
import time
import numpy as np
import cv2
import matplotlib
import matplotlib.pyplot as plt
import tensorflow as tf
from mrcnn.config import Config
#import utils
from mrcnn import model as modellib,utils
from mrcnn import visualize
import yaml
from mrcnn.model import log
from PIL import Image

#os.environ["CUDA_VISIBLE_DEVICES"] = "0"
# Root directory of the project
ROOT_DIR = os.getcwd()

#ROOT_DIR = os.path.abspath("../")
# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")

iter_num=0

# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
    utils.download_trained_weights(COCO_MODEL_PATH)


class ShapesConfig(Config):
    """Configuration for training on the toy shapes dataset.
    Derives from the base Config class and overrides values specific
    to the toy shapes dataset.
    """
    # Give the configuration a recognizable name
    NAME = "shapes"

    # Train on 1 GPU and 8 images per GPU. We can put multiple images on each
    # GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

    # Number of classes (including background)
    NUM_CLASSES = 1 + 3  # background + 3 shapes

    # Use small images for faster training. Set the limits of the small side
    # the large side, and that determines the image shape.
    IMAGE_MIN_DIM = 320
    IMAGE_MAX_DIM = 448

    # Use smaller anchors because our image and objects are small
    RPN_ANCHOR_SCALES = (8 * 6, 16 * 6, 32 * 6, 64 * 6, 128 * 6)  # anchor side in pixels

    # Reduce training ROIs per image because the images are small and have
    # few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
    TRAIN_ROIS_PER_IMAGE = 100

    # Use a small epoch since the data is simple
    STEPS_PER_EPOCH = 100

    # use small validation steps since the epoch is small
    VALIDATION_STEPS = 50


config = ShapesConfig()
config.display()

class DrugDataset(utils.Dataset):
    # 得到该图中有多少个实例(物体)
    def get_obj_index(self, image):
        n = np.max(image)
        return n

    # 解析labelme中得到的yaml文件,从而得到mask每一层对应的实例标签
    def from_yaml_get_class(self, image_id):
        info = self.image_info[image_id]
        with open(info['yaml_path']) as f:
            temp = yaml.load(f.read())
            labels = temp['label_names']
            del labels[0]
        return labels

    # 重新写draw_mask
    def draw_mask(self, num_obj, mask, image,image_id):
        #print("draw_mask-->",image_id)
        #print("self.image_info",self.image_info)
        info = self.image_info[image_id]
        #print("info-->",info)
        #print("info[width]----->",info['width'],"-info[height]--->",info['height'])
        for index in range(num_obj):
            for i in range(info['width']):
                for j in range(info['height']):
                    #print("image_id-->",image_id,"-i--->",i,"-j--->",j)
                    #print("info[width]----->",info['width'],"-info[height]--->",info['height'])
                    at_pixel = image.getpixel((i, j))
                    if at_pixel == index + 1:
                        mask[j, i, index] = 1
        return mask

    # 重新写load_shapes,里面包含自己的类别,可以任意添加
    # 并在self.image_info信息中添加了path、mask_path 、yaml_path
    # yaml_pathdataset_root_path = "/tongue_dateset/"
    # img_floder = dataset_root_path + "rgb"
    # mask_floder = dataset_root_path + "mask"
    # dataset_root_path = "/tongue_dateset/"
    # print(img_floder + "img_floder")  # D:\pythonprojects\maskrcnn_mrcnn\train_data/pic
    # print(mask_floder + "mask_floder")  # D:\pythonprojects\maskrcnn_mrcnn\train_data/cv2_mask
    def load_shapes(self, count, img_floder, mask_floder, imglist, dataset_root_path):
        """Generate the requested number of synthetic images.
        count: number of images to generate.
        height, width: the size of the generated images.
        """
        # Add classes,可通过这种方式扩展多个物体
        self.add_class("shapes", 1, "carton")
        self.add_class("shapes", 2, "black")
        self.add_class("shapes", 3, "wood")
        for i in range(count):
            # 获取图片宽和高

            filestr = imglist[i].split(".")[0]
            #print(imglist[i],"-->",cv_img.shape[1],"--->",cv_img.shape[0])
            #print("id-->", i, " imglist[", i, "]-->", imglist[i],"filestr-->",filestr)
            #filestr = filestr.split("_")[1]
            mask_path = mask_floder + "/" + filestr + ".png"#D:\pythonprojects\maskrcnn_mrcnn\train_data/cv2_mask/99.png
            yaml_path = dataset_root_path + "labelme_json/" + filestr + "_json/info.yaml"
            #print(dataset_root_path + "labelme_json/" + filestr + "_json/img.png")
            cv_img = cv2.imread(dataset_root_path + "labelme_json/" + filestr + "_json/img.png")
            self.add_image("shapes", image_id=i, path=img_floder + "/" + imglist[i],
                           width=400, height=400, mask_path=mask_path, yaml_path=yaml_path)

    # 重写load_mask
    def load_mask(self, image_id):
        """Generate instance masks for shapes of the given image ID.
        """
        global iter_num
        print("image_id",image_id)
        info = self.image_info[image_id]
        count = 1  # number of object
        img = Image.open(info['mask_path'])
        num_obj = self.get_obj_index(img)
        mask = np.zeros([info['height'], info['width'], num_obj], dtype=np.uint8)
        mask = self.draw_mask(num_obj, mask, img,image_id)
        occlusion = np.logical_not(mask[:, :, -1]).astype(np.uint8)
        for i in range(count - 2, -1, -1):
            mask[:, :, i] = mask[:, :, i] * occlusion

            occlusion = np.logical_and(occlusion, np.logical_not(mask[:, :, i]))
        labels = []
        labels = self.from_yaml_get_class(image_id)
        labels_form = []
        for i in range(len(labels)):
            if labels[i].find("carton") != -1:
                # print "box"
                labels_form.append("carton")
            elif labels[i].find("black") != -1:
                # print "column"
                labels_form.append("black")
            elif labels[i].find("wood") != -1:
                # print "package"
                labels_form.append("wood")
        class_ids = np.array([self.class_names.index(s) for s in labels_form])
        return mask, class_ids.astype(np.int32)

def get_ax(rows=1, cols=1, size=8):
    """Return a Matplotlib Axes array to be used in
    all visualizations in the notebook. Provide a
    central point to control graph sizes.

    Change the default size attribute to control the size
    of rendered images
    """
    _, ax = plt.subplots(rows, cols, figsize=(size * cols, size * rows))
    return ax

#基础设置
dataset_root_path="D:\\pythonprojects\\blogs\\train_data_400/"
img_floder = dataset_root_path + "pic"
mask_floder = dataset_root_path + "cv2_mask"
#yaml_floder = dataset_root_path
imglist = os.listdir(img_floder)
count = len(imglist)
print(img_floder+"img_floder")#D:\pythonprojects\maskrcnn_mrcnn\train_data/pic
print(mask_floder+"mask_floder")#D:\pythonprojects\maskrcnn_mrcnn\train_data/cv2_mask
#train与val数据集准备
dataset_train = DrugDataset()
dataset_train.load_shapes(158, img_floder, mask_floder, imglist,dataset_root_path)
dataset_train.prepare()

#print("dataset_train-->",dataset_train._image_ids)

dataset_val = DrugDataset()
dataset_val.load_shapes(158, img_floder, mask_floder, imglist,dataset_root_path)
dataset_val.prepare()

#print("dataset_val-->",dataset_val._image_ids)

# Load and display random samples
#image_ids = np.random.choice(dataset_train.image_ids, 4)
#for image_id in image_ids:
#    image = dataset_train.load_image(image_id)
#    mask, class_ids = dataset_train.load_mask(image_id)
#    visualize.display_top_masks(image, mask, class_ids, dataset_train.class_names)

# Create model in training mode
model = modellib.MaskRCNN(mode="training", config=config,
                          model_dir=MODEL_DIR)

# Which weights to start with?
init_with = "coco"  # imagenet, coco, or last

if init_with == "imagenet":
    model.load_weights(model.get_imagenet_weights(), by_name=True)
elif init_with == "coco":
    # Load weights trained on MS COCO, but skip layers that
    # are different due to the different number of classes
    # See README for instructions to download the COCO weights
    model.load_weights(COCO_MODEL_PATH, by_name=True,
                       exclude=["mrcnn_class_logits", "mrcnn_bbox_fc",
                                "mrcnn_bbox", "mrcnn_mask"])
elif init_with == "last":
    # Load the last model you trained and continue training
    model.load_weights(model.find_last()[1], by_name=True)

# Train the head branches
# Passing layers="heads" freezes all layers except the head
# layers. You can also pass a regular expression to select
# which layers to train by name pattern.
model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE,
            epochs=5,
            layers='heads')



# Fine tune all layers
# Passing layers="all" trains all layers. You can also
# pass a regular expression to select which layers to
# train by name pattern.
model.train(dataset_train, dataset_val,
            learning_rate=config.LEARNING_RATE / 10,
            epochs=10,
            layers="all")

注意参数有:文件路径、物体类别、训练图片大小、gpu处理图片数量、加载训练权重根据自己工程设置修改

训练结果:

两个日志文件以及训练十次后每次生成的模型文件,我们选取最后一次的模型文件作为测试模型

五、测试训练模型

import os
import sys
import random
import skimage.io
from mrcnn.config import Config
from datetime import datetime

# Root directory of the project
ROOT_DIR = os.getcwd()

# Import Mask RCNN
sys.path.append(ROOT_DIR)  # To find local version of the library
from mrcnn import utils
import mrcnn.model as modellib
from mrcnn import visualize

# Directory to save logs and trained model
MODEL_DIR = os.path.join(ROOT_DIR, "logs")

# Local path to trained weights file
COCO_MODEL_PATH = os.path.join(MODEL_DIR, "mask_rcnn_shapes_0010.h5")
# Download COCO trained weights from Releases if needed
if not os.path.exists(COCO_MODEL_PATH):
    utils.download_trained_weights(COCO_MODEL_PATH)

# Directory of images to run detection on
IMAGE_DIR = os.path.join(ROOT_DIR, "images")


class ShapesConfig(Config):
    """Configuration for training on the toy shapes dataset.
    Derives from the base Config class and overrides values specific
    to the toy shapes dataset.
    """
    # Give the configuration a recognizable name
    NAME = "shapes"

    # Train on 1 GPU and 8 images per GPU. We can put multiple images on each
    # GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1

    # Number of classes (including background)
    NUM_CLASSES = 1 + 3  # background + 3 shapes

    # Use small images for faster training. Set the limits of the small side
    # the large side, and that determines the image shape.
    IMAGE_MIN_DIM = 320
    IMAGE_MAX_DIM = 448

    # Use smaller anchors because our image and objects are small
    RPN_ANCHOR_SCALES = (8 * 6, 16 * 6, 32 * 6, 64 * 6, 128 * 6)  # anchor side in pixels

    # Reduce training ROIs per image because the images are small and have
    # few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
    TRAIN_ROIS_PER_IMAGE = 100

    # Use a small epoch since the data is simple
    STEPS_PER_EPOCH = 100

    # use small validation steps since the epoch is small
    VALIDATION_STEPS = 50


# import train_tongue
# class InferenceConfig(coco.CocoConfig):
class InferenceConfig(ShapesConfig):
    # Set batch size to 1 since we'll be running inference on
    # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
    GPU_COUNT = 1
    IMAGES_PER_GPU = 1


config = InferenceConfig()

model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)

# Create model object in inference mode.
model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)

# Load weights trained on MS-COCO
model.load_weights(COCO_MODEL_PATH, by_name=True)

# COCO Class names
# Index of the class in the list is its ID. For example, to get ID of
# the teddy bear class, use: class_names.index('teddy bear')
class_names = ['carton', 'black', 'wood']
# Load a random image from the images folder
file_names = next(os.walk(IMAGE_DIR))[2]
image = skimage.io.imread(os.path.join(IMAGE_DIR, random.choice(file_names)))

a = datetime.now()
# Run detection
results = model.detect([image], verbose=1)
b = datetime.now()
# Visualize results
print("shijian", (b - a).seconds)
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'],
                            class_names, r['scores'])

测试结果如下:

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

mask-rcnn训练识别纸箱模型 的相关文章

随机推荐

  • 常见挖矿病毒处理方法(qW3xT/Ddgs.3011/S01wipefs/acpidtd/MSFC)

    常见挖矿病毒处理方法 1 常见病毒 病毒名称 qW3xT 现象 占用超高CPU 进程查杀之后自启动 中毒案例 2 病毒名称 Ddgs 3011 现象 占用超高CPU 进程查杀之后自启动 中毒案例 3 病毒名称 S01wipefs 现象 占用
  • zookeerp安装与配置

    1 zookeeper官网 https zookeeper apache org 2 找到download 然后打开的页面打开archive 版本页面不选择内测或者公测版本 选择一个稳定的 然后下载下来 解压 进入红框目录 在红框目录打cm
  • [1211]python imagehash库简单运用

    文章目录 python imagehash库简单运用 基本原理 什么是哈希 hash 什么是图像哈希 imagehash 安装 基本用法 imagehash中的四种图像哈希方式 phash ahash dhash 小波hash percep
  • RIP实验(详细步骤)

    一 设置每个路由器的环回地址和IP地址 二 设置完IP后 输入RIP 再输入该路由器的环回和所在网段 只是自己本身有的 而不是宣告学习到的 以此类推 宣告完之后再给每一个端口设置密文认证 如上图所示 但是路由器四环回不用宣告 3 缺省路由
  • 2023智源大会议程公开丨基础模型前沿技术论坛

    6月9日 2023北京智源大会 将邀请这一领域的探索者 实践者 以及关心智能科学的每个人 共同拉开未来舞台的帷幕 你准备好了吗 与会知名嘉宾包括 图灵奖得主Yann LeCun 图灵奖得主Geoffrey Hinton OpenAI创始人S
  • 剑指 Offer 55 - I. 二叉树的深度(java+python)

    输入一棵二叉树的根节点 求该树的深度 从根节点到叶节点依次经过的节点 含根 叶节点 形成树的一条路径 最长路径的长度为树的深度 例如 给定二叉树 3 9 20 null null 15 7 3 9 20 15 7 返回它的最大深度 3 提示
  • 为什么WebSocket连接可以实现全双工通信而HTTP连接不行呢?WebSocket协议详解

    WebSocket WebSocket是HTML5新增的协议 它的目的是在浏览器和服务器之间建立一个不受限的双向通信的通道 比如说 服务器可以在任意时刻发送消息给浏览器 为什么传统的HTTP协议不能做到WebSocket实现的功能 这是因为
  • 微信小程序用户隐私保护指引设置怎么填?

    我们在微信小程序审核时 可能会出现下图的提示 需要我们完善用户隐私协议 此时点击上图中的 了解详情 进入下图的界面 点击下图所示选项 点击 确认以上内容 勾选以上两项 再确认 再对照上图填写 确认生成协议 就可以提交小程序审核了
  • java 泛型 动态,如何动态地指定Java泛型类

    If I specific a method which return a generic class how can I do than I can specific the type of generic class dynamicly
  • Python 实现斐波那契数列中的前50个

    斐波那契数列 Fibonacci sequence 又称黄金分割数列 因数学家列昂纳多 斐波那契 Leonardoda Fibonacci 以兔子繁殖为例子而引入 故又称为 兔子数列 指的是这样一个数列 1 1 2 3 5 8 13 21
  • Redis 之 list数据类型

    基本操作 lpush 从左端插入元素 可以一次插入多个 rpush 从右端插入元素 可以一次插入多个 lpop 左侧取出一个元素 取出后list元素个数减1 rpop 从右端取出一个元素 取出后list元素个数减1 llen 查看list的
  • Java爬虫:用java爬取小说

    Java也能做爬虫 现在提到爬虫人第一个想到的就是python 其实使用Java编写爬虫也是很好的选择 下面给大家展示一个使用Java基础语言编写的爬取小说的案例 实现功能 爬取目标网站全本小说 代码编写环境 JDK 1 8 0 191 E
  • 拓世大模型引领传统能源大变革,开启煤矿产业升级新范式

    引言 煤炭 因其丰富储量和广泛用途 被誉为工业食粮 自第一次工业革命以来一直是人类生产生活中不可或缺的能源材料 我国是世界煤炭资源储存大国 据国家统计局数据 2022年我国煤炭产量达到45 6亿吨 同比增长10 5 作为世界第二大经济体 我
  • HTTP服务器(三)

    下面实现处理动态页面的逻辑 创建一对命名管道 fork创建子进程 让父子进程执行不同的任务 值得注意的是 由于管道数据流动是单向的 所以要创建一对 父进程将必要的信息通过管道传递给子进程 子进程将计算的结果通过管道传递给父进程 int Ha
  • RPG Maker MV游戏解包

    该文章最新版本请前往 https www crowsong xyz 127 html 前言 使用Petschko s RPG Maker MV File Decrypter进行解包 使用Petschko s RPG Maker MV Fil
  • 根据三角形的三条边长(长、中、短三条边),来判断三角形类型。注意: 1.一个三角形的边长应该都为正数 2.一个三角形的边长都应该满足三角形条件:两边之和大于第三边 如果有两边的

    根据三角形的三条边长 长 中 短三条边 来判断三角形类型 注意 1 一个三角形的边长应该都为正数 2 一个三角形的边长都应该满足三角形条件 两边之和大于第三边 如果有两边的平方和比第三边的平方小 他就是钝角三角形 比如 a a b b
  • Beyond Compare 4.2.9破解

    Beyond Compare 4 2 9破解教程 1 Beyond Compare 4 2 9下载 2 安装软件 3 破解 4 再次打开软件 破解后完成状态 1 Beyond Compare 4 2 9下载 点击这里官网下载 2 安装软件
  • LeetCode 1493. 删掉一个元素以后全为 1 的最长子数组 - 二分 + 滑动窗口

    删掉一个元素以后全为 1 的最长子数组 提示 中等 90 相关企业 给你一个二进制数组 nums 你需要从中删掉一个元素 请你在删掉元素的结果数组中 返回最长的且只包含 1 的非空子数组的长度 如果不存在这样的子数组 请返回 0 提示 1
  • BIOS里BOOT中boot mode 设置成UEFI和Legacy support有什么区别

    BIOS里BOOT中boot mode 设置成UEFI和Legacy support有什么区别 各什么意思 作用 uefi开机时间短是么 专业回答 BIOS中 UEFI和legacy support是两种不同的引导方式 UEFI是新式的BI
  • mask-rcnn训练识别纸箱模型

    源码来源 matterport Mask RCNN https github com matterport Mask RCNN 一 开发环境及工具 1 开发环境 anaconda3 python3 6 jupyter pycharm 2 样