背景:

       在自动驾驶中,基于摄像头的视觉感知,如同人的眼睛一样重要。而目前主流方案基本都采用深度学习方案(tensorflow等),而非传统图像处理(opencv等)。

  接下来我们就以YOLOV3为基本网络模型,Tensorflow为基本框架,搭建一套能够自动识别路面上动态目标,如车辆,行人,骑行人等。

正文:

  原生YOLOV3是基于darknet(纯C编写)开发的,这里我们会将YOLOV3架构在Tensorflow平台上(Python,C++跨平台多语言)。

       关键点介绍:

       一、基本的网络结构图:

       目标检测之车辆行人(tensorflow版yolov3-tiny)

 

模型流程图如下:

目标检测之车辆行人(tensorflow版yolov3-tiny)

  基础主干网Darknet53:

目标检测之车辆行人(tensorflow版yolov3-tiny)

二、代码结构:

tf_yolov3

|-------extract_voc.py #从原始文件中生成训练所需数据格式

import os
import argparse
import xml.etree.ElementTree as ET

# sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')]
sets=[('2012', 'train'), ('2012', 'val')]

classes = ["aeroplane", "bicycle", "bird", "boat", "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", "train", "tvmonitor"]

parser = argparse.ArgumentParser()
parser.add_argument("--voc_path", default="/home/yang/test/VOCdevkit/train/")
parser.add_argument("--dataset_info_path", default="./")
flags = parser.parse_args()

def convert_annotation(year, image_id, list_file):
    xml_path = os.path.join(flags.voc_path, 'VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
    in_file = open(xml_path)
    tree=ET.parse(in_file)
    root = tree.getroot()

    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult)==1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))
        list_file.write(" " +  " ".join([str(a) for a in b]) + " " + str(cls_id))


for year, image_set in sets:
    text_path = os.path.join(flags.voc_path, 'VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set))
    if not os.path.exists(text_path): continue
    image_ids = open(text_path).read().strip().split()
    list_file_path = os.path.join(flags.dataset_info_path, '%s_%s.txt'%(year, image_set))
    list_file = open(list_file_path, 'w')
    for image_id in image_ids:
        image_path = os.path.join(flags.voc_path, 'VOCdevkit/VOC%s/JPEGImages/%s.jpg'%(year, image_id))
        print("=>", image_path)
        list_file.write(image_path)
        convert_annotation(year, image_id, list_file)
        list_file.write('n')
    list_file.close()

View Code