这个系列是讲自己实现FCN(Fully Convolutional Networks)模型,并训练用它在一组形状图片上做训练以及预测。

这是系列的第一章,记录数据预处理以及生成器的构建。

用来训练FCN的数据集一共有200张图片,格式如下

|-images
|-|-0.jpg
|-|-1.jpg
|-|-...
|-annotated
|-|-0.json
|-|-1.json
|-|-...

每个图片对应一个json文件

大概的格式如下:

  • shapes里面包含一个数组
  • 数组中是一个个的{},代表一个对象
  • 每个{}包括:

    • label:这个对象的标签
    • points:组成这个对象的点的坐标
{
  "shapes": [
    { 
      "label": "star",
      "points": [
        [
          34, 115
        ],
        ...
      ],
    },
    ...
  ],
  ...
}

安装需要的库

# TensorFlow版本为2.3.1
! pip install pillow cython imgaug
! pip install git+https://github.com/lucasb-eyer/pydensecrf.git

下载数据集

! git clone git@github.com:ElijahMingLiu/FCN-dataset.git

接下来,构建生成器,关于生成器的介绍Keras 模型训练 fit_generator的生成器实现keras.utils.Sequence

最终一个批次的target格式为[batch_size, height, width, n_classes+1],每个分类会有一个对应的channel,是它们的mask图片。另外,background也会有一个对应的channel。

例如,一共5个类别,那么一个批次的target格式为[batch_size, height, width, 6]

生成器大概的逻辑如下:

  • __init__:构造函数要求传入以下参数

    • image_paths:图片的路径
    • annot_paths:annotation的路径
    • batch_size:batch大小,默认为32
    • shuffle:是否在每个epoch结束后打乱数据集的顺序,默认为True
    • augment:是否做数据增强,默认为False
  • __getitem__:利用__data_generation方法来预处理数据
  • __data_generation:利用create_binary_masks方法来生成mask
  • create_binary_masks

    1. 构建一个名为channels的list,用来存每个分类的mask图片
    2. cls是每个对象的标签
    3. poly是每个对象的points(组成对象的点的坐标)
    4. ·background·是背景的mask
    5. 遍历每个标签,如果标签包含在cls里,那么利用cv.fillPoly来画出mask,有mask的地方为255,其他地方为0
    6. background的mask,如果有对象的label是background,那么就和其他对象一样画,否则就是其他对象加在一起的mask取反,利用cv2.threshold(background, 127, 255, cv2.THRESH_BINARY_INV)
    7. 最后把channel结合起来,用np.stack(channels, axis=2),变成[height,width,n_classes+1]
    8. 归一化,最终值除255
import numpy as np
import pickle
import os
from PIL import Image
import imgaug as ia
from imgaug import augmenters as iaa
import cv2
import json
import tensorflow as tf

'''
ia.seed(1)

seq = iaa.Sequential([
    iaa.Fliplr(0.5),
    iaa.Multiply((1.2, 1.5)),
    iaa.Affine(
        #scale={"x": (0.8, 1.2), "y": (0.8, 1.2)},
        rotate=(-90, 90)
    ),
    iaa.Sometimes(0.5,
        iaa.GaussianBlur(sigma=(0, 8))
    )
], random_order=True)
'''

class DataGenerator(tf.keras.utils.Sequence):
    
    def __init__(self, image_paths, annot_paths, batch_size=32,
                 shuffle=True, augment=False):
        # 图片路径
        self.image_paths = image_paths
        # annotation路径
        self.annot_paths = annot_paths
        # batch大小
        self.batch_size = batch_size
        # 是否在一个epoch之后打乱数据集
        self.shuffle = shuffle
        # 是否数据增强
        self.augment = augment
        # 执行一遍epoch结束后的操作
        # 这里是为了建立索引
        self.on_epoch_end()
    
    
    def __len__(self):
        # 每个epoch多少batch
        return int(np.floor(len(self.image_paths) / self.batch_size))
    
    
    def __getitem__(self, index):
        # 返回对应的第index个batch
        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]

        image_paths = [self.image_paths[k] for k in indexes]
        annot_paths = [self.annot_paths[k] for k in indexes]
        
        # 根据路径生成numpy数据,y的格式为(batch_size,height, width, classes)
        # 相当于,第i个class都对应第i个channel
        X, y = self.__data_generation(image_paths, annot_paths)

        return X, y


    def on_epoch_end(self):
        # 每个epoch之后打乱索引
        self.indexes = np.arange(len(self.image_paths))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)

    
    def get_poly(self, annot_path):
        # r阅读annotation文件
        with open(annot_path) as handle:
            data = json.load(handle)
        
        shape_dicts = data['shapes']

        return shape_dicts

    
    def create_binary_masks(self, im, shape_dicts):
        # image must be grayscale
        blank = np.zeros(shape=(im.shape[0], im.shape[1]), dtype=np.float32)

        for shape in shape_dicts:
            if shape['label'] != 'background':
                points = np.array(shape['points'], dtype=np.int32)
                cv2.fillPoly(blank, [points], 255)
        blank = blank / 255.0

        return np.expand_dims(blank, axis=2)


    def create_multi_masks(self, im, shape_dicts):
        #每个channel是一个分类的mask
        channels = []
        # 根据上面的格式,可以看到label就是'star','rectangle等等'分类
        cls = [x['label'] for x in shape_dicts]
        # points就是组成这些的点的坐标
        poly = [np.array(x['points'], dtype=np.int32) for x in shape_dicts]
        # label2poly的格式类似
        # {'star': array([[ x1, y1],
        # [ x2,  y2],
        # ...)
        # 'square': array([[x1, y1],
        # [x2,  y2],
        # ...)
        # }
        label2poly = dict(zip(cls, poly))
        # 背景
        background = np.zeros(shape=(im.shape[0], im.shape[1]), dtype=np.float32)
        
        # 迭代跑每个object
        for i, label in enumerate(labels):
            # blank是object的mask
            blank = np.zeros(shape=(im.shape[0], im.shape[1]), dtype=np.float32)
            
            if label in cls:
                #cv2.fillPoly会根据points来画出对应的poly
                #分别填充blank和background
                cv2.fillPoly(blank, [label2poly[label]], 255)
                cv2.fillPoly(background, [label2poly[label]], 255)
            # 把blank加入channel
            channels.append(blank)

        # 如果有background对象,那么就也填充
        # 如果没有background对象,那么就翻转一下,有其他对象mask的地方为0,否则为255
        if 'background' in cls:
            background = np.zeros(shape=(im.shape[0], im.shape[1]), dtype=np.float32)
            cv2.fillPoly(background, [label2poly['background']], 255)
        else:
            _, background = cv2.threshold(background, 127, 255, cv2.THRESH_BINARY_INV)
        #把background也加入channel
        channels.append(background)
        # Y为[height, width, n_classes+1],然后再除上255,转换为[0,1]区间
        Y = np.stack(channels, axis=2) / 255.0

        return Y
    
    '''
    def augment_poly(self, im, shape_dicts):
        # augments an image and it's polygons

        points = []
        aug_shape_dicts = []
        i = 0

        for shape in shape_dicts:

            for pairs in shape['points']:
                points.append(ia.Keypoint(x=pairs[0], y=pairs[1]))

            _d = {}
            _d['label'] = shape['label']
            _d['index'] = (i, i+len(shape['points']))
            aug_shape_dicts.append(_d)

            i += len(shape['points'])

        keypoints = ia.KeypointsOnImage(points, shape=(256,256,3))

        seq_det = seq.to_deterministic()
        image_aug = seq_det.augment_images([im])[0]
        keypoints_aug = seq_det.augment_keypoints([keypoints])[0]

        for shape in aug_shape_dicts:
            start, end = shape['index']
            aug_points = [[keypoint.x, keypoint.y] for keypoint in keypoints_aug.keypoints[start:end]]
            shape['points'] = aug_points

        return image_aug, aug_shape_dicts
    '''
    
    def __data_generation(self, image_paths, annot_paths):
        
        # 建立训练用的numpy数组
        X = np.empty((self.batch_size, imshape[0], imshape[1], imshape[2]), dtype=np.float32)
        Y = np.empty((self.batch_size, imshape[0], imshape[1], n_classes),  dtype=np.float32)
        
        for i, (im_path, annot_path) in enumerate(zip(image_paths, annot_paths)):
            
            # 如果图片
            if imshape[2] == 1:
                im = cv2.imread(im_path, 0)
                im = np.expand_dims(im, axis=2)
            elif imshape[2] == 3:
                im = cv2.imread(im_path, 1)
                im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
            shape_dicts = self.get_poly(annot_path)
            
            # 是否数据增强
            if self.augment:
                im, shape_dicts = self.augment_poly(im, shape_dicts)
            
            # 构建mask
            if n_classes == 1:
                mask = self.create_binary_masks(im, shape_dicts)
            elif n_classes > 1:
                mask = self.create_multi_masks(im, shape_dicts)
            
            X[i,] = im
            Y[i,] = mask
            
        return X, Y

设置参数

# 给文件名排序
def sorted_fns(dir):
    return sorted(os.listdir(dir), key=lambda x: int(x.split('.')[0]))
# 输入图片的尺寸
imshape = (256, 256, 3)
# 图片颜色范围,目前用不到
hues = {'star': 30,
        'square': 0,
        'circle': 90,
        'triangle': 60}
# labels = ['circle', 'square', 'star', 'triangle']
labels = sorted(hues.keys())

# 有一个是background
n_classes = len(labels) + 1

# 文件list
image_paths = [os.path.join('./FCN-dataset/images', x) for x in sorted_fns('./FCN-dataset/images')]
annot_paths = [os.path.join('./FCN-dataset/annotated', x) for x in sorted_fns('./FCN-dataset/annotated')]

构建生成器

dataGenerator = DataGenerator(image_paths, annot_paths)

获取一个批次

batch_X, batch_y = dataGenerator.__getitem__(1)

随机plot一个图片

import matplotlib.pyplot as plt

plt.imshow(batch_X[0].astype(int))

img

看一下square的mask,因为labels = ['circle', 'square', 'star', 'triangle'],所以square的mask是第1个channel

plt.imshow(batch_y[0, :, :, 1].astype(int))

img

看一下star的mask

plt.imshow(batch_y[0, :, :, 2].astype(int))

img

最后看一下background的mas

plt.imshow(batch_y[0, :, :, 4].astype(int))

img

这就是生成器的全部代码

最后修改:2021 年 06 月 01 日 02 : 33 PM
如果觉得我的文章对你有用,请随意赞赏