keras的model,如果用fit_generator,那么传入的generator必须是个python的生成器或者是个keras.utils.Sequence
的实例。
keras.utils.Sequence
在多线程时更安全,它保证网络在一个epoch中只会在一个样本上上训练一次,python默认的生成器是无法保证的。
使用Sequence方式要注意:
Sequence
必须实现__getitem__
和__len__
方法- 如果想要在epoch间更改数据集,那么需要实现
on_epoch_end
方法 - 其中,
__getitem__
必须返回完整的batch,__getitem__
获取的是第几个batch的index,返回对应的X和y
下面是一个MNIST使用fit_generator来训练的demo
导入所需的包
from skimage.io import imread
from skimage.transform import resize
import numpy as np
import math
from tensorflow import keras
构建产生MNIST的Sequence实例,
__init__
:获取数据集和batch_size__len__
:返回一个epoch多场__getitem
:根据idx返回对应批次的数据
class MNISTSequence(keras.utils.Sequence):
def __init__(self, x_set, y_set, batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return math.ceil(len(self.x) / self.batch_size)
def __getitem__(self, idx):
batch_x = self.x[idx * self.batch_size:(idx + 1) *
self.batch_size]
batch_y = self.y[idx * self.batch_size:(idx + 1) *
self.batch_size]
return batch_x, batch_y
接着,实例化它
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data(path='mnist.npz')
mnist_sequence = MNISTSequence(x_train, y_train, 20)
构建模型
model = keras.models.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128,activation='relu'),
keras.layers.Dense(10)
])
model.compile(
optimizer=keras.optimizers.Adam(0.001),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
训练
model.fit_generator(mnist_sequence, epochs=2)
Epoch 1/2
3000/3000 [==============================] - 8s 3ms/step - loss: 1.9532 - sparse_categorical_accuracy: 0.8545
Epoch 2/2
3000/3000 [==============================] - 8s 3ms/step - loss: 0.3445 - sparse_categorical_accuracy: 0.9130
如果想要每个epoch结束后打乱数据集的顺序,可以把Sequence改为:
class MNISTSequence(keras.utils.Sequence):
def __init__(self, x_set, y_set, batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return math.ceil(len(self.x) / self.batch_size)
def __getitem__(self, idx):
batch_x = self.x[idx * self.batch_size:(idx + 1) *
self.batch_size]
batch_y = self.y[idx * self.batch_size:(idx + 1) *
self.batch_size]
return batch_x, batch_y
def on_epoch_end(self):
idxs = np.arange(len(self.x))
np.random.shuffle(idxs)
self.x = self.x[idxs]
self.y = self.y[idxs]