在keras中,可以通过组合层来构建模型。模型是由层构成的图。最常见的模型类型是层的堆叠:tf.keras.Sequential.

model = tf.keras.Sequential()
# Adds a densely-connected layer with 64 units to the model:
model.add(layers.Dense(64, activation='relu'))
# Add another:
model.add(layers.Dense(64, activation='relu'))
# Add a softmax layer with 10 output units:
model.add(layers.Dense(10, activation='softmax'))

tf.keras.layers的参数,activation:激活函数,由内置函数的名称指定,或指定为可用的调用对象。kernel_initializer和bias_initializer:层权重的初始化方案。名称或可调用对象。kernel_regularizer和bias_regularizer:层权重的正则化方案。

# Create a sigmoid layer:
layers.Dense(64, activation='sigmoid')
# Or:
layers.Dense(64, activation=tf.sigmoid)

# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:
layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l1(0.01))

# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
layers.Dense(64, bias_regularizer=tf.keras.regularizers.l2(0.01))

# A linear layer with a kernel initialized to a random orthogonal matrix:
layers.Dense(64, kernel_initializer='orthogonal')

# A linear layer with a bias vector initialized to 2.0s:
layers.Dense(64, bias_initializer=tf.keras.initializers.constant(2.0))

训练和评估

设置训练流程

构建好模型后,通过调用compile方法配置该模型的学习流程:

model = tf.keras.Sequential([
# Adds a densely-connected layer with 64 units to the model:
layers.Dense(64, activation='relu'),
# Add another:
layers.Dense(64, activation='relu'),
# Add a softmax layer with 10 output units:
layers.Dense(10, activation='softmax')])

model.compile(optimizer=tf.train.AdamOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

tf.keras.Model.compile采用三个重要参数:

  • optimizer:从tf.train模块向其传递优化器实例,例如tf.train.AdamOptimizer,tf.train.RMSPropOptimizer或tf.train.GradientDescentOptimizer。
  • loss:损失函数。常见选择包括均方误差(mse)、categorical_crossentropy和binary_crossentropy.
  • metrics:评估指标

对于小型数据集,可以使用numpy数据训练。使用fit方法使模型与训练数据拟合。tf.keras.Model.fit采用三个重要参数:

  • epochs:以周期为单位进行训练。
  • batch_size:此整数制定每个批次的大小。
  • validation_data:验证集,监控该模型在验证数据上的达到的效果。
import numpy as np

data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

val_data = np.random.random((100, 32))
val_labels = np.random.random((100, 10))

model.fit(data, labels, epochs=10, batch_size=32,
          validation_data=(val_data, val_labels))



Train on 1000 samples, validate on 100 samples
Epoch 1/10
1000/1000 [==============================] - 0s 124us/step - loss: 11.5267 - categorical_accuracy: 0.1070 - val_loss: 11.0015 - val_categorical_accuracy: 0.0500
Epoch 2/10
1000/1000 [==============================] - 0s 72us/step - loss: 11.5243 - categorical_accuracy: 0.0840 - val_loss: 10.9809 - val_categorical_accuracy: 0.1200
Epoch 3/10
1000/1000 [==============================] - 0s 73us/step - loss: 11.5213 - categorical_accuracy: 0.1000 - val_loss: 10.9945 - val_categorical_accuracy: 0.0800
Epoch 4/10
1000/1000 [==============================] - 0s 73us/step - loss: 11.5213 - categorical_accuracy: 0.1080 - val_loss: 10.9967 - val_categorical_accuracy: 0.0700
Epoch 5/10
1000/1000 [==============================] - 0s 73us/step - loss: 11.5181 - categorical_accuracy: 0.1150 - val_loss: 11.0184 - val_categorical_accuracy: 0.0500
Epoch 6/10
1000/1000 [==============================] - 0s 72us/step - loss: 11.5177 - categorical_accuracy: 0.1150 - val_loss: 10.9892 - val_categorical_accuracy: 0.0200
Epoch 7/10
1000/1000 [==============================] - 0s 72us/step - loss: 11.5130 - categorical_accuracy: 0.1320 - val_loss: 11.0038 - val_categorical_accuracy: 0.0500
Epoch 8/10
1000/1000 [==============================] - 0s 74us/step - loss: 11.5123 - categorical_accuracy: 0.1130 - val_loss: 11.0065 - val_categorical_accuracy: 0.0100
Epoch 9/10
1000/1000 [==============================] - 0s 72us/step - loss: 11.5076 - categorical_accuracy: 0.1150 - val_loss: 11.0062 - val_categorical_accuracy: 0.0800
Epoch 10/10
1000/1000 [==============================] - 0s 67us/step - loss: 11.5035 - categorical_accuracy: 0.1390 - val_loss: 11.0241 - val_categorical_accuracy: 0.1100

使用Datasets可扩展为大型数据集或多设备训练。将tf.data.Dataset实力传递到fit方法。

tf.keras.Model.evaluate和tf.keras.Model.predict方法可以使用Numpy和tf.data.Dataset评估和预测。

tf.keras.Sequential模型是层的简单堆叠,无法表示任意模型。使用keras函数式API可以构建复杂的模型。

inputs = tf.keras.Input(shape=(32,))  # Returns a placeholder tensor

# A layer instance is callable on a tensor, and returns a tensor.
x = layers.Dense(64, activation='relu')(inputs)
x = layers.Dense(64, activation='relu')(x)
predictions = layers.Dense(10, activation='softmax')(x)

#给定输入和输出的情况下实例化模型。
model = tf.keras.Model(inputs=inputs, outputs=predictions)

# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Trains for 5 epochs
model.fit(data, labels, batch_size=32, epochs=5)

模型子类化

在__init__方法中创建层并将他们设置为类实例的属性。在__call__方法中定义前向传播。

class MyModel(tf.keras.Model):

  def __init__(self, num_classes=10):
    super(MyModel, self).__init__(name='my_model')
    self.num_classes = num_classes
    # Define your layers here.
    self.dense_1 = layers.Dense(32, activation='relu')
    self.dense_2 = layers.Dense(num_classes, activation='sigmoid')

  def call(self, inputs):
    # Define your forward pass here,
    # using layers you previously defined (in `__init__`).
    x = self.dense_1(inputs)
    return self.dense_2(x)

  def compute_output_shape(self, input_shape):
    # You need to override this function if you want to use the subclassed model
    # as part of a functional-style model.
    # Otherwise, this method is optional.
    shape = tf.TensorShape(input_shape).as_list()
    shape[-1] = self.num_classes
    return tf.TensorShape(shape)



model = MyModel(num_classes=10)

# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Trains for 5 epochs.
model.fit(data, labels, batch_size=32, epochs=5)

通过对tf.keras.layers.Layer进行子类化并实现以下方法来创建自定义层:

  • build:创建层的权重。使用add_weight方法添加权重。
  • call:定义前向传播
  • compute_output_shape:指定在给定输入形状的情况下如何计算输出形状。
  • 或者,可以通过get_config方法和from_config类方法序列化层。
class MyLayer(layers.Layer):

  def __init__(self, output_dim, **kwargs):
    self.output_dim = output_dim
    super(MyLayer, self).__init__(**kwargs)

  def build(self, input_shape):
    shape = tf.TensorShape((input_shape[1], self.output_dim))
    # Create a trainable weight variable for this layer.
    self.kernel = self.add_weight(name='kernel',
                                  shape=shape,
                                  initializer='uniform',
                                  trainable=True)
    # Be sure to call this at the end
    super(MyLayer, self).build(input_shape)

  def call(self, inputs):
    return tf.matmul(inputs, self.kernel)

  def compute_output_shape(self, input_shape):
    shape = tf.TensorShape(input_shape).as_list()
    shape[-1] = self.output_dim
    return tf.TensorShape(shape)

  def get_config(self):
    base_config = super(MyLayer, self).get_config()
    base_config['output_dim'] = self.output_dim
    return base_config

  @classmethod
  def from_config(cls, config):
    return cls(**config)


model = tf.keras.Sequential([
    MyLayer(10),
    layers.Activation('softmax')])

# The compile step specifies the training configuration
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Trains for 5 epochs.
model.fit(data, labels, batch_size=32, epochs=5)

 回调是传递给模型的对象,用于在训练期间自定义该模型并扩展其行为。可以编写自定义回调,也可以使用内置tf.keras.callbacks:

  • tf.keras.callbacks.ModelCheckpoint:定期保存模型的检查点。
  • tf.keras.callbacks.LearningRateScheduler:动态更改学习速率。
  • tf.keras.callbacks.EarlyStopping:在验证效果不再改进时中断训练。
  • tf.keras.callbacks.TensorBoard:使用TensorBoard监控模型的行为。
  • 要使用tf.keras.callbacks.Callback,需将其传递给模型的fit方法。
callbacks = [
  # Interrupt training if `val_loss` stops improving for over 2 epochs
  tf.keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'),
  # Write TensorBoard logs to `./logs` directory
  tf.keras.callbacks.TensorBoard(log_dir='./logs')
]
model.fit(data, labels, batch_size=32, epochs=5, callbacks=callbacks,
          validation_data=(val_data, val_labels))

保存和恢复

(1)仅限权重。使用tf.keras.Model.save_weights保存并加载模型的权重。

model = tf.keras.Sequential([
layers.Dense(64, activation='relu'),
layers.Dense(10, activation='softmax')])

model.compile(optimizer=tf.train.AdamOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])


# Save weights to a TensorFlow Checkpoint file
model.save_weights('./weights/my_model')

# Restore the model's state,
# this requires a model with the same architecture.
model.load_weights('./weights/my_model')

默认情况下,会以TensorFlow检查点文件格式保存模型的权重。权重也可以另存为Keras HDF5格式(keras多后端实现的默认格式)。

# Save weights to a HDF5 file
model.save_weights('my_model.h5', save_format='h5')

# Restore the model's state
model.load_weights('my_model.h5')

(2)仅限配置。可以保存模型的结构,此操作会对模型架构(不含任何权重)进行序列化。即使没有定义原始模型的代码,保存的配置也可以重新创建并初始化相同的模型。Keras支持JSON和YAML序列化格式:

# Serialize a model to JSON format
json_string = model.to_json()
json_string


'{"backend": "tensorflow", "keras_version": "2.1.6-tf", "config": {"name": "sequential_3", "layers": [{"config": {"units": 64, "kernel_regularizer": null, "activation": "relu", "bias_constraint": null, "trainable": true, "use_bias": true, "bias_initializer": {"config": {"dtype": "float32"}, "class_name": "Zeros"}, "activity_regularizer": null, "dtype": null, "kernel_constraint": null, "kernel_initializer": {"config": {"mode": "fan_avg", "seed": null, "distribution": "uniform", "scale": 1.0, "dtype": "float32"}, "class_name": "VarianceScaling"}, "name": "dense_17", "bias_regularizer": null}, "class_name": "Dense"}, {"config": {"units": 10, "kernel_regularizer": null, "activation": "softmax", "bias_constraint": null, "trainable": true, "use_bias": true, "bias_initializer": {"config": {"dtype": "float32"}, "class_name": "Zeros"}, "activity_regularizer": null, "dtype": null, "kernel_constraint": null, "kernel_initializer": {"config": {"mode": "fan_avg", "seed": null, "distribution": "uniform", "scale": 1.0, "dtype": "float32"}, "class_name": "VarianceScaling"}, "name": "dense_18", "bias_regularizer": null}, "class_name": "Dense"}]}, "class_name": "Sequential"}'



{'backend': 'tensorflow',
 'class_name': 'Sequential',
 'config': {'layers': [{'class_name': 'Dense',
                        'config': {'activation': 'relu',
                                   'activity_regularizer': None,
                                   'bias_constraint': None,
                                   'bias_initializer': {'class_name': 'Zeros',
                                                        'config': {'dtype': 'float32'}},
                                   'bias_regularizer': None,
                                   'dtype': None,
                                   'kernel_constraint': None,
                                   'kernel_initializer': {'class_name': 'VarianceScaling',
                                                          'config': {'distribution': 'uniform',
                                                                     'dtype': 'float32',
                                                                     'mode': 'fan_avg',
                                                                     'scale': 1.0,
                                                                     'seed': None}},
                                   'kernel_regularizer': None,
                                   'name': 'dense_17',
                                   'trainable': True,
                                   'units': 64,
                                   'use_bias': True}},
                       {'class_name': 'Dense',
                        'config': {'activation': 'softmax',
                                   'activity_regularizer': None,
                                   'bias_constraint': None,
                                   'bias_initializer': {'class_name': 'Zeros',
                                                        'config': {'dtype': 'float32'}},
                                   'bias_regularizer': None,
                                   'dtype': None,
                                   'kernel_constraint': None,
                                   'kernel_initializer': {'class_name': 'VarianceScaling',
                                                          'config': {'distribution': 'uniform',
                                                                     'dtype': 'float32',
                                                                     'mode': 'fan_avg',
                                                                     'scale': 1.0,
                                                                     'seed': None}},
                                   'kernel_regularizer': None,
                                   'name': 'dense_18',
                                   'trainable': True,
                                   'units': 10,
                                   'use_bias': True}}],
            'name': 'sequential_3'},
 'keras_version': '2.1.6-tf'}


#从json重新创建模型(刚刚初始化)
fresh_model = tf.keras.models.model_from_json(json_string)



#将模型序列化为YAML格式
yaml_string = model.to_yaml()
print(yaml_string)


backend: tensorflow
class_name: Sequential
config:
  layers:
  - class_name: Dense
    config:
      activation: relu
      activity_regularizer: null
      bias_constraint: null
      bias_initializer:
        class_name: Zeros
        config: {dtype: float32}
      bias_regularizer: null
      dtype: null
      kernel_constraint: null
      kernel_initializer:
        class_name: VarianceScaling
        config: {distribution: uniform, dtype: float32, mode: fan_avg, scale: 1.0,
          seed: null}
      kernel_regularizer: null
      name: dense_17
      trainable: true
      units: 64
      use_bias: true
  - class_name: Dense
    config:
      activation: softmax
      activity_regularizer: null
      bias_constraint: null
      bias_initializer:
        class_name: Zeros
        config: {dtype: float32}
      bias_regularizer: null
      dtype: null
      kernel_constraint: null
      kernel_initializer:
        class_name: VarianceScaling
        config: {distribution: uniform, dtype: float32, mode: fan_avg, scale: 1.0,
          seed: null}
      kernel_regularizer: null
      name: dense_18
      trainable: true
      units: 10
      use_bias: true
  name: sequential_3
keras_version: 2.1.6-tf


#从yaml重新创建模型

fresh_model = tf.keras.models.model_from_yaml(yaml_string)

注意:子类化模型不可序列化,因为它们的架构有call方法正文中的python代码定义。

(3)整个模型。整个模型可以保存到一个文件中,其中包含权重值、模型配置乃至优化其配置。这样,您就可以对模型设置检查点并稍后从完全相同的状态继续训练,而无需访问原始代码。

# Create a trivial model
model = tf.keras.Sequential([
  layers.Dense(10, activation='softmax', input_shape=(32,)),
  layers.Dense(10, activation='softmax')
])
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels, batch_size=32, epochs=5)

# Save entire model to a HDF5 file
model.save('my_model.h5')

# Recreate the exact same model, including weights and optimizer.
model = tf.keras.models.load_model('my_model.h5')


Epoch 1/5
1000/1000 [==============================] - 0s 297us/step - loss: 11.5009 - acc: 0.0980
Epoch 2/5
1000/1000 [==============================] - 0s 76us/step - loss: 11.4844 - acc: 0.0960
Epoch 3/5
1000/1000 [==============================] - 0s 77us/step - loss: 11.4791 - acc: 0.0850
Epoch 4/5
1000/1000 [==============================] - 0s 78us/step - loss: 11.4771 - acc: 0.1020
Epoch 5/5
1000/1000 [==============================] - 0s 79us/step - loss: 11.4763 - acc: 0.0900