说明:

主要参考Francois Chollet《Deep Learning with Python》

代码运行环境为kaggle中的kernels;

数据集IMDB需要手动添加;

循环神经网络和LSTM请参考:【深度学习】:循环神经网(RNN)【深度学习】:长期依赖与LSTM

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.
['imdb.npz']

一、使用Numpy实现简单RNN的前向传播

timesteps = 100
input_features = 32
output_features = 64

# 输入有100个时间点,每个时间点有32维的数据
inputs = np.random.random((timesteps,input_features))
state_t = np.zeros((output_features,))

W = np.random.random((output_features,input_features)) # input的权重
U = np.random.random((output_features,output_features)) # state的权重
b = np.random.random((output_features,)) # bias

successive_outputs = []
for input_t in inputs:
    # 按timesteps进行迭代
    # output_t是一个64维的向量
    output_t = np.tanh(np.dot(W,input_t)+np.dot(U,state_t)+b)
    # 将当前时刻的输出保存到successive_outputs中
    successive_outputs.append(output_t)
    # 当前时刻的输出作为下一时刻的state
    state_t = output_t
    
final_output_sequence = np.concatenate(successive_outputs,axis=0)

二、Keras中的循环层

上面实现的RNN在Keras中对应SimpleRNN层,唯一的不同是SimpleRNN可以处理batch数据,其输入为(batch_size,timesteps,output_features)。
上面实现的RNN在每个timestep都有输出,其实也可以只让最后一个timestep时有输出。

1.只在最后的timestep输出结果

from keras.models import Sequential
from keras.layers import Embedding,SimpleRNN
model = Sequential()
model.add(Embedding(10000,32))
model.add(SimpleRNN(32))

2.在每个timestep都有输出

model = Sequential()
model.add(Embedding(10000,32))
model.add(SimpleRNN(32,return_sequences=True))

3.堆叠多个SimpleRNN层

model = Sequential()
model.add(Embedding(10000,32))
model.add(SimpleRNN(32,return_sequences=True))
model.add(SimpleRNN(32,return_sequences=True))
model.add(SimpleRNN(32,return_sequences=True))
model.add(SimpleRNN(32))

三、使用RNN对IMDB电影评论进行建模

1.准备数据

from keras.datasets import imdb
from keras.preprocessing import sequence

max_features=10000
maxlen = 500
batch_size=32

print('Loading data...')
(input_train,y_train),(input_test,y_test) = imdb.load_data(path='/kaggle/input/imdb.npz',
                                                           num_words=max_features)
print(len(input_train),'train sequences')
print(len(input_test),'test sequences')

print('Pad sequences (samples x time)')
input_train = sequence.pad_sequences(input_train,maxlen=maxlen)
input_test = sequence.pad_sequences(input_test,maxlen=maxlen)
print('input_train shape:',input_train.shape)
print('input_test shape:',input_test.shape)
Loading data...
25000 train sequences
25000 test sequences
Pad sequences (samples x time)
input_train shape: (25000, 500)
input_test shape: (25000, 500)

2.建立模型并训练

from keras.layers import Dense

model = Sequential()
model.add(Embedding(max_features,32))
model.add(SimpleRNN(32))
model.add(Dense(1,activation='sigmoid'))

model.compile(optimizer='rmsprop',
             loss='binary_crossentropy',
             metrics=['acc'])

history = model.fit(input_train,y_train,
                   epochs=10,
                   batch_size=128,
                   validation_split=0.2)
Train on 20000 samples, validate on 5000 samples
Epoch 1/10
20000/20000 [==============================] - 24s 1ms/step - loss: 0.6481 - acc: 0.6028 - val_loss: 0.4828 - val_acc: 0.7812
...
Epoch 10/10
20000/20000 [==============================] - 23s 1ms/step - loss: 0.0210 - acc: 0.9941 - val_loss: 0.6618 - val_acc: 0.8160

3.绘制曲线

import matplotlib.pyplot as plt
%matplotlib inline
def plot_curve(history):
    acc = history.history['acc']
    val_acc = history.history['val_acc']
    loss = history.history['loss']
    val_loss = history.history['val_loss']
    
    epochs = range(1,len(acc)+1)
    
    plt.plot(epochs,acc,'bo',label='Training acc')
    plt.plot(epochs,val_acc,'b',label='Validation acc')
    plt.title('Training and validation accuracy')
    plt.legend()
    
    plt.figure()
    
    plt.plot(epochs,loss,'bo',label='Training loss')
    plt.plot(epochs,val_loss,'b',label='Validation loss')
    plt.title('Training and validation loss')
    plt.legend()
    
plot_curve(history)

【深度学习框架Keras】循环神经网络(SimpleRNN与LSTM)

【深度学习框架Keras】循环神经网络(SimpleRNN与LSTM)

四、LSTM层

SimpleRNN层不擅长于处理较长的序列,而LSTM则相对于SimpleRNN更适合处理较长的序列。

from keras.layers import LSTM

model = Sequential()
model.add(Embedding(max_features,32))
model.add(LSTM(32))
model.add(Dense(1,activation='sigmoid'))

model.compile(optimizer='rmsprop',
             loss='binary_crossentropy',
             metrics=['acc'])

history = model.fit(input_train,y_train,
                   epochs = 10,
                   batch_size=128,
                   validation_split=0.2)
Train on 20000 samples, validate on 5000 samples
Epoch 1/10
20000/20000 [==============================] - 65s 3ms/step - loss: 0.5227 - acc: 0.7557 - val_loss: 0.4223 - val_acc: 0.8082
...
Epoch 10/10
20000/20000 [==============================] - 65s 3ms/step - loss: 0.1075 - acc: 0.9630 - val_loss: 0.3759 - val_acc: 0.8838
plot_curve(history)

【深度学习框架Keras】循环神经网络(SimpleRNN与LSTM)

【深度学习框架Keras】循环神经网络(SimpleRNN与LSTM)