一、总结

一句话总结:

model.add(Embedding(max_features, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation='sigmoid'))
from tensorflow.keras.layers import Dense,Embedding,SimpleRNN
from tensorflow.keras import Sequential

model = Sequential()
model.add(Embedding(max_features, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

 

 

1、why our small recurrent network doesn't perform very well at all compared to this baseline (only up to 85% validation accuracy)?

|||-begin

from tensorflow.keras.layers import Dense,Embedding,SimpleRNN
from tensorflow.keras import Sequential

model = Sequential()
model.add(Embedding(max_features, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

|||-end

【inputs only consider the first 500 words】:Part of the problem is that our inputs only consider the first 500 words rather the full sequences -- hence our RNN has access to less information than our earlier baseline model.
【SimpleRNN isn't very good at processing long sequences】:The remainder of the problem is simply that SimpleRNN isn't very good at processing long sequences, like text. Other types of recurrent layers perform much better.

 

 

 

二、6.2-2、循环神经网络-IMDB电影评论分类实例

博客对应课程的视频位置:

 

from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence

# 作为特征的单词个数
max_features = 10000  # number of words to consider as features
# 在这么多单词之后截断文本(这些单词都 属于前 max_features 个最常见的单词)
maxlen = 500  # cut texts after this number of words (among top max_features most common words)
batch_size = 32

print('Loading data...')
(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words=max_features)
print(len(input_train), 'train sequences')
print(len(input_test), 'test sequences')
Loading data...
25000 train sequences
25000 test sequences
In [3]:
print(input_train[0])
print(y_train[0])
[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65, 458, 4468, 66, 3941, 4, 173, 36, 256, 5, 25, 100, 43, 838, 112, 50, 670, 2, 9, 35, 480, 284, 5, 150, 4, 172, 112, 167, 2, 336, 385, 39, 4, 172, 4536, 1111, 17, 546, 38, 13, 447, 4, 192, 50, 16, 6, 147, 2025, 19, 14, 22, 4, 1920, 4613, 469, 4, 22, 71, 87, 12, 16, 43, 530, 38, 76, 15, 13, 1247, 4, 22, 17, 515, 17, 12, 16, 626, 18, 2, 5, 62, 386, 12, 8, 316, 8, 106, 5, 4, 2223, 5244, 16, 480, 66, 3785, 33, 4, 130, 12, 16, 38, 619, 5, 25, 124, 51, 36, 135, 48, 25, 1415, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 407, 16, 82, 2, 8, 4, 107, 117, 5952, 15, 256, 4, 2, 7, 3766, 5, 723, 36, 71, 43, 530, 476, 26, 400, 317, 46, 7, 4, 2, 1029, 13, 104, 88, 4, 381, 15, 297, 98, 32, 2071, 56, 26, 141, 6, 194, 7486, 18, 4, 226, 22, 21, 134, 476, 26, 480, 5, 144, 30, 5535, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 1334, 88, 12, 16, 283, 5, 16, 4472, 113, 103, 32, 15, 16, 5345, 19, 178, 32]
1
In [4]:
print('Pad sequences (samples x time)')
# 截取前maxlen(500)个单词
input_train = sequence.pad_sequences(input_train, maxlen=maxlen)
input_test = sequence.pad_sequences(input_test, maxlen=maxlen)
print('input_train shape:', input_train.shape)
print('input_test shape:', input_test.shape)
Pad sequences (samples x time)
input_train shape: (25000, 500)
input_test shape: (25000, 500)
In [5]:
print(input_train[0])
[   0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    1   14   22   16   43  530  973 1622 1385   65  458 4468
   66 3941    4  173   36  256    5   25  100   43  838  112   50  670
    2    9   35  480  284    5  150    4  172  112  167    2  336  385
   39    4  172 4536 1111   17  546   38   13  447    4  192   50   16
    6  147 2025   19   14   22    4 1920 4613  469    4   22   71   87
   12   16   43  530   38   76   15   13 1247    4   22   17  515   17
   12   16  626   18    2    5   62  386   12    8  316    8  106    5
    4 2223 5244   16  480   66 3785   33    4  130   12   16   38  619
    5   25  124   51   36  135   48   25 1415   33    6   22   12  215
   28   77   52    5   14  407   16   82    2    8    4  107  117 5952
   15  256    4    2    7 3766    5  723   36   71   43  530  476   26
  400  317   46    7    4    2 1029   13  104   88    4  381   15  297
   98   32 2071   56   26  141    6  194 7486   18    4  226   22   21
  134  476   26  480    5  144   30 5535   18   51   36   28  224   92
   25  104    4  226   65   16   38 1334   88   12   16  283    5   16
 4472  113  103   32   15   16 5345   19  178   32]
In [6]:
print(input_train[1])
[   0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    1  194 1153  194 8255   78  228    5    6 1463 4369
 5012  134   26    4  715    8  118 1634   14  394   20   13  119  954
  189  102    5  207  110 3103   21   14   69  188    8   30   23    7
    4  249  126   93    4  114    9 2300 1523    5  647    4  116    9
   35 8163    4  229    9  340 1322    4  118    9    4  130 4901   19
    4 1002    5   89   29  952   46   37    4  455    9   45   43   38
 1543 1905  398    4 1649   26 6853    5  163   11 3215    2    4 1153
    9  194  775    7 8255    2  349 2637  148  605    2 8003   15  123
  125   68    2 6853   15  349  165 4362   98    5    4  228    9   43
    2 1157   15  299  120    5  120  174   11  220  175  136   50    9
 4373  228 8255    5    2  656  245 2350    5    4 9837  131  152  491
   18    2   32 7464 1212   14    9    6  371   78   22  625   64 1382
    9    8  168  145   23    4 1690   15   16    4 1355    5   28    6
   52  154  462   33   89   78  285   16  145   95]

用 Embedding 层和 SimpleRNN 层来训练模型

In [11]:
from tensorflow.keras.layers import Dense,Embedding,SimpleRNN
from tensorflow.keras import Sequential

model = Sequential()
model.add(Embedding(max_features, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_1 (Embedding)      (None, None, 32)          320000    
_________________________________________________________________
simple_rnn (SimpleRNN)       (None, 32)                2080      
_________________________________________________________________
dense (Dense)                (None, 1)                 33        
=================================================================
Total params: 322,113
Trainable params: 322,113
Non-trainable params: 0
_________________________________________________________________
In [12]:
history = model.fit(input_train, y_train,
                    epochs=10,
                    batch_size=128,
                    validation_split=0.2)
Epoch 1/10
157/157 [==============================] - 52s 334ms/step - loss: 0.6819 - acc: 0.5541 - val_loss: 0.6608 - val_acc: 0.6192
Epoch 2/10
157/157 [==============================] - 58s 367ms/step - loss: 0.4849 - acc: 0.7848 - val_loss: 0.4331 - val_acc: 0.8164
Epoch 3/10
157/157 [==============================] - 57s 362ms/step - loss: 0.3108 - acc: 0.8732 - val_loss: 0.4023 - val_acc: 0.8284
Epoch 4/10
157/157 [==============================] - 58s 368ms/step - loss: 0.2195 - acc: 0.9162 - val_loss: 0.4266 - val_acc: 0.8238
Epoch 5/10
157/157 [==============================] - 59s 376ms/step - loss: 0.1451 - acc: 0.9503 - val_loss: 0.4497 - val_acc: 0.8558
Epoch 6/10
157/157 [==============================] - 58s 371ms/step - loss: 0.0919 - acc: 0.9703 - val_loss: 0.4590 - val_acc: 0.8344
Epoch 7/10
157/157 [==============================] - 55s 353ms/step - loss: 0.0508 - acc: 0.9862 - val_loss: 0.6258 - val_acc: 0.7666
Epoch 8/10
157/157 [==============================] - 59s 379ms/step - loss: 0.0353 - acc: 0.9901 - val_loss: 0.6202 - val_acc: 0.7984
Epoch 9/10
157/157 [==============================] - 59s 379ms/step - loss: 0.0179 - acc: 0.9959 - val_loss: 0.5921 - val_acc: 0.8400
Epoch 10/10
157/157 [==============================] - 59s 374ms/step - loss: 0.0225 - acc: 0.9935 - val_loss: 0.6981 - val_acc: 0.7976

Let's display the training and validation loss and accuracy

In [13]:
import matplotlib.pyplot as plt

acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()
《python深度学习》笔记---6.2-2、循环神经网络-IMDB电影评论分类实例
《python深度学习》笔记---6.2-2、循环神经网络-IMDB电影评论分类实例

As a reminder, in chapter 3, our very first naive approach to this very dataset got us to 88% test accuracy. Unfortunately, our small recurrent network doesn't perform very well at all compared to this baseline (only up to 85% validation accuracy). Part of the problem is that our inputs only consider the first 500 words rather the full sequences -- hence our RNN has access to less information than our earlier baseline model. The remainder of the problem is simply that SimpleRNN isn't very good at processing long sequences, like text. Other types of recurrent layers perform much better. Let's take a look at some more advanced layers.

In [ ]: