一、总结
一句话总结:
model.add(Embedding(max_features, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation='sigmoid'))
from tensorflow.keras.layers import Dense,Embedding,SimpleRNN from tensorflow.keras import Sequential model = Sequential() model.add(Embedding(max_features, 32)) model.add(SimpleRNN(32)) model.add(Dense(1, activation='sigmoid')) model.summary() model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
1、why our small recurrent network doesn't perform very well at all compared to this baseline (only up to 85% validation accuracy)?
|||-begin
from tensorflow.keras.layers import Dense,Embedding,SimpleRNN from tensorflow.keras import Sequential model = Sequential() model.add(Embedding(max_features, 32)) model.add(SimpleRNN(32)) model.add(Dense(1, activation='sigmoid')) model.summary() model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
|||-end
【inputs only consider the first 500 words】:Part of the problem is that our inputs only consider the first 500 words rather the full sequences -- hence our RNN has access to less information than our earlier baseline model.
【SimpleRNN isn't very good at processing long sequences】:The remainder of the problem is simply that SimpleRNN isn't very good at processing long sequences, like text. Other types of recurrent layers perform much better.
二、6.2-2、循环神经网络-IMDB电影评论分类实例
博客对应课程的视频位置:
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing import sequence
# 作为特征的单词个数
max_features = 10000 # number of words to consider as features
# 在这么多单词之后截断文本(这些单词都 属于前 max_features 个最常见的单词)
maxlen = 500 # cut texts after this number of words (among top max_features most common words)
batch_size = 32
print('Loading data...')
(input_train, y_train), (input_test, y_test) = imdb.load_data(num_words=max_features)
print(len(input_train), 'train sequences')
print(len(input_test), 'test sequences')
print(input_train[0])
print(y_train[0])
print('Pad sequences (samples x time)')
# 截取前maxlen(500)个单词
input_train = sequence.pad_sequences(input_train, maxlen=maxlen)
input_test = sequence.pad_sequences(input_test, maxlen=maxlen)
print('input_train shape:', input_train.shape)
print('input_test shape:', input_test.shape)
print(input_train[0])
print(input_train[1])
用 Embedding 层和 SimpleRNN 层来训练模型
from tensorflow.keras.layers import Dense,Embedding,SimpleRNN
from tensorflow.keras import Sequential
model = Sequential()
model.add(Embedding(max_features, 32))
model.add(SimpleRNN(32))
model.add(Dense(1, activation='sigmoid'))
model.summary()
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model.fit(input_train, y_train,
epochs=10,
batch_size=128,
validation_split=0.2)
Let's display the training and validation loss and accuracy
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
As a reminder, in chapter 3, our very first naive approach to this very dataset got us to 88% test accuracy. Unfortunately, our small recurrent network doesn't perform very well at all compared to this baseline (only up to 85% validation accuracy). Part of the problem is that our inputs only consider the first 500 words rather the full sequences -- hence our RNN has access to less information than our earlier baseline model. The remainder of the problem is simply that SimpleRNN isn't very good at processing long sequences, like text. Other types of recurrent layers perform much better. Let's take a look at some more advanced layers.
本站文章如无特殊说明,均为本站原创,如若转载,请注明出处:《python深度学习》笔记—6.2-2、循环神经网络-IMDB电影评论分类实例 - Python技术站