Deep Learning 7_深度学习UFLDL教程：Self-Taught Learning_Exercise（斯坦福大学深度学习教程）

2023年4月9日下午11:49 • 深度学习

练习环境：win7， matlab2015b，16G内存，2T硬盘

练习内容及步骤：Exercise:Self-Taught Learning。具体如下：

一是用29404个无标注数据unlabeledData（手写数字数据库MNIST Dataset中数字为5-9的数据）来训练稀疏自动编码器，得到其权重参数opttheta。这一步的目的是提取这些数据的特征，虽然我们不知道它提取的究竟是哪些特征（当然，可以通过可视化结果看出来，可假设其提取的特征为Features），但是我们知道它提取到的特征实际上就是已训练好的稀疏自动编码器的隐藏层的激活值（即：第2层激活值）。注意：本节所有训练稀疏自动编码器的算法用的都L-BFGS算法。

二是把15298个已标注数据trainData（手写数字数据库MNIST Dataset中数字为0-4的前一半数据）作为训练数据集通过这个已训练好的稀疏自动编码器（即：权重参数为opttheta的稀疏自动编码器），就可提取出跟上一步一样的相同的特征参数，这里trainData提取的特征表达假设为trainFeatures，它其实也是隐藏层的激活值。如果还不明白，这里打一个比方：假设上一步提取的是一个通信信号A(对应unlabeledData)的特征是一阶累积量，而这一步提取的就是通信信号B（对应trainData）的一阶累积量，它们提取的都是同样的特征，只是对象不同而已。同样地，unlabeledData和trainData提取的是同样的特征Features，只是对象不同而已。

注意：如果上一步对unlabeledData做了预处理，一定要把其各种数据预处理参数（比如PCA中主成份U）保存起来，因为这一步的训练数据集trainData和下一步的测试数据集testData也一定要做相同的预处理。本节练习，因为用的是手写数字数据库MNIST Dataset，已经经过了预处理，所以不用再预处理。

具体见：http://ufldl.stanford.edu/wiki/index.php/%E8%87%AA%E6%88%91%E5%AD%A6%E4%B9%A0

三是把15298个已标注数据testData（手写数字数据库MNIST Dataset中数字为0-4的后一半数据）作为测试数据集通过这个已训练好的稀疏自动编码器（即：权重参数为opttheta的稀疏自动编码器），，就可提取出跟上一步一样的相同的特征参数，这里testData提取的特征表达假设为testFeatures，它其实也是隐藏层的激活值。

四是把第二步提取出来的特征trainFeatures和已标注数据trainData的标签trainLabels作为输入来训练softmax分类器，得到其回归模型softmaxModel。

五是把第三步提取出来的特征testFeatures输入训练好的softmax回归模型softmaxModel，从而预测出已标注数据testData的类别pred，再把pred和已标注数据testData本来的标签testLabels对比，就可得出正确率。

综上，Self-taught learning是利用未标注数据，用无监督学习来提取特征参数，然后用有监督学习和提取的特征参数来训练分类器。

本节方法适用范围：

用于在一些拥有大量未标注数据和少量的已标注数据的场景中，本节方法可能是最有效的。即使在只有已标注数据的情况下（这时我们通常忽略训练数据的类标号进行特征学习），以上想法也能得到很好的结果。

一些matlab函数

numel：求元素总数。

n=numel(A)该语句返回数组中元素的总数。

s=size(A),当只有一个输出参数时，返回一个行向量，该行向量的第一个元素时数组的行数，第二个元素是数组的列数。

[r,c]=size(A),当有两个输出参数时，size函数将数组的行数返回到第一个输出变量，将数组的列数返回到第二个输出变量。

round(n)的意思是纯粹的四舍五入，意思与我们以前数学中的四舍五入是一样的！

find

找到非零元素的索引和值

语法：

1. ind = find(X)

2. ind = find(X, k)

3. ind = find(X, k, 'first')

4. ind = find(X, k, 'last')

5. [row,col] = find(X, ...)

6. [row,col,v] = find(X, ...)

说明：

1. ind = find(X)

找出矩阵X中的所有非零元素，并将这些元素的线性索引值（linear indices：按列）返回到向量ind中。

如果X是一个行向量，则ind是一个行向量；否则，ind是一个列向量。

如果X不含非零元素或是一个空矩阵，则ind是一个空矩阵。

2. ind = find(X, k) 或 3. ind = find(X, k, 'first')

返回第一个非零元素k的索引值。

k必须是一个正数，但是它可以是任何数字数值类型。

4. ind = find(X, k, 'last')

返回最后一个非零元素k的索引值。

5. [row,col] = find(X, ...)

返回矩阵X中非零元素的行和列的索引值。

这个语法对于处理稀疏矩阵尤其有用。

如果X是一个N（N>2）维矩阵，col包括列的线性索引。

例如，一个5*7*3的矩阵X，有一个非零元素X（4,2,3），find函数将返回row=4和col=16。也就是说，（第1页有7列）+（第2页有7列）+（第3页有2列）=16。

6. [row,col,v] = find(X, ...)

返回X中非零元素的一个列或行向量v，同时返回行和列的索引值。

如果X是一个逻辑表示，则v是一个逻辑矩阵。

输出向量v包含通过评估X表示得到的逻辑矩阵的非零元素。

例如，

A= magic(4)
A =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1

[r,c,v]= find(A>10);

r', c', v'
ans =
1 2 4 4 1 3 (按列)
ans =
1 2 2 3 4 4 （按列）
ans =
1 1 1 1 1 1

这里返回的向量v是一个逻辑矩阵，它包含N个非零元素，N=(A>10)

例子：

例1

X = [1 0 4 -3 0 0 0 8 6];
indices = find(X)

返回X中非零元素的线性索引值。

indices =
1 3 4 8 9

例2

你可以用一个逻辑表达方式定义X。例如

find(X > 2)

返回X中大于2的元素的相对应的线性索引值。

ans =
3 8 9

unique:

　　unique为找出向量中的非重复元素并进行排序后输出。

运行结果

权重参数opttheta中W1的可视化结果，也就是所提取特征的可视化结果如下：

Deep Learning 7_深度学习UFLDL教程：Self-Taught Learning_Exercise（斯坦福大学深度学习教程）

Test Accuracy: 98.333115%

Elapsed time is 594.435594 seconds.

结果总结：

1. 为什么Andrew Ng他们训练样本用25分钟，而我所有运行时间不到6分钟？估计前几年电脑配置比现在的电脑配置差很多！

2.为了对比，Andrew Ng团队做了实验，如果不用本节稀疏自动编码器提取的特征代替原始像素值（即：原始数据）训练softmax分类器，准确率最多达到96%。实际上，本节练习和上一节练习Deep Learning六：Softmax Regression_Exercise（斯坦福大学UFLDL深度学习教程）的不同之处，就是本节练习用的是稀疏自动编码器提取的特征训练softmax分类器，而上一节练习用的原始数据训练softmax分类器，上节练习我们得到的准确率实际上只有92.640%，当然，可能Andrew Ng团队的准确率最多达到了96%。

代码

stlExercise.m

%% CS294A/CS294W Self-taught Learning Exercise

%  Instructions
%  ------------
% 
%  This file contains code that helps you get started on the
%  self-taught learning. You will need to complete code in feedForwardAutoencoder.m
%  You will also need to have implemented sparseAutoencoderCost.m and 
%  softmaxCost.m from previous exercises.
%
%% ======================================================================
%  STEP 0: Here we provide the relevant parameters values that will
%  allow your sparse autoencoder to get good filters; you do not need to 
%  change the parameters below.
tic
inputSize  = 28 * 28;
numLabels  = 5;
hiddenSize = 200;
sparsityParam = 0.1; % desired average activation of the hidden units.
                     % (This was denoted by the Greek alphabet rho, which looks like a lower-case "p",
                     %  in the lecture notes). 
lambda = 3e-3;       % weight decay parameter       
beta = 3;            % weight of sparsity penalty term   
maxIter = 400;

%% ======================================================================
%  STEP 1: Load data from the MNIST database
%
%  This loads our training and test data from the MNIST database files.
%  We have sorted the data for you in this so that you will not have to
%  change it.

% Load MNIST database files
mnistData   = loadMNISTImages('train-images.idx3-ubyte');
mnistLabels = loadMNISTLabels('train-labels.idx1-ubyte');

% Set Unlabeled Set (All Images)

% Simulate a Labeled and Unlabeled set
labeledSet   = find(mnistLabels >= 0 & mnistLabels <= 4);%返回mnistLabels中元素值大于等于0且小于等于4的数字的行号
unlabeledSet = find(mnistLabels >= 5);

numTrain = round(numel(labeledSet)/2);
trainSet = labeledSet(1:numTrain);
testSet  = labeledSet(numTrain+1:end);

unlabeledData = mnistData(:, unlabeledSet);% 无标签数据集

trainData   = mnistData(:, trainSet);% mnistData中大于等于0且小于等于4的数字的前一半数字作为有标签的训练数据
trainLabels = mnistLabels(trainSet)' + 1; % Shift Labels to the Range 1-5

testData   = mnistData(:, testSet);% mnistData中大于等于0且小于等于4的数字的后一半数字作为有标签的测试数据
testLabels = mnistLabels(testSet)' + 1;   % Shift Labels to the Range 1-5

% Output Some Statistics
fprintf('# examples in unlabeled set: %d\n', size(unlabeledData, 2));
fprintf('# examples in supervised training set: %d\n\n', size(trainData, 2));
fprintf('# examples in supervised testing set: %d\n\n', size(testData, 2));

%% ======================================================================
%  STEP 2: Train the sparse autoencoder
%  This trains the sparse autoencoder on the unlabeled training
%  images. 

%  按均匀分布随机初始化theta参数   Randomly initialize the parameters
theta = initializeParameters(hiddenSize, inputSize);

%% ----------------- YOUR CODE HERE ----------------------
%  Find opttheta by running the sparse autoencoder on
%  unlabeledTrainingImages
%  利用L-BFGS算法，用无标签数据集来训练稀疏自动编码器

opttheta = theta; 

addpath minFunc/
options.Method = 'lbfgs';
options.maxIter = 400;
options.display = 'on';
[opttheta, cost] = minFunc( @(p) sparseAutoencoderCost(p, ...
      inputSize, hiddenSize, ...
      lambda, sparsityParam, ...
      beta, unlabeledData), ...
      theta, options);


%% -----------------------------------------------------
                          
% Visualize weights
W1 = reshape(opttheta(1:hiddenSize * inputSize), hiddenSize, inputSize);
display_network(W1');

%%======================================================================
%% STEP 3: 从有标签数据集中提取特征 Extract Features from the Supervised Dataset
%  
%  You need to complete the code in feedForwardAutoencoder.m so that the 
%  following command will extract features from the data.

trainFeatures = feedForwardAutoencoder(opttheta, hiddenSize, inputSize, ...
                                       trainData);

testFeatures = feedForwardAutoencoder(opttheta, hiddenSize, inputSize, ...
                                       testData);

%%======================================================================
%% STEP 4: Train the softmax classifier

softmaxModel = struct;  
%% ----------------- YOUR CODE HERE ----------------------
%  Use softmaxTrain.m from the previous exercise to train a multi-class
%  classifier. 
%  利用L-BFGS算法，用从有标签训练数据集中提取的特征及其标签，训练softmax回归模型，

%  Use lambda = 1e-4 for the weight regularization for softmax
lambda = 1e-4;
inputSize = hiddenSize;
numClasses = numel(unique(trainLabels));%unique为找出向量中的非重复元素并进行排序
% You need to compute softmaxModel using softmaxTrain on trainFeatures and
% trainLabels

options.maxIter = 100; %最大迭代次数
softmaxModel = softmaxTrain(inputSize, numClasses, lambda, ...
                            trainFeatures, trainLabels, options);





%% -----------------------------------------------------


%%======================================================================
%% STEP 5: Testing 

%% ----------------- YOUR CODE HERE ----------------------
% Compute Predictions on the test set (testFeatures) using softmaxPredict
% and softmaxModel

[pred] = softmaxPredict(softmaxModel, testFeatures);



%% -----------------------------------------------------

% Classification Score
fprintf('Test Accuracy: %f%%\n', 100*mean(pred(:) == testLabels(:)));
toc
% (note that we shift the labels by 1, so that digit 0 now corresponds to
%  label 1)
%
% Accuracy is the proportion of correctly classified images
% The results for our implementation was:
%
% Accuracy: 98.3%
%
%

feedForwardAutoencoder.m

 1 function [activation] = feedForwardAutoencoder(theta, hiddenSize, visibleSize, data)
 2 
 3 % theta: trained weights from the autoencoder
 4 % visibleSize: the number of input units (probably 64) 
 5 % hiddenSize: the number of hidden units (probably 25) 
 6 % data: Our matrix containing the training data as columns.  So, data(:,i) is the i-th training example. 
 7   
 8 % We first convert theta to the (W1, W2, b1, b2) matrix/vector format, so that this 
 9 % follows the notation convention of the lecture notes. 
10 
11 W1 = reshape(theta(1:hiddenSize*visibleSize), hiddenSize, visibleSize);
12 b1 = theta(2*hiddenSize*visibleSize+1:2*hiddenSize*visibleSize+hiddenSize);
13 
14 %% ---------- YOUR CODE HERE --------------------------------------
15 %  Instructions: Compute the activation of the hidden layer for the Sparse Autoencoder.
16 
17 activation  = sigmoid(W1*data+repmat(b1,[1,size(data,2)]));
18 %-------------------------------------------------------------------
19 
20 end
21 
22 %-------------------------------------------------------------------
23 % Here's an implementation of the sigmoid function, which you may find useful
24 % in your computation of the costs and the gradients.  This inputs a (row or
25 % column) vector (say (z1, z2, z3)) and returns (f(z1), f(z2), f(z3)). 
26 
27 function sigm = sigmoid(x)
28     sigm = 1 ./ (1 + exp(-x));
29 end

参考资料：

http://www.cnblogs.com/tornadomeet/archive/2013/03/24/2979408.html

UFLDL教程

……

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：Deep Learning 7_深度学习UFLDL教程：Self-Taught Learning_Exercise（斯坦福大学深度学习教程） - Python技术站

深度学习

0 0 打赏

微信扫一扫

支付宝扫一扫

Deep Learning 10_深度学习UFLDL教程：Convolution and Pooling_exercise（斯坦福大学深度学习教程）

上一篇 2023年4月9日下午11:49

Deep Learning 6_深度学习UFLDL教程：Softmax Regression_Exercise（斯坦福大学深度学习教程）

下一篇 2023年4月9日下午11:49

深度学习

深度学习环境配置:Ubuntu16.04安装GTX1080Ti+CUDA9.0+cuDNN7.0完整安装教程（多链接多参考文章）

本来就对Linux不熟悉，经过几天惨痛的教训，参考了不知道多少篇文章，终于把环境装好了，每篇文章或多或少都有一些用，但没有一篇完整的能解决我安装过程碰到的问题，所以决定还是自己写一篇我安装过程的教程，有些参考的文章会给出原地址，比较大众的教程就没有给出了。本文写于2018年7月27日，注意下时效性，有问题欢迎留言系统下载地址： http://releas…

2023年4月9日
000
Google深度学习开源框架TenseorFlow安装 – cslxiao

Google深度学习开源框架TenseorFlow安装 Google近期发布了TensorFlow，考录到Google出品，必属精品，估计这玩意会火，不过火钳刘明已经来不及了今天才想着安装来试试 TensorFlow官网：https://www.tensorflow.org/ 安装的话最简单的是pip安装： $ pip install https://st…

深度学习 2023年4月15日
000
吴恩达《深度学习》第一门课（4）深层神经网络

4.1深层神经网络（1）到底是深层还是浅层是一个相对的概念，不必太纠结，以下是一个四层的深度神经网络：（2）一些符号定义： a[0]=x（输入层也叫做第0层） L=4：表示网络的层数 g:表示激活函数第l层输出用a[l]，最终的输出用a[L]表示 n[1]=5:表示第一层有五个神经元，第l层神经元个数用n[l]表示 4.2前向传播和反向传播（1）前向…

深度学习 2023年4月11日
000
【目标识别】深度学习进行目标识别的资源列表

【目标识别】深度学习进行目标识别的资源列表：O网页链接包括RNN、MultiBox、SPP-Net、DeepID-Net、Fast R-CNN、DeepBox、MR-CNN、Faster R-CNN、YOLO、DenseBox、SSD、Inside-Outside Net、G-CNN等。Papers Deep Neural Networks for Obj…

深度学习 2023年4月11日
000
《TensorFlow实战Google深度学习框架》笔记——TensorFlow环境搭建

一、TensorFlow的主要依赖包 1.Protocol Buffer Protocol Buffer负责将结构化的数据序列化，并从序列化之后的数据流中还原出原来的结构化数据。TensorFlow中的数据基本都是通过Protocol Buffer来组织的。结构化数据： name: 张三 id: 12345 email: zhangsan@abc.com …

深度学习 2023年4月12日
000
深度学习之无监督训练

最近看了一下深度学习的表征学习，总结并记录与一下学习笔记。 1.在标签数据集中做的监督学习容易导致过拟合，半监督学习由于可以从无标签数据集中学习，可以有一定概率化解这种情况。 2.深度学习所使用的算法不能太复杂，否则会加大计算复杂度和工作量。 3.逐层贪婪的无监督预训练有这几个特点：（1）贪婪：基于贪婪算法，独立优化问题解的各方面，但是每次只优化一个方面，…

深度学习 2023年4月10日
000
深度学习中环境配置的一些经验总结(conda 常用命令)

　　前两个月参加了学校的国创项目，和一个外院的同学组队。课题是基于深度学习的新闻图片中网络暴力元素的检查。 6月末最后一门试考完，正式开始暑假，便有了大把时间搞这个国创项目（反正没有其他事干）。两个组凑钱买了服务器。实验室的师兄老早告诉我们，配环境是第一步，我们可能要搞很久。下面总结一下配环境中获得的经验。　　首先是要有独立的环境，因为github上的代码…

深度学习 2023年4月16日
000
KubeEdge SIG AI发布首个分布式协同AI Benchmark调研

摘要：AI Benchmark旨在衡量AI模型的性能和效能。本文分享自华为云社区《KubeEdge SIG AI发布首个分布式协同AI Benchmark调研》，作者：KubeEdge SIG AI （成员：张扬，张子阳）。人工智能技术已经在我们生活中的方方面面为我们提供服务，尤其是在图像、视频、语音、推荐系统等方面带来了突破性成果。AI Benchma…

深度学习 2023年4月10日
000

Deep Learning 7_深度学习UFLDL教程：Self-Taught Learning_Exercise（斯坦福大学深度学习教程）

代码

相关文章