[深度学习]Python/Theano实现逻辑回归网络的代码分析

2023年4月12日下午9:48 • 深度学习

首先PO上主要Python代码(2.7), 这个代码在Deep Learning上可以找到.

 1 　　 # allocate symbolic variables for the data
 2     index = T.lscalar()  # index to a [mini]batch
 3     x = T.matrix('x')  # the data is presented as rasterized images
 4     y = T.ivector('y')  # the labels are presented as 1D vector of
 5                            # [int] labels
 6 
 7     # construct the logistic regression class
 8     # Each MNIST image has size 28*28
 9     classifier = LogisticRegression(input=x, n_in=24 * 48, n_out=10)
10 
11     # the cost we minimize during training is the negative log likelihood of
12     # the model in symbolic format
13     cost = classifier.negative_log_likelihood(y)
14 
15     # compiling a Theano function that computes the mistakes that are made by
16     # the model on a minibatch
17     test_model = theano.function(inputs=[index],
18             outputs=classifier.errors(y),
19             givens={
20                 x: test_set_x[index * batch_size: (index + 1) * batch_size],
21                 y: test_set_y[index * batch_size: (index + 1) * batch_size]})
22 
23     validate_model = theano.function(inputs=[index],
24             outputs=classifier.errors(y),
25             givens={
26                 x: valid_set_x[index * batch_size:(index + 1) * batch_size],
27                 y: valid_set_y[index * batch_size:(index + 1) * batch_size]})
28 
29     # compute the gradient of cost with respect to theta = (W,b)
30     g_W = T.grad(cost=cost, wrt=classifier.W)
31     g_b = T.grad(cost=cost, wrt=classifier.b)
32 
33     # specify how to update the parameters of the model as a list of
34     # (variable, update expression) pairs.
35     updates = [(classifier.W, classifier.W - learning_rate * g_W),
36                (classifier.b, classifier.b - learning_rate * g_b)]
37 
38     # compiling a Theano function `train_model` that returns the cost, but in
39     # the same time updates the parameter of the model based on the rules
40     # defined in `updates`
41     train_model = theano.function(inputs=[index],
42             outputs=cost,
43             updates=updates,
44             givens={
45                 x: train_set_x[index * batch_size:(index + 1) * batch_size],
46                 y: train_set_y[index * batch_size:(index + 1) * batch_size]})

代码长度不算太长, 只是逻辑关系需要厘清. 下面逐行分析这些代码.

代码中的T是theano.tensor的代名词.

行1~行13:

# allocate symbolic variables for the data
    index = T.lscalar()  # index to a [mini]batch
    x = T.matrix('x')  # the data is presented as rasterized images
    y = T.ivector('y')  # the labels are presented as 1D vector of
                           # [int] labels

    # construct the logistic regression class
    # Each MNIST image has size 28*28
    classifier = LogisticRegression(input=x, n_in=24 * 48, n_out=10)

    # the cost we minimize during training is the negative log likelihood of
    # the model in symbolic format
    cost = classifier.negative_log_likelihood(y)

声明index, x, y三个符号变量(类似Matlab的symbol), 分别用来指代训练样本批序号, 输入图像矩阵, 期望输出向量.

classifier是一个LR对象, 调用LR类的构造函数, 并将符号变量x作为输入, 我们就可以使用Theano.function方法在x和classifier中构造联系, 当x改变时, classifier也会改变.

cost指代classifier中的负对数相似度, 使用符号变量y作为输入, 此处的作用和classifier相同, 不再赘述.

行14~行28:

    # compiling a Theano function that computes the mistakes that are made by
    # the model on a minibatch
    test_model = theano.function(inputs=[index],
            outputs=classifier.errors(y),
            givens={
                x: test_set_x[index * batch_size: (index + 1) * batch_size],
                y: test_set_y[index * batch_size: (index + 1) * batch_size]})

    validate_model = theano.function(inputs=[index],
            outputs=classifier.errors(y),
            givens={
                x: valid_set_x[index * batch_size:(index + 1) * batch_size],
                y: valid_set_y[index * batch_size:(index + 1) * batch_size]})

这里的2个model是容易让人迷惑的地方, 关于theano.function, 需要一些基础知识:

比如声明2个符号变量a, b: a, b = T.iscalar(), T.iscalar() , 它们都是整形(i)标量(scalar), 再声明一个变量c: c = a + b , 我们通过type(c)来查看其类型:

>>> type(c)
<class 'theano.tensor.var.TensorVariable'>
>>> type(a)
<class 'theano.tensor.var.TensorVariable'>

　　c的类型和a, b相同, 都是Tensor变量. 至此准备工作完成, 我们通过theano.function来构建关系: add = theano.function(inputs = [a, b], output = c) . 这条语句就构造了一个函数add, 它接收a, b为输入, 输出为c. 我们在Python中这样使用它即可:

>>> add = theano.function(inputs = [a, b], outputs = c)
>>> test = add(100, 100)
>>> test
array(200)

好了, 有了基础知识, 就可以理解这2个model的含义:

test_model = theano.function(inputs=[index],
            outputs=classifier.errors(y),
            givens={
                x: test_set_x[index * batch_size: (index + 1) * batch_size],
                y: test_set_y[index * batch_size: (index + 1) * batch_size]})

输入是index, 输出则是classifier对象中的errors方法的返回值, 其中y作为errors方法的输入参数. 其中的classifier接收x作为输入参数.

givens关键字的作用是使用冒号后面的变量来替代冒号前面的变量, 本例中, 即使用测试数据中的第index批数据(一批有batch_size个)来替换x和y.

test_model用中文来解释就是: 接收第index批测试数据的图像数据x和期望输出y作为输入, 返回误差值的函数.

validate_model = theano.function(inputs=[index],
            outputs=classifier.errors(y),
            givens={
                x: valid_set_x[index * batch_size:(index + 1) * batch_size],
                y: valid_set_y[index * batch_size:(index + 1) * batch_size]})

这里同上, 只不过使用的是验证数据.

行29~行32:

    # compute the gradient of cost with respect to theta = (W,b)
    g_W = T.grad(cost=cost, wrt=classifier.W)
    g_b = T.grad(cost=cost, wrt=classifier.b)

计算的是梯度, 用于学习算法, T.grad(y, x) 计算的是相对于x的y的梯度.

行33~行37:

    # specify how to update the parameters of the model as a list of
    # (variable, update expression) pairs.
    updates = [(classifier.W, classifier.W - learning_rate * g_W),
               (classifier.b, classifier.b - learning_rate * g_b)]

updates是一个长度为2的list, 每个元素都是一组tuple, 在theano.function中, 每次调用对应函数, 使用tuple中的第二个元素来更新第一个元素.

行38~行46:

　　# compiling a Theano function `train_model` that returns the cost, but in
    # the same time updates the parameter of the model based on the rules
    # defined in `updates`
    train_model = theano.function(inputs=[index],
            outputs=cost,
            updates=updates,
            givens={
                x: train_set_x[index * batch_size:(index + 1) * batch_size],
                y: train_set_y[index * batch_size:(index + 1) * batch_size]})

这里其余部分不再赘述. 需要注意的是增加了一个updates参数, 这个参数给定了每次调用train_model时对某些参数的修改(W, b). 另外输出也变成了cost函数(对数误差)而非test_model和valid-model中的errors函数(绝对误差).

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：[深度学习]Python/Theano实现逻辑回归网络的代码分析 - Python技术站

深度学习

0 0 打赏

微信扫一扫

支付宝扫一扫

【RS】CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering-CoupledCF：在推荐系统深度协作过滤中学习显式和隐式的用户物品耦合

上一篇 2023年4月12日

利用Theano理解深度学习——Multilayer Perceptron

下一篇 2023年4月12日

《机器翻译统计建模与深度学习方法》 __肖桐学习第二天【机器翻译基础】

1、董振东先生对机器翻译方法的评价：基于规则和实例的机器翻译是傻子（依赖一定人工，在匹配规则和模板的情况下翻译质量高，但是系统泛华能力有限），统计和神经机器翻译是疯子（只依赖数据，系统健硕性强，但是精度不稳定且翻译过程难以人工干预）。 2、翻译质量评价：　　有参考答案的评价：在参考答案或者评价标准已知的情况下对译文进行打分；　　无参考答案的评价：在没有…

深度学习 2023年4月11日
000
深度学习网络训练技巧汇总

转载请注明：炼丹实验室新开了一个专栏，为什么叫炼丹实验室呢，因为以后会在这个专栏里分享一些关于深度学习相关的实战心得，而深度学习很多人称它为玄学，犹如炼丹一般。不过即使是炼丹也是可以摸索出一些经验规律的，希望和各位炼丹术士一起多多交流。训练技巧对深度学习来说是非常重要的，作为一门实验性质很强的科学，同样的网络结构使用不同的训练方法训练，结果可能会有很大的差…

深度学习 2023年4月13日
000
基于深度学习的图像语义分割技术概述之背景与深度网络架构

图像语义分割正在逐渐成为计算机视觉及机器学习研究人员的研究热点。大量应用需要精确、高效的分割机制，如：自动驾驶、室内导航、及虚拟/增强现实系统。这种需求与机器视觉方面的深度学习领域的目标一致，包括语义分割或场景理解。本文对多种应用领域语义分割的深度学习方法进行概述。首先，我们给出本领域的术语及主要背景知识。其次，介绍主要的数据集及难点，以帮助研究人员找到合适…

深度学习 2023年4月11日
000
[一本通学习笔记] 深度优先搜索与剪枝

深度优先搜索的剪枝优化还是很灵活的。但常规来说，比较通用的优化思路主要有两类。可行性剪枝最优性剪枝需要结合题目性质进行一定的理解与探究。必要时还可以加入一些启发式的优化。一本通上的几个例题和练习做得有点卡，代码也很丑陋。没怎么动脑子就直接dp了 #include <bits/stdc++.h> using namespace std…

深度学习 2023年4月11日
000
TensorFlow实战Google深度学习框架8-9章学习笔记

目录第8章循环神经网络第9章自然语言处理第8章循环神经网络循环神经网络的主要用途是处理和预测序列数据。循环神经网络的来源就是为了刻画一个序列当前的输出与之前信息的关系。也就是说，循环神经网络的隐藏层之间的节点是有连接的，隐藏层的输入不仅包括输入层的输出，还包括上一时刻隐藏层的输出。下面给出一个长度为2的RNN前向传播示例代码： impo…

深度学习 2023年4月15日
000
《深度学习-改善深层神经网络》-第二周-优化算法-Andrew Ng

看到有不少人挺推崇：An overview of gradient descent optimization algorithms；特此放到最上面，大家有机会可以阅读一下；本文内容主要来源于Coursera吴恩达《优化深度神经网络》课程，另外一些不同优化算法之间的比较也会出现在其中，具体来源不再单独说明，会在文末给出全部的参考文献；本主要主要…

深度学习 2023年4月11日
000
吴恩达《深度学习》第一门课（2）神经网络的编程基础

2.1二分类（1）以一张三通道的64×64的图片做二分类识别是否是毛，输出y为1时认为是猫，为0时认为不是猫： y输出是一个数，x输入是64*64*3=12288的向量。（2）以下是一些符号定义（数据集变成矩阵之后进行矩阵运算代替循环运算，更加高效） x：表示一个nx维数据，维度为（nx,1） y：表示输出结果，取值为（0,1）；（x(i),y(i)）…

深度学习 2023年4月11日
000
关于深度学习的小知识点

　　Q：CNN最成功的应用是在CV，那为什么NLP和Speech的很多问题也可以用CNN解出来？为什么AlphaGo里也用了CNN？这几个不相关的问题的相似性在哪里？CNN通过什么手段抓住了这个共性？　　以上几个不相关问题的相关性在于，都存在局部与整体的关系，由低层次的特征经过组合，组成高层次的特征，并且得到不同特征之间的空间相关性。　　CNN抓住此共性…

深度学习 2023年4月10日
000

[深度学习]Python/Theano实现逻辑回归网络的代码分析

相关文章