Pytorch实现GoogLeNet的方法

PyTorch实现GoogLeNet的方法

GoogLeNet是一种经典的卷积神经网络模型，它在2014年的ImageNet比赛中获得了第一名。本文将介绍如何使用PyTorch实现GoogLeNet模型，并提供两个示例说明。

1. 导入必要的库

在开始实现GoogLeNet之前，我们需要导入必要的库。以下是一个示例代码：

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.datasets as datasets

在上面的示例代码中，我们导入了PyTorch中常用的库，包括torch、torch.nn、torch.optim、torchvision.transforms和torchvision.datasets。

2. 定义GoogLeNet模型

在PyTorch中，可以使用torch.nn模块定义神经网络模型。以下是一个GoogLeNet模型的示例代码：

class GoogLeNet(nn.Module):
    def __init__(self, num_classes=1000):
        super(GoogLeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3)
        self.maxpool1 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.conv2 = nn.Conv2d(64, 192, kernel_size=3, stride=1, padding=1)
        self.maxpool2 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.inception3a = Inception(192, 64, 96, 128, 16, 32, 32)
        self.inception3b = Inception(256, 128, 128, 192, 32, 96, 64)
        self.maxpool3 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.inception4a = Inception(480, 192, 96, 208, 16, 48, 64)
        self.inception4b = Inception(512, 160, 112, 224, 24, 64, 64)
        self.inception4c = Inception(512, 128, 128, 256, 24, 64, 64)
        self.inception4d = Inception(512, 112, 144, 288, 32, 64, 64)
        self.inception4e = Inception(528, 256, 160, 320, 32, 128, 128)
        self.maxpool4 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.inception5a = Inception(832, 256, 160, 320, 32, 128, 128)
        self.inception5b = Inception(832, 384, 192, 384, 48, 128, 128)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.dropout = nn.Dropout(p=0.4)
        self.fc = nn.Linear(1024, num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = nn.functional.relu(x)
        x = self.maxpool1(x)
        x = self.conv2(x)
        x = nn.functional.relu(x)
        x = self.maxpool2(x)
        x = self.inception3a(x)
        x = self.inception3b(x)
        x = self.maxpool3(x)
        x = self.inception4a(x)
        x = self.inception4b(x)
        x = self.inception4c(x)
        x = self.inception4d(x)
        x = self.inception4e(x)
        x = self.maxpool4(x)
        x = self.inception5a(x)
        x = self.inception5b(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.dropout(x)
        x = self.fc(x)
        return x

在上面的示例代码中，我们定义了一个名为GoogLeNet的类，它继承自nn.Module。在__init__函数中，我们定义了GoogLeNet模型的各个层，包括卷积层、池化层、Inception模块、全局平均池化层、Dropout层和全连接层。在forward函数中，我们定义了模型的前向传播过程。

3. 定义Inception模块

在GoogLeNet模型中，使用了多个Inception模块。以下是一个Inception模块的示例代码：

class Inception(nn.Module):
    def __init__(self, in_channels, out1x1, reduce3x3, out3x3, reduce5x5, out5x5, out1x1pool):
        super(Inception, self).__init__()
        self.branch1 = nn.Sequential(
            nn.Conv2d(in_channels, out1x1, kernel_size=1),
            nn.BatchNorm2d(out1x1),
            nn.ReLU(inplace=True)
        )

        self.branch2 = nn.Sequential(
            nn.Conv2d(in_channels, reduce3x3, kernel_size=1),
            nn.BatchNorm2d(reduce3x3),
            nn.ReLU(inplace=True),
            nn.Conv2d(reduce3x3, out3x3, kernel_size=3, padding=1),
            nn.BatchNorm2d(out3x3),
            nn.ReLU(inplace=True)
        )

        self.branch3 = nn.Sequential(
            nn.Conv2d(in_channels, reduce5x5, kernel_size=1),
            nn.BatchNorm2d(reduce5x5),
            nn.ReLU(inplace=True),
            nn.Conv2d(reduce5x5, out5x5, kernel_size=5, padding=2),
            nn.BatchNorm2d(out5x5),
            nn.ReLU(inplace=True)
        )

        self.branch4 = nn.Sequential(
            nn.MaxPool2d(kernel_size=3, stride=1, padding=1),
            nn.Conv2d(in_channels, out1x1pool, kernel_size=1),
            nn.BatchNorm2d(out1x1pool),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        branch1 = self.branch1(x)
        branch2 = self.branch2(x)
        branch3 = self.branch3(x)
        branch4 = self.branch4(x)
        outputs = [branch1, branch2, branch3, branch4]
        return torch.cat(outputs, 1)

在上面的示例代码中，我们定义了一个名为Inception的类，它继承自nn.Module。在__init__函数中，我们定义了Inception模块的四个分支，包括1x1卷积分支、3x3卷积分支、5x5卷积分支和1x1池化分支。在forward函数中，我们将四个分支的输出拼接在一起，并返回结果。

4. 加载数据集

在训练模型之前，我们需要加载数据集。以下是一个加载CIFAR-10数据集的示例代码：

transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

trainset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform_train)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)

testset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transform_test)
testloader = torch.utils.data.DataLoader(testset, batch_size=128, shuffle=False, num_workers=2)

在上面的示例代码中，我们定义了训练集和测试集的数据预处理方式，并使用datasets.CIFAR10函数加载CIFAR-10数据集。然后，我们使用torch.utils.data.DataLoader函数将数据集转换为可迭代的数据加载器。

5. 训练模型

在加载数据集之后，我们可以开始训练模型。以下是一个训练GoogLeNet模型的示例代码：

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net = GoogLeNet(num_classes=10).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)

for epoch in range(100):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        if i % 100 == 99:
            print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

在上面的示例代码中，我们首先将模型移动到GPU上（如果可用）。然后，我们定义了损失函数、优化器和训练循环。在训练循环中，我们首先将输入和标签移动到GPU上，然后将梯度清零，计算模型输出和损失，进行反向传播和参数更新。最后，我们输出每个epoch的平均损失。

示例1：使用GoogLeNet进行图像分类

以下是一个使用GoogLeNet进行图像分类的示例代码：

import torch.nn.functional as F

# 加载测试集
dataiter = iter(testloader)
images, labels = dataiter.next()

# 使用模型进行预测
outputs = net(images.to(device))
_, predicted = torch.max(outputs.data, 1)

# 输出预测结果
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4)))

# 显示图片
imshow(torchvision.utils.make_grid(images))

在上面的示例代码中，我们首先从测试集中加载一批图像和标签。然后，我们使用训练好的模型对图像进行预测，并输出预测结果。最后，我们使用imshow函数显示图像。

示例2：使用GoogLeNet进行迁移学习

以下是一个使用GoogLeNet进行迁移学习的示例代码：

# 加载预训练模型
pretrained_net = torch.hub.load('pytorch/vision:v0.9.0', 'googlenet', pretrained=True)
pretrained_net.fc = nn.Linear(1024, 10)

# 将模型移动到GPU上
pretrained_net = pretrained_net.to(device)

# 定义损失函数和优化器
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(pretrained_net.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)

# 训练模型
for epoch in range(100):
    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        optimizer.zero_grad()
        outputs = pretrained_net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        if i % 100 == 99:
            print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

在上面的示例代码中，我们首先使用torch.hub.load函数加载预训练的GoogLeNet模型，并将其输出层替换为一个新的全连接层。然后，我们将模型移动到GPU上，并定义损失函数和优化器。最后，我们训练模型并输出每个epoch的平均损失。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：Pytorch实现GoogLeNet的方法 - Python技术站