一文详解如何用GPU来运行Python代码

简介

Python 是一种流行的编程语言, 具有灵活性和易于使用的特点。然而，Python 非常慢，不能直接用于处理计算密集型任务。幸运的是，我们可以使用 GPU 加速来提高 Python 的运算速度。

本文将讨论如何在常见的深度学习编程框架中使用 GPU。我们将讨论 TensorFlow, PyTorch 和 MXNet。此外，我们将介绍如何在 Google Colab 和 Anaconda 中设置 GPU。

设置 GPU

在使用 GPU 之前，首先需要使用命令来安装必要的软件包。例如，在使用 TensorFlow 之前，需要使用以下命令安装 TensorFlow:

!pip install tensorflow-gpu

使用 GPU 可以使用 Nvidia 的 CUDA 和 cuDNN 库。确保使用的 CUDA 和 cuDNN 版本与您正在使用的深度学习框架的版本相匹配。

Google Colab

如果使用 Google Colab 进行深度学习项目，则可以使用以下代码启用 GPU：

import tensorflow as tf

# check if tensorflow can see the GPU
device_name = tf.test.gpu_device_name() or "CPU"
print(f"Running on {device_name}")

如果您看到以下输出，则说明您的 Colab 框架支持 GPU 运算：

Running on /device:GPU:0

Anaconda

如果使用 Anaconda 进行深度学习项目，则可以使用以下命令来创建支持 GPU 的环境：

conda create --name gpu_env tensorflow-gpu

然后，使用以下命令激活环境：

conda activate gpu_env

示例

TensorFlow

在 TensorFlow 中使用 GPU 可以大大加速训练和测试过程。以下示例演示如何在 TensorFlow 中使用 GPU。

import tensorflow as tf

# create a simple neural network model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(512, activation='relu', input_shape=(784,)),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10)
])

# load MNIST dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# pre-process the data
x_train = x_train.reshape((60000, 784))
x_test = x_test.reshape((10000, 784))
x_train, x_test = x_train / 255.0, x_test / 255.0

# compile and fit the model
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5, batch_size=32, validation_data=(x_test, y_test))

PyTorch

使用 GPU 可以通过减少训练和测试时间来提高 PyTorch 模型的计算速度和性能。以下示例演示了如何使用 PyTorch 中的 GPU：

import torch

# specify device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

# set up a simple neural network model
model = torch.nn.Sequential(
        torch.nn.Linear(784, 512),
        torch.nn.ReLU(),
        torch.nn.Dropout(0.2),
        torch.nn.Linear(512, 10))

model.to(device)

# load MNIST dataset
train_set = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST('./data', train=True, download=True,
                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.ToTensor(),
                                   torchvision.transforms.Normalize((0.1307,), (0.3081,))
                               ])),
    batch_size=32, shuffle=True)

test_set = torch.utils.data.DataLoader(
    torchvision.datasets.MNIST('./data', train=False, download=True,
                               transform=torchvision.transforms.Compose([
                                   torchvision.transforms.ToTensor(),
                                   torchvision.transforms.Normalize((0.1307,), (0.3081,))
                               ])),
    batch_size=32, shuffle=True)

# specify loss function and optimizer
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)

# train and test the model
def train(epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_set):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data.view(data.shape[0], -1))
        loss = loss_fn(output, target)
        loss.backward()
        optimizer.step()

        if batch_idx % 100 == 0:
            print(f"Train Epoch: {epoch} [{batch_idx * len(data)} / {len(train_set.dataset)} "\
                  f"({100. * batch_idx / len(train_set)}%)], Loss: {loss.item()}")

def test():
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_set:
            data, target = data.to(device), target.to(device)
            output = model(data.view(data.shape[0], -1))
            test_loss += loss_fn(output, target).item()
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_set.dataset)
    print(f"\nTest set: Average loss: {test_loss:.4f}, "\
          f"Accuracy: {correct}/{len(test_set.dataset)} ({100. * correct / len(test_set.dataset)}%)\n")


for epoch in range(1, 6):
    train(epoch)
    test()

MXNet

以下示例将使用 MXNet 中的 GPU 运行 Logistic 回归：

import mxnet as mx
from mxnet import nd, autograd, gluon

# set up a simple logistic regression model
model = gluon.nn.Sequential()
with model.name_scope():
    model.add(gluon.nn.Dense(10))

ctx = mx.gpu() if mx.context.num_gpus() > 0 else mx.cpu()

model.initialize(mx.init.Xavier(), ctx=ctx)

# load MNIST dataset
batch_size = 32
train_data = gluon.data.DataLoader(gluon.data.vision.MNIST(train=True).transform_first(transformer),
                                      batch_size=batch_size, shuffle=True, num_workers=2)
test_data = gluon.data.DataLoader(gluon.data.vision.MNIST(train=False).transform_first(transformer),
                                     batch_size=batch_size, num_workers=2)

# specify loss function and optimizer
cross_entropy_loss = gluon.loss.SoftmaxCrossEntropyLoss()
trainer = gluon.Trainer(model.collect_params(), 'sgd', {'learning_rate': 0.1})

# train and test the model
def train(epochs):
    for epoch in range(epochs):
        running_loss = 0.0
        for i, (data, label) in enumerate(train_data):
            data = data.as_in_context(ctx)
            label = label.as_in_context(ctx)
            with autograd.record():
                output = model(data)
                loss = cross_entropy_loss(output, label)

            loss.backward()
            trainer.step(batch_size)

            running_loss += nd.mean(loss).asscalar()

        print(f"Epoch {epoch+1}, train_loss: {running_loss / len(train_data)}")
        test()

def test():
    test_acc = mx.metric.Accuracy()
    for data, label in test_data:
        data = data.as_in_context(ctx)
        label = label.as_in_context(ctx)
        output = model(data)
        predictions = nd.argmax(output, axis=1)
        test_acc.update(preds=predictions, labels=label)

    _, test_acc_value = test_acc.get()
    print(f"test_accuracy: {test_acc_value:.4f}")


train(5)

总结

在本文中，我们讨论了如何在运行 Python 代码时使用 GPU。我们演示了在 TensorFlow，PyTorch 和 MXNet 中使用 GPU 运行代码，并提供了在 Google Colab 和 Anaconda 中启用 GPU 的说明。使用 GPU 可以显著提高处理计算密集型任务的速度和性能。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：一文详解如何用GPU来运行Python代码 - Python技术站

一文详解如何用GPU来运行Python代码

一文详解如何用GPU来运行Python代码

简介

设置 GPU

Google Colab

Anaconda

示例

TensorFlow

PyTorch

MXNet

总结

相关文章