Python实现Harbor私有镜像仓库垃圾自动化清理详情

在使用容器时，镜像管理非常重要。一个镜像存在于仓库中，仓库最好具有自动清理功能以避免垃圾堆积。Harbor 是一款私有 Docker 镜像仓库管理软件，它可以实现镜像仓库的自动化清理功能。这里我们将详细讲解如何使用 Python 实现 Harbor 私有镜像仓库垃圾自动化清理。

安装依赖

首先，我们需要安装 Python3 和 Python requests 库。可以使用以下命令进行安装：

sudo apt-get update
sudo apt-get install python3
sudo apt-get install python3-pip
pip3 install requests

生成访问 Token

为了自动化删除 Harbor 中的垃圾镜像，我们需要使用 Harbor 的 API。首先，我们需要生成用于访问 API 的 Token。可以使用以下命令来生成 Token：

curl -u 'username:password' "https://harbor.example.com/service/token?account=admin&client_id=harbor&offline_token=true&service=harbor-registry" -k | jq -r '.token'

其中，username 和 password 分别修改为你的 Harbor 管理员用户名和密码。harbor.example.com 修改为你的 Harbor 地址。如果没有安装 jq 工具，可以忽略 -r '.token'。

程序中需要使用到 Token，可以将其存为环境变量或写入配置文件。

编写自动清理脚本

接下来，我们可以编写 Python 脚本来实现自动清理 Harbor 中的垃圾镜像。详细代码见下：

# -*- coding: utf-8 -*-
import requests
import os
import json
import datetime

harbor_host = "https://harbor.example.com"

headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer %s" % os.getenv("HARBOR_TOKEN")
}

def list_all_repo():
    url = harbor_host + "/api/repositories"
    res = requests.get(url=url, headers=headers, timeout=60, verify=False)
    if res.status_code != 200:
        print("Get repo list failed with HTTP Status: %s" % res.status_code)
        return None
    return json.loads(res.text)

def list_all_tags(repo):
    url = "%s/api/repositories/%s/tags" % (harbor_host, repo)
    res = requests.get(url=url, headers=headers, timeout=60, verify=False)
    if res.status_code != 200:
        print("Get tag list failed with HTTP Status: %s" % res.status_code)
        return None
    return json.loads(res.text)

def delete_tag(repo, tag):
    url = "%s/api/repositories/%s/tags/%s" % (harbor_host, repo, tag)
    res = requests.delete(url=url, headers=headers, timeout=60, verify=False)
    if res.status_code != 200:
        print("Delete tag %s failed with HTTP Status: %s" % (tag, res.status_code))
        return False
    return True

if __name__ == '__main__':
    repo_list = list_all_repo()
    for repo in repo_list:
        print("Cleaning repo: %s" % repo.get("name"))
        tags = list_all_tags(repo.get("name"))
        if tags:
            for tag in tags:
                timestamp = int(tag.get("created", "0"))
                if timestamp:
                    created_time = datetime.datetime.fromtimestamp(timestamp)
                    days = (datetime.datetime.now() - created_time).days
                    if days >= 30:
                        print("Deleting tag: %s, created: %s, age: %s days" % (tag.get("name"), created_time.strftime("%Y/%m/%d"), days))
                        delete_tag(repo.get("name"), tag.get("name"))

上述代码中，我们编写了三个函数：

list_all_repo：获取 Harbor 中所有仓库的列表。
list_all_tags：获取指定仓库的所有镜像标记列表。
delete_tag：删除指定仓库中指定镜像标记。

在 if __name__ == '__main__' 分支中，我们首先获取所有仓库的列表，然后遍历每个仓库的所有镜像标记。对于每个镜像标记，我们计算它的创建时间，如果它的年龄超过 30 天，则删除它。

示例应用

以下是对上述代码的两个示例应用：

示例1：使用 cron 任务定期清理

cron 是一个用于在 Linux 上设置定期任务的工具。我们可以使用如下命令设置一个每天清理一次垃圾镜像的定期任务：

crontab -e
0 0 * * * /usr/bin/python3 /path/to/cleanup.py

以上命令表示在每天午夜（0点）执行 /path/to/cleanup.py 脚本。

示例2：手动清理所有仓库

如果您想手动清理所有仓库的所有镜像标记：

python3 cleanup.py

执行结束后，所有年龄超过 30 天的镜像标记将被删除。

总结

使用 Python 实现 Harbor 私有镜像仓库垃圾自动化清理并不难。通过编写简单的脚本，我们可以轻松地清理过期的镜像标记，避免仓库内的垃圾堆积。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：Python实现Harbor私有镜像仓库垃圾自动化清理详情 - Python技术站

Python实现Harbor私有镜像仓库垃圾自动化清理详情