Python 爬取网页图片详解流程

在 Python 中，我们可以使用各种库和框架来爬取网页图片。其中，最常用的库是 requests 库和 BeautifulSoup 库，通过它们的结合，我们可以轻松地爬取网页中的图片。以下是 Python 爬取网页图片的完整攻略。

1. 导入所需库

首先，我们需要导入所需的库，包括 requests、BeautifulSoup 和 os。

import requests
from bs4 import BeautifulSoup
import os

2. 获取网页内容

接下来，我们需要使用 requests 库来获取网页的 HTML 内容。

url = 'http://example.com'
response = requests.get(url)
html = response.text

3. 解析网页内容

通过使用 BeautifulSoup 库，我们可以轻松地解析网页的 HTML 内容，并获取其中的图片链接。

soup = BeautifulSoup(html, 'html.parser')
img_tags = soup.find_all('img')
urls = [img['src'] for img in img_tags]

4. 下载图片

现在，我们已经获取了图片的链接，接下来，我们需要使用 requests 库来下载这些图片。

for url in urls:
    response = requests.get(url)
    filename = os.path.basename(url)
    with open(filename, 'wb') as f:
        f.write(response.content)

示例一

假如我们要爬取 Bing 搜索的首页图片，我们可以将 url 修改为 https://www.bing.com/，然后按照上述步骤进行爬取。

import requests
from bs4 import BeautifulSoup
import os

url = 'https://www.bing.com/'
response = requests.get(url)
html = response.text

soup = BeautifulSoup(html, 'html.parser')
img_tags = soup.find_all('img', class_='rms_img')
urls = [img['src'] for img in img_tags]

for url in urls:
    response = requests.get(url)
    filename = os.path.basename(url)
    with open(filename, 'wb') as f:
        f.write(response.content)

示例二

假如我们要爬取 Unsplash 网站上的美食类图片，我们可以将 url 修改为 https://unsplash.com/search/photos/food，然后按照上述步骤进行爬取。

import requests
from bs4 import BeautifulSoup
import os

url = 'https://unsplash.com/search/photos/food'
response = requests.get(url)
html = response.text

soup = BeautifulSoup(html, 'html.parser')
img_tags = soup.find_all('img')
urls = [img['src'] for img in img_tags]

for url in urls:
    response = requests.get(url)
    filename = os.path.basename(url)
    with open(filename, 'wb') as f:
        f.write(response.content)

以上就是 Python 爬取网页图片的完整攻略。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：Python 爬取网页图片详解流程 - Python技术站

Python 爬取网页图片详解流程