解决selenium模块利用performance获取network日志请求报错的问题(亲测有效)

下面为大家讲解“解决selenium模块利用performance获取network日志请求报错的问题”的完整攻略。

背景说明

在使用Python的selenium模块时，我们可以通过performance方法来获取网页的性能数据，其中也包括了网络请求的日志。但是有些情况下会出现获取网络请求日志报错的情况。

常见问题

在使用driver.get_log('performance')获取网页性能数据时可能会遇到以下报错：

selenium.common.exceptions.WebDriverException: Message: unknown error: performance
log type 'performance' not found

这是因为Chrome浏览器版本更新后，从Chrome65版本开始，chrome的DevTools协议接口中调试协议 Protocol.Cmd 明确禁止向浏览器获取日志。因此在 Chrome65 版本及以上的浏览器中无法通过 selenum 的 get_log() 函数便捷地获取到 console, network, performance 等日志。

解决方法

我们可以通过设置ChromeOptions启用performance日志，并使用Chrome DevTools协议获取日志，以下是具体步骤：

安装Chrome浏览器驱动，并将其添加到系统环境变量。
安装selenium模块，以便在Python代码中调用。
下载最新版的Chrome浏览器并安装。
通过Chrome DevTools协议获取网络请求日志。

from selenium import webdriver
import json
from urllib import request
from urllib.error import HTTPError


url = 'http://www.baidu.com'

# 新建一个chrome浏览器实例
options = webdriver.ChromeOptions()
options.add_argument('--disable-infobars')
options.add_argument('--headless')
options.add_argument("--start-maximized")
options.add_argument("--disable-extensions")
options.add_argument("--disable-gpu")
prefs = {"download.default_directory": "d:\\temp", "profile.default_content_settings.popups": 0,
         "download.prompt_for_download": False, "download.directory_upgrade": True}
options.add_experimental_option("prefs", prefs)
options.add_experimental_option('w3c', False)  # 开启兼容模式
options.add_argument("--disable-web-security")
options.add_argument("--allow-running-insecure-content")
options.add_argument("window-size=1920x1080")
options.add_argument('--log-level=3')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
browser = webdriver.Chrome(options=options, executable_path='/usr/local/bin/chromedriver')
try:
    browser.get(url)

    # 获取Performance信息，即性能信息
    netlog = browser.execute_cdp_cmd('Performance.getManifestly', {})
    print(netlog)
finally:
    browser.quit()

该代码中，我们设置了Chrome浏览器的一些options，包括禁止显示浏览器的infobars、启用无头模式等。同时，我们也设置了Chrome浏览器的prefs属性，指定了下载文件的默认目录。在获取网络请求日志前，我们需要使用selenium的execute_cdp_cmd方法执行Performance.getManifestly命令，最终得到的日志信息netlog是一个dict类型的变量。

示例说明

以下两个示例分别说明如何获取网络请求日志以及如何使用requests发送POST请求。

获取并保存请求日志

import json
import requests
from selenium import webdriver


# 新建一个chrome浏览器实例
options = webdriver.ChromeOptions()
options.add_argument('--disable-infobars')
options.add_argument('--headless')
prefs = {"download.default_directory": "d:\\temp", "profile.default_content_settings.popups": 0,
         "download.prompt_for_download": False, "download.directory_upgrade": True}
options.add_experimental_option("prefs", prefs)
options.add_experimental_option('w3c', False)  # 开启兼容模式
options.add_argument("--disable-web-security")
options.add_argument("--allow-running-insecure-content")
options.add_argument("window-size=1920x1080")
options.add_argument('--log-level=3')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
browser = webdriver.Chrome(options=options)

try:
    # 访问指定url，获取网络请求日志
    url = 'https://www.baidu.com'
    browser.get(url)
    netlog = browser.execute_cdp_cmd('Performance.getManifestly', {})

    # 保存网络请求日志
    with open('network_log.json', 'w', encoding='utf-8') as f:
        f.write(json.dumps(netlog))

finally:
    browser.quit()

该示例中，我们访问了百度网站，并通过selenium模块控制Chrome浏览器获取网页的网络请求日志，最终将其保存在了network_log.json文件中。

使用requests模块发送POST请求

import json
import requests
from selenium import webdriver


# 新建一个chrome浏览器实例
options = webdriver.ChromeOptions()
options.add_argument('--disable-infobars')
options.add_argument('--headless')
prefs = {"download.default_directory": "d:\\temp", "profile.default_content_settings.popups": 0,
         "download.prompt_for_download": False, "download.directory_upgrade": True}
options.add_experimental_option("prefs", prefs)
options.add_experimental_option('w3c', False)  # 开启兼容模式
options.add_argument("--disable-web-security")
options.add_argument("--allow-running-insecure-content")
options.add_argument("window-size=1920x1080")
options.add_argument('--log-level=3')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
browser = webdriver.Chrome(options=options)

try:
    # 访问指定url，获取网络请求日志
    url = 'https://www.baidu.com'
    browser.get(url)
    netlog = browser.execute_cdp_cmd('Performance.getManifestly', {})

    # 获取POST请求返回结果
    headers = {'Content-Type': 'application/json'}
    data = {"param1": "value1", "param2": "value2"}
    post_url = 'https://www.example.com/'
    r = requests.post(post_url, headers=headers, data=json.dumps(data))

finally:
    browser.quit()

该示例中，我们通过selenium模块控制Chrome浏览器获取网页的网络请求日志，并使用requests模块发送了一个POST请求，其中请求的数据以JSON格式编码，在请求头中指定Content-Type为application/json。程序最终将POST请求返回结果存储在变量r中，我们可以在后续对返回结果做进一步处理。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：解决selenium模块利用performance获取network日志请求报错的问题(亲测有效) - Python技术站