Python爬取数据并写入MySQL数据库的实例

在本攻略中，我们将介绍如何使用Python爬取数据并将其写入MySQL数据库。我们将使用第三方库requests、BeautifulSoup和pymysql来实现这个功能。

步骤1：创建数据库和表

在编写爬取数据并写入MySQL数据库的代码之前，我们需要先创建数据库和表。以下是一个示例SQL语句，用于创建一个名为example的数据库和一个名为example_table的表：

CREATE DATABASE example;
USE example;
CREATE TABLE example_table (
    id INT(11) NOT NULL AUTO_INCREMENT,
    title VARCHAR(255) NOT NULL,
    content TEXT NOT NULL,
    PRIMARY KEY (id)
);

步骤2：requests、BeautifulSoup和pymysql库

在创建数据库和表之后，我们可以开始编写爬取数据并写入MySQL数据库的代码了。以下是一个示例代码，用于爬取数据并将其写入MySQL数据库：

import requests
from bs4 import BeautifulSoup
import pymysql

# 连接数据库
conn = pymysql.connect(host='localhost', user='root', password='password', database='example', charset='utf8mb4')
cursor = conn.cursor()

# 爬取数据
url = 'https://www.example.com'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
title = soup.find('title').text
content = soup.find('div', {'class': 'content'}).text

# 写入数据库
sql = "INSERT INTO example_table (title, content) VALUES (%s, %s)"
cursor.execute(sql, (title, content))
conn.commit()

# 关闭连接
cursor.close()
conn.close()

在上面的代码中，我们首先导入了requests、BeautifulSoup和pymysql库。然后，我们使用pymysql库连接到MySQL数据库，并使用requests库发送一个GET请求来获取网页内容。接下来，我们使用BeautifulSoup库解析响应的HTML内容，并使用find方法查找标题和内容。然后，我们定义了一个名为sql的变量，用于存储插入数据的SQL语句。接下来，我们使用execute方法执行SQL语句，并使用commit方法提交更改。最后，我们使用close方法关闭游标和连接。

以下是另一个示例代码，用于爬取多个网页并将其写入MySQL数据库：

import requests
from bs4 import BeautifulSoup
import pymysql

# 连接数据库
conn = pymysql.connect(host='localhost', user='root', password='password', database='example', charset='utf8mb4')
cursor = conn.cursor()

# 爬取数据
urls = ['https://www.example.com/page1', 'https://www.example.com/page2', 'https://www.example.com/page3']
for url in urls:
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    title = soup.find('title').text
    content = soup.find('div', {'class': 'content'}).text

    # 写入数据库
    sql = "INSERT INTO example_table (title, content) VALUES (%s, %s)"
    cursor.execute(sql, (title, content))
    conn.commit()

# 关闭连接
cursor.close()
conn.close()

在上面的代码中，我们首先导入了requests、BeautifulSoup和pymysql库。然后，我们使用pymysql库连接到MySQL数据库，并使用一个for循环遍历多个网页。在每个循环中，我们使用requests库发送一个GET请求来获取网页内容。接下来，我们使用BeautifulSoup库解析响应的HTML内容，并使用find方法查找标题和内容。然后，我们定义了一个名为sql的变量，用于存储插入数据的SQL语句。接下来，我们使用execute方法执行SQL语句，并使用commit方法提交更改。最后，我们使用close方法关闭游标和连接。

总结

本攻略介绍了如何使用Python爬取数据并将其写入MySQL数据库。我们使用了requests、BeautifulSoup和pymysql库，并提供了两个示例代码，分别用于爬取单个网页和爬取多个网页。这些技巧可以帮助我们更好地了解Python爬虫编程，并使用requests、BeautifulSoup和pymysql库进行数据爬取和写入。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：Python爬取数据并写入MySQL数据库的实例 - Python技术站

Python爬取数据并写入MySQL数据库的实例

Python爬取数据并写入MySQL数据库的实例

步骤1：创建数据库和表

步骤2：requests、BeautifulSoup和pymysql库

总结

相关文章