Python如何爬取51cto数据并存入MySQL

在本攻略中，我们将介绍如何使用Python爬取51CTO数据并存入MySQL。我们将使用requests、BeautifulSoup和pymysql库来实现这个功能。

安装requests、BeautifulSoup和pymysql

在使用requests、BeautifulSoup和pymysql之前，需要安装它们。以下是安装这些库的命令：

pip install requests
pip install beautifulsoup4
pip install pymysql

爬取51CTO数据并存入MySQL

以下是一个示例代码，演示了如何使用requests、BeautifulSoup和pymysql库爬取51CTO数据并存入MySQL：

import requests
from bs4 import BeautifulSoup
import pymysql

# 爬取51CTO数据并存入MySQL
def crawl_51cto_data():
    # 连接MySQL数据库
    conn = pymysql.connect(host='localhost', port=3306, user='root', password='your_password', db='your_database')
    cursor = conn.cursor()

    # 爬取51CTO数据
    for i in range(1, 11):
        url = 'https://blog.51cto.com/original/page/' + str(i)
        r = requests.get(url)
        soup = BeautifulSoup(r.text, 'html.parser')
        article_list = soup.find_all('div', class_='artical_item')
        for article in article_list:
            title = article.find('h2').string
            author = article.find('span', class_='name').string
            publish_time = article.find('span', class_='time').string
            content = article.find('div', class_='artical_content').get_text().strip()

            # 将数据存入MySQL数据库
            sql = "INSERT INTO article(title, author, publish_time, content) VALUES (%s, %s, %s, %s)"
            cursor.execute(sql, (title, author, publish_time, content))
            conn.commit()

    # 关闭数据库连接
    cursor.close()
    conn.close()

# 爬取51CTO数据并存入MySQL
crawl_51cto_data()

在上面的代码中，我们首先定义了一个crawl_51cto_data()函数。我们使用pymysql库连接MySQL数据库，并使用requests库爬取51CTO博客的数据。我们使用BeautifulSoup库解析HTML文档，并使用find_all()方法找到所有class为'artical_item'的div标签。我们使用find()方法找到文章的标题、作者、发布时间和内容，并将这些数据存入MySQL数据库。我们使用cursor.execute()方法执行SQL语句，并使用conn.commit()方法提交事务。我们使用cursor.close()方法关闭游标，并使用conn.close()方法关闭数据库连接。我们使用crawl_51cto_data()函数爬取51CTO数据并存入MySQL。

结论

本攻略介绍了如何使用Python爬取51CTO数据并存入MySQL。我们介绍了如何安装requests、BeautifulSoup和pymysql库，并提供了一个示例代码来演示如何使用这些库爬取51CTO数据并存入MySQL。这个示例代码可以帮助您更好地理解如何使用requests、BeautifulSoup和pymysql库。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：Python如何爬取51cto数据并存入MySQL - Python技术站

Python如何爬取51cto数据并存入MySQL

安装requests、BeautifulSoup和pymysql

爬取51CTO数据并存入MySQL

结论

相关文章