加快 HTTP 请求 python 和 500 错误

2023年4月6日上午11:16 • Python开发

【问题标题】：speed up a HTTP request python and 500 error加快 HTTP 请求 python 和 500 错误
【发布时间】：2023-04-04 14:36:01
【问题描述】：

我有一个代码，它使用查询和时间范围（可能长达一年）从此 newspaper 检索新闻结果。

结果每页最多分页 10 篇文章，由于我找不到增加它的方法，我为每个页面发出请求，然后检索每篇文章的标题、网址和日期。每个周期（HTTP 请求和解析）需要 30 秒到一分钟，这非常慢。最终它将以 500 的响应代码停止。我想知道是否有办法加快它的速度，或者一次发出多个请求。我只是想检索所有页面中的文章详细信息。
代码如下：

    import requests
    import re
    from bs4 import BeautifulSoup
    import csv

    URL = 'http://www.gulf-times.com/AdvanceSearchNews.aspx?Pageindex={index}&keywordtitle={query}&keywordbrief={query}&keywordbody={query}&category=&timeframe=&datefrom={datefrom}&dateTo={dateto}&isTimeFrame=0'


    def run(**params):
        countryFile = open("EgyptDaybyDay.csv","a")
        i=1
        results = True
        while results:
                    params["index"]=str(i)
                    response = requests.get(URL.format(**params))
                    print response.status_code
                    htmlFile = BeautifulSoup(response.content)
                    articles = htmlFile.findAll("div", { "class" : "newslist" })

                    for article in articles:
                                url =  (article.a['href']).encode('utf-8','ignore')
                                title = (article.img['alt']).encode('utf-8','ignore')
                                dateline = article.find("div",{"class": "floatright"})
                                m = re.search("([0-9]{2}\-[0-9]{2}\-[0-9]{4})", dateline.string)
                                date =  m.group(1)
                                w = csv.writer(countryFile,delimiter=',',quotechar='|', quoting=csv.QUOTE_MINIMAL)
                                w.writerow((date, title, url ))

                    if not articles:
                                results = False
                    i+=1
        countryFile.close()


    run(query="Egypt", datefrom="12-01-2010", dateto="12-01-2011")

【问题讨论】：

标签：
python
http
web-scraping
web-crawler

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：加快 HTTP 请求 python 和 500 错误 - Python技术站