Python正则表达式中的re库常用方法总结

正则表达式是一种强大的工具，可以用于匹配、查找和替换文本中的模式。Python中，re模块提供了一系列函数来操作正则表达式。本攻略将详细讲解Python中re模块的常用方法，包括search()、match()、findall()、sub()等。

search()方法

search()方法用于在字符串中搜索正则表达式的一个匹配项。如果匹配成功，返回一个Match对象；否则返回None。下面是一个例子：

import re

text = 'The quick brown fox jumps over the lazy dog.'
pattern = r'fox'
result = re.search(pattern, text)
if result:
    print('Match found:', result.group())
else:
    print('Match not found')

在上面的代码中，我们使用正则表达式fox匹配字符串中的fox。运行代码后，输出为Match found: fox。

match()方法

match()方法用于在字符串的开头匹配正则表达式。如果匹配成功，返回一个Match对象；否则返回None。下面一个例子：

import re

text = 'The quick brown fox jumps over the lazy dog.'
pattern = r'The'
result = re.match(pattern, text)
if result:
    print('Match found:', result.group())
else:
    print('Match not found')

在上面的代码中，我们使用正则表达式The匹配字符串的开头。运行代码后，输出结果为Match found: The。

findall()方法

findall方法用于在字符串中查找所有匹配正则表达式的子串，并返回一个列表。下面是一个例子：

import re

text = 'The price is $1099.'
pattern = r'\d+'
result = re.findall(pattern, text)
if result:
    print('Matches found:', result)
else:
    print('Matches not found')

在上面的代码中，我们使用正则表达式\d+匹配字符串中的数字。findall()函数返回所有匹配的结果。运行后，输出结果Matches found: ['1099']。

sub()方法

sub()方法用在字符串中搜索正则表达式的所有匹配项，并将其替换为指定的字符串。下面是一个例子：

import re

text = 'The price is $1099.'
pattern = r'\d+'
replacement = ''
result = re.sub(pattern, replacement, text)
print('Result:', result)

在上面的代码中，我们使用正则表达式\d+匹配字符串中的数字，并将其替换为XXXX。sub()函数返回替换后的字符串。运行后，输出结果为Result The price is $XXXX.`。

以上是Python中re模块的常用方法，包括search()、match()、findall()、sub()等。这些方法在Python中的正则表达式操作中非常常用，望读者可以通过这些示例更好地理解这些方法的应。

示例1：匹配HTML标签中的文本内容

下面是一个例子，演示如何正则表达式匹配HTML标签中的文本内容：

import re

text = '<h1>Welcome to website</h>'
pattern = r'<.*?>(.*?)</.*?>'
result = re.search(pattern, text)
if result:
    print('Match found:', result.group(1))
else:
    print('Match not found')

在上面的代码中，我们使用正则表达式<.*?>(.*?)匹配HTML标签中的文本内容。<.*?>匹配一个或多个任意字符，.*?表示非贪婪匹配，()中的.*?匹配到字符串作为捕获组。search()用于在字符串中搜索正则表达式的第一个匹配项。运行代码，输出结果为`Match found: Welcome to my website。

示例2：提取网页中的图片链接

下面是另一个例子，演示如何使用正则表达式提取网页中的图片链接：

import re
import requests

url = 'https://www.example'
response = requests.get(url)
html = response.text
pattern = r'<img.*?src="(.*?)".*?>'
result = re.findall(pattern, html)
if result:
    print('Matches found:', result)
else:
    print('Matches not found')

在上面的代码中，我们使用正则表达式<img.*?src="(.*?)".*?>匹配网页中的图片链接。<img.*?>表示匹配img标签，src="(.*?表示匹配src属性中的链接，()表示匹配到的链接作为捕获组。findall()`函数返回所有匹配的结果。运行代码后，输出结果为匹配到的图片链接列表。

以上是两个示例，演示了如何使用正则表达配HTML标签的文本内容和提取网页中的图片链接。正则表达式是Python爬虫中常用的工具，读者可以通过这些示例更好地理解正则表达式的应。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：python正则表达中的re库常用方法总结 - Python技术站

python正则表达中的re库常用方法总结