详解pandas.replace()（替换数值）函数使用方法

2023年3月22日下午9:17 • Pandas函数大全

pandas.replace() 函数的作用是将 DataFrame 或 Series 中的某一列或多列中的指定值替换为其他值或空值。其常用在数据清洗或转换的过程中。

pandas.replace() 的常用参数：

to_replace：要替换的值，可以是单个值、多个值、字典或正则表达式
value：用来替换 to_replace 的值
inplace：是否在原 DataFrame 上进行修改，默认为 False
limit：限制替换的数量，默认为 None，即不限制

实例1: 将 DataFrame 的某列数值替换为其他值

下面以一个具体的实例来说明，假设有一个 DataFrame，其中有一个名为 "gender" 的列，包含了 "M" (Male) 和 "F" (Female) 两种性别信息，我们想把 "M" 替换为 1，把 "F" 替换为 0：

import pandas as pd

# 创建一个样例 DataFrame
df = pd.DataFrame(
    {'name': ['Alice', 'Bob', 'Charlie', 'David'],
     'gender': ['M', 'M', 'F', 'F']})

# 使用 replace() 函数将 'M' 替换为 1，将 'F' 替换为 0
df.replace({'gender': {'M': 1, 'F': 0}}, inplace=True)

print(df)

输出结果：

      name  gender
0    Alice       1
1      Bob       1
2  Charlie       0
3    David       0

实例2: 使用正则表达式替换某列中的字符串

下面我们将展示另一个实例，在这个实例中我们将使用正则表达式来替换某列中的字符串。假设有一个包含网址的 DataFrame，其中的 "urls" 列的每个元素都包含一个网址，如 "http://example.com"。我们想将每个网址的 "http://" 替换为 "https://"：

import pandas as pd

# 创建一个样例 DataFrame
df = pd.DataFrame(
    {'name': ['Alice', 'Bob', 'Charlie', 'David'],
     'url': ['http://example.com', 'https://google.com', 'http://facebook.com', 'https://twitter.com']})

# 使用正则表达式替换 'http' 为 'https'
df.replace(to_replace=r'^http', value='https', regex=True, inplace=True)

print(df)

输出结果：

      name                   url
0    Alice   https://example.com
1      Bob    https://google.com
2  Charlie  https://facebook.com
3    David   https://twitter.com

在这个例子中，我们使用了正则表达式 r'^http' 来匹配所有以 "http" 开头的字符串，用 "https" 替换它们。注意如果我们不加上 regex=True，replace() 函数就无法识别 r'^http' 是一条正则表达式。

综上所述，pandas.replace() 函数是一个非常有用的数据清洗函数，在数据预处理阶段有着广泛的应用。在实际使用时，我们可以根据情况灵活选择要替换的值，以及替换的方式和方式的粒度。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：详解pandas.replace()（替换数值）函数使用方法 - Python技术站