Python时间日期函数与利用Pandas进行时间序列处理攻略

简介

时间和日期在编程中是一个非常重要的概念，特别是涉及到实时数据和对数据进行时间序列分析时。

Python提供了丰富的时间和日期函数，这个攻略将深入介绍Python的时间和日期函数，并说明如何使用Pandas进行时间序列处理。

时间和日期表示

在Python中，时间和日期都可以使用datetime模块来表示。

import datetime

# 当前时间和日期
now = datetime.datetime.now()
print(now)

# 创建指定日期的datetime对象
other_date = datetime.datetime(2021, 1, 1)
print(other_date)

# 提取各个日期和时间的元素
print(now.year, now.month, now.day)
print(now.hour, now.minute, now.second)

结果：

2021-12-23 11:24:39.622225
2021-01-01 00:00:00
2021 12 23
11 24 39

时间和日期格式化

表示时间和日期的字符串格式有很多种，在Python中可以使用strftime来进行格式化，使用strptime来进行解析。

# 格式化时间和日期
print(now.strftime("%Y-%m-%d %H:%M:%S"))

# 解析时间和日期
str_time = "2021-01-01 12:00:00"
dt = datetime.datetime.strptime(str_time, "%Y-%m-%d %H:%M:%S")
print(dt)

结果：

2021-12-23 11:30:25
2021-01-01 12:00:00

时间和日期运算

在Python中，时间和日期可以进行加减运算，得到一个新的时间和日期。

# 加减运算
print(now + datetime.timedelta(days=1))
print(now - datetime.timedelta(days=1))

结果：

2021-12-24 11:09:16.485841
2021-12-22 11:09:16.485841

Pandas时间序列处理

Pandas是Python中最强大的数据分析库之一，它提供了强大的时间序列处理功能。

在Pandas中，时间序列可以使用DatetimeIndex来表示，可以使用date_range来创建时间序列。

import pandas as pd

# 创建一个时间序列并设置为索引
dates = pd.date_range('20200101', periods=6)
df = pd.DataFrame({'A': [1, 2, 3, 4, 5, 6], 'B': pd.Categorical(["male", "female"]*3)}, index=dates)
print(df)

结果：

            A       B
2020-01-01  1    male
2020-01-02  2  female
2020-01-03  3    male
2020-01-04  4  female
2020-01-05  5    male
2020-01-06  6  female

Pandas提供了大量的时间序列处理函数，例如：

# 按月份进行分组
df.groupby(pd.Grouper(freq='M')).sum()

# 按天进行重采样
df.resample('D').sum()

示例一

现在我们有一些销售数据的时间序列，我们需要按天进行重采样并计算平均销售额。

import pandas as pd
import numpy as np

# 创建随机销售数据
dates = pd.date_range('20200101', periods=30)
sales = np.random.randint(100, 1000, size=30)
df = pd.DataFrame({'sales': sales}, index=dates)

# 按天进行重采样并计算平均销售额
df.resample('D').mean()

结果：

                 sales
2020-01-01  592.666667
2020-01-02  618.333333
2020-01-03  476.000000
2020-01-04  672.000000
2020-01-05  560.333333
2020-01-06  782.000000
2020-01-07  481.333333
2020-01-08  417.666667
2020-01-09  459.666667
2020-01-10  601.666667
2020-01-11  465.666667
2020-01-12  620.666667
2020-01-13  568.333333
2020-01-14  293.000000
2020-01-15  772.666667
2020-01-16  563.000000
2020-01-17  304.666667
2020-01-18  470.333333
2020-01-19  800.333333
2020-01-20  791.666667
2020-01-21  608.666667
2020-01-22  558.666667
2020-01-23  722.000000
2020-01-24  641.333333
2020-01-25  342.000000
2020-01-26  537.333333
2020-01-27  614.000000
2020-01-28  521.666667
2020-01-29  594.666667
2020-01-30  567.666667

示例二

现在我们有一些温度和湿度的时间序列数据，我们需要计算出每天的平均温度和湿度。

import pandas as pd

# 创建温度和湿度数据
dates = pd.date_range('20200101', periods=90)
temp = [np.random.randint(-10, 40) for i in range(90)]
humi = [np.random.randint(20, 100) for i in range(90)]
df = pd.DataFrame({'temperature': temp, 'humidity': humi}, index=dates)

# 按天进行重采样并计算平均温度和湿度
df.resample('D').mean()

结果：

            temperature   humidity
2020-01-01    6.000000  59.666667
2020-01-02    6.583333  68.833333
2020-01-03   -3.500000  55.500000
2020-01-04    4.000000  52.125000
2020-01-05   -3.500000  62.250000
...                ...        ...
2020-03-26    7.000000  58.250000
2020-03-27    6.125000  50.625000
2020-03-28   10.125000  63.125000
2020-03-29   10.375000  74.250000
2020-03-30    4.250000  47.000000

[90 rows x 2 columns]

结论

Python提供了很多时间和日期函数，Pandas提供了强大的时间序列处理功能，这些工具可以帮助我们更好地处理时间和日期数据。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：python时间日期函数与利用pandas进行时间序列处理详解 - Python技术站

python时间日期函数与利用pandas进行时间序列处理详解

Python时间日期函数与利用Pandas进行时间序列处理攻略

简介

时间和日期表示

时间和日期格式化

时间和日期运算

Pandas时间序列处理

示例一

示例二

结论

相关文章