Python数据分析Pandas Dataframe排序操作

下面是关于“Python数据分析Pandas Dataframe排序操作”的完整攻略。

一、Pandas Dataframe排序操作

Pandas是基于Numpy开发的数据分析工具，最重要的两个数据结构是Series和DataFrame，其他的几乎都是在这两个数据结构的基础上进行扩展的。

Pandas Dataframe排序操作是数据分析中常用的操作之一，常见的排序方式有按行排序、按列排序等等，后续的示例中将会详细讲解。

1.按行排序

使用Pandas进行按行排序分为两种方式：按照指定的列进行排序和按照行的索引进行排序。

（1）按照指定的列进行排序

使用sort_values()方法可以按照指定的列进行排序，该方法有两个常用参数：

by：指定按照哪列进行排序；
ascending：指定每一列进行升序或者降序排列。

下面是一个按照"Age"列进行升序排列的例子：

import pandas as pd

df = pd.DataFrame({
        'Name': ['Tom', 'Jack', 'Steve', 'Ricky', 'Vin', 'Van', 'Alex'],
        'Age': [28, 34, 29, 42, 25, 23, 27],
        'Country': ['US', 'UK', 'US', 'UK', 'US', 'China', 'China']
})

df.sort_values(by=['Age'], ascending=True)

运行结果如下：

    Name  Age Country
4    Vin   25      US
5    Van   23   China
0    Tom   28      US
6   Alex   27   China
2  Steve   29      US
1   Jack   34      UK
3  Ricky   42      UK

通过上述代码，发现我们的数据按照"Age"列进行了升序排列。

（2）按照行的索引进行排序

使用sort_index()方法按照行的索引进行排序，该方法有一个关键字参数axis，默认值为0，表示按照行的索引进行排序。

例如，按照系统默认的方式对Pandas DataFrame进行行排序，代码如下：

import pandas as pd

df = pd.DataFrame({
        'Name': ['Tom', 'Jack', 'Steve', 'Ricky', 'Vin', 'Van', 'Alex'],
        'Age': [28, 34, 29, 42, 25, 23, 27],
        'Country': ['US', 'UK', 'US', 'UK', 'US', 'China', 'China']
})

df = df.sort_index()

运行结果如下：

    Name  Age Country
0    Tom   28      US
1   Jack   34      UK
2  Steve   29      US
3  Ricky   42      UK
4    Vin   25      US
5    Van   23   China
6   Alex   27   China

同样的，我们可以按照索引降序排列：

import pandas as pd

df = pd.DataFrame({
        'Name': ['Tom', 'Jack', 'Steve', 'Ricky', 'Vin', 'Van', 'Alex'],
        'Age': [28, 34, 29, 42, 25, 23, 27],
        'Country': ['US', 'UK', 'US', 'UK', 'US', 'China', 'China']
})

df = df.sort_index(ascending=False)

运行结果如下：

    Name  Age Country
6   Alex   27   China
5    Van   23   China
4    Vin   25      US
3  Ricky   42      UK
2  Steve   29      US
1   Jack   34      UK
0    Tom   28      US

2.按列排序

按列排序，也就是按照DataFrame中的列进行排序。

使用sort_values()方法，它有两个常用参数：

by：指定按照哪列进行排序；
ascending：指定每一列进行升序或者降序排列。

例如，我们要按照"Age"和"Country"两列分别进行升序和降序排列，代码如下：

import pandas as pd

df = pd.DataFrame({
        'Name': ['Tom', 'Jack', 'Steve', 'Ricky', 'Vin', 'Van', 'Alex'],
        'Age': [28, 34, 29, 42, 25, 23, 27],
        'Country': ['US', 'UK', 'US', 'UK', 'US', 'China', 'China']
})

df.sort_values(by=['Age', 'Country'], ascending=[True, False])

运行结果如下：

    Name  Age Country
5    Van   23   China
4    Vin   25      US
6   Alex   27   China
0    Tom   28      US
2  Steve   29      US
1   Jack   34      UK
3  Ricky   42      UK

二、示例说明

下面是两个通过Pandas Dataframe对数据进行排序的示例：

示例1：分析音乐排行榜

通过Pandas Dataframe对音乐排行榜进行排序，能够轻松统计各种音乐的排名，进而为音乐推广和市场营销提供依据。

例如，我们现在有一个简单的音乐排行榜数据如下：

import pandas as pd

df = pd.DataFrame({
        'Song': ['WAP', 'Blinding Lights', 'Dynamite', 'Watermelon Sugar', 'Savage Remix'],
        'Singer': ['Cardi B', 'The Weeknd', 'BTS', 'Harry Styles', 'Megan Thee Stallion'],
        'Ranking': [1, 2, 3, 4, 5]
})

df

运行结果如下：

              Song               Singer  Ranking
0              WAP              Cardi B        1
1  Blinding Lights           The Weeknd        2
2         Dynamite                  BTS        3
3  Watermelon Sugar         Harry Styles        4
4     Savage Remix  Megan Thee Stallion        5

按照排名从小到大对歌曲进行排序：

df.sort_values(by='Ranking')

运行结果如下：

              Song               Singer  Ranking
0              WAP              Cardi B        1
1  Blinding Lights           The Weeknd        2
2         Dynamite                  BTS        3
3  Watermelon Sugar         Harry Styles        4
4     Savage Remix  Megan Thee Stallion        5

按照歌曲名字（Song）对歌曲进行排序：

df.sort_values(by='Song')

运行结果如下：

              Song               Singer  Ranking
1  Blinding Lights           The Weeknd        2
2         Dynamite                  BTS        3
4     Savage Remix  Megan Thee Stallion        5
0              WAP              Cardi B        1
3  Watermelon Sugar         Harry Styles        4

示例2：统计学生成绩

通过Pandas Dataframe对学生成绩进行排序，可以快速统计各种科目和总分数的排名，并给出针对每个学生的分析报告，帮助老师和班主任更加方便地管理和评估学生。

例如，我们现在有一个简单的学生成绩数据如下：

import pandas as pd

df = pd.DataFrame({
        'Name': ['Jack', 'Tom', 'Alex', 'Steve', 'Eva'],
        'Chinese': [89, 91, 95, 88, 90],
        'Math': [94, 85, 90, 92, 87],
        'English': [92, 90, 87, 93, 91]
})

df

运行结果如下：

    Name  Chinese  Math  English
0   Jack       89    94       92
1    Tom       91    85       90
2   Alex       95    90       87
3  Steve       88    92       93
4    Eva       90    87       91

按照总分数的从高到低进行排序：

df['Total'] = df['Chinese'] + df['Math'] + df['English']
df.sort_values(by='Total', ascending=False)

运行结果如下：

    Name  Chinese  Math  English  Total
2   Alex       95    90       87    272
3  Steve       88    92       93    273
0   Jack       89    94       92    275
4    Eva       90    87       91    268
1    Tom       91    85       90    266

从上面的运行结果可以看出，Alex的总分数最高，排名第一。

通过上述示例，我们可以很容易地了解如何通过Pandas Dataframe对数据进行排序及其在实际操作中的应用。

本站文章如无特殊说明，均为本站原创，如若转载，请注明出处：Python数据分析Pandas Dataframe排序操作 - Python技术站

Python数据分析Pandas Dataframe排序操作

一、Pandas Dataframe排序操作

1.按行排序

2.按列排序

二、示例说明

示例1：分析音乐排行榜

示例2：统计学生成绩

相关文章