Python Pandas DataFrame sort_values() - Sort Data by Values

Updated on December 27, 2024
sort_values() header image

Introduction

In the world of data analytics, sorting data is a fundamental task that facilitates better understanding, clearer presentations, and easier analysis. Python's Pandas library offers a robust tool called sort_values() for sorting the values in DataFrames. This method is versatile and can handle a variety of data types, providing extensive customization options to suit different sorting requirements.

In this article, you will learn how to efficiently use the sort_values() method to sort data in Pandas DataFrames. Discover how to sort by single columns, multiple columns, handle missing values, and customize sorting orders to get insights from your data more effectively.

Sorting by Single Column

Basic Ascending Sort

  1. Import the Pandas library and create a DataFrame.

  2. Apply the sort_values() method to sort the DataFrame based on one column in ascending order.

    python
    import pandas as pd
    
    data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
            'Age': [28, 22, 34, 29]}
    df = pd.DataFrame(data)
    
    sorted_df = df.sort_values(by='Age')
    print(sorted_df)
    

    This code sorts the DataFrame df by the 'Age' column in ascending order, which is the default sorting order in sort_values().

Descending Sort

  1. Use the ascending=False parameter to sort a DataFrame in descending order.

    python
    sorted_df_desc = df.sort_values(by='Age', ascending=False)
    print(sorted_df_desc)
    

    Sorting in descending order is straightforward with the ascending parameter set to False, which reverses the sort order.

Sorting by Multiple Columns

Specify Sort Order for Each Column

  1. Prepare a DataFrame with multiple columns to sort.

  2. Use the sort_values() method, specifying a list of columns and corresponding sort directions.

    python
    data_multi = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
                  'Department': ['HR', 'HR', 'IT', 'IT'],
                  'Age': [28, 22, 34, 29]}
    df_multi = pd.DataFrame(data_multi)
    
    sorted_df_multi = df_multi.sort_values(by=['Department', 'Age'], ascending=[True, False])
    print(sorted_df_multi)
    

    In this snippet, the DataFrame is sorted first by the 'Department' column in ascending order and then by 'Age' in descending order within each department.

Handling Missing Values

Control the Placement of NaN Values

  1. Introduce NaN values into a DataFrame.

  2. Use the na_position argument to specify the placement of NaN values in the sorted DataFrame.

    python
    data_nan = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
                'Age': [28, None, 34, 29]}
    df_nan = pd.DataFrame(data_nan)
    
    sorted_df_nan = df_nan.sort_values(by='Age', na_position='last')
    print(sorted_df_nan)
    

    The na_position='last' argument ensures that rows with NaN values in the 'Age' column appear at the end of the DataFrame after sorting.

Custom Sorting with the Key Parameter

Use a Custom Key Function for Sorting

  1. Define a custom sorting logic using a key function.

  2. Pass the key function to the sort_values() using the key parameter.

    python
    data_key = {'Name': ['banana', 'apple', 'Orange', 'Grape']}
    df_key = pd.DataFrame(data_key)
    
    sorted_df_key = df_key.sort_values(by='Name', key=lambda x: x.str.lower())
    print(sorted_df_key)
    

    The key parameter allows for custom transformations of the data before sorting. In this example, the names are sorted in case-insensitive alphabetical order.

Conclusion

The sort_values() function in the Pandas library is a powerful and flexible tool for sorting data in DataFrames. Whether sorting by one column or multiple columns, ascending or descending order, handling missing values, or applying a custom sort key, this method provides the functionality needed to efficiently manipulate and prepare data for analysis. Use these techniques to enhance the clarity and usefulness of your data sets, ensuring that they are presented in an ordered and insightful manner.