Python Pandas DataFrame pivot() - Create Pivot Table

Introduction

Pandas is a powerhouse tool for data manipulation and analysis in Python, particularly favored in data science and finance for its ease of use and flexibility. Among its versatile set of features, the pivot() function stands out as a fundamental method for reshaping data and creating pivot tables useful for summarizing large data sets. This function helps in transforming data to a more readable and useful format, allowing for more in-depth analysis and quicker insight generation.

In this article, you learn how to effectively utilize the pivot() function to create pivot tables from a DataFrame. Explore different situations where this function becomes essential, like summarizing data, rotating data for a better view, and aggregating results under different scenarios.

Understanding Data Transformation with pivot()

Basics of the pivot() Function

Pandas pivot() function is used to create a new derived table out of a given one. The primary use of pivot() is to reshape or transform data according to given index/column values. Here’s how to start using it:

Import the pandas library.

Create or load an example DataFrame.

                            python
                            
                        
import pandas as pd

data = {
    'Date': ['2023-01-01', '2023-01-02', '2023-01-02', '2023-01-03'],
    'Category': ['Tech', 'Finance', 'Tech', 'Sports'],
    'Value': [200, 220, 240, 100]
}
df = pd.DataFrame(data)
print(df)

The DataFrame df consists of dates, categories, and values, simulating a typical data log scenario.

Creating a Simple Pivot Table

Once you have a DataFrame, creating a pivot table to see values distributed across categories per date becomes straightforward:

Specify index, columns, and values for the pivot function.
python
```
pivot_df = df.pivot(index='Date', columns='Category', values='Value')
print(pivot_df)
```
This pivots the table so that each date is expanded horizontally by the category with corresponding values being populated. Unspecified combinations remain NaN.

Aggregating Data with Pivot Tables

Recognize the need for aggregation if data has duplicates in the index/columns pair.
Use the pivot_table() with an aggregation function to handle multiple values for a single index/column pair correctly.
python
```
pivot_table_df = df.pivot_table(index='Date', columns='Category', values='Value', aggfunc='sum')
print(pivot_table_df)
```
Here, the pivot_table() function is used because it supports aggregation, unlike pivot(). aggfunc='sum' will add up values in case there are duplicate entries for certain dates and categories.

Advanced Usage of pivot()

Multi-level Indexing in a Pivot Table

Multi-level indexing can enhance data presentation and allow more complex data summaries:

Include additional category or identifier as part of the index.

                            python
                            
                        
df['Subcategory'] = ['Software', 'Banking', 'Hardware', 'Team Sports']
pivot_multi_index_df = df.pivot_table(index=['Date', 'Subcategory'], columns='Category', values='Value')
print(pivot_multi_index_df)

This configuration helps in further breaking down the data, making insights even clearer at the more granular levels of categories.

Conclusion

Mastering the pivot() function in pandas significantly elevates data manipulation and analysis capabilities. It simplifies turning raw data into summarized tables, making it easier to observe patterns, trends, and anomalies. From simple restructurings to complex summaries, pivot() accommodates various data shapes and sizes, ensuring your analysis is comprehensive and insightful. Harness these techniques to transform data effectively and unlock deeper insights for your projects or data explorations.

Comments

No comments yet.

Python Pandas DataFrame pivot() - Create Pivot Table

Introduction

Understanding Data Transformation with pivot()

Basics of the pivot() Function

Creating a Simple Pivot Table

Aggregating Data with Pivot Tables

Advanced Usage of pivot()

Multi-level Indexing in a Pivot Table

Conclusion

Comments

Products

Features

Solutions

Marketplace

Resources

Company

Tech Talks

Vultr Blogs