Python Pandas DataFrame info() - Display Information

Introduction

The info() method in Python's Pandas library is a vital tool for data scientists and analysts working with large datasets. This method provides a concise summary of a DataFrame, including information about index dtype and columns, non-null values, and memory usage. It serves as a quick diagnostic tool to understand the structure and entries of the DataFrame without viewing the entire dataset.

In this article, you will learn how to use the info() method effectively. Discover how to retrieve essential details about your DataFrame, modify its output to suit your needs, and interpret the information it provides.

Understanding the info() Method

Basic Usage of info()

Import the pandas library and create a DataFrame.
Call the info() method to view the summary of the DataFrame.
python
```
import pandas as pd
data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
        'Age': [28, 22, 34, 42],
        'City': ['New York', 'Paris', 'Berlin', 'London']}
df = pd.DataFrame(data)
df.info()
```
Executing this code results in an output that lists the number of entries, the total number of columns, names of columns, count of non-null entries per column, datatype of each column, and memory usage.

Exploring Parameters of info()

Verbose Parameter

Use the verbose parameter to control the display of information.
Set verbose=False to show a simpler output, especially useful when dealing with a large number of columns.
python
```
df.info(verbose=False)
```
This adjustment limits the output to the very basics: the DataFrame's range index and the number of columns.

Max_cols Parameter

Control the number of columns summarized with the max_cols parameter.
Change max_cols to fit the number of columns you want detailed in the output.
python
```
df.info(max_cols=2)
```
This configuration will adjust the output to show detailed information for up to two columns only.

Null_counts Parameter

Specify the null_counts parameter to control the display of non-null counts.
Setting null_counts=True ensures that you see the count of non-null values for each column.
python
```
df.info(null_counts=True)
```
With this setting, the output will explicitly display non-null counts, which is the default behavior for smaller DataFrames.

Using info() with Large DataFrames

Practical Example with a Large DataFrame

Simulate a larger DataFrame using NumPy to understand the extended use of info().
Observe how info() behaves differently due to large data size.
python
```
import numpy as np
large_data = pd.DataFrame(np.random.rand(1000, 50), columns=[f'col{i}' for i in range(50)])
large_data.info()
```
This demonstration with a DataFrame of 1000 rows and 50 columns highlights the method's ability to summarize extensive data succinctly.

Conclusion

The info() method in Pandas is a powerful and essential tool for quickly assessing the structure and properties of a DataFrame. By understanding and utilizing the parameters of info(), such as verbose, max_cols, and null_counts, you can tailor the output to better suit your analytical needs. Applying this method helps in efficiently diagnosing data, paving the way for more effective data preprocessing and analysis. Make info() a regular part of your data inspection toolkit to maintain clarity and oversight over your data assets.

Comments

No comments yet.

Python Pandas DataFrame info() - Display Information

Introduction

Understanding the info() Method

Basic Usage of info()

Exploring Parameters of info()

Verbose Parameter

Max_cols Parameter

Null_counts Parameter

Using info() with Large DataFrames

Practical Example with a Large DataFrame

Conclusion

Comments

Products

Features

Solutions

Marketplace

Resources

Company

Tech Talks

Vultr Blogs