Calculating the hash of a file in Python is an effective way to verify file integrity and detect any modifications. Hashing functions like SHA256
and MD5
are commonly employed for this task across various applications, such as software distribution, where ensuring that a file has not been altered is crucial.
In this article, you will learn how to compute the hash of a file using Python. You will explore how to use both the SHA256
and MD5
hashing algorithms with clear examples designed to guide you through the process step by step.
Ensure you've imported the necessary modules from Python’s standard library.
import hashlib
Define a function that opens the file in binary mode and reads chunks of it to calculate the hash.
def get_sha256_hash(file_path):
hash_sha256 = hashlib.sha256()
with open(file_path, "rb") as f:
for chunk in iter(lambda: f.read(4096), b""):
hash_sha256.update(chunk)
return hash_sha256.hexdigest()
This function creates a new SHA256 hash object, then opens the specified file in binary mode. It reads the file in small chunks to handle large files efficiently. hash_sha256.update(chunk)
updates the hash object with the content of each chunk until the entire file has been read.
Call the function with the path to your file.
file_path = 'example.txt'
print("SHA-256 Hash:", get_sha256_hash(file_path))
Replace 'example.txt'
with the path to the file whose hash you want to compute. This invocation prints the SHA-256 hash of the file.
Define a similar function for MD5 hash computation.
def get_md5_hash(file_path):
hash_md5 = hashlib.md5()
with open(file_path, "rb") as f:
for chunk in iter(lambda: f.read(4096), b""):
hash_md5.update(chunk)
return hash_md5.hexdigest()
This function operates like the SHA-256 function but uses MD5 hashing. It is slightly faster but less secure compared to SHA-256, making it suitable for non-security-critical applications such as quick file integrity checks in less sensitive contexts.
Call the MD5 hash function and print the result.
file_path = 'example.txt'
print("MD5 Hash:", get_md5_hash(file_path))
Again, replace 'example.txt'
with the actual file path you wish to evaluate. This line of code outputs the MD5 hash of the file.
Calculating the hash of a file in Python using SHA256
or MD5
enables you to verify file integrity and detect any unauthorized changes. By following the examples provided, you can effectively implement file hashing in your Python applications for various practical purposes, from security checks to quick integrity verifications. Remember to choose the hashing algorithm that best fits your use case, considering the balance between speed and security.