Python 3 Script to Remove Duplicate Files and Images in a Given Directory or Folder Using Hashlib Library Full Project For Beginners

Python 3 Script to Remove Duplicate Files and Images in a Given Directory or Folder Using Hashlib Library Full Project For Beginners

 

Welcome folks today in this blog post we will be making a python script which will automatically removes the same or duplicate files and images in the given directory. All the full source code of the application is given below.

 

 

 

Get Started

 

 

In order to get started you need to make an app.py file and copy paste the following code

 

app.py

 

 

import hashlib
import os

# Returns the hash string of the given file name


def hashFile(filename):
    # For large files, if we read it all together it can lead to memory overflow, So we take a blocksize to read at a time
    BLOCKSIZE = 65536
    hasher = hashlib.md5()
    with open(filename, 'rb') as file:
        # Reads the particular blocksize from file
        buf = file.read(BLOCKSIZE)
        while(len(buf) > 0):
            hasher.update(buf)
            buf = file.read(BLOCKSIZE)
    return hasher.hexdigest()


if __name__ == "__main__":
    # Dictionary to store the hash and filename
    hashMap = {}

    # List to store deleted files
    deletedFiles = []
    filelist = [f for f in os.listdir() if os.path.isfile(f)]
    for f in filelist:
        key = hashFile(f)
        # If key already exists, it deletes the file
        if key in hashMap.keys():
            deletedFiles.append(f)
            os.remove(f)
        else:
            hashMap[key] = f
    if len(deletedFiles) == 0:
        print('Deleted Files')
        for i in deletedFiles:
            print(i)
    else:
        print('No duplicate files found')

 

See also  Python 3 Script to Compress or Reduce Size of PDF Document Full Project For Beginners

 

Now if you run this python script app.py by typing the below command

 

python app.py

 

 

 

 

So as you can see if you execute the script it will remove all the duplicate files and images in the given directory where you are executing the script and then it will give you this message that no duplicate files found

Leave a Reply