Python 3 Web Scraping Selenium Script to Bulk Download all Images From URL in Command Line Using BeautifulSoup4 Library Full Project For Beginners

Python 3 Web Scraping Selenium Script to Bulk Download all Images From URL in Command Line Using BeautifulSoup4 Library Full Project For Beginners

 

Welcome folks today in this post we will be making a web scraper which will automatically download all the images that are present inside the website url using selenium and beautifulsoup4 library in python. All the source code of the application is shown below

 

 

 

Get Started

 

 

In order to get started you need to search for your chrome browser or any company browser you are using you need to download the chrome driver for your browser for chrome users from here

 

After downloading this store the driver inside the root directory wherever you are developing the python project. Now install the following libraries using the pip command as shown below

 

pip install requests

 

pip install beautifulsoup4

 

pip install selenium

 

After installing all these libraries make an app.py file and copy paste the following code to it as shown below

 

app.py

 

from selenium import webdriver
import requests as rq
import os
from bs4 import BeautifulSoup
import time

# path= E:\web scraping\chromedriver_win32\chromedriver.exe
path = input("Enter Path : ")

url = input("Enter URL : ")

output = "output"


def get_url(path, url):
    driver = webdriver.Chrome(executable_path=r"{}".format(path))
    driver.get(url)
    print("loading.....")
    res = driver.execute_script("return document.documentElement.outerHTML")

    return res


def get_img_links(res):
    soup = BeautifulSoup(res, "lxml")
    imglinks = soup.find_all("img", src=True)
    return imglinks


def download_img(img_link, index):
    try:
        extensions = [".jpeg", ".jpg", ".png", ".gif"]
        extension = ".jpg"
        for exe in extensions:
            if img_link.find(exe) > 0:
                extension = exe
                break

        img_data = rq.get(img_link).content
        with open(output + "\\" + str(index + 1) + extension, "wb+") as f:
            f.write(img_data)
        
        f.close()
    except Exception:
        pass


result = get_url(path, url)
time.sleep(60)
img_links = get_img_links(result)
if not os.path.isdir(output):
    os.mkdir(output)

for index, img_link in enumerate(img_links):
    img_link = img_link["src"]
    print("Downloading...")
    if img_link:
        download_img(img_link, index)
print("Download Complete!!")

 

See also  Python 3 OpenCV Script to Smoothen or Sharpen Input Image Using Numpy Library Full Project For Beginners

 

Now to run this script you need to type the below command

 

python app.py

 

After running the command it will ask for chrome driver path you just need to see below figure

 

 

 

You can see that after providing the chrome driver path and also the website url with https this is mandatory for the application to work and then it will automatically open the url with chrome browser and downloads all the images which are present inside it and store it inside the folder called as output which is shown below

 

 

 

 

 

 

Leave a Reply