Welcome folks today in this tutorial we will be downloading
image or pdf file from url using requests and validators libraries in python. All the full source code of the application is shown below.
Get Started
In order to get started we need to install the following libraries using the pip
command as shown below
pip install requests
pip install validators
After installing these libraries make an app.py
file and copy paste the following code
app.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
import sys import requests import re from colorama import init, Fore, Back, Style import validators # PURPOSE of this script # Get a url with various files formats to download # For example to get a ip ranges of AWS in json format, use below url # AWS IP range url is: https://ip-ranges.amazonaws.com/ip-ranges.json init(autoreset=True) # Check if user has passed the file url to download or not if len(sys.argv) == 2: file_url = sys.argv[1] # Getting the url from command line# extract file name form the url using re module # Proceed if url is valid if not validators.url(file_url): exit(Fore.RED + Style.BRIGHT + "Not a valid url") # this regex would also work as it with positive lookbehind (?:[^/][\d\w\.-]+)$(?<=\.\w{3,4}) # file name with extension matched_file = re.search("(?=[\w\d-]+\.\w{3,4}$).+", file_url) # regex with positive lookahead if matched_file is not None: file_name_with_extension = matched_file.group(0) else: exit(Fore.CYAN + Style.BRIGHT + "Nothing to download") print("Downloading %s" % file_name_with_extension) response = requests.get(file_url, stream = True) with open(file_name_with_extension, "wb") as file_download: total_length = response.headers.get('content-length') total_length = int(total_length) print("Total length of the file is: {0:.3f} KB".format(total_length/1024)) dl_bar = 0 # download bar for chunk in response.iter_content(chunk_size = 1024): #writing one chunk at a time to file_download if chunk: dl_bar += len(chunk) file_download.write(chunk) done = int(50 * dl_bar / total_length) sys.stdout.write(Fore.GREEN + Style.BRIGHT + "\r[%s%s]" % ('=' * done, ' ' * (50-done)) ) sys.stdout.flush() print("\nDownload Complete! \n") else : print(Fore.RED + Style.BRIGHT + 'Please pass the url to download. Make sure it is downloadable url') print(Fore.GREEN + Style.BRIGHT + 'Ex: python {} <downloadable_url>'.format(sys.argv[0])) |
Now to run this python script you need to provide an additonal command line
argument in which you will be providing the url
of the file to download in this script as shown below
Now to download a image from a url we will execute below command to run this script
python app.py https://freemediatools.com/img/profile.jpg
So as you can see the image is successfully downloaded
with this script inside my root directory and in similar manner you can download any extension file such as pdf
and any other media file like gif
and video