Welcome folks today in this post we will be downloading
geeksforgeeks.org articles as pdf
documents in python using selenium. All the source code of the application will be given as below.
Get Started
In order to get started you need to install the following library using the pip
command as shown below
pip install selenium
After that make an app.py
file and copy paste the following code
app.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 |
# !/usr/bin/env python from selenium import webdriver from webdriver_manager.chrome import ChromeDriverManager import json import requests # article url # URL = "https://www.geeksforgeeks.org/what-can-i-do-with-python/" def get_driver(): # chrome options settings chrome_options = webdriver.ChromeOptions() settings = { "recentDestinations": [ {"id": "Save as PDF", "origin": "local", "account": ""} ], "selectedDestinationId": "Save as PDF", "version": 2, } prefs = { "download.default_directory": r"downloads", "download.prompt_for_download": False, "download.directory_upgrade": True, "safebrowsing.enabled": True } chrome_options.add_experimental_option("prefs", prefs) chrome_options.add_argument("--kiosk-printing") # launch browser with predefined settings browser = webdriver.Chrome( executable_path=ChromeDriverManager().install(), options=chrome_options ) return browser def download_article(URL): browser = get_driver() browser.get(URL) # launch print and save as pdf browser.execute_script("window.print();") browser.close() if __name__ == "__main__": URL = input("provide article URL: ") # check if the url is valid/reachable if requests.get(URL).status_code == 200: try: download_article(URL) print("Your article is successfully downloaded") except Exception as e: print(e) else: print("Enter a valid working URL") |
And now just make a downloads
folder inside the root directory and then run this python script
using the below command
python app.py