Python 3 Bulk Domain Age Checker Web Scraping Script Using BeautifulSoup4 Library Full Project For Beginners

Python 3 Bulk Domain Age Checker Web Scraping Script Using BeautifulSoup4 Library Full Project For Beginners

 

Welcome folks today in this blog post we will be building a bulk domain age checker web scraping script using beautifulsoup4 library. All the full source code of the application is given below.

 

 

Get Started

 

 

In order to get started you need to install the following library using the pip command as shown below

 

pip install bs4

 

After installing this library you need to create a text file called domains in which you will store all your domains to check for domain age

 

domains

 

http://www.obrnadzor.gov.ru
http://www.rost.False
http://www.fipi.ru
http://www.lexed.ru
http://www.edu.ru
http://www.ecsocman.edu.ru
http://minobr.khb.ru
http://www.minomos.ru
http://www.edunso.ru
http://www.educom.ru
http://www.avorontcov.ru
http://www.vlgregedu.ru
http://www.baikalnarobraz.ru

 

Now make an app.py script file inside your root directory and copy paste the following code which is shown below

 

app.py

 

import requests
from bs4 import BeautifulSoup
import fake_useragent
import time

'''
Check only http url, if you want to change url to https do the follows:
 ==>   lines = [line.strip()[12:] for line in file]
'''
user = fake_useragent.UserAgent().random
headers = {
    'user-agent': user
}

# https[12:]  http[11:]
with open('domains') as file:
    lines = [line.strip()[11:] for line in file]

    for domains in lines:
        time.sleep(1)
        url = f'https://www.nic.ru/whois/?searchWord={domains}'
        response = requests.get(url, headers=headers)
        soup = BeautifulSoup(response.content, 'html.parser')
        try:
            information = soup.find(class_='_3U-mA _23Irb')
            inf = information.text.splitlines()
            for x in inf:
                if 'created:' in x:
                    creation_date = x.split('created:')[-1].strip()
                    creation_date = creation_date[:4]
                    print(f'{domains}: {creation_date}')
                elif 'Creation Date' in x:
                    creation_date = x.split('Creation Date')[-1].strip()
                    creation_date = creation_date[:4]
                    print(f'{domains}: {creation_date}')
        except:
            print(f'{domains}: не удалось проверить')

 

See also  Python 3 Script to Build StringCase Converter (Camel case, Pascal case, Snake case) Using stringcase Library Full Project For Beginners

 

Now if you execute this above python script by typing the below command as shown below

 

python app.py

 

 

As you can see it has shown all the domain age for all the domains that we listed inside the file on the command line

Leave a Reply