Welcome folks today in this blog post we will be building a bulk domain age checker
web scraping script using beautifulsoup4 library. All the full source code of the application is given below.
Get Started
In order to get started you need to install the following library using the pip
command as shown below
pip install bs4
After installing this library you need to create a text file called domains
in which you will store all your domains to check for domain age
domains
1 2 3 4 5 6 7 8 9 10 11 12 13 |
http://www.obrnadzor.gov.ru http://www.rost.False http://www.fipi.ru http://www.lexed.ru http://www.edu.ru http://www.ecsocman.edu.ru http://minobr.khb.ru http://www.minomos.ru http://www.edunso.ru http://www.educom.ru http://www.avorontcov.ru http://www.vlgregedu.ru http://www.baikalnarobraz.ru |
Now make an app.py
script file inside your root directory and copy paste the following code which is shown below
app.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
import requests from bs4 import BeautifulSoup import fake_useragent import time ''' Check only http url, if you want to change url to https do the follows: ==> lines = [line.strip()[12:] for line in file] ''' user = fake_useragent.UserAgent().random headers = { 'user-agent': user } # https[12:] http[11:] with open('domains') as file: lines = [line.strip()[11:] for line in file] for domains in lines: time.sleep(1) url = f'https://www.nic.ru/whois/?searchWord={domains}' response = requests.get(url, headers=headers) soup = BeautifulSoup(response.content, 'html.parser') try: information = soup.find(class_='_3U-mA _23Irb') inf = information.text.splitlines() for x in inf: if 'created:' in x: creation_date = x.split('created:')[-1].strip() creation_date = creation_date[:4] print(f'{domains}: {creation_date}') elif 'Creation Date' in x: creation_date = x.split('Creation Date')[-1].strip() creation_date = creation_date[:4] print(f'{domains}: {creation_date}') except: print(f'{domains}: не удалось проверить') |
Now if you execute this above python script by typing the below command as shown below
python app.py
As you can see it has shown all the domain age for all the domains
that we listed inside the file on the command line