Python 3 Script to Convert HTML Table to CSV File Using Pandas and BeautifulSoup4 Library Full Tutorial For Beginners

 

Welcome folks today in this post we will be converting html table to csv file in python script using pandas and beautifulsoup4 library. All the full source code of the application is given below.

 

 

 

Get Started

 

 

In order to get started you need to install the following libraries by using the pip command

 

pip install beautifulsoup4

 

pip install pandas

 

After installing all the libraries you need to create an table.html inside the root directory and here you need to copy paste the html code for generating a simple table

 

index.html

 

<!DOCTYPE html>
<html lang="en">
<head>
  <title>Bootstrap Example</title>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/css/bootstrap.min.css">
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
  <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/js/bootstrap.min.js"></script>
</head>
<body>

<div class="container">
  <h2>Basic Table</h2>
  <p>The .table class adds basic styling (light padding and only horizontal dividers) to a table:</p>            
  <table class="table">
    <thead>
      <tr>
        <th>Firstname</th>
        <th>Lastname</th>
        <th>Email</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>John</td>
        <td>Doe</td>
        <td>john@example.com</td>
      </tr>
      <tr>
        <td>Mary</td>
        <td>Moe</td>
        <td>mary@example.com</td>
      </tr>
      <tr>
        <td>July</td>
        <td>Dooley</td>
        <td>july@example.com</td>
      </tr>
    </tbody>
  </table>
</div>

</body>
</html>

 

 

So here if you open this file inside the browser it will show the table which we will convert to csv file in python script

READ  Python 3 Tkinter Close Window Using root.Destroy() Method Using Button inside Functions and Classes in GUI Desktop App Full Project For Beginners

 

 

 

So after this you need to make an app.py file and copy paste the following code

 

app.py

 

# Importing the required modules 
import os 
import sys 
import pandas as pd 
from bs4 import BeautifulSoup 

path = 'table.html'

# empty list 
data = [] 

# for getting the header from 
# the HTML file 
list_header = [] 
soup = BeautifulSoup(open(path),'html.parser') 
header = soup.find_all("table")[0].find("tr") 

for items in header: 
    try: 
        list_header.append(items.get_text()) 
    except: 
        continue

# for getting the data 
HTML_data = soup.find_all("table")[0].find_all("tr")[1:] 

for element in HTML_data: 
    sub_data = [] 
    for sub_element in element: 
        try: 
            sub_data.append(sub_element.get_text()) 
        except: 
            continue
    data.append(sub_data) 

# Storing the data into Pandas 
# DataFrame 
dataFrame = pd.DataFrame(data = data, columns = list_header) 

# Converting Pandas DataFrame 
# into CSV file 
dataFrame.to_csv('Geeks.csv')

 

 

So now if you execute this python script by the below command

 

python app.py

 

Now you will see it will create the csv file as shown below

 

 

 

 

Leave a Reply