Python 3 Script to Strip or Remove HTML Tags From Raw String Using Regular Expression Full Project For Beginners

Python 3 Script to Strip or Remove HTML Tags From Raw String Using Regular Expression Full Project For Beginners

 

Welcome folks today in this blog post we will be removing html tags from raw string using python. All the full source code of the application is given below.

 

 

Get Started

 

 

In order to get started you need to make an app.py file and copy paste the following code

 

app.py

 

import re

def cleanhtml(raw_html):
  cleanr = re.compile('<.*?>')
  cleantext = re.sub(cleanr, '', raw_html)
  return cleantext

print(cleanhtml("<p>helloworld</p>"))

 

 

Here we are using the re module of python which is the regular expression which actually removes the html tags from the raw string which is passed to the cleanhtml function as a argument.

 

We are removing the <p> tags from the raw text helloworld and returning the output text. So now if you execute the python script by typing the below command as shown below

See also  Python 3 Tkinter GUI Script to Make Age Calculator From DOB Using tk-calendar Library Full Project For Beginners

 

python app.py

 

 

 

Now you can see the html tags are successfully removed or stripped from the raw string

Leave a Reply