Python 3 Text Mining or Analysis Script to Analyze a Text File to Count Number of Words,Lines and Special Characters Full Project For Beginners

Python 3 Text Mining or Analysis Script to Analyze a Text File to Count Number of Words,Lines and Special Characters Full Project For Beginners

 

Welcome folks today in this post we will be doing text analysis about as text file such as counting no of words,lines and special characters in python. All the full source code of the application is shown below.

 

 

Get Started

 

 

In order to get started you need to make an app.py file and copy paste the following code

 

app.py

 

# -*- cofing: utf-8 -*-
import os
import sys
import collections
import string

script_name = sys.argv[0]

res = {
    "total_lines":"",
    "total_characters":"",
    "total_words":"",
    "unique_words":"",
    "special_characters":""
}

try:
    textfile = sys.argv[1]
    with open(textfile, "r", encoding = "utf_8") as f:

        data = f.read()
        res["total_lines"] = data.count(os.linesep)
        res["total_characters"] = len(data.replace(" ","")) - res["total_lines"]
        counter = collections.Counter(data.split())
        d = counter.most_common()
        res["total_words"] = sum([i[1] for i in d])
        res["unique_words"] = len([i[0] for i in d])
        special_chars = string.punctuation
        res["special_characters"] = sum(v for k, v in collections.Counter(data).items() if k in special_chars)

except IndexError:
    print('Usage: %s TEXTFILE' % script_name)
except IOError:
    print('"%s" cannot be opened.' % textfile)

print(res)

 

See also  Python 3 Tkinter GUI Script to Make Color Picker or Chooser Dialog Popup Window Full Tutorial For Beginners

 

And now if you execute the python script you need to provide also an additional command line argument which will be the path of the text file to analyze in this script.

 

python app.py input.txt

 

This is the input.txt file that i am analyzing is shown below

 

input.txt

 

this is a text file

 

 

 

 

So now as you can see it has returned all the information about the text file i.e. how many characters and lines and also unique words and also special_characters. All this information is contained inside a json object

See also  Python 3 Pandas Library Script to Get the Current Date Time in Days,Months,Years,Hours,Minutes and Seconds Full Project For Beginners

 

Leave a Reply