Welcome folks today in this post we will be looking on how to extract text
from pdf document and converting it pdf to text in tkinter
using pypdf2 library. All the full source code of application is given below.
Get Started
In order to get started we need to install the following libraries by using the pip
command as shown below
pip install tkinter
pip install pypdf2
After installing these libraries you need to make an app.py
file and copy paste the following code
app.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
import tkinter as tk import PyPDF2 import tkinter.ttk as ttk from tkinter.filedialog import askopenfile from tkinter.messagebox import showinfo import pathlib window=tk.Tk() window.title("PDF to text converter") def openfile(): file=askopenfile(filetypes=[('PDF Files','*.pdf')]) pdf_file=open(file.name,'rb') read_pdf=PyPDF2.PdfFileReader(pdf_file) no_of_pages=read_pdf.getNumPages() page=read_pdf.getPage(0) page_content=page.extractText() pathlib.Path('context.txt').write_text(page_content) showinfo("Done","Successfully Converted") label=tk.Label(window,text="choose file: ") label.grid(row=0,column=0,padx=5,pady=5) button=ttk.Button(window,text="Select",width=30,command=openfile) button.grid(row=0,column=1,padx=5,pady=5) window.mainloop() |
Now if you execute this python script by typing the below command
python app.py