Welcome folks today in this tutorial we will be extracting
text from image using pytesseract library. It is a
optical recognition library. All the source code of the application is given below.
For making this application we need to
install the following dependencies
pip install pytesseract
pip install pillow
Now after installing the following dependencies now make a
app.py file and copy paste the following code
from PIL import Image from pytesseract import pytesseract # Defining paths to tesseract.exe # and the image we would be using path_to_tesseract = r"C:\Program Files\Tesseract-OCR\tesseract.exe" image_path = r"csv\sample_text.png" # Opening the image & storing it in an image object img = Image.open(image_path) # Providing the tesseract executable # location to pytesseract library pytesseract.tesseract_cmd = path_to_tesseract # Passing the image object to image_to_string() function # This function will extract the text from the image text = pytesseract.image_to_string(img) # Displaying the extracted text print(text[:-1])
Now in this piece of code we are importing the
input image by using the
pil module and then we are using this second module
pytesseract module to extract the text from the image.
Now to execute the
app.py file we need to execute the following command as below