How to perform Scene text detection using google vision API in Python
Pre-requisites:
- Create a google cloud account : https://console.cloud.google.com/getting-started
- Enable billing in your google cloud account to use the vision API : https://cloud.google.com/billing/docs/how-to/manage-billing-account
- Enable vision API for your account : https://console.cloud.google.com/flows/enableapi?apiid=vision.googleapis.com
- Get a credentials JSON file to access the vision API service : https://console.cloud.google.com/apis/credentials
- Install google-cloud-vision python library : pip install google-cloud-vision
-
The Steps :
From google.cloud import vision
-
This imports the vision module ; to see the complete documentation of this module , type help(vision) in your python console
import os; os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="client_secrets.json"
-
Set the above environment variable to the json file obtained from the pre requisite step 4 . This will ensure that you ae authorized to use the vision API
client = vision.ImageAnnotatorClient()
-
The ImageAnnotatorClient() contains the utilities for text detection.
import io path = 'Image.jpeg'
with io.open(path, 'rb') as image_file:
-
Open an image file and read its contents
image = vision.types.Image(content=content)
-
Instantiate an object of type vision.types.Image and supply content=content as its argument.
response = client.image_properties(image=image)
-
Call client.imageproperties with as (image=image) argument and store the response of imageproperties() in a variable response
texts = response.text_annotations
-
extract the image properties by calling the imagepropertiesannotation argument of response
for text in texts: print('\n"{}"'.format(text.description)) vertices = (['({},{})'.format(vertex.x, vertex.y) for vertex in text.bounding_poly.vertices]) print('bounds: {}'.format(','.join(vertices)))
- Print the results
Sample Code
from google.cloud import vision
import os
import io
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="C:\\Users\\Andy\\Downloads\\My First Project-09289b97c420.json"
client = vision.ImageAnnotatorClient()
path = 'C:\\Users\\Andy\\Desktop\\45_rightCluster_RZED_right_cam_1592014847950008800.png'
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
for text in texts:
print('\n"{}"'.format(text.description))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
Sample Input
Sample Output
“ALTO” bounds: (118,83),(305,83),(305,176),(118,176)