by @kodeazy

How to perform Scene text detection using google vision API in Python

Home » VisionAPI » How to perform Scene text detection using google vision API in Python

Pre-requisites:

  • Create a google cloud account : https://console.cloud.google.com/getting-started
  • Enable billing in your google cloud account to use the vision API : https://cloud.google.com/billing/docs/how-to/manage-billing-account
  • Enable vision API for your account : https://console.cloud.google.com/flows/enableapi?apiid=vision.googleapis.com
  • Get a credentials JSON file to access the vision API service : https://console.cloud.google.com/apis/credentials
  • Install google-cloud-vision python library : pip install google-cloud-vision
  • The Steps :

    From google.cloud import vision
  • This imports the vision module ; to see the complete documentation of this module , type help(vision) in your python console

    import os; os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="client_secrets.json"
  • Set the above environment variable to the json file obtained from the pre requisite step 4 . This will ensure that you ae authorized to use the vision API

    client = vision.ImageAnnotatorClient()
  • The ImageAnnotatorClient() contains the utilities for text detection.

    import io
    path = 'Image.jpeg'
    with io.open(path, 'rb') as image_file: 
  • Open an image file and read its contents

    image = vision.types.Image(content=content)
  • Instantiate an object of type vision.types.Image and supply content=content as its argument.

    response = client.image_properties(image=image)
  • Call client.imageproperties with as (image=image) argument and store the response of imageproperties() in a variable response

    texts = response.text_annotations
  • extract the image properties by calling the imagepropertiesannotation argument of response

    for text in texts:
        print('\n"{}"'.format(text.description))
        vertices = (['({},{})'.format(vertex.x, vertex.y)
        for vertex in text.bounding_poly.vertices])
            print('bounds: {}'.format(','.join(vertices)))
  • Print the results

Sample Code

from google.cloud import vision

import os

import io

 

os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="C:\\Users\\Andy\\Downloads\\My First Project-09289b97c420.json"

client = vision.ImageAnnotatorClient()

path = 'C:\\Users\\Andy\\Desktop\\45_rightCluster_RZED_right_cam_1592014847950008800.png'

 

with io.open(path, 'rb') as image_file:

    content = image_file.read()

image = vision.types.Image(content=content)

response = client.text_detection(image=image)

texts = response.text_annotations

for text in texts:
        print('\n"{}"'.format(text.description))
        vertices = (['({},{})'.format(vertex.x, vertex.y)
        for vertex in text.bounding_poly.vertices])
            print('bounds: {}'.format(','.join(vertices)))

Sample Input

ztext Detection Image

Sample Output

“ALTO” bounds: (118,83),(305,83),(305,176),(118,176)