Python and Google Cloud Vision for Tamil text

Update: If you are looking for a simpler way to do Tamil Text OCR, then check this post which shows how you can do with batch files.

Learning anything new is NOT easy. And doing it yourself is tough after for years you had a team who can do it faster and better than you. I am talking about me (learning &) writing code in Python programming language.

A few weeks ago, I wrote (mostly copy ‘n’ paste) a couple of snippets in Python to do Speech To Text, and, Text to Voice, for the Tamil language – the blog post is here. I followed that with a small program to OCR of the Tamil text from a given image and then machine translating to English. The Python code uses Google Cloud Vision and Google Translate APIs – you will need an account (free is fine too) with Google Cloud Platform (GCP).

The input and output for the program are given below:

Sample Text in Tamil, given as input to the Python Program

Output from the program. On Top is the output from Google Cloud Vision OCR, and, at the bottom is the output from Google Cloud Translate

For the OCR part, I got the base code from here and from GCP samples in GitHub, the code for Google Translate. You need to install google-cloud-vision and google-cloud-translate packages, both are available through PIP command (PyPI).

#pip install google-cloud-vision
#pip install google-cloud-translate
#You need to create your own client_secrets.json from Google Cloud Platform. The file in this project has only empty strings :-)
#The input image file should be in the same folder and in this case I have named it book-tamil.png
#This program runs as it is inside Thonny, outside Thonny you need to create Python Virtual Environments otherwise you will face an error from grpc._cython

def detect_text(path):
    from google.cloud import vision
    import os
    import io

    os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "client_secrets.json"
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.types.Image(content=content)
    response = client.text_detection(
        image=image,
        image_context={"language_hints": ["ta"]},  # Tamil
    )
    texts = response.text_annotations
    return texts
# [END vision_text_detection]

#pip install google-cloud-translate

def Quick_Translate(SourceText):
    from google.cloud import translate

    # Instantiates a client
    translate_client = translate.Client()

    sourcelanguage = 'ta'
    targetlanguage = 'en'

    translation = translate_client.translate(
        SourceText,
        source_language=sourcelanguage,
        target_language=targetlanguage)

    return translation['translatedText']
# [END translate_quickstart]

path = 'book-tamil.png'

mytexts = detect_text (path)
mytext = mytexts[0].description
print (mytext)

mytranslated = Quick_Translate (mytext)
print (mytranslated)

When I first wrote the code, it didn’t run (gave an error, ImportError: cannot import name ‘cygrpc’ from ‘grpc._cython’) in Visual Studio Code (my favourite IDE), but ran fine in Thonny (an IDE loved by Python beginner community), I didn’t bother further. But it kept bugging me. Both the IDEs were using the same source code and the same Python compiler (Interpreter), then why the different behaviour?

Python code using Google Cloud Vision API working fine in Thonny IDE

Python code using Google Cloud Vision API throwing an error in Visual Studio Code

Yesterday I spend half-a-day in solving this mystery. At the outset, I was able to infer it was to do with the environment in Visual Studio Code that was in some way different from Thonny – Thonny has an OOB Python compiler and a GUI based Plugins setup different from the PIP command.

Google search(es) about the problem was not helpful. Doing the grunt work (phew!) involving trial and error, I was led in the direction of Python Virtual Environments – “a self-contained directory tree that contains a Python installation for a particular version of Python, plus a number of additional packages“. This made sense, as for the code to work in Thonny it had to be having a different (virtual) environment from the base one on my Windows PC.

Then things fell in place, as I learned about VENV (as Virtual Environments are called by Python programmers). I set up a new VENV, then gave that path in VS Code: Ctrl-Shift-P, “Python: Select Interpreter” option. Now it worked fine in Code as well. Reading the Google Cloud Vision GitHub page confirmed that it requires a Python VENV to work. Till then, following the etiquette of any self-respecting engineer I read the documentation only after figuring out the solution.

Google Cloud Vision – Python Samples

The commands in Windows for a new Python Virtual Environment goes like this:

pip install virtualenv
virtualenv myenv
myenv\scripts\python.exe vision_and_translate.py

Sample instructions and code are often incomplete (or not detailed) for Windows. The majority who write the open-source packages and docs prefer *NIX, a choice I respect. For me, Windows is better, as it offers the best of both worlds, but that’s a different topic.

I am falling in love with Python. Sorry, Visual Basic, you have to step aside from my life! I am flirting with Chocolatey and Jupyter but that’s for a different day.

Categorized in:

Coding

Tagged in:

Google, Localization, Python

Python and Google Cloud Vision for Tamil text

About the Author

Venkatarangan Thirumalai

Check latest articles from this author:

Short demo video on Gemma, LLAMA2 and Phi3

Meta’s LLAMA3 7B Model Shows Promise!

Voted for Indian Parliament Elections 2024

Comments

Leave a ReplyCancel reply

Previous Article

Janamaithri (2019)

Next Article

Will Star Trek be relevant when we really travel to other galaxies?

Press ESC to close

Or check our Popular Categories...