How to use google cloud vision along with unity for recognising text using mobile camera?

Question

I am testing on a project on how to read text from objects and pictures using google cloud vision.Using mobile camera(iphone,ipad preferably or android phones)I would like to get the required text.Samsung bixby application is an example.After some reading I found out about OpenCV for unity and Google cloud vision.OpenCV for unity is around 95$.For testing I cannot use it.So I took the other option.

I downloaded this project. Github project .I created a google cloud vision api key and added to the inspector.I have set the option feature type to text detection.When I took a IOS build, the camera was ON but looks inverted.Nothing was happening.I see a missing script in the inspector.How to detect the text using device camera?

this happens usually mostly if A) the according script file is actually missing because copied e.g. a prefab but not the according script files to a new prokject or B) the script file name does not match the component class name in the code. or C) you have compiler errors. Since the only script from the linked project is the `WebCamTextureToCloudVision` which is actually there it seems to be one of your other scripts — derHugo, Jul 24 '19 at 20:09
@derHugo Any idea on how to detect text real time using mobile camera. — zyonneo, Jul 25 '19 at 05:07

score 1 · Answer 1 · answered Aug 18 '21 at 13:42

Unity Cloud Vision git repo contains the code for Face Detection. It is not suitable for OCR or Text Detection.

So, I have created a code for perform text detection from images using vision OCR api in Unity3D.

You can try to use the following script to detect the text from an image in Unity3D.

using UnityEngine;
using System.Collections;
using System.Collections.Generic;
using UnityEngine.UI;
using SimpleJSON;

public class WebCamTextureToCloudVision : MonoBehaviour {

    public string url = "https://vision.googleapis.com/v1/images:annotate?key=";
    public string apiKey = ""; //Put your google cloud vision api key here
    public float captureIntervalSeconds = 5.0f;
    public int requestedWidth = 640;
    public int requestedHeight = 480;
    public FeatureType featureType = FeatureType.TEXT_DETECTION;
    public int maxResults = 10;
    public GameObject resPanel;
    public Text responseText, responseArray; 

    WebCamTexture webcamTexture;
    Texture2D texture2D;
    Dictionary<string, string> headers;

    [System.Serializable]
    public class AnnotateImageRequests {
        public List<AnnotateImageRequest> requests;
    }

    [System.Serializable]
    public class AnnotateImageRequest {
        public Image image;
        public List<Feature> features;
    }

    [System.Serializable]
    public class Image {
        public string content;
    }

    [System.Serializable]
    public class Feature {
        public string type;
        public int maxResults;
    }

    public enum FeatureType {
        TYPE_UNSPECIFIED,
        FACE_DETECTION,
        LANDMARK_DETECTION,
        LOGO_DETECTION,
        LABEL_DETECTION,
        TEXT_DETECTION,
        SAFE_SEARCH_DETECTION,
        IMAGE_PROPERTIES
    }

    // Use this for initialization
    void Start () {
        headers = new Dictionary<string, string>();
        headers.Add("Content-Type", "application/json; charset=UTF-8");

        if (apiKey == null || apiKey == "")
            Debug.LogError("No API key. Please set your API key into the \"Web Cam Texture To Cloud Vision(Script)\" component.");
        
        WebCamDevice[] devices = WebCamTexture.devices;
        for (var i = 0; i < devices.Length; i++) {
            Debug.Log (devices [i].name);
        }
        if (devices.Length > 0) {
            webcamTexture = new WebCamTexture(devices[0].name, requestedWidth, requestedHeight);
            Renderer r = GetComponent<Renderer> ();
            if (r != null) {
                Material m = r.material;
                if (m != null) {
                    m.mainTexture = webcamTexture;
                }
            }
            webcamTexture.Play();
            StartCoroutine("Capture");
        }   
    }
    
    // Update is called once per frame
    void Update () {

    }

    private IEnumerator Capture() {
        while (true) {
            if (this.apiKey == null)
                yield return null;

            yield return new WaitForSeconds(captureIntervalSeconds);

            Color[] pixels = webcamTexture.GetPixels();
            if (pixels.Length == 0)
                yield return null;
            if (texture2D == null || webcamTexture.width != texture2D.width || webcamTexture.height != texture2D.height) {
                texture2D = new Texture2D(webcamTexture.width, webcamTexture.height, TextureFormat.RGBA32, false);
            }

            texture2D.SetPixels(pixels);
            // texture2D.Apply(false); // Not required. Because we do not need to be uploaded it to GPU
            byte[] jpg = texture2D.EncodeToJPG();
            string base64 = System.Convert.ToBase64String(jpg);
// #if UNITY_WEBGL  
//          Application.ExternalCall("post", this.gameObject.name, "OnSuccessFromBrowser", "OnErrorFromBrowser", this.url + this.apiKey, base64, this.featureType.ToString(), this.maxResults);
// #else
            
            AnnotateImageRequests requests = new AnnotateImageRequests();
            requests.requests = new List<AnnotateImageRequest>();

            AnnotateImageRequest request = new AnnotateImageRequest();
            request.image = new Image();
            request.image.content = base64;
            request.features = new List<Feature>();
            Feature feature = new Feature();
            feature.type = this.featureType.ToString();
            feature.maxResults = this.maxResults;
            request.features.Add(feature); 
            requests.requests.Add(request);

            string jsonData = JsonUtility.ToJson(requests, false);
            if (jsonData != string.Empty) {
                string url = this.url + this.apiKey;
                byte[] postData = System.Text.Encoding.Default.GetBytes(jsonData);
                using(WWW www = new WWW(url, postData, headers)) {
                    yield return www;
                    if (string.IsNullOrEmpty(www.error)) {
                        string responses = www.text.Replace("\n", "").Replace(" ", "");
                        // Debug.Log(responses);
                        JSONNode res = JSON.Parse(responses);
                        string fullText = res["responses"][0]["textAnnotations"][0]["description"].ToString().Trim('"');
                        if (fullText != ""){
                            Debug.Log("OCR Response: " + fullText);
                            resPanel.SetActive(true);
                            responseText.text = fullText.Replace("\\n", " ");
                            fullText = fullText.Replace("\\n", ";");
                            string[] texts = fullText.Split(';');
                            responseArray.text = "";
                            for(int i=0;i<texts.Length;i++){
                                responseArray.text += texts[i];
                                if(i != texts.Length - 1)
                                    responseArray.text += ", ";
                            }
                        }
                    } else {
                        Debug.Log("Error: " + www.error);
                    }
                }
            }
// #endif
        }
    }

#if UNITY_WEBGL
    void OnSuccessFromBrowser(string jsonString) {
        Debug.Log(jsonString);  
    }

    void OnErrorFromBrowser(string jsonString) {
        Debug.Log(jsonString);  
    }
#endif

}

The demo project is available in github. codemaker2015/google-cloud-vision-api-ocr-unity3d-demo

score 0 · Answer 2 · answered Jun 17 '23 at 13:11

0

amazing functionality, there is anyone that try to build the bounding box around the object or the item detected? I've tried to build using the vertex returned in the json from google vision ai api, but i'm not able to translate the coordinate to the one used in the canvas. how do you achieve this?

thanks for the help nfc

answered Jun 17 '23 at 13:11

th0r

1

This seems like a new question rather than an answer to the question. You should submit a new question – Schwarz Software Jun 20 '23 at 02:15
If you have a new question, please ask it by clicking the [Ask Question](https://stackoverflow.com/questions/ask) button. Include a link to this question if it helps provide context. - [From Review](/review/late-answers/34560777) – Andrew Jun 23 '23 at 05:48

How to use google cloud vision along with unity for recognising text using mobile camera?

2 Answers2

Linked