0

I am trying to test out google cloud vision api by following Google's tutorial on using cloud vision api.

Step 1: Generating JSON Requests by typing the following command in the terminal

$ python path/to/generate_json.py -i path/to/cloudVisionInputFile -o path/to/request.json

The above command generates request.json file.

Step 2: Using Curl to Send Generated Requests

$ curl -v -k -s -H "Content-Type: application/json" https://vision.googleapis.com/v1/images:annotate?key=AIzaSyD7Pm-ebpjas62ihvp9v1gAhTk --data-binary @/path/to/request.json > result.json

Output in Terminal (following step 2) Notice that the output in the terminal (see below) shows Content-Length: 0 and [data not shown].

Can someone please advise why the content length is zero ? and also why I am unable to obtain the JSON response from google cloud vision api ?

The below is the out put in Terminal

* Hostname was NOT found in DNS cache
*   Trying 216.58.347.74...
* Connected to vision.googleapis.com (216.58.347.74) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: /opt/local/share/curl/curl-ca-bundle.crt
  CApath: none
* SSLv3, TLS handshake, Client hello (1):
} [data not shown]
* SSLv3, TLS handshake, Server hello (2):
{ [data not shown]
* SSLv3, TLS handshake, CERT (11):
{ [data not shown]
* SSLv3, TLS handshake, Server key exchange (12):
{ [data not shown]
* SSLv3, TLS handshake, Server finished (14):
{ [data not shown]
* SSLv3, TLS handshake, Client key exchange (16):
} [data not shown]
* SSLv3, TLS change cipher, Client hello (1):
} [data not shown]
* SSLv3, TLS handshake, Finished (20):
} [data not shown]
* SSLv3, TLS change cipher, Client hello (1):
{ [data not shown]
* SSLv3, TLS handshake, Finished (20):
{ [data not shown]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* Server certificate:
*    subject: C=US; ST=California; L=Mountain View; O=Google Inc; CN=*.googleapis.com
*    start date: 2016-10-06 12:44:36 GMT
*    expire date: 2016-12-29 12:28:00 GMT
*    issuer: C=US; O=Google Inc; CN=Google Internet Authority G2
*    SSL certificate verify ok.
> POST /v1/images:annotate?key=AIzaSyD7Pm-ebpjas62ihvp9v1gAhTk HTTP/1.1
> User-Agent: curl/7.37.1
> Host: vision.googleapis.com
> Accept: */*
> Content-Type: application/json
> Content-Length: 0
> 
< HTTP/1.1 200 OK
< Content-Type: application/json; charset=UTF-8
< Vary: X-Origin
< Vary: Referer
< Date: Mon, 17 Oct 2016 13:02:56 GMT
* Server ESF is not blacklisted
< Server: ESF
< Cache-Control: private
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: SAMEORIGIN
< X-Content-Type-Options: nosniff
< Alt-Svc: quic=":443"; ma=2592000; v="36,35,34,33,32"
< Accept-Ranges: none
< Vary: Origin,Accept-Encoding
< Transfer-Encoding: chunked
< 
{ [data not shown]
* Connection #0 to host vision.googleapis.com left intact

Below is the JSON request generated in request.json file

{
    "requests": [{
        "image": {
            "content": "/9j/4AAQSkZJRgABAQAA..."
        },
        "features": [{
            "type": "TYPE_UNSPECIFIED",
            "maxResults": 10
        }, {
            "type": "FACE_DETECTION",
            "maxResults": 10
        }, {
            "type": "LANDMARK_DETECTION",
            "maxResults": 10
        }, {
            "type": "LOGO_DETECTION",
            "maxResults": 10
        }, {
            "type": "LABEL_DETECTION",
            "maxResults": 10
        }, {
            "type": "TEXT_DETECTION",
            "maxResults": 10
        }, {
            "type": "SAFE_SEARCH_DETECTION",
            "maxResults": 10
        }]
    }, {
        "image": {
            "content": "/9j/4AAQSkZJRgABAQAAAQABAAD..."
        },
        "features": [{
            "type": "TYPE_UNSPECIFIED",
            "maxResults": 10
        }, {
            "type": "FACE_DETECTION",
            "maxResults": 10
        }, {
            "type": "LANDMARK_DETECTION",
            "maxResults": 10
        }, {
            "type": "LOGO_DETECTION",
            "maxResults": 10
        }, {
            "type": "LABEL_DETECTION",
            "maxResults": 10
        }, {
            "type": "TEXT_DETECTION",
            "maxResults": 10
        }, {
            "type": "SAFE_SEARCH_DETECTION",
            "maxResults": 10
        }]
    }]
}

Below is the Code in generate_json.py

import argparse
import base64
import json
import sys



def main(cloudVisionInputFile, request):
    """Translates the input file into a json output file.

    Args:
        input_file: a file object, containing lines of input to convert.
        output_filename: the name of the file to output the json to.
    """
    # Collect all requests into an array - one per line in the input file
    request_list = []
    for line in input_file:
        # The first value of a line is the image. The rest are features.
        image_filename, features = line.lstrip().split(' ', 1)

        # First, get the image data
        with open(image_filename, 'rb') as image_file:
            content_json_obj = {
                'content': base64.b64encode(image_file.read()).decode('UTF-8')
            }

        # Then parse out all the features we want to compute on this image
        feature_json_obj = []
        for word in features.split(' '):
            feature, max_results = word.split(':', 1)
            feature_json_obj.append({
                'type': get_detection_type(feature),
                'maxResults': int(max_results),
            })

        # Now add it to the request
        request_list.append({
            'features': feature_json_obj,
            'image': content_json_obj,
        })

    # Write the object to a file, as json
    # with open(output_filename, 'w') as output_file:
    with open(request, 'w') as output_file:
        json.dump({'requests': request_list}, output_file)


DETECTION_TYPES = [
    'TYPE_UNSPECIFIED',
    'FACE_DETECTION',
    'LANDMARK_DETECTION',
    'LOGO_DETECTION',
    'LABEL_DETECTION',
    'TEXT_DETECTION',
    'SAFE_SEARCH_DETECTION',
]


def get_detection_type(detect_num):
    """Return the Vision API symbol corresponding to the given number."""
    detect_num = int(detect_num)
    if 0 < detect_num < len(DETECTION_TYPES):
        return DETECTION_TYPES[detect_num]
    else:
        return DETECTION_TYPES[0]
# [END generate_json]

FILE_FORMAT_DESCRIPTION = '''
Each line in the input file must be of the form:

    file_path feature:max_results feature:max_results ....

where 'file_path' is the path to the image file you'd like
to annotate, 'feature' is a number from 1 to %s,
corresponding to the feature to detect, and max_results is a
number specifying the maximum number of those features to
detect.

The valid values - and their corresponding meanings - for
'feature' are:

    %s
'''.strip() % (
    len(DETECTION_TYPES) - 1,
    # The numbered list of detection types
    '\n    '.join(
        # Don't present the 0th detection type ('UNSPECIFIED') as an option.
        '%s: %s' % (i + 1, detection_type)
        for i, detection_type in enumerate(DETECTION_TYPES[1:])))


if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        formatter_class=argparse.RawTextHelpFormatter
    )
    parser.add_argument(
        '-i', dest='input_file', required=True,
        help='The input file to convert to json.\n' + FILE_FORMAT_DESCRIPTION)
    parser.add_argument(
        '-o', dest='output_file', required=True,
        help='The name of the json file to output to.')
    args = parser.parse_args()
    try:
        with open(args.input_file, 'r') as input_file:
            main(input_file, args.output_file)
    except ValueError as e:
        sys.exit('Invalid input file format.\n' + FILE_FORMAT_DESCRIPTION)

The below is the text inside cloudVisionInputFile

/Users/pravishanthmadepally/documents/machineLearning/googleCloudVisionAPI/images/img1.jpeg 0:10 1:10 2:10 3:10 4:10 5:10 6:10
/Users/pravishanthmadepally/documents/machineLearning/googleCloudVisionAPI/images/img2.jpeg 0:10 1:10 2:10 3:10 4:10 5:10 6:10
SpaceX
  • 2,814
  • 2
  • 42
  • 68

2 Answers2

0

I just tried to replicate your problem using the following steps:

  1. Copied your request.json file and added an encoding of my own image. The rest of the file is left unchanged
  2. Provided my own API key in your curl request

This was my output

* STATE: INIT => CONNECT handle 0x600080a40; line 1397 (connection #-5000)
* Added connection 0. The cache now contains 1 members
*   Trying 66.102.1.95...
* TCP_NODELAY set
* STATE: CONNECT => WAITCONNECT handle 0x600080a40; line 1450 (connection #0)
* Connected to vision.googleapis.com (66.102.1.95) port 443 (#0)
* STATE: WAITCONNECT => SENDPROTOCONNECT handle 0x600080a40; line 1557 (connection #0)
* Marked for [keep alive]: HTTP default
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* STATE: SENDPROTOCONNECT => PROTOCONNECT handle 0x600080a40; line 1571 (connection #0)
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=Mountain View; O=Google Inc; CN=*.googleapis.com
*  start date: Oct  6 12:44:36 2016 GMT
*  expire date: Dec 29 12:28:00 2016 GMT
*  issuer: C=US; O=Google Inc; CN=Google Internet Authority G2
*  SSL certificate verify ok.
* STATE: PROTOCONNECT => DO handle 0x600080a40; line 1592 (connection #0)
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* nghttp2_session_mem_recv() returns 0
* http2_send len=203
* Using Stream ID: 1 (easy handle 0x600080a40)
* before_frame_send() was called
* on_frame_send() was called, length = 106
> POST /v1/images:annotate?key=AIzaSyBj4XkAmOPByBmNbrcBJT0KBMM6xAw7eAM HTTP/1.1
> Host: vision.googleapis.com
> User-Agent: curl/7.50.3
> Accept: */*
> Content-Type: application/json
> Content-Length: 27170
>
* STATE: DO => DO_DONE handle 0x600080a40; line 1654 (connection #0)
* multi changed, check CONNECT_PEND queue!
* STATE: DO_DONE => WAITPERFORM handle 0x600080a40; line 1781 (connection #0)
* STATE: WAITPERFORM => PERFORM handle 0x600080a40; line 1791 (connection #0)
* http2_recv: easy 0x600080a40 (stream 1)
* nread=27
* Got SETTINGS
* MAX_CONCURRENT_STREAMS == 100
* ENABLE_PUSH == TRUE
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* nghttp2_session_mem_recv() returns 27
* All data in connection buffer processed
* http2_recv returns AGAIN for stream 1
* http2_send len=16384
* data_source_read_callback: returns 16384 bytes stream 1
* on_frame_send() was called, length = 16384
* http2_send returns 16384 for stream 1
* multi changed, check CONNECT_PEND queue!
* http2_recv: easy 0x600080a40 (stream 1)
* nread=13
* nghttp2_session_mem_recv() returns 13
* All data in connection buffer processed
* http2_recv returns AGAIN for stream 1
* http2_send len=10786
* data_source_read_callback: returns 10786 bytes stream 1
* on_frame_send() was called, length = 10786
* http2_send returns 10786 for stream 1
* We are completely uploaded and fine
* http2_recv: easy 0x600080a40 (stream 1)
* nread=9
* Got SETTINGS
* MAX_CONCURRENT_STREAMS == 100
* ENABLE_PUSH == TRUE
* nghttp2_session_mem_recv() returns 9
* All data in connection buffer processed
* http2_recv returns AGAIN for stream 1
* http2_recv: easy 0x600080a40 (stream 1)
* nread=225
* on_begin_headers() was called
* h2 status: HTTP/2 200 (easy 0x600080a40)
* h2 header: content-type: application/json; charset=UTF-8
* h2 header: vary: X-Origin
* h2 header: vary: Referer
* h2 header: vary: Origin,Accept-Encoding
* h2 header: date: Mon, 17 Oct 2016 14:27:59 GMT
* h2 header: server: ESF
* h2 header: cache-control: private
* h2 header: x-xss-protection: 1; mode=block
* h2 header: x-frame-options: SAMEORIGIN
* h2 header: x-content-type-options: nosniff
* h2 header: alt-svc: quic=":443"; ma=2592000; v="36,35,34,33,32"
* h2 header: accept-ranges: none
* on_frame_recv() header 1 stream 1
* Store 367 bytes headers from stream 1 at 0x600081410
* nghttp2_session_mem_recv() returns 225
* All data in connection buffer processed
* http2_recv: returns 367 for stream 1
* HTTP/2 found, allow multiplexing
< HTTP/2 200
< content-type: application/json; charset=UTF-8
< vary: X-Origin
< vary: Referer
< vary: Origin,Accept-Encoding
< date: Mon, 17 Oct 2016 14:27:59 GMT
< server: ESF
< cache-control: private
< x-xss-protection: 1; mode=block
< x-frame-options: SAMEORIGIN
< x-content-type-options: nosniff
< alt-svc: quic=":443"; ma=2592000; v="36,35,34,33,32"
< accept-ranges: none
<
* http2_recv: easy 0x600080a40 (stream 1)
* nread=1401
* 1392 data received for stream 1 (14992 left in buffer 0x600081410, total 1392)
* nghttp2_session_mem_recv() returns 1401
* All data in connection buffer processed
* http2_recv: returns 1392 for stream 1
{
  "responses": [
    {
      "labelAnnotations": [
        {
          "mid": "/m/07s6nbt",
          "description": "text",
          "score": 0.941945
        },
        {
          "mid": "/m/03gq5hm",
          "description": "font",
          "score": 0.87127215
        },
        {
          "mid": "/m/03scnj",
          "description": "line",
          "score": 0.72790623
        },
        {
          "mid": "/m/01cd9",

As you can see it worked fine! How does your --data-binary @/path/to/request.json parameter look like exactly when you execute?

Serge Hendrickx
  • 1,416
  • 9
  • 15
0

If you're trying to use the Cloud Vision API from Python, you may want to try using the google.cloud client library.

To authenticate with the right scope, you'll need to generate a service account in the Cloud Console, and point to it from your code (or environment variables). See the Vision auth section for more info:

Get a service account from the credentials manager, and then point to your project and JSON credentials file in your environment:

$ export GOOGLE_CLOUD_PROJECT="your-project-id-here"
$ export GOOGLE_APPLICATION_CREDENTIALS="/path/to/keyfile.json"

You can do manual detection like you're showing in your question (where you specify the different things you want detected) like this:

>>> from google.cloud import vision
>>> from google.cloud.vision.feature import Feature
>>> from google.cloud.vision.feature import FeatureTypes
>>> client = vision.Client()
>>> image = client.image(source_uri='gs://my-test-bucket/image.jpg')
>>> features = [Feature(FeatureTypes.FACE_DETECTION, 5),
...             Feature(FeatureTypes.LOGO_DETECTION, 3)]
>>> annotations = image.detect(features)
>>> len(annotations)
2
>>> for face in annotations[0].faces:
...     print(face.joy)
Likelihood.VERY_LIKELY
Likelihood.VERY_LIKELY
Likelihood.VERY_LIKELY
>>> for logo in annotations[0].logos:
...     print(logo.description)
'google'
'github'

(See https://googlecloudplatform.github.io/google-cloud-python/stable/vision-usage.html#manual-detection for more detail).

However if you're only looking for one thing (e.g., labels), you can use feature-specific detection:

>>> from google.cloud import vision
>>> client = vision.Client()
>>> image = client.image(source_uri='gs://my-storage-bucket/image.jpg')
>>> labels = image.detect_labels(limit=3)
>>> labels[0].description
'automobile'
>>> labels[0].score
0.9863683
JJ Geewax
  • 10,342
  • 1
  • 37
  • 49