4

I am trying to call Google's Vertex AI API via REST to something like:

https://us-central1-aiplatform.googleapis.com/v1/projects/...

I am having trouble with figuring out where to get the "access token":

-X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \

I was able to generate a short-term OAUth one from Google CLI, but I want to generate a long-term one. I have tried the following, all of which returns a 401 error:

  • API Key

  • Service Account

I just need this for testing purposes. Is there a way for me to do this easily?

Using Google CLI, but it was a short-term solution. The token expired after 30 minutes.

  • All OAuth credentials are short-lived (3600 seconds) except if you have an ORG where you can create credentials for up to 24 hours. – John Hanley May 11 '23 at 17:31
  • Since you are adding a bounty, specify if you have an ORGANIZATION and the programming language. If you do not have an Google Cloud organization, then the answer is `not possible` as you cannot generate tokens for more than 3600 seconds. – John Hanley Jun 13 '23 at 21:28
  • 1
    It's not a problem; problem was how to authenticate from non-GCE server on production. I have no organization. Language is Node JS. Honestly I just wanted to offer some replacement of OpenAI to my customers, and expected that things will be much simpler (OpenAI just gives you API key and code snippet which works). In Google world I spent 3 days to resolve this simple issue – Extender Jun 15 '23 at 04:37
  • @SandraIsCool does this solve your issue? – Zig Mandel Jul 21 '23 at 20:26

3 Answers3

1

You can only do this via the REST API (see documentation) with the following requirement

By default, the maximum token lifetime is 1 hour (3,600 seconds). To extend the maximum lifetime for these tokens to 12 hours (43,200 seconds), add the service account to an organization policy that includes the constraints/iam.allowServiceAccountCredentialLifetimeExtension list constraint.

To use the REST API, you execute a POST method to

https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/PRIV_SA:generateAccessToken

with a body

{
  "scope": [
    "https://www.googleapis.com/auth/cloud-platform"
  ],
  "lifetime": "LIFETIME"
}

where

LIFETIME: The amount of time until the access token expires, in seconds. For example, 300s

PRIV_SA: The email address of the privilege-bearing service account for which the short-lived token is created.

Your current method is via gcloud CLI. According to documentation,

The Google Cloud CLI does not support setting a lifetime for the token

This means you're limited to the default time limit which is designed to be short (the access token is referred to as short-lived credentials)

NoCommandLine
  • 5,044
  • 2
  • 4
  • 15
  • this is not the case. while the token expires, you can generate a new token on-demand (see my answer) – Zig Mandel Jul 09 '23 at 21:05
  • @ZigMandel - OP's issue is that the token expires after 30 minutes. According to you, the token expires and has to be regenerated which hasn't resolved OP's problem. Google docs says the only way to specify a longer time line is via the documentation I've linked to. Let me know if I'm missing something – NoCommandLine Jul 10 '23 at 01:34
  • no, the issue is that OP does not want to keep editing the code to manually reconfigure new access tokens. one idea he tried was extending the expiration time, but that's not the correct route. what OP need is to be able to programmatically generate new tokens on-demand, solving the issue of having to manually change the tokens. my answer covers that route which seems to be the only way to resolve his question. – Zig Mandel Jul 11 '23 at 02:11
1

update: now you can use google makersuite to generate a simple api key see step-by-step to generate API key for vertexAI in makersuite, but currently is on a closed beta. then just call the vertex AI api with &key=thatKey

without makersuite, because you are on a non-GCE server, you need to impersonate a service account

You will need these configuration steps so the instructions in that link work:

  1. Install the Google Cloud SDK
  2. Create a service account: If you haven't already done so, and give necessary permissions.
  3. Obtain a service account key file.
  4. Set up authentication using "gcloud auth activate-service-account --key-file=[PATH_TO_KEY_FILE]
  5. impersonate the service account by setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to the path of the service account key file. This step is essential for API calls to be associated with the service account.
  6. Generate (short lived) access tokens and regenerate as needed since you now have the key file installed.
Zig Mandel
  • 19,571
  • 5
  • 26
  • 36
  • The tokens are still short-lived. Impersonating a service account does not directly affect an OAuth Access Token's lifetime. The answer by @nocommandline is correct. Google Authorization will only generate tokens with a lifetime longer than 3600 seconds for a Google Cloud Organization. – John Hanley Jul 28 '23 at 21:50
  • yes I know they are short-lived, that's just step 6. this answer includes a way to generate a token on the fly (steps 1 to 5 give that ability even if outside GCP), so if it expired the code can regenerate it without human intervention. the current only way to get a token that does not expire is by using a key made in makersuite, which is on a closed beta and would make it similar to the openAI api. I updated the answer to make it explicit that its short lived. – Zig Mandel Jul 28 '23 at 22:08
  • Recommending a closed beta is not a good answer for Stack Overflow. The average developer cannot access those features. However, if you are trying to recommend an API Key, yes, that is a solution. In this case, I think my bias toward never using API Keys is a factor. Therefore +1 for a valid recommendation that I overlooked. – John Hanley Jul 28 '23 at 22:21
  • yea I know ;) note I included two ways, one being the currently possible way, and the other part of a closed beta but the closest to the way openAI does it. – Zig Mandel Jul 29 '23 at 17:49
0

I finally was able to call Palm (bison) from NodeJS and normal service account. See code:

import { JWT } from "google-auth-library";

const API_ENDPOINT = "us-central1-aiplatform.googleapis.com";
const URL = `https://${API_ENDPOINT}/v1/projects/${process.env.GOOGLE_KEY}/locations/us-central1/publishers/google/models/chat-bison@001:predict`;

const getIdToken = async () => {
    const client = new JWT({
        keyFile: "./google.json",
        scopes: ["https://www.googleapis.com/auth/cloud-platform"],
    });
    const idToken = await client.authorize();
    return idToken.access_token;
};

export const getTextPalm = async (prompt, temperature) => {
    const headers = {
        Authorization: `Bearer ` + (await getIdToken()),
        "Content-Type": "application/json",
    };

    const data = {
        instances: [
            {
                context: "",
                examples: [],
                messages: [
                    {
                        author: "user",
                        content: prompt,
                    },
                ],
            },
        ],
        parameters: {
            temperature: temperature || 0.5,
            maxOutputTokens: 1024,
            topP: 0.8,
            topK: 40,
        },
    };

    const response = await fetch(URL, {
        method: "POST",
        headers,
        body: JSON.stringify(data),
    });

    if (!response.ok) {
        console.error(response.statusText);
        throw new Error("Request failed " + response.statusText);
    }

    const result = await response.json();
    return result.predictions[0].candidates[0].content;
};

I also had to add some permissions to service account, like this:

enter image description here

Extender
  • 160
  • 4
  • 8
  • 1
    Yes, but the thing is I don't have to generate it manually in console every hour. – Extender Jun 15 '23 at 04:31
  • 1
    Yes but still even in my (working) code it is called return idToken.access_token; - which is VERY misleading. Also, what is scopes? what is audience? How audience is related to ENDPOINTs? I had to resolve all this questions, because Google docs are very outdated, non-concise and hard to follow. And the only thing I wanted was to call Vertex AI from another cloud. And then it was returning 401 all the time until I realized how on the hell I should add needed roles in IAM screen – Extender Jun 15 '23 at 18:53