3

I'm trying to decrypt a kms encrypted file and running in to the following error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 3: invalid start byte

I'm using the sample decrypt code.

I'm able to decrypt the file using the command line.

The exception is being thrown from here:

cipher_text.decode('utf-8')

Code: https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/kms/api-client/snippets.py

Please let me know if I'm missing something here.

Fayaz Ahmed
  • 953
  • 1
  • 9
  • 23
  • Figured out that a file encrypted using the command line utility cannot be decrypted using the Python decrypt API (Not sure if its the same with other language APIs as well). So to get an encrypted file decrypted via the Python API, the encryption has to be done via encrypt Python API as well. Not sure if my understanding is right, but got it to work using the above method. – Fayaz Ahmed Aug 15 '17 at 19:35

2 Answers2

6

When you use the Python library, all inputs must be base64-encoded, and the outputs will be base64-encoded as well. In the encrypt function in snippets.py, you can see that the code is base64-encoding the plaintext before passing it to the KMS encrypt API.

encoded_text = base64.b64encode(plaintext)

When you use the gcloud kms encrypt command, you do not have to base64 encode the plaintext yourself, and the ciphertext is not base64-encoded.

So, when you pass the ciphertext from gcloud kms encrypt to the Python library to decrypt, you must base64-encode it first. Change the decrypt function in snippets.py to base64-encode the file data before sending it on.

# Read cipher text from the input file.
with io.open(encrypted_file_name, 'rb') as encrypted_file:
    ciphertext = encrypted_file.read()
encoded_text = base64.b64encode(ciphertext)

# Use the KMS API to decrypt the text.
cryptokeys = kms_client.projects().locations().keyRings().cryptoKeys()
request = cryptokeys.decrypt(
    name=name, body={'ciphertext': encoded_text.decode('utf-8')})
response = request.execute()

You can think of the base64-encoding as being a transport-layer implementation detail: it's only necessary so that arbitrary binary data can be sent in JSON, which only accepts Unicode strings. So, the Cloud KMS API requires this data to be base64-encoded, and must base64-encode the output as well. But the gcloud command does this work for you, so you don't have to do it.

I think the Python sample code is misleading. It should always base64-encode inputs to the API and base64-decode outputs, instead of only doing it sometimes. I'll look at updating the Python sample code shortly, and double check the sample code for the other languages.

Russ Amos
  • 419
  • 3
  • 9
0

Given the date of the question, the accepted answer should be @Russ (also, thank you for updating the git). Since the documentation changed a little, here is a function that deals with an already encrypted json file.

Encrypted using the GCloud Command Line:

gcloud kms encrypt \
  --plaintext-file=[SECRETS.json] \
  --ciphertext-file=[ENCRYPTED-SECRETS.json.enc] \
  --location=[REGION] \
  --keyring=[KEYRING-NAME] \
  --key=[KEY-NAME]

Here is the function for decrypting said file (cipher_file being the path to [ENCRYPTED-SECRETS.json.enc]):

def decrypt(cipher_file):
    project_id = "project"
    location_id = "region"
    key_ring_id = "key-ring"
    crypto_key_id = "key"

    # Creates an API client for the KMS API.
    client = kms_v1.KeyManagementServiceClient()

    # The resource name of the CryptoKey.
    name = client.crypto_key_path_path(project_id, location_id, key_ring_id,
                                       crypto_key_id)

    # Use the KMS API to decrypt the data.
    with io.open(cipher_file, "rb") as file:
        c_text = file.read()

    response = client.decrypt(name, c_text)

    secret_dict = json.loads(response.plaintext.decode("utf-8"))

    return secret_dict
GregK
  • 571
  • 6
  • 15