1

I need your help badly :D I wrote a code in python with PGP , I have a trusted public key and I could perfectly encrypt my massage with this code, but when I run it on data brick I faced problem : gnupghome should be a directory and it isnt I would like to know how can I access to a directory in databrick.

import gnupg
from pprint import pprint
import os

gpg = gnupg.GPG(gnupghome='/root/.pnugp')
key_data = open("/dbfs/mnt/xxxx/SCO/oracle/xxx/Files/publickey.asc").read()
    
import_result = gpg.import_keys(key_data)
pprint(import_result.results)
with open("/dbfs/mnt/xxxxx-storage/SCO/oracle/xxx/Files/FileToEncrypt.txt",'rb') as f:
  status = gpg.encrypt_file(
    f, recipients=['securxxxxfertuca@xx.ca'],
    output='my-encrypted.txt.gpg')
  print( 'ok: ', status.ok)
  print ('status: ', status.status)
  print ('stderr: ', status.stderr)
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
Marian
  • 11
  • 2

1 Answers1

0

I suspect that this ran successfully locally. It doesn't work on databricks because it is looking for the .pnugp in the root which data bricks does not allow you to access.

I use the below snippet of code which doesn't need you to access anything from any directory other than the files you plan to encrypt and the keys. In the code, I have my public key stored in the key vault as a secret named 'publicb64'. If you want to read the asc version from somewhere you can just read it into KEY_PUB. Don't forget to install pgpy using pip install pgpy.

#Encrypting a file using public key
import pgpy
from pgpy.constants import PubKeyAlgorithm, KeyFlags, HashAlgorithm, SymmetricKeyAlgorithm, CompressionAlgorithm
from timeit import default_timer as timer
import base64 
import io
 
KEY_PUB = base64.b64decode(publicb64).decode("ascii").lstrip()  
#print(KEY_PUB)

pub_key = pgpy.PGPKey()
pub_key.parse(KEY_PUB)
pass
# -READ THE FILE FROM MOUNT POINT-----------------
with io.open('/dbfs/mnt/sample_data/california_housing_test.csv', "r",newline='') as csv_file:
    input_data = csv_file.read()                   # The io and newline retains the CRLF
    
t0 = timer()
#PGP Encryption start
msg = pgpy.PGPMessage.new(input_data)
###### this returns a new PGPMessage that contains an encrypted form of the original message
encrypted_message = pub_key.encrypt(msg)
pgpstr = str(encrypted_message)
with open('/dbfs/mnt/sample_data/california_housing_test.csv.pgp', "w") as text_file:
    text_file.write(pgpstr)
print("Encryption Complete :" + str(timer()-t0)) 
Anupam Chand
  • 2,209
  • 1
  • 5
  • 14