Safely storing encrypted credentials in django

Question

I'm working on a python/django app which, among other things, syncs data to a variety of other services, including samba shares, ssh(scp) servers, Google apps, and others. As such, it needs to store the credentials to access these services. Storing them as unencrypted fields would be, I presume, a Bad Idea, as an SQL injection attack could retrieve the credentials. So I would need to encrypt the creds before storage - are there any reliable libraries to achieve this?

Once the creds are encrypted, they would need to be decrypted before being usable. There are two use cases for my app:

One is interactive - in this case the user would provide the password to unlock the credentials.
The other is an automated sync - this is started by a cron job or similar. Where would I keep the password in order to minimise risk of exploits here?

Or is there a different approach to this problem I should be taking?

This is exactly my question. Very hard to find this instead of the hundreds of "How do I salt/hash passwords properly" questions....urgh. — mlissner, May 29 '15 at 01:01

score 13 · Answer 1 · edited Jun 20 '20 at 09:12

I have the same problem and have been researching this the past few days. The solution presented by @Rostislav is pretty good, but it's incomplete and a bit out dated.

On the Algorithm Layer

First, there's a new library for cryptography called, appropriately enough, Cryptography. There are a good number of reasons to use this library instead of PyCrypto, but the main ones that attracted me are:

A core goal is for you to be unable to shoot yourself in the foot. For example, it doesn't have severely outdated hash algos like MD2.
It has strong institutional support
500,000 tests with continuous integration on various platforms!
Their documentation website has a better SSL configuration (near-perfect A+ score instead of a mediocre B rating)
They have a disclosure policy for vulnerabilities.

You can read more about the reasons for creating the new library on LWN.

Second, the other answer recommends using SHA1 as the encryption key. SHA1 is dangerously weak and getting weaker. The replacement for SHA1 is SHA2, and on top of that, you should really being salting your hash and stretching it using either bcrypt or PBKDF2. Salting is important as a protection against rainbow tables and stretching is an important protection against brute forcing.

(Bcrypt is less tested, but is designed to use lots of memory and PBKDF2 is designed to be slow and is recommended by NIST. In my implementation, I use PBKDF2. If you want more on the differences, read this.)

For encryption AES in CBC mode with a 128-bit key should be used, as mentioned above – that hasn't changed, although it's now rolled up into a spec called Fernet. The initialization vector will be generated for you automatically in this library, so you can safely forget about that.

On the Key Generation and Storage Layer

The other answers are quite right to suggest that you need to carefully consider key handling and opt for something like OAuth, if you can. But assuming that's not possible (it isn't in my implementation), you have two use cases: Cron jobs and Interactive.

The cron job use case boils down to the fact that you need to keep a key somewhere safe and use it to run cron jobs. I haven't studied this, so I won't opine here. I think there are a lot of good ways to do this, but I don't know the easiest way.

For the Interactive use case, what you need to do is collect a user's password, use that to generate a key, and then use that key to decrypt the stored credentials.

Bringing it home

Here's how I would do all of the above, using the Cryptography library:

from cryptography.fernet import Fernet
from cryptography.hazmat.primitives.hashes import SHA256
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC
from cryptography.hazmat.backends import default_backend

secret = "Some secret"

# Generate a salt for use in the PBKDF2 hash
salt = base64.b64encode(os.urandom(12))  # Recommended method from cryptography.io
# Set up the hashing algo
kdf = PBKDF2HMAC(
    algorithm=SHA256(),
    length=32,
    salt=str(salt),
    iterations=100000,  # This stretches the hash against brute forcing
    backend=default_backend(),  # Typically this is OpenSSL
)
# Derive a binary hash and encode it with base 64 encoding
hashed_pwd = base64.b64encode(kdf.derive(user_pwd))

# Set up AES in CBC mode using the hash as the key
f = Fernet(hashed_pwd)
encrypted_secret = f.encrypt(secret)

# Store the safe inputs in the DB, but do NOT include a hash of the 
# user's password, as that is the key to the encryption! Only store 
# the salt, the algo and the number of iterations.
db.store(
    user='some-user', 
    secret=encrypted_secret,
    algo='pbkdf2_sha256', 
    iterations='100000', 
    salt=salt
)

Decryption then looks like:

# Get the data back from your database
encrypted_secret, algo, iterations, salt = db.get('some-user')

# Set up the Key Derivation Formula (PBKDF2)
kdf = PBKDF2HMAC(
    algorithm=SHA256(),
    length=32,
    salt=str(salt),
    iterations=int(iterations),
    backend=default_backend(),
)
# Generate the key from the user's password
key = base64.b64encode(kdf.derive(user_pwd))

# Set up the AES encryption again, using the key
f = Fernet(key)

# Decrypt the secret!
secret = f.decrypt(encrypted_secret)
print("  Your secret is: %s" % secret)

Attacks?

Let's assume your DB is leaked to the Internet. What can an attacker do? Well, the key we used for encryption took the 100,000th SHA256 hash of your user's salted password. We stored the salt and our encryption algo in your database. An attacker must therefore either:

Attempt brute force of the hash: Combine the salt with every possible password and hash it 100,000 times. Take that hash and try it as the decryption key. The attacker will have to do 100,000 hashes just to try one password. This is basically impossible.
Try every possible hash directly as the decryption key. This is basically impossible.
Try a rainbow table with pre-computed hashes? Nope, not when random salts are involved.

I think this is pretty much solid.

There is, however, one other thing to think about. PBKDF2 is designed to be slow. It requires a lot of CPU time. This means that you are opening yourself up to DDOS attacks if there's a way for users to generate PBKDF2 hashes. Be prepared for this.

Postscript

All of this said, I think there are libraries that will do some of this for you. Google around for things like django encrypted field. I can't make any promises about those implementations, but perhaps you'll learn something about how others have done this.

score 1 · Accepted Answer · answered Oct 16 '12 at 07:12

First storing on a server credentials enough to login to a multitude of systems looks like a nightmare. Compromising code on your server will leak them all whatever the encryption.

You should store only the credentials that would be necessary to perform your task (i.e. files sync). For servers you should consider using synchronization server like RSync, for Google the protocols like OAuth etc. This way if your server is compromised this will only leak the data not the access to systems.

Next thing is encrypting these credentials. For cryptography I advise you to use PYCrypto.

For all random numbers you would use in your cryptography generate them by Crypto.Random (or some other strong method) to be sure they are strong enough.

You should not encrypt different credentials with the same key. The method I would recommend is this:

Your server should have it's master secret M (derived from /dev/random). Store it in the file owned by root and readable by root only.
When your server starts with root privileges it reads the file into memory and before serving clients drops it's privileges. That's normal practice for web servers and other demons.
When you are to write a new credential (or update existing one) generate a random block S. Take the first half and calculate hash K=H(S₁,M). That would be your encryption key.
Use CBC mode to encrypt your data. Take the initialization vector (IV) from S₂.
Store S alongside with encrypted data.

When you need to decrypt just take out S create the K and decrypt with the same IV.

For hash I would advise SHA1, for encryption — AES. Hashes and symmetric cyphers are fast enough so going for larger key sizes wouldn't hurt.

This scheme is a bit overshot in some places but again this wouldn't hurt.

But remember again, best way to store credentials is not to store credentials, and when you have to, use the least privileged ones that will allow you to accomplish the task.

score -2 · Answer 3 · answered Oct 15 '12 at 22:10

Maybe you can rely on a multi-user scheme, by creating :

A user running Django (e.g. django) who does not have the permission to access the credentials
A user having those permissions (e.g. sync).

Both of them can be in the django group, to allow them to access the app. After that, make a script (a Django command, such as manage.py sync-external, for instance) that syncs what you want.

That way, the django user will have access to the app and the sync script, but not the credentials, because only the sync user does. If anyone tries to run that script without the credentials, it will of course result in an error.

Relying on Linux permission model is in my opinion a "Good Idea", but I'm not a security expert, so bear that in mind. If anyone has anything to say about what's above, don't hesitate!