5

Let's say I have an entire column in a table that is encrypted, the table also has unencrypted columns like IDs, and I have the encryption key for the entire column and I used the DBMS' encrypt() function with AES to store it.

I'm wondering if there is anyway to execute something like

SELECT * FROM table1 WHERE decrypt(col1, 'fooz', 'aes') = 'aValue'

I've already tried that in PostgreSQL and the above syntax is not supported. If there is no way to do this, what are the workarounds?

I've looked into decrypting into a temporary table and then execute the query and drop it but that seems extremely inefficient and also unsafe because there's a chance the decrypted table can remain on disk

Johan
  • 74,508
  • 24
  • 191
  • 319
  • Answering my own question: SELECT * FROM table1 WHERE decrypt(col1::bytea,'fooz','aes') = 'aValue'; Decrypt requires explicit type casting. –  Oct 13 '11 at 11:48
  • Please note I would like to be able to perform general query within encrypted data columns. Thus I've suggested using decrypt(). Using col1=encrypt() is too limited. I maybe querying WHERE col1 > 10 for example. –  Oct 14 '11 at 15:27

1 Answers1

2

Pseudo code

SELECT * FROM table1 WHERE col1 = encrypt('avalue','fooz','aes');

Or more specifically:

Real code

SELECT * FROM table1
WHERE col1 = pgp_sym_encrypt('avalue', 'apasswordwithsomeentropy'
                            ,'compress-algo=1, cipher-algo=aes256');

http://www.postgresql.org/docs/8.3/static/pgcrypto.html

Johan
  • 74,508
  • 24
  • 191
  • 319
  • Could you make use of a indexed function so your searches aren't slow? Or would that open the data your trying to hide in the index? – Kuberchaun Oct 13 '11 at 16:23
  • @StarShip3000, you can, but it does not really make sense, because that will make inserts really slow, because postgreSQL will have to do the encryption twice on the insert (once for the insert and once for the index) and you will save 0 time on the select. – Johan Oct 13 '11 at 18:33
  • I'm confused at why you wouldn't save time on the select if you are able to use an index on decrypt(col1, 'fooz', 'aes') = 'aValue'. Of course the index will make insert and updates slower, but often that is a price your willing to pay for the benefit of getting your data out faster. – Kuberchaun Oct 13 '11 at 19:56
  • @StarShip3000, For 1 row saving decrypt(col1) in an index will save time. However for 10,000 rows the above code will only have to do the encrypt once, so it's a O(1) cost that you're saving. Therefor as the number of rows under consideration increases your savings go down. Furthermore and more importantly. If you do `select * from user where username = ? and password = encrypt....` the unique index on username will make an index on password unneeded. So the time that you think you are saving is not being spend. – Johan Oct 14 '11 at 06:49
  • However **if** you are doing `select decrypt(col1) from t1` and you are above to use a covering index,so that the decrypt function is never actually called, but the covering index is used instead, **then** you have a point. If this is possible you have a security hole though because this assumes that you use the same password everywhere. You should never do this and always salt the password with a different salt per row. and use salt+password as the encryption key. Failing to do so opens you up to attack with a rainbow table. Maybe if you include the salt in the index. – Johan Oct 14 '11 at 06:54
  • However the idea of storing unencrypted data in the index, is just scary. And I would never allow it in production code. – Johan Oct 14 '11 at 06:55
  • Right that is what I figured which makes all the details of the idea of indexing the secured data pointless if in the end the index is holding it unencrypted, though couldn't you make sure the directory the index data is stored is locked down to mitigate any risk? Also Postgre can't currently do covered indexes or are you talking about 9.2dev? – Kuberchaun Oct 14 '11 at 07:01
  • 1
    @StarShip3000, eagerly awaiting 9.2. IMO if you encrypt you encrypt, you don't have some unencrypted data somewhere in a backoffice. I think that would get me in all sort of legal and ethical trouble. Imagine explaining to a customer how their encrypted data leaked, because you decided to store unencrypted data in an index. – Johan Oct 14 '11 at 07:13
  • Just so I can get this straight... the way you've done it is to encrypt the value that we're searching for and search the encrypted value? However, what if I was using LIKE '%avalue%'? Then is my method the only way or is there a more efficient way? –  Oct 14 '11 at 15:25
  • @RonaldChan, You cannot do `like` on a encrypted value. You'll have to use `select * from table1 where pgp_sym_decrypt(col1, 'password','compress-algo=1, cipher-algo=aes256') like '%test%'` Be ready for uber slowness on this query though. If you have more than a handful of rows this query could run for hours, days or months. – Johan Oct 14 '11 at 15:39
  • @Johan I understand thank you. I guess this is where MapReduce will come in handy. –  Oct 15 '11 at 15:29
  • Is it possible to encrypt the email and put it in one column, and also hash the email and put the hash in another indexed column. @StarShip3000 They you can retrieve the row very quickly by searching for the hashed email, like you would for a hashed password. Once you have the row, you can decrypt the email address. I think this might also be a quick way to determine if an email address is in use already. – Bill Rosmus Oct 24 '14 at 07:29