6

I am going to encrypted several fields in existing table. Basically, the following encryption technique is going to be used:

CREATE MASTER KEY ENCRYPTION
BY PASSWORD = 'sm_long_password@'
GO

CREATE CERTIFICATE CERT_01
WITH SUBJECT = 'CERT_01'
GO

CREATE SYMMETRIC KEY SK_01
WITH ALGORITHM = AES_256 ENCRYPTION
BY CERTIFICATE CERT_01
GO


OPEN SYMMETRIC KEY SK_01 DECRYPTION
BY CERTIFICATE CERT_01

SELECT ENCRYPTBYKEY(KEY_GUID('SK_01'), 'test')

CLOSE SYMMETRIC KEY SK_01

DROP SYMMETRIC KEY SK_01
DROP CERTIFICATE CERT_01
DROP MASTER KEY

The ENCRYPTBYKEY returns varbinary with a maximum size of 8,000 bytes. Knowing the table fields going to be encrypted (for example: nvarchar(128), varchar(31), bigint) how can I define the new varbinary types length?

gotqn
  • 42,737
  • 46
  • 157
  • 243

2 Answers2

6

You can see the full specification here

So lets calculate:

  • 16 byte key UID
  • _4 bytes header
  • 16 byte IV (for AES, a 16 byte block cipher)

Plus then the size of the encrypted message:

  • _4 byte magic number
  • _2 bytes integrity bytes length
  • _0 bytes integrity bytes (warning: may be wrongly placed in the table)
  • _2 bytes (plaintext) message length
  • _m bytes (plaintext) message
  • CBC padding bytes

The CBC padding bytes should be calculated the following way:

16 - ((m + 4 + 2 + 2) % 16)

as padding is always applied. This will result in a number of padding bytes in the range 1..16. A sneaky shortcut is to just add 16 bytes to the total, but this may mean that you're specifying up to 15 bytes that are never used.


We can shorten this to 36 + 8 + m + 16 - ((m + 8) % 16) or 60 + m - ((m + 8) % 16. Or if you use the little trick specified above and you don't care about the wasted bytes: 76 + m where m is the message input.


Notes:

  • beware that the first byte in the header contains the version number of the scheme; this answer does not and cannot specify how many bytes will be added or removed if a different internal message format or encryption scheme is used;
  • using integrity bytes is highly recommended in case you want to protect your DB fields against change (keeping the amount of money in an account confidential is less important than making sure the amount cannot be changed).
  • The example on the page assumes single byte encoding for text characters.
Maarten Bodewes
  • 90,524
  • 13
  • 150
  • 263
  • Please do verify that this is correct before continuing, I'm a crypto expert, not a DB expert. – Maarten Bodewes Mar 19 '18 at 13:29
  • I am just wondering in this part: `using integrity bytes is highly recommended` - could you give me a link where I can find more details about this. – gotqn Mar 20 '18 at 12:46
  • 1
    It's kind of generic: if you protect your data against confidentiality with a symmetric key then you might as well protect it against change (but be warned that swapping it might still be possible). See for instance the huge inroads of *authenticated* cipher modes such as GCM, for instance in TLS 1.3 (the latest draft SSL). If it is **required** depends on the use case and threat model. You don't want somebody to change the value of a bank account, but you cannot *steal* passwords even if you can change them. – Maarten Bodewes Mar 20 '18 at 13:49
  • So did the calculations check out? I presume they did with the accept? – Maarten Bodewes Mar 20 '18 at 13:50
  • 1
    Yes, I have performed test with ASCII and Unicode data. Everything is correct. I was not able to encrypt numbers or dates, but this is limitation of the T-SQL functions. – gotqn Mar 20 '18 at 14:04
3

Based upon some tests in SQL Server 2008, the following formula seems to work. Note that @ClearText is VARCHAR():

52 + (16 * ( ((LEN(@ClearText) + 8)/ 16) ) )

This is roughly compatible with the answer by Maarten Bodewes, except that my tests showed the DATALENGTH(myBinary) to always be of the form 52 + (z * 16), where z is an integer.

LEN(myVarCharString)   DATALENGTH(encryptedString)
--------------------   -----------------------------------------
 0 through  7          usually 52, but occasionally 68 or 84 
 8 through 23          usually 68, but occasionally 84
24 through 39          usually 84
40 through 50          100

The "myVarCharString" was a table column defined as VARCHAR(50). The table contained 150,000 records. The mention of "occasionally" is an instance of about 1 out of 10,000 records that would get bumped into a higher bucket; very strange. For LEN() of 24 and higher, there were not enough records to get the weird anomaly.

Here is some Perl code that takes a proposed length for "myVarCharString" as input from the terminal and produces an expected size for the EncryptByKey() result. The function "int()" is equivalent to "Math.floor()".

while($len = <>) { 
  print 52 + ( 16 * int( ($len+8) / 16 ) ),"\n";
}

You might want to use this formula to calculate a size, then add 16 to allow for the anomaly.

IAM_AL_X
  • 1,221
  • 11
  • 12