2

I'm building an app that requires HIPAA compliance, which, to cut to the chase, means that I can't allow for certain connections to be freely viewable in the database (patients and recommendations for them).

These tables are connected through the patients_recommendations table in my app, which worked well until I added the encryption via attr_encrypted. In an effort to cut down on the amount of encrypting and decrypting (and associated overhead), I'd like to be able to simply encrypt the patient_id in the patients_recommendations table. However, upon changing the data type to string and the column name to encrypted_patient_id, the app breaks with the following error when I try to reseed my database:

can't write unknown attribute `patient_id'

I assume this is because the join is looking for the column directly and not by going through the model (makes sense, using the model is probably slower). Is there any way that I can make Rails go through the model (where attr_encrypted has added the necessary helper methods)?

Update:

In an effort to find a work-around, I've tried adding a before_save to the model like so:

before_save :encrypt_patient_id

...

private

def encrypt_patient_id
  self.encrypted_patient_id = PatientRecommendation.encrypt(:patient_id, self.patient_id)
  self.patient_id = nil
end

This doesn't work either, however, resulting in the same error of unknown attribute. Either solution would work for me (though the first would address the primary problem), any ideas why the before_save isn't being called when created through an association?

Jonathan Bender
  • 1,911
  • 4
  • 22
  • 39
  • Would one-way hash work for you instead of the 2-way encryption? You could use `SHA512(secret_key + patient_id)` as foreign key and things are relatively simple that way. Big chars make expensive indexes though – bbozo Jan 23 '14 at 07:11
  • 2-way encryptions usually don't stand up to cryptographic scrutiny if they have a fixed IV, and it seems at first glance that only fixed IV encryption is suitable to use as a foreign key, hashes work though.. Can you link parts of the HIPAA specification that mandate this for you? – bbozo Jan 23 '14 at 08:28
  • Under HIPAA, you need to secure Personal Health Information (PHI) and Personally Identifiable Information (PII) in transit and at rest. Since my app will allow for the reuse and editing of Recommendations, encrypting all the data in those fields (so that you can't tell all the recommendations being made for a person) seems like more overhead than just encrypting/decrypting the joining table IDs. The idea is that Rails will decrypt the key then do the joins, so indexing on that key won't be necessary. – Jonathan Bender Jan 23 '14 at 15:32
  • While a hash might be 'more secure' it isn't really applicable in this case because I won't have the patient_id available to reconstruct the hash for comparison, leaving 2-way encryptions my only option unless I'm missing something. – Jonathan Bender Jan 23 '14 at 15:33

2 Answers2

1

You should probably store the PII data and the PHI data in separate DBs. Encrypt the PII data (including any associations to a provider or provider location) and then hash out all of the PHI data to make it easier. As long as there are not direct associations between the two, it would be acceptable to not have the PHI data encrypted as it's anonymized.

Tim Barnes
  • 46
  • 6
  • This is probably the best solution for now (while not achieving the initial goal of keeping the app in one database). – Jonathan Bender Jan 29 '14 at 17:44
  • can you please explain more? why cant you do the same with different tables in the same db? what do you mean hash out? thanks a lot – dowi Nov 08 '16 at 15:52
0

Plan A

Don't set patient_id to nil in encrypt_patient_id since it does not exist and the problem could go away.

Also, ending a callback with a nil or false will halt the callback chain, put an explicite true at the end of method.

Plan B, rethink your options

There are more options - from database-level transparent encryption (which formally encrypts the data on disk), to encrypted filesystems for storing certain tablespaces, to flat out encryption of data in the columns.

Encrypting the join columns sounds like a road to unhappiness for a variety of reasons ranging from reporting issues to performance issues when joining is necessary which might be pretty severe,

the trouble you're currently experiencing with the seed looks like its the first bump caused by this on what promises to be a bad road (in this case activerecord seems to be confused how to handle your association, it tries to set patient_id on initialize and breaks).

The overhead of encrypting restricted data might not be as high as you think, not sure how things go for HIPAA but for PCI you're not exactly encouraged to render the protected data on screen so encryption incurs only a small overhead because it happens relatively rarely (business-need-to-know etc).

Also, memory is probably considered to be 'not at rest and not in transit', you could in theory cache some of the clear values for limited periods of time and thus save up on the decryption overhead.

Basically, encrypting data might not be that bad, and encrypting keys in database might be worse then you think

I suggest we talk directly, I'm doing PCI DSS compliance stuff and this topic interests me.

Option: 1-way hashes for primary/foreign keys

PatientRecommendation would have hash of patient_id - call it patient_hash and Patient would be capable of generating the same patient_hash from its id - but I'd suggest storing the patient_hash in both tables, for Patient it would be the primary key for join and for PatientRecommendation it would be the foreign key for join,

thus you define rails relation in these terms and rails will no longer be confused about your relation scheme

has_many :patient_recommendations, primary_key: :patient_hash, foreign_key: :patient_hash

and the result is cryptographically more robust and easy for the database to handle

IF you're adamant about not storing the patient_hash in Patient you could use a plain SQL statement to do the relation - less convenient but workable - something in the lines of this pseudosql:

JOIN ON generate_hash(patient.id) = patient_recommendations.patient_hash

Oracle, for example, has an option to make functional indexes (think create index generate_hash(patient.id)) so this approach could be pretty efficient depending on your choice of database.

However, playing with join keys will complicate your life a lot, even with these measures


I'll expand on this post later on with additional options

bbozo
  • 7,075
  • 3
  • 30
  • 56
  • Removing the `self.patient_id = nil` line doesn't change the output, though neither does adding a print or logger statement, leading me to believe that for whatever reason the `before_save` isn't being called at all. – Jonathan Bender Jan 23 '14 at 15:20
  • @JonathanBender, one more note, `self.patient_id = nil` is probably blocking further `before_save` hooks so put an explicit `true` at the end of `encrypt_patient_id`. Can you show how the update/save/create is being invoked? – bbozo Jan 23 '14 at 15:30
  • Still a no-go, the line in question (in the seeds) is in a block randomly assigning recommendations to patients: `p.patient_recommendations.create!(recommendation_id: rand.rand(1..rec_count))` – Jonathan Bender Jan 23 '14 at 15:37
  • Something else is blocking your save, my bet is on `p.patient_recommendations.create!` is forcing the setting of `patient_id`, try to explicitly do `PatientRecommendation.create!` – bbozo Jan 23 '14 at 16:39
  • The issue is this: while it could constitute PII, especially when used in aggregate, the recommendation itself needs to be freely editable, and will be available in an open source-style manner for community contribution and discussion. It is only the assignment that needs to be protected (and at least the title of the recommendation will be required whenever the join is made anyway, so there will be at least one encryption/decryption every time). – Jonathan Bender Jan 23 '14 at 19:58
  • With regards to hashing it, the problem becomes that the association is one-way, e.g. you'd be able to get the recommendations for a patient, but not all the patients that were given a recommendation. This can become an issue if a particular recommendation is 'bad' and all patients given it need to be notified. – Jonathan Bender Jan 23 '14 at 19:58