1

I am trying to process a csv files which contains more than 20000 patient information. There are totally 50 columns and each patient will have multiple rows as its the hourly data . Most of the columns belong to Observation resource type. Like Heart Rate, Temperature, Blood Pressure.

I have successfully transformed the data into FHIR format. however, when i try to push the data inside FHIR server, the server throws an error saying maximum of 500 entries are only allowed for the data.

Even if i wait up to 500 entries and push the json file, its taking quite a lot time to cover up 20000 * 50 . Is there any efficient way of bulk inserting the data into the azure fhir server ?

Currently , i am using the following code. But looks like its going to take quite a lot time and resource. As there are around 0.7 million rows in my csv file.

def export_template(self, template):
     if self.export_max_500 is None:
         self.export_max_500 = template
     else:
         export_max_500_entry = self.export_max_500["entry"]
         template_entry = template["entry"]
         self.export_max_500["entry"] = export_max_500_entry + template_entry
         if len(self.export_max_500["entry"]) > 500:
             template["entry"] = self.export_max_500["entry"][:495]
             self.export_max_500["entry"] = self.export_max_500["entry"][495:]
             self.send_to_server(template)

srinath
  • 2,748
  • 6
  • 35
  • 56

1 Answers1

2

The most efficient way is not to send multiple (batch) bundles. It is actually to do many individual requests running in parallel. Your problem is that you are sending these in sequentially and taking a huge hit on the round-trip time. You can take a look at something like this loader: https://github.com/hansenms/FhirLoader, which parallelizes the requests. You will also want to up the RUs on your service to make sure you have enough throughput to get the data in.

MichaelHansen
  • 656
  • 3
  • 7
  • Thank you very much.. Say if we need to insert millions of rows. Approximately how many RU's would make the job smoother ? – srinath Apr 29 '20 at 03:38
  • i have 0.7 million records .. So, it would be helpful if i know how much RU's would be better to have – srinath Apr 29 '20 at 16:31
  • If you have something like 10,000 RUs, you should be able to insert about 500 resources per second, depending on the size and type of the resources. So you should be able to import 700k resources in 30 minutes. – MichaelHansen Apr 29 '20 at 16:42