I'm processing over 200,000 netcdf files and each file is 17 MB. They are all in a google cloud storage bucket and I am trying to find a way to increase the throughput using gcsfuse.
I am using the google cloud compute engine virtual machine and gcsfuse to access the files. I looked into gsutil but read in the Google Cloud documentation that "individual I/O streams run approximately as fast as gsutil." Using gcsfuse the NCL script will take over 8 days which is too long. Any suggestions on how to improve the throughput? Thank you.