0

I am having an issue which I came across using the R package "sleuth" to analyze RNAseq data. The problem clearly depends on rhdf5, however. The issue boils down to this error, which I only get when trying to read a .h5 file from my external SSD but not from my local disk:

> library("rhdf5")

> H5Fopen("/pathway/to/external/SSD/abundance.h5")
#gives the following error:
Error in H5Fopen("/pathway/to/external/SSD/abundance.h5") : 
  HDF5. File accessibility. Unable to open file.

> H5Fopen("/Users/myname/Desktop/abundance.h5")
#this runs perfectly

> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 13.0

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rhdf5_2.38.1

loaded via a namespace (and not attached):
 [1] KEGGREST_1.34.0        progress_1.2.2         tidyselect_1.2.0      
 [4] xfun_0.36              vctrs_0.5.1            generics_0.1.3        
 [7] htmltools_0.5.4        stats4_4.1.3           BiocFileCache_2.2.1   
[10] yaml_2.3.6             utf8_1.2.2             blob_1.2.3            
[13] XML_3.99-0.13          rlang_1.0.6            pillar_1.8.1          
[16] glue_1.6.2             DBI_1.1.3              rappdirs_0.3.3        
[19] BiocGenerics_0.40.0    bit64_4.0.5            dbplyr_2.2.1          
[22] GenomeInfoDbData_1.2.7 lifecycle_1.0.3        stringr_1.5.0         
[25] zlibbioc_1.40.0        Biostrings_2.62.0      memoise_2.0.1         
[28] evaluate_0.19          Biobase_2.54.0         knitr_1.41            
[31] IRanges_2.28.0         fastmap_1.1.0          biomaRt_2.50.3        
[34] GenomeInfoDb_1.30.1    curl_4.3.3             fansi_1.0.3           
[37] AnnotationDbi_1.56.2   Rcpp_1.0.9             filelock_1.0.2        
[40] cachem_1.0.6           S4Vectors_0.32.4       XVector_0.34.0        
[43] bit_4.0.5              hms_1.1.2              png_0.1-8             
[46] digest_0.6.31          stringi_1.7.8          dplyr_1.0.10          
[49] rhdf5filters_1.6.0     cli_3.5.0              tools_4.1.3           
[52] bitops_1.0-7           magrittr_2.0.3         tibble_3.1.8          
[55] RCurl_1.98-1.9         RSQLite_2.2.20         crayon_1.5.2          
[58] pkgconfig_2.0.3        ellipsis_0.3.2         xml2_1.3.3            
[61] prettyunits_1.1.1      assertthat_0.2.1       rmarkdown_2.19        
[64] httr_1.4.4             rstudioapi_0.14        Rhdf5lib_1.16.0       
[67] R6_2.5.1               compiler_4.1.3        

Notably, the .h5 file was initially stored exclusively on the external SSD and I copied it to my Desktop here for demonstration purposes (the file is identical in the two locations and was copied from the external SSD to desktop today)

Can anyone help me understand this behaviour? It works a lot better for my workflow if I can keep the .h5 file stored on the external SSD.

I have tried toggling Sys.setenv(HDF5_USE_FILE_LOCKING = "FALSE") and Sys.setenv(RHDF5_USE_FILE_LOCKING = "FALSE") based on some github issues all to no avail.

Others have been experiencing this issue using the sleuth package here: https://github.com/pachterlab/sleuth/issues/274

And a similar issue has been asked about here: Reading .h5 file in R

But neither of these answer my question or solve my issue!

I appreciate any insight!

  • What file system is your SSD using? And what mount options are active? Do you have write-permission to the directory of the file? Can you open the file for appending? – Homer512 Dec 22 '22 at 23:40
  • @Homer512 File system is MacOS Extended (Journaled) and the Scheme is GUID Partition Map. mount returns: /dev/disk4s2 on /Volumes/SanDisk_Extreme_1TB (hfs, local, nodev, nosuid, journaled, noowners). Based on "Get Info" I do have read & write permission to this drive. I can only open the file for appending if I move it to my Desktop, unless there's some other package to open/edit it with in R? – bkleiboeker Dec 23 '22 at 00:51
  • @Homer512 Notably, I can use this exact same R script on the same version R+ rhdf5 and with the exact same SSD/.h5 file using my 2016 intel MBP. Todays problems emerged as I am setting up my new 2020 M1 MBP, in case that could matter? In my uninformed opinion it seems like there's a chance it's an issue with the new laptop, or else how it interfaces with the SSD, as this SSD and script worked just fine on my 2016 MBP. – bkleiboeker Dec 23 '22 at 00:55

1 Answers1

1

A few suggestions:

You can try calling h5errorHandling (type="verbose") before trying to open the file. It won't fix the problem, but will print the full HDF5 error trace, which might give more clues.

Perhaps try open the file in read only mode. You can do that with H5Fopen("/pathway/to/external/SSD/abundance.h5", flags = "H5F_ACC_RDONLY").

Another thing to try I'd to make another copy of the file on the external drive. That might help determine if it's this file or a more general issue with that file system.

Grimbough
  • 81
  • 4