2

Hei, I am not quite sure if this might be a trivial question but I am having some troubles with it. I am trying to do the following:

I downloaded a folder of about 8000 pdb files on my computer. I converted the folder into an array using:

protein_array = np.array(os.listdir('/directory/to/pdf/files/folder'))

in order to edit it. I identified the elements of the array I cannot use later and deleted them from protein_array - so essentially I am cleaning up the array so I can work with it later. My problem is, I now need to save the edited protein_array back to my computer so that I again get a folder with (now less) pdb files. This seems rather simple but I couldn't find how to save a NumPy array to PDB formatted files.

Jennan
  • 89
  • 11
  • 1
    The way you've created it, your `protein_array` is an array of file names (strings). Is that what you intended? – Seb Jan 01 '20 at 17:01
  • Hm, yes that is true. I get an array of strings then, but I later use the PDB Parser of BioPython which still is able to get the structure data of the files. So I am actually working with the contents of the files, yes. All of that works fine except the last step of generating a new folder without the excluded files. – Jennan Jan 01 '20 at 17:09
  • This code is bizarre, why not save the filenames in a `list`? Anyway, you can save a numpy array using the function `numpy.save`, this will create a `.npy` file, which can be read back using `numpy.load`. – Jan Christoph Terasa Jan 01 '20 at 17:11
  • I actually tried that before. The problem is that I need to parse all the files in a later step and iterating through all of those 8000 files and parsing every single of them takes too long if it is in the form of a list. As an array it goes much faster. – Jennan Jan 01 '20 at 17:14
  • It would actually be important that I can save the individual elements of the array as separate .pdb files. Do you know if that is possible? – Jennan Jan 01 '20 at 17:22
  • If `BioPython` can parse a file and *make* a numpy array, shouldn't it be able to do the reverse? – wwii Jan 01 '20 at 17:39
  • https://biopython.org/wiki/SeqIO – wwii Jan 01 '20 at 17:46
  • Bio.SequIO can only read pdb files unfortunately – Jennan Jan 02 '20 at 16:27
  • Is your question still active? If so, it might help if you could show more of your code. – Lydia van Dyke Apr 05 '20 at 13:39

0 Answers0