I had a similar issue with a GEO project. What I did was I downloaded all of the .idat files and put them in their own folder. Then I used this code to parse the .idat filenames and create a sample sheet.
It will parse a filename like GSM1855609_9020331147_R02C02_Grn.idat
and store everything in a .csv file. Then you can read the .csv file into R, add the standardized column names (c("Sample_Name", "Sentrix_ID", "Sentrix_Position")
) that a function like logger
wants to see, and you're on your way.
Hope this helps!
#!/usr/bin/env python
# Import the OS library
import os
# Get your Current Working Directory
cwd = os.getcwd()
# Get a list of all of the files (and directories, if there are any) in your directory.
# This will be a list of strings.
filenames = os.listdir(cwd)
# Split each one into the chunks that were separated by underscores ("_") and then keep the first three for each name.
# This will be a list of lists.
chunked_names = [filename.split("_")[0:3] for filename in filenames]
# For each name, rejoin the three chunks with commas
# We're back to having a list of strings.
csv_lines = [",".join(chunks) for chunks in chunked_names]
# Join all of those strings with the newline character to get just a long string.
contents = "\n".join(csv_lines)
# Print this string to standard output so that it can be redirected to a file.
print(contents)