I am wondering if there is a way to define wildcards when the input files are named slightly differently. In this case FASTQ files have different suffixes - some end with '_L001_R1_001.fastq.gz' and some with 'R1_001.fastq.gz'. I'm hoping to use glob_wildcards to read in the run name and sample name. Is there a good way to use "or" in glob_wildcards? Any suggestions would be fantastic, thank you in advance!!
# Define samples:
RUNS, SAMPLES = glob_wildcards(config['fastq_dir'] + "{run}/{samp}" + config['fastq1_suffix'])
My config file contains the following:
fastq_dir:
'~/tb/data/'
fastq1_suffix:
'_L001_R1_001.fastq.gz'
fastq2_suffix:
'_L001_R2_001.fastq.gz'
First rule:
rule trim_reads:
input:
p1= config['fastq_dir'] + '{run}/{samp}' + config['fastq1_suffix'],
p2= config['fastq_dir'] + '{run}/{samp}' + config['fastq2_suffix']