I am doing RNA sequencing. I am using terminal to merge some of my fast files with a python script to do this.
The python scrip has a line to determine how to recognise the files: merge:
pattern_input: '(\S+)_L00\d.fastq.gz'
My file names are as follows:
5mm_S33_L001_R1_001.fastq.gz
There are 4 files which need to be merged like so:
5mm_S33_**L001**_R1_001.fastq.gz
5mm_S33_**L002**_R1_001.fastq.gz
5mm_S33_**L003**_R1_001.fastq.gz
5mm_S33_**L004**_R1_001.fastq.gz
I need the script to recognise the bold text - L00 - in order to merge the files.
I have tried various combinations of the pattern input and cannot get it to work. Examples I have tried:
(\S+)_L00[1-4]_(\S+)
(\S+)_L00\d
(\S+)_L00[1-4]_V1_001.fastq.gz
Any suggestion or solutions of what the pattern input should read to recognise the files would be greatly received. the Yml file that this is in is in the same directory as the files.
WBW Rob