Questions tagged [fastq]

FASTQ files are used in bioinformatics to store sequence information and sequencing quality scores.

FASTQ format is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a single ASCII character for brevity.

[Wikipedia]

257 questions
-1
votes
1 answer

Why does wget fail when running multiple wget commands targeting the ENA in a single shell script?

I wanted to download FASTQ files associated with a particular BioProject (PRJEB21446) from the European Nucleotide Archive. There is a button to generate and download a shell script containing wget commands for all FASTQ files associated with the…
Sj1993
  • 34
  • 4
-1
votes
1 answer

How to fix "AttributeError: 'generator' object has no attribute 'nextRead'" issue?

I'm trying to use this code from https://github.com/dayedepps/q30 and I encountered some issues. I tried fixing some of the issues except for one. def stat(filename): reader = fastq.read(filename) total_count = 0 q20_count = 0 …
apcreyes
  • 1
  • 1
-1
votes
1 answer

How can I check if a file is a real FASTQ (python)?

I have to check if a file is FASTA, FASTQ or none of those. For the FASTA checking i used the module SeqIO from Bio: def is_fasta(filename): with open(filename, "r") as handle: fasta = SeqIO.parse(handle, "fasta") return…
C insi
  • 13
  • 3
-1
votes
1 answer

Split fasta file into specific new fasta files

So I'm writing this code that will read a fasta file. In the fasta file, there will be 10 sequences. The start of the sequence will be ">" I want to split 50:50 of those sequences and create two new fasta files with it. 5 sequences in one new file;…
Juleszio
  • 3
  • 2
-1
votes
1 answer

A question on spotting the error.What are the following errors in this python3 script?

#!/usr/bin/env python3 # trimAll.py #Initialize variable to contain the directory of un-trimmed fastq files fastqPath="/scratch/AiptasiaMiSeq/fastq/" #Initialize variable to contain the suffix for the left…
-1
votes
1 answer

Trying to create a script that counts the length of a all the reads in a fastq file but getting no return

I am trying go count the length of each read in a fastq file from illumina sequencing and outputting this to a tsv or any sort of file so I can then later also look at this and count the number of reads per file. So I need to cycle down the file and…
Rob
  • 17
  • 5
-1
votes
2 answers

Merging files in folder with same file name except one character

I have filenames like the…
Jack Arnestad
  • 1,845
  • 13
  • 26
-1
votes
1 answer

Pass lines from 2 files to same subroutine

I'm in the process of learning how to use perl for genomics applications. I am trying to clean up paired end reads (1 forward, 1 reverse). These are stored in 2 files, but the lines match. What I'm having trouble doing is getting the relevant…
aupadhyaya
  • 31
  • 3
-1
votes
1 answer

ValueError: invalid literal for int() with base 10: '' error occurs when part of larger code, not when alone

defaultdict(, {'match': 1}) [(0, 0)] [] [] [] Traceback (most recent call last): File "Joinomattic_1.py", line 409, in verboseprint (magic(matching)) File "Joinomattic_1.py", line 408, in magic = lambda…
Tom
  • 469
  • 4
  • 7
  • 16
-1
votes
1 answer

Checking my Tuple code

So I'm trying to parce a FastQ sequence, but I'm a beginner to Python, and I'm a little confused as to how to complete my code. This is what the program is supposed to carry out: if I enter the FASTQ seqname…
user3504701
  • 45
  • 1
  • 6
-1
votes
3 answers

Read same extension multiple files in one directory in Perl

I currently have an issue with reading files in one directory. I need to take all the fastq files in a file and run the script for each file then put new files in an ‘Edited_sequences’ folder. The one script I had is perl -ne '$i++;…
Shunzhe Yao
  • 33
  • 1
  • 3
-2
votes
1 answer

how to extract only mapped reads?

I have mapped a pacbio read against a reference [with minimap2] and now I have my output in Bam file. I would like to extract only the mapped reads from it. I tried bamToFastq [samtools bamtofq input.bam | seqtk seq -A > output.fa], since finally…
-2
votes
1 answer

Python Def Syntax Error with . in file name

In my Visual Code Studio running python3.6 - my code is saved as "Langemead12Test.py" w/ lines as: !C:\Users\Bones\Anaconda3\python.exe [1]def readFastq(SRR835775_1.first1000.fastq) Red Error underline def [pylint] E0001:invalid syntax (, line…
Triage
  • 21
  • 1
  • 3
-3
votes
1 answer

Explanation of a code about lineIndex , to collect reads from a file

Here the aim is to build a graph from a collection of stings (reads) in a FASTQ file. But first, we implement the following function that gets the reads. We remove the new line character from the end of each line (with str.strip()), and for…
-3
votes
1 answer

Bash Conditional IF statement after string within FASTQ header

I would like to extract only reads that have a coverage above 2 and length above 504. This is all stored in each header of FASTQ file. However I can't workout a one-liner that would filter based on these qualities. See an example of what two of the…
1 2 3
17
18