Questions tagged [fastq]

FASTQ files are used in bioinformatics to store sequence information and sequencing quality scores.

FASTQ format is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a single ASCII character for brevity.

[Wikipedia]

257 questions
1
vote
1 answer

How to loop over multiple folders to concatenate FastQ files?

I have received multiple fastq.gz files from Illumina Sequencing for 100 samples. But all the fastq.gz files for the respective samples are in separate folders according to the sample ID. Moreover, I have multiple (8-16) R1.fastq.gz and R2.fastq.gz…
Anik Dutta
  • 29
  • 6
1
vote
1 answer

Exception in thread "main" java.awt.HeadlessException

I am running a program FASTQC in the command line(Ubuntu terminal on windows 10 PC) but got the error of the following. I am not sure how to solve this and I would appreciate it if some one already know the solutions. Exception in thread "main"…
adR
  • 305
  • 4
  • 14
1
vote
0 answers

Converting BAM files to FASTQ using Google Colab

I am trying to convert BAM files from Ion Amplicon sequencing to FASTQ files. I am using Google Collaboratory. I will be using the FASTQ files in R for the pipeline DADA2. Any help would be greatly appreciated.
Nicole_B
  • 11
  • 2
1
vote
1 answer

How to search and replace using Regex s/ when having to convert genome quality to ASCII?

I am struggling to convert a genome read quality from a fasta.qual file (40, 39, 38 etc) to ASCII using Phred+33 on Perl, but can't get it to work. I am trying to do it through the s///g operator. I have my qualities stored in a hash and I am trying…
Alan
  • 35
  • 6
1
vote
1 answer

Concatenate multiple sets of 2 fastq files in BASH

I'm trying to merge multiple sets of 2 fastq files from the same sequencing library. I have a txt file with all the sample names in it. The samples were sequenced in paired-end so there're both _1.fastq.gz and _2.fastq.gz files associated with each…
1
vote
0 answers

How to extract a part of a sequence from FASTA

I have FASTA files, from an in vitro SELEX experiment. All reads should in theory start with the same 6 bases (core seq: GCTGCT) and be of equal length - 27nt. In reality, some reads start even 10 bases later with the core sequence and continue with…
Fluorine
  • 55
  • 9
1
vote
1 answer

How can I create a FASTQ sequence file?

I have a genomic database, which contains a simple character sequence (like >chr1 AGTGTCA.....). Now, I want to convert it to the standard FASTQ format like this: @HWUSI-EAS594-R:1:3:1453:1350#0/1…
Arpssss
  • 3,850
  • 6
  • 36
  • 80
1
vote
2 answers

Bash loops instead VS parallel processes

I wrote a simple script using cat+pipe+parallel in bash but due to the large amount of input data (>200) my computer crashes. However, it works well with only few files (2). I was recommended to use "for" or "foreach" loops instead to avoid the…
Valentina
  • 47
  • 4
1
vote
1 answer

Why is my R function not running? Trying to send R script to cluster

This is a very beginner question, so thanks in advance. I was given an R script to align from fastq files to a genome. All I need to do is sent this R script to my uni's cluster, but I want to make sure the script is running fine on my own computer…
kefir
  • 21
  • 5
1
vote
1 answer

Hot to trim every nth line by a different value?

I would like to trim the last XY characters of every 4th line. The cut off should be the different between the character count from line 4 and 2, and line 8 and 6. For example: line 4 (29 characters) - line 2 (20 characters) = 9. So the last 9…
gnikixam
  • 69
  • 6
1
vote
3 answers

awk; getting multiple lines from two files when they share a common header

I have a question that is quite similar to many other questions regarding this topic, yet I am unable to extent these solutions to the exact output I am looking for. I have two files that are formatted in fastq style, which looks like…
1
vote
2 answers

How to remove a SeqRecord object from a fastq file

I have a parsed fastq file, and I am doing some operations with the reads. Concretely, I am trying to determine if my fastq files have reads that belong to a microorganism contamination, instead of my human sample. So, if my read is a contamination,…
1
vote
2 answers

How to sort dirty fastq files to interleaved fastq

I have a fastq file (file.fastq) of about 80GB in size which has a header line and three subsequent information lines. I need to match /1 and /2 in header lines matching the header information and put them in one file sorted by /1 and /2…
MAPK
  • 5,635
  • 4
  • 37
  • 88
1
vote
1 answer

In a fastq file, how do I change the sequence headers to the file name and a unique identifier?

I'm working with barcoded data and I want to be able to combine the fastq files and easily be able to tell which barcode the read originally had. So I am trying to change the names of the reads to the name of the file (i.e barcode01.fastq) and…
DewBudd
  • 23
  • 3
1
vote
1 answer

Combine multiple wildcards in Snakemake

├── DIR1 │ ├── smp1.fastq.gz │ ├── smp1_fastqc/ │ ├── smp2.fastq.gz │ └── smp2_fastqc/ └── DIR2 ├── smp3.fastq.gz ├── smp3_fastqc/ ├── smp4.fastq.gz └── smp4_fastqc/ I would like to count the number of reads by sample and…
Elysire
  • 693
  • 10
  • 23