Questions tagged [fastq]

FASTQ files are used in bioinformatics to store sequence information and sequencing quality scores.

FASTQ format is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a single ASCII character for brevity.

[Wikipedia]

257 questions

votes

1 answer

Nanopore tools designed to analyze fastq file format?

I just received my first nanopore data set and was sent a fastq file. I was expecting a fast5 file, and now I'm not sure how to begin filtering the data. Most of the tools I've come across (NanoOK, poretools) deal with the fast5 format, although…

stack-overflow fastq stack-smash

asked Jul 21 '17 at 15:02

7tbear7

votes

1 answer

Counting and removing characters in different lines

I have DNA sequence data in FASTQ format, which takes a 4-line format per record: @sequence-header-information sequence + quality-scores Each character in the sequence line has a corresponding character in the quality score line. All the sequences…

bash awk sed fastq

asked Jul 17 '17 at 08:53

ecologysarah

votes

2 answers

Writing a script for large text file manipulation (iterative substitution of duplicated lines), weird bugs and very slow.

I am trying to write a script which takes a directory containing text files (384 of them) and modifies duplicate lines that have a specific format in order to make them not duplicates. In particular, I have files in which some lines begin with the…

bash fastq

asked Apr 20 '17 at 20:47

Alon Gelber

votes

3 answers

Bash script to concatenate text files with specific substrings in filenames

Within a certain directory I have many directories containing a bunch of text files. I’m trying to write a script that concatenates only those files in each directory that have the string ‘R1’ in their filename into one file within that specific…

bash fastq

asked Jan 12 '17 at 21:57

Alon Gelber

votes

5 answers

Python - Checking concordance between two huge text files

So, this one has been giving me a hard time! I am working with HUGE text files, and by huge I mean 100Gb+. Specifically, they are in the fastq format. This format is used for DNA sequencing data, and consists of records of four lines, something like…

python python-2.7 parsing bigdata fastq

asked Nov 18 '15 at 07:50

soungalo

1,106
2
19
34

votes

2 answers

Concatenate Files In Order Linux Command

I just started learning to use command line. Hopefully this is not a dump question. I have the following files in my directory: L001_R1_001.fastq L002_R2_001.fastq L004_R1_001.fastq L005_R2_001.fastq L001_R2_001.fastq L003_R1_001.fastq…

linux sorting cat fastq

asked Oct 15 '13 at 18:37

user2883746

votes

1 answer

Parallel sed with group capture

I have to process a big file, and have been reading about parallel command to try to use more than 1 core processor when using sed, sort and so on. So I first wanted to change first line of every four (because of naming conventions of this kind of…

linux sed parallel-processing gnu-parallel fastq

asked May 08 '13 at 09:32

cantalapiedra

votes

2 answers

Peek into stream of Popen pipeline in Python

Background: Python 2.6.6 on Linux. First part of a DNA sequence analysis pipeline. I want to read a possibly gzipped file from a mounted remote storage (LAN) and if it is gzipped; gunzip it to a stream (i.e. using gunzip FILENAME -c) and if the…

python popen fastq

asked Oct 07 '12 at 18:25

user1727089

vote

1 answer

R list path command only returns some of the files, but not all

I'm working on analyzing some fastq files in R for 16s work. I have a previous script from someone that has successfully done this before, but when I did: path_1 <- "set to my WD" then went to get a list of the files in the path…

r fastq

asked Jun 28 '23 at 20:09

Laeanna

vote

1 answer

Nextflow Units file specified is not found. Please provide a valid file

I have a following nextflow script which runs a tool perf on all the split fastq files located in the below mentioned directory. When I run the script I get the following error: *Error executing process > 'perf (29)' Caused by: Process `perf (29)`…

bioinformatics fastq nextflow

asked Mar 06 '23 at 21:16

AishwaryaKulkarni

vote

1 answer

How to align read to two SHORT reference sequences and see percentage that mapped to one or the other reference?

I have PCR-Amplified fastq files of a specific target region from several samples. For each sample, I want to know the percentage of reads that align better to reference sequence #1 or #2 posted below. How should I begin to tackle this question and…

bioinformatics fastq sequence-alignment

asked Dec 20 '22 at 23:18

Sara Nicholson

vote

0 answers

Fastp can not open a file

I used fastp like this > cat test | while read id > do > name=`echo $id |awk '{print $1}'` > read1=`echo $id |awk '{print $2}'` > read2=`echo $id |awk '{print $3}'` > echo $name > echo $read1 > echo $read2 > fastp \ > …

fastq

asked Nov 25 '22 at 08:29

Limbo

vote

1 answer

how to produce multiple readlength.tsv at once from multiple fastq files?

ı have 16 fastq files under the different directories to produce readlength.tsv seperately and ı have some script to produce readlength.tsv .this is the script that ı should use to produce readlength.tsv zcat ~/proje/project/name/fıle_fastq | paste…

bash loops fastq sbatch

asked Aug 23 '22 at 21:51

pierogi

vote

1 answer

How to extract unique read IDs from a fastq file?

I want to extract all the unique read IDs in a fastq file and output the unique read IDs to a text file. (I have done the same task for bam files using the samtools but I don't know any tools that would handle fastq files.) for BAM files: samtools…

uniq fastq ids

asked Jul 07 '22 at 19:53

HimalayanGuy

vote

1 answer

using printf to include both variable output and command

I am trying to get the number of reads for my fastq files, and I wanted the output to also include the name of my files. I've found a solution online that almost works, but still not getting the right output. Example: My file…

bash unix printf fastq

asked Apr 05 '22 at 01:05

Rachel

Prev 1 2

…

17 18 Next