Questions tagged [fastq]

FASTQ files are used in bioinformatics to store sequence information and sequencing quality scores.

FASTQ format is a text-based format for storing both a biological sequence (usually nucleotide sequence) and its corresponding quality scores. Both the sequence letter and quality score are each encoded with a single ASCII character for brevity.

[Wikipedia]

257 questions
3
votes
3 answers

Parse file and use some of the fields as variables using the header as name in bash

I have a file which first line contain a series of fields, tab separated (\t). I'm trying to walk through the lines and use some of the fields as variables for a programme. The code I have so far is the following: { A=$(head -1…
biojl
  • 1,060
  • 1
  • 8
  • 26
3
votes
3 answers

Bash: replace part of filename

I have a command I want to run on all of the files of a folder, and the command's syntax looks like this: tophat -o What I would like to do is a script that loops over all the files in an arbitrary folder and also uses…
erikfas
  • 4,357
  • 7
  • 28
  • 36
3
votes
1 answer

creating new queue using torque/PBS "access from host not allowed"

I have carried out the following commands. qmgr -c "create queue fastq queue_type=execution" qmgr -c "set queue fastq started=true" qmgr -c "set queue fastq enabled=true" qmgr -c "set queue fastq acl_hosts=compute-0-30" qmgr -c "set queue fastq…
Griff
  • 2,064
  • 5
  • 31
  • 47
3
votes
2 answers

get content from variable whose name is taken from another variable

I am doing some shell scripting. I use this construction for creating new variables: eval ${ARG}_ext=fastq which works pretty nice because then I can use those newly created variable directly like this: $file_ext Now I want to assign value to the…
Perlnika
  • 4,796
  • 8
  • 36
  • 47
3
votes
2 answers

what's the meaning of samtools mpileup result "^F"

this is a part of samtools mpileup result: chr7 55241514 G 2786 ..................... chr7 55241515 C 2786 ..................... chr7 55241516 C 2786 ..................... chr7 55241517 …
user1744416
  • 171
  • 1
  • 12
3
votes
2 answers

Does multiple runs make it parallel?

I have written a short python script to process my big fastq files in size from 5Gb to 35Gb. I am running the script in a Linux server that has many cores. The script is not written in parallel at all and taking about 10 minutes to finish for a…
svural
  • 961
  • 1
  • 9
  • 17
3
votes
4 answers

Trim Illumina reads in a bam/sam file

I have found plenty of tools for trimming reads in a fastq format, but are there any available for trimming already aligned reads?
JoshuaA
  • 279
  • 4
  • 11
2
votes
1 answer

Syntax conflict for "{" using Nextflow

New to nextflow, attempted to run a loop in nextflow chunk to remove extension from sequence file names and am running into a syntax error. params.rename = "sequences/*.fastq.gz" workflow { rename_ch =…
2
votes
3 answers

How to merge zcat and bzcat in a single function

I would like to build a little helper function that can deal with fastq.gz and fastq.bz2 files. I want to merge zcat and bzcat into one transparent function which can be used on both sorts of files: zbzcat example.fastq.gz zbzcat…
Peter Pisher
  • 457
  • 2
  • 11
2
votes
2 answers

How to trim every nth line?

i would like to cut off the first 9 characters of each 4th line. I could use cut -c 9, but i don't know how to select only every 4th line, without loosing the remaining…
gnikixam
  • 69
  • 6
2
votes
2 answers

How do I append variable name while splitting fastq file?

I have a fastq file below and I want to split the file by lane=$2. My code does the job of splitting it, but I also want the output files to have $SM variable appended to them. Can someone please let me know what I am missing in my…
MAPK
  • 5,635
  • 4
  • 37
  • 88
2
votes
2 answers

Write an txt file with fastq pair names with python

I'm new to python and want to improve it. Now I want to write a python script to organize my fastq file names into a txt file, like this: My files are like…
stevex
  • 59
  • 1
  • 8
2
votes
1 answer

grep every fourth line in .fastq

I am working on a linux machine using bash. My question is, how can I skip lines in the query file using grep? I am working with a large ~16Gb .fastq file named example.fastq which has the following format. example.fastq @SRR6750041.1…
Paul
  • 656
  • 1
  • 8
  • 23
2
votes
1 answer

Read FASTQ file into a Spark dataframe

I'm trying to read FASTQ files into Spark dataframes. I have some difficulties because FASTQ is a multi line format. Example: @seq1 AGTCAGTCGAC + ?@@FFBFFDDH @seq2 CCAGCGTCTCG + ?88ADA?BDF8 Is there a way to get these data in a Spark dataframe…
John Doe
  • 354
  • 2
  • 10
2
votes
1 answer

Delete an item from a dictionary generated by SeqIO.index

I am using Python 2.6.6 and I am trying to remove fastq reads in file2 that overlap (i.e., are identical to) reads in file1. Here is code I am trying to implement: ref_reads = SeqIO.index("file1.fastq", "fastq") spk_reads =…
wa3j
  • 21
  • 2
1
2
3
17 18