Questions tagged [samtools]

Samtools is a suite of programs for interacting with high-throughput sequencing data.

Samtools is a suite of programs for interacting with high-throughput sequencing data. It consists of three separate repositories:

  1. Samtools Reading/writing/editing/indexing/viewing SAM/BAM/CRAM format
  2. Reading/writing BCF2/VCF/gVCF files and calling/filtering/summarising SNP and short indel sequence variants
  3. HTSlib A C library for reading/writing high-throughput sequencing data Samtools and BCFtools both use HTSlib internally, but these source packages contain their own copies of htslib so they can be built independently.

Links:

115 questions
0
votes
0 answers

Using OpenACC with compiled C programs

I am trying to use OpenACC to accelerate the Samtools package by inserting pragmas at the applicable for loops. linux86-64/19.4/bin/pgcc -acc autopar -ta=tesla config.h stats.c I get the following error message: stats.c: PGC-F-0206-Can't find…
0
votes
2 answers

Extract reads from a BAM/SAM file of a designated length

I am a bit of new to Perl and wish to use it in order to extract reads of a specific length from my BAM (alignment) file. The BAM file contains reads, whose length is from 19 to 29 nt. Here is an example of first 2…
pkom
  • 1
  • 1
0
votes
2 answers

Extract user specified sequence from reverse strand of from FASTA file Using samtools

I have a list of regions with start and end points. I used the samtools faidx ref.fa command. This command gave me the forward strand sequence for that region. In the samtools manual there is an option to extract reverse strand but I could…
Hsn SA
  • 1
  • 1
0
votes
2 answers

Filter a file to find rows that match in one column but differ in another column

I would like to filter a file so that I can obtain rows that match in column 1 and do not match in column 2. In the following example: 00b27c71-a833-4605-9fb3-a2714ac98092 ENST00000352983.6 157 60 16 00d77e65-466e-4fe6-ad0f-bc6b3f44af75 …
csijcs
  • 47
  • 5
0
votes
2 answers

BASH Making pileup files recursively using values piped from one column in another file

I'm trying to make pileup files using samtools from two files, File1 and File2. I have split up File1 and File2 by chromosome, resulting in having 44 files named following the…
Emm Gee
  • 165
  • 7
0
votes
0 answers

Does picard markduplicate toggle PCR duplicate samflag

I have a RNA-seq bam file and there are few reads that are puzzling me. According to the bam header, this bam file is sorted by coordinate, created using tophat and markduplicate step is not done. But some reads are marked for being duplicate in the…
svural
  • 961
  • 1
  • 9
  • 17
0
votes
1 answer

how to ouput sed/samtools result into new directory

I have the following sed command that change the chromosome name: for file in /myoldpath/*.bam; do filename=`echo $file | cut -d "." -f 1`; samtools view -H $file | sed -e 's/SN:\([0-9XY]\)/SN:chr\1/' -e 's/SN:MT/SN:chrM/' | samtools reheader -…
Alexis_543
  • 33
  • 8
0
votes
1 answer

How to use SAMtool htslib library to read optional info fields

I have a BAM file: ERR174327.487900 99 chr9 80320323 60 101M = 80320752 530 AGGGACATTGGTCCAAAAGGTTTTAATTAACCATACACCCTGCTCTACAAATCTAAAAAACTGTAGGACAGTATTTTGAGTCTCCAAGTATCCAGTGATAA …
ABCD
  • 7,914
  • 9
  • 54
  • 90
0
votes
1 answer

Filter RDD in Spark using class attribute provided by pysam

I am using pysam, a python library, for reading BAM files within Spark. I created an RDD containing the "BAM" data. When I try to filter the data, using the attribute query_sequence of the class AlignedSegment(pysam library), then spark crashes.…
alexa
  • 63
  • 1
  • 4
0
votes
2 answers

Mpileup regex command to remove indels

I am trying to filter out insertions and deletions from an mpileup txt file. An example of an insertion or deletion would be +3ATG or -9AATCGTCTC. In another post I found a solution using perl: regular expression that reference a match from earlier…
0
votes
1 answer

dealing with paths in unix loops

Relative unix newbie. I have a number of directories (Sample*/), within which I want to merge all raw.sort.bam files using samtools. I have working code to do this within each directory, but I want to deal with all directories at once by running…
0
votes
1 answer

Samtools pysam mate

I am using pysam to dome data mining on .bam files. I want to check if a read has a mapped mate. The command mate = samfile.mate(read1) throws an error if the mate is not mapped, so if I do if samfile.mate(read1): ... that throws an error, too.…
user2725109
  • 2,286
  • 4
  • 27
  • 46
0
votes
1 answer

htsjdk intermittent indexed fasta read errror

I have an issue with intermittently getting an exception from htsjdk.samtools.reference.IndexedFastaSequenceFile: htsjdk.samtools.SAMException: Sequence dictionary and index contain different numbers of contigs or htsjdk.samtools.SAMException:…
Kyle
  • 401
  • 4
  • 10
0
votes
1 answer

Awk is using command line arguments instead of column names

I am trying to create a script that removes read groups from the header of a sam file. The code, run from the command line is below. samtools view -H e2_20.indel.recal.dedup.bam | awk ' BEGIN {FS = "\t"} {split($2,a,":")} {if ($1 != "@RG" || ($1…
szimmerman
  • 21
  • 1
  • 2
0
votes
1 answer

Running samtools from a qsub

I'm trying to run some samtools commands from a qsub call (to run on a cluster). For some reason, the commands do not seem to be recognized. However, if I copy-paste the command and run it directly from the terminal cluster, it works fine. Has…