0

I used the Burrows-Wheeler Aligner to map a high-coverage short-read sequence to a reference genome. The output is in .sam format. I have also used a separate program to identify the loci in the reference genome at which microsatellites occur. I would like to identify all the loci in the short-read sequence at which the microsatellite length and loci differ from the reference genome. Does anyone know any tools or packages I could use to read a .sam/.bam file of a short-read sequence mapped to a reference genome and identify specific loci at which the short-read sequence differs from the reference genome? I am using RStudio and have access to my university's supercomputer clusters.

For info on microsatellites, see here: https://en.wikipedia.org/wiki/Microsatellite#:~:text=A%20microsatellite%20is%20a%20tract,locations%20within%20an%20organism's%20genome.

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
  • 2
    suggest you try https://bioinformatics.stackexchange.com/ seems extremely specific... – StupidWolf Jul 01 '20 at 20:17
  • there are packages for reading bam files, for example https://bioconductor.org/packages/release/bioc/html/GenomicAlignments.html or other packages in Bioconductor. but your question of how to find differences... that is a total mystery to me because i don't know what they are – StupidWolf Jul 01 '20 at 20:18
  • Thanks @StupidWolf ! I have re-posted the question to bioinformatics.stackexchange and will check out bioconductor – annabelperry Jul 01 '20 at 20:36

0 Answers0