Aligning sequence and comparing it against primer

Question

I am looking to show how a primer is consistent among some genomic data. I have a primer of about 23bp and looking to compare it to about 5000 genomic sequences of 10kb. Since that is too much for my computer to do, I wanted to do that following:

>     1. Cut out the area that my primer is located and about 20bp down each end. 
>     2. Show only the bases that are different from my primer in my analysis.
>      ex: Primer: -----------ATGTGGAAGCAAATATCAAATGA---------
>          Gene:   ATGACCATACG----C--------------T---ATCGTAGGG
>                  ATGAGCATACC-----A----T--------T---TTCGAACGC

The data I am using is all dengue sequences (all serotypes) and the primer with the following code: ATGTGGAAGCAAATATCAAATGA.

I was trying to somehow use the msa() function and only look at the part of the gene of interest. However, it was difficult because to accurately predict if you would need to have it aligned.

I was still thinking of maybe cutting out an arbitrary number around that part of the gene and aligning it, but could not figure a way out to demonstrate it properly and also thought others might have suggestions for better way to do it.

I am using Biostrings, msa, and seqinr. I use ncbi to get the genetic sequences and using FASTA files.

Thanks!

More suitable to https://bioinformatics.stackexchange.com/ – zx8754 Sep 10 '18 at 07:28 — zx8754, Sep 10 '18 at 07:28
Hi Gregor, thanks! I threw it on there. – Colin Sep 10 '18 at 16:53 — Colin, Sep 10 '18 at 16:53

Aligning sequence and comparing it against primer

0 Answers0