I have a virus gene-integrated human gene in data frame or text file form like:
"C""G""C""T""G""T""T""G""T""T"...
It is 50000 nucleotides long. I have also the virus gene data frame and I found its standard deviation and mean frequency before. I'm trying to find an approximate location of this virus gene by dividing the human gene into 1000 nucleotides long fragments and find the location by frequency and standard deviation values that I have.