Genome is the entirety of an organism's DNA sequence. The genome includes both the genes and the non-coding sequences, such as repeats, introns and regulatory sequences, possessing both known and unknown function.
Questions tagged [genome]
230 questions
2
votes
1 answer
Python Regex to Extract Genome Sequence
I’m trying to use a Python Regular Expression to extract a genome sequence from a genome database; I’ve pasted a snippet of the database below.
>GSVIVT01031739001 pacid=17837850 polypeptide=GSVIVT01031739001 locus=GSVIVG01031739001…

MojaveAzure
- 21
- 1
- 3
2
votes
1 answer
What is the syntax to instantiate a structured dtype in numpy?
If I have a dtype like
foo = dtype([('chrom1', '

traeki
- 33
- 6
1
vote
2 answers
Can I parse hg19.2bit with php?
I know this is possibly an obscure use for php, but I'm working on an idea to navigate the human genome in a rather interesting way.
The problem is I need to know if I can write a php script to parse the freely available data, and if so how would I…

T9b
- 3,312
- 5
- 31
- 50
1
vote
1 answer
Nextflow No such variable: id
I'm trying to perform my first code with Next-flow, im introducing 2 paired reads and I want to execute the bbduk function. I don't know why my code didn't works.
I tryed the following code:
#!/usr/bin/env nextflow
/*
* Pipeline Metagenomics,…

Adría Cruells
- 13
- 2
1
vote
0 answers
Generate Random Permutations of Genomic Ranges using Nullranges (matchedranges or bootranges)
I want to generate 200 random genomicranges that are 200kbp long each that can occur anywhere in the genome. I was recommended to try using nullranges, but I haven't figured out how to specify only generating 200 ranges / iteration. I think it takes…

erman
- 11
- 1
1
vote
1 answer
Why is my SPAdes not working on Nextflow?
The SPAdes is not working on my Nextflow for some reason, I already have it installed.
I used the following code, but it doesn't seem to work. Can anyone please help point out where the problem is?
#!/usr/bin/env…

Terra
- 21
- 1
1
vote
1 answer
ggplot: Any way to only draw x axis border starting from 0?
I'm trying to add a border to the axis of this ggplot but it extends past the 0Mb mark and I would like it to start there. Is there a way to start it at 0 or have it covered up by a white line in the negative direction so that it doesn't show? I…

Mark Pampuch
- 27
- 4
1
vote
1 answer
Retrieve mRNA sequence based on DNA coordinates
I have a list of genome DNA coordinates (hg38), I want to retrieve corresponding mRNA sequence 200bp up/downstream of these coordinates’ positions, and idea?
Thank you.
I have tried table browser, easy to get all codon sequence based on coordinates,…

user20649250
- 11
- 1
1
vote
1 answer
How do I rewrite this expected depth of (genome) coverage function in R?
I need to draw the probability density for a random position for Length of fragment = 600, Genome size = 3 × 109, and Number of reads = 10 million reads
depth_of_coverage <- function(genome = 3E9, fragment_length = 600, reads = 10E6) {
depth <- 0
…

ibnadam
- 13
- 2
1
vote
1 answer
How to determine characteristics for a genome?
In AI, are there any simple and/or very visual examples of how one could implement a genome into a simulation?
Basically, I'm after a simple walkthrough (not a tutorial, but rather something of a summarizing nature) which details how to implement a…

Marcus Hansson
- 816
- 2
- 8
- 17
1
vote
0 answers
MuscleCommandLine non-zero return code 1/is not recognized as an internal or external command,
I am trying to align 4 difference sequences using MuscleCommandLine. This code works perfectly on Anaconda and Mac but I am trying to make it work on Windows and I am having several issues.
muscle_exe = r'../muscle3.8.31_i86darwin64.exe'
in_file =…

Harr1ls
- 71
- 5
1
vote
1 answer
Tab file mix up column when loading into R
I am trying to load data into R, but some row does not work well. I got this issue a lot of time, but when I load them in excel, it works well. Please help me if you know the reason.
Thank you very much!
library(RCurl)
URL <-…

Trinh Phan-Canh
- 33
- 3
1
vote
0 answers
How to setup a Seurat object from gz file?
I am trying to follow the Seurat tutorial found here: https://satijalab.org/seurat/articles/pbmc3k_tutorial.html
The PBMC raw data from the tutorial downloads to my computer as: pbmc3k_filtered_gene_bc_matrices.tar.gz
I am having trouble uploading…

Lucia Wagner
- 45
- 3
1
vote
3 answers
Can anyone tell me how to replace strings with floats in an np.array(of several genotypes) by frequence per column?
I have a np.array matrix(1826*5000) where the rows are my samples and the columns are the features.
That means I have a genotype in each line with the individual nucleotides as a string.
like this:
[['G' 'G' 'G' ... 'T' 'T' 'A']
['G' 'G' 'G' ...…

Python NoHand
- 13
- 2
1
vote
1 answer
How to optimize my FASTA parser Python script in order to make it runs faster on slurm?
I hope I post on the right place ?
My script is running fine on little genomes but it take hours and days when it comes to work with mammal genomes. I tried many different things but Im out of idea. Can you tell me what cause this script to be so…

CitronWorld
- 47
- 4