Questions tagged [ncbi]

NCBI is a National Center for Biotechnology Information, one of the most important websites used by bioinformaticians. NCBI runs a big variety of various bioinformatical web services, also provides important databases for download.

The NCBI covers a wide range of bioinformatics resources, from journal listing to gene alignments to chemical libraries databases to protein folding prediction.

NCBI's data is publicly available from the main website and from ftp repositories.

  • PubMed
    PubMed, a database of citations and abstracts for biomedical literature from MEDLINE and additional life science journals.

  • The NCBI C++ Toolkit provides a set of modules to access, modify, generate and deposit biological data. The full description can be read in its online book

  • PubChem, a chemical library database, has its own API to search and retrieve chemical compounds

205 questions
1
vote
0 answers

Problems extracting metadata from NCBI in R

I am trying to extract some information (metadata) from GenBank using the R package "rentrez" and the example I found here https://ajrominger.github.io/2018/05/21/gettingDNA.html. Specifically, for a particular group of organisms, I search for all…
1
vote
1 answer

Unable to download data using Aspera

I am trying to download data from the European Nucleotide Archive (ENA) using Aspera CLI however my downloads are getting stalled. I have downloaded several files earlier using the same tool but this is happening since last one month. I usually use…
1
vote
1 answer

How do I find the nucleotide sequence of a protein using Biopython?

I have proteins for which I would like to find their corresponding nucleotide sequences. I also have the genome in which the protein is found. In the genome, I have found the corresponding Gene ID for the protein. However, I am having trouble…
Cindy Fang
  • 41
  • 5
1
vote
1 answer

Extract matching pattern from input file and print to output file in Perl

I have huge input file from ncbi blastn in this form:
    Job Title: otu0 Database: rRNA_typestrains/prokaryotic_16S_ribosomal_RNA 16S ribosomal RNA (Bacteria and Archaea) Query #1: otu0 Query ID:…
RebiKirl
  • 83
  • 1
  • 7
1
vote
1 answer

Entrez eFetch Accession Number

We are currently working on a project where we need to access the 'NP_' accession number from ClinVar. However, when we use the Entrez.eFetch( ) function, this information appears to be missing in the result. Here is a link to the website page where…
1
vote
0 answers

Does Biopython have an argument to "Exclude Uncultured samples" and "Sequences from type material"?

I am using biopython to do blast searches on bacterial strains, !HOWEVER! a normal blast search provides me with ALL strains, but I only want the well annotated strains (not all the data dump that comes with a normal query). Normal Solution: The…
Barry
  • 11
  • 3
1
vote
0 answers

Error from getSRAfile function in SRAdb package

I am trying to download RNASeq data from NCBI SRA repository using SRAdb package. I consistently get the following error: getSRAfile( in_acc = c("SRR2033366", "SRR2033446"), sra_con = sra_con, destDir = getwd(), fileType = 'sra', srcType…
NBN
  • 11
  • 1
1
vote
1 answer

How to download _full_ RefSeq record using Efetch?

I have a problem downloading a full record from Nucleotide db. I use: from Bio import Entrez from Bio import SeqIO with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as handle: seq_record = SeqIO.read(handle, "gb")…
Some student
  • 131
  • 2
  • 13
1
vote
2 answers

R error in connecting to NCBI to access protein sequences using "read.GenBank"

Im trying to access protein sequence data from NCBI in R using the function read.Genbank: e.g. ref.proteins <- c("XP_005327622", "XP_026241994", "NP_001107354", " XP_007536378", "NP_001268234 XP_004712197", "XP_017531808",…
1
vote
0 answers

Which homebrew package for "LWP::Protocol::https not installed" on macOSX?

I'm trying to download a fasta file from NCBI with the following perl-based pipeline: esearch -db nuccore -query "\"\(internal transcribed spacer 1\"[All Fields] AND \(300[SLEN] : 600[SLEN]\)\) NOT \"uncultured Neocallimastigales\"[porgn] NOT…
Camouf0079
  • 11
  • 3
1
vote
2 answers

Error: Too many positional arguments (1) when using BLAST w/ a bash for loop

I'm trying to write a script that will go through all of the directories within a directory where it will query a specific sequence against a local blast database. I've run the BLAST search without the bash for loop and used a for loop to create the…
1
vote
1 answer

How to download gene expression data from NCBI gene database

In the NCBI gene database, I can add the expression tracks (circled in picture blow) through 'Tracks' button, but How I can download the expression data directly, not just look the picture?
YudongCai
  • 19
  • 2
1
vote
1 answer

ncbi C++ exception(in the function GetSeqEntry())

NCBI on windows10 I input the following command in the command line and want to get pssm: psiblast -in_msa 1.sequence.txt -db nr -comp_based_stats 0 -out_ascii_pssm seqpssm.txt but I got the C++ exception: Error: NCBI C++ Exception: T0…
charlie freak
  • 21
  • 1
  • 6
1
vote
1 answer

Bio.Blast (NCBIWWW) on multiple sequences fails and stalls

I am trying to parse a few dozen sequences through BLAST, using Bio.Blast with NCBIWWW, in Python 2.7. Not a problem there with one or a couple sequences, but the NCBIWWW.qblast() always stops after about 5-7 iterative BLAST searches. Importantly,…
DrOrpheum
  • 23
  • 6
1
vote
1 answer

Get NCBI taxIDs under a given taxID

Somewhat similar to this thread: How can I get taxonomic rank names from taxid? I have a taxID for a genus and I want to pull all taxIDs or accession numbers below that genus. Can anyone advise?