Questions tagged [pubchem]

Free database of chemical structures of small organic molecules and information on their biological activities

PubChem is a database of chemical molecules and their activities against biological assays. The system is maintained by the National Center for Biotechnology Information (NCBI), a component of the National Library of Medicine, which is part of the United States National Institutes of Health (NIH). PubChem can be accessed for free through a web user interface. Millions of compound structures and descriptive datasets can be freely downloaded via FTP. PubChem contains 162 Millions substance descriptions and small molecules. More than 80 database vendors contribute to the growing PubChem database.

35 questions
1
vote
0 answers

rpubchem error - collecting data from pubchem

This is my first try at using R to collect data from pubchem. However i am getting the following error every time for every cid that i have used. library("rpubchem") get.cid(46926545) Error in do.call(cbind, unlist(Filter(function(x)…
Sidd
  • 11
  • 1
1
vote
1 answer

extracting sdf files from pubchem matching a monoisotopic mass

I'm trying to extract the chemical structures from the pubchem database in the sdf format of compounds matching a certain exact mass and in the range of 10ppm of that exact mass(exactmass-cmpndmass/exactmass)*10^6. Is there a way to achieve this…
1
vote
1 answer

R: parse JSON/XML exported compound properties from Pubchem

I would like to parse all chemical properties of a given compound as given in Pubchem in R, using the JSON (or XML) export facility. Example: ALPHA-IONONE, pubchem compound ID 5282108 https://pubchem.ncbi.nlm.nih.gov/compound/5282108…
Tom Wenseleers
  • 7,535
  • 7
  • 63
  • 103
0
votes
3 answers

How to extract all the IUPAC names mentioned in the data available from Pubchem(NCBI) into a text file?

I want to build lists of prefixes and suffixes of some length from all the IUPAC names mentioned in Pubchem Database,so that I can use them further in my project as a feature.So I want all the IUPAC chemical names in a text file or in some format…
0
votes
1 answer

How to extract 'Odor' information from PubChem using BeautifulSoup

I wrote the following Python code extract 'odor' information from PubChem for a particular molecule; in this case molecule nonanal (CID=31289) The webpage for this molecule is: https://pubchem.ncbi.nlm.nih.gov/compound/31289#section=Odor import…
John Mommers
  • 140
  • 7
0
votes
0 answers

programmatic access to pubchem bioassay data in R

I'm trying to loop over a list of 11,500 PubChem CIDs to retrieve the BioAssay results table (when available). For example, for the CID 2965821, this is the table I want to obtain. I only need the rows where activity is "Active". Following this…
xsrt
  • 1
  • 1
0
votes
1 answer

Error when trying to use df.merge: "You are trying to merge on object and int64 columns"

I'm currently trying to write a program that takes a chemical compound's identifier (something called the CID number) and then gives back the compound's properties by using the pubchempy documentation. However, I keep getting an error when I try to…
0
votes
2 answers

How to use pandas' df.get function for a dataframe column so that each row in the column maintains its own value?

To summarize as concisely as I can, I have data file containing a list of chemical compounds along with their ID numbers ("CID" numbers). My goal is to use pubchempy's pubchempy.get_properties function along with pandas' df.map function to…
0
votes
1 answer

Is there a way to apply a function to all the values in a column and then replace the column values with the new values?

Essentially I'm reading a csv file containing a bunch of chemical compounds and I'm trying to apply the pubchempy.get_properties function to the CID column that contains the CID (identifier) number of each of the chemical compounds. However, I can't…
0
votes
0 answers

IndexError: list index out of range. For Stitch API, when I do API using python, for a few inputs I get output, only the last rows have index error

I am new to programmimg. I am trying to find targets for chemicals using STITCH API. When I run the code, for some of the inputs in the list, I get the output. But in the end, few lines show index error as quoted above (like if I have 10 input IDs,…
Boo
  • 13
  • 4
0
votes
0 answers

How to speed up the request to the server? (PUGREST.Timeout)

I run into a problem related to the server request time. In some cases (for example C2H4), it gives the result after 5-10 seconds (too slow, tho), in other cases (for example C9H8O4) it fails with timeout error. Obviosly, tha dates of both…
0
votes
1 answer

Download json-file from pubchem

My task is to download json-file from website (pubchem) using only the query string (h2o for example) and JS. I know it's possible to do with parsing, but this is too much code because of number of pages i need to parse for getting destination. Is…
0
votes
1 answer

seeming simple rpubchem interactive function produces error

The following lines of R code produces the errors in italics. It would seem to be a rpubchem error, unless I'm doing something stupid: require(rpubchem); get.aid.by.cid(614467, type="raw") Output: ***Warning messages: 1: In readLines(icon,…
B Koch
  • 1
0
votes
0 answers

Pubchem compound database into oracle

Did anyone know a way to import easy all the data from the pubchem compound database into an oracle database (11g)? I didn't find any solution which solves my problem
bladepit
  • 853
  • 5
  • 14
  • 29
-1
votes
2 answers

Get molecules from PubChem which have an Exact Mass e.g. 1176.784 +/- 0.01 Dalton by using Python

I wrote the following code to find all molecules in PubChem which have an ExactMass of, in this case, 1176.784 +/- 0.01 Da. I get an error request fail [code 400]. The url should be ok, I checked the PubChem documentation, however I can't find the…
John Mommers
  • 140
  • 7