I am trying to extract data from NCBI using different functions in rentrez
package. However, I have an issue because the function extract_from_esummary()
in rentrez results in matrix, where text of a column is splitted into adjacent columns when saved in .csv file ( as shown in Image) because of "," is recognized as a delimiter.
library (rentrez)
PM.ID <- c("25979833", "25667274","23792568","22435913")
p.data <- entrez_summary(db = "pubmed", id = PM.ID )
pubrecord.table <- extract_from_esummary(esummaries = p.data ,
elements = c("uid","title","fulljournalname",
"pubtype"))
From the image example above, In Column PMID: 25979833, the journal name split to extend into the next column. European journal of cancer (Oxford
in columns 1 and then England : 1990)
in next column. When I did a dput(pubrecord.table), I understood that the split is because the words are separated by comma ",". How can I make R understand thatEuropean journal of cancer (Oxford, England : 1990)
belongs to the same column ? Similar issue with the Title and Pubtype fields.... where the long text has a comma in between and R breaks it by csv format. How can I clean the data to so that data is in appropriate column ?