1

I have made a list and want to pull out only certain components and elements of that list and save it as a data frame. However, when using the sapply function, it pulls the correct data for some elements, but for others it does not.

Here is a reproducible example:

#Creating an example list

library(KEGGREST)
chem_info <- keggGet(dbentries = c("C01546", "C05984"))

#Extracting what I need from the chem_info list. I am needing to extract the Compound, FORMULA, and PATHWAY fields. I have followed the directions listed here for parts of this step: How to extract elements from a list with mixed elements

a <- as.list(t(sapply(chem_info, '[', c(1,3))))
b <- sapply(chem_info, '[[', 7)
cbind(a, Pathway = b)

#My Result & problem = there is a 'character, 3' instead of what the list has for the PATHWAY element of the list.

ENTRY FORMULA Pathway
[1,] "C01546" "C5H4O3" character,3
[2,] "C05984" "C4H8O3" "Propanoate metabolism"

What I am aiming for is to have everything listed under the PATHWAY elements for every entry to be listed under the pathway column. For example, the first entry, C01546 has 3 elements under PATHWAY, Furfural degradation, Metabolic pathways, and Microbial metabolism in diverse environments. I need this listed under the Pathway column. Here is an example output of what I'm hoping to achieve.

What I'm aiming for is the following:

ENTRY FORMULA Pathway
[1,] C01546 C5H4O3 Furfural degradation, Metabolic pathways, and Microbial metabolism in diverse environments
[2,] C05984 C4H8O3 "Propanoate metabolism"

Any help would be immensely appreciated!

Purrsia
  • 712
  • 5
  • 18

1 Answers1

1

This final line should do the trick. You just need to collapse any of the entries where it is a vector rather than a single element, before you can properly combine with cbind().

cbind(a, Pathway = lapply(b, function(x) paste(x, collapse = ", ")))
     ENTRY    FORMULA  Pathway                                                                                 
[1,] "C01546" "C5H4O3" "Furfural degradation, Metabolic pathways, Microbial metabolism in diverse environments"
[2,] "C05984" "C4H8O3" "Propanoate metabolism"          
Josh White
  • 1,003
  • 1
  • 17