I have a complicated xml file with items as 1st child nodes. The items can have different structure and some of the attributes are missing in some of them. I need to store one item (nodeset) in tibble row, so that I keep track on missing attributes and write a function handling all variants.
I found a solution of the first step by Felix Ebert: https://stackoverflow.com/questions/49253021/how-to-extract-xml-attr-and-xml-text-on-different-levels-with-xml2-and-purrr
I copy part of the code here:
xml <- xml2::read_xml("input/example.xml")
rows <- xml %>% xml_find_all("//xmlsubsubnode")
rows_df <- data_frame(node = rows)
Function data_frame was depreciated and I got error messages if I replace it with
tibble()
as_tibble()
data.frame()
With "tibble" I get following ERROR:
df_articles <- tibble(item = xml_articles)
Error:
! All columns in a tibble must be vectors.
✖ Column `item` is a `xml_nodeset` object.
Backtrace:
1. tibble::tibble(item = xml_articles)
2. tibble:::tibble_quos(xs, .rows, .name_repair)
3. tibble:::check_valid_col(res, col_names[[j]], j)
4. tibble:::check_valid_cols(set_names(list(x), name))
I would be grateful if anybody can update the original post.