0

I want to combine multiple XML strings (> 1000) into one string in R. This can for example be done by the XML package (xml_add_sibling). However I would like to get rid of the intermediate root nodes ("positions" in my example).

Input:

library(XML)    
position1 <- <positions>
  <moneyMarket>
    <positionName>1</positionName>
    <notional>10000</notional>
    <currency>EUR</currency>
  </moneyMarket>
</positions>

position2 <- <positions>
      <moneyMarket>
        <positionName>2</positionName>
        <notional>40000</notional>
        <currency>EUR</currency>
      </moneyMarket>
        </positions>

position3 <- <positions>
      <moneyMarket>
        <positionName>3</positionName>
        <notional>50000</notional>
        <currency>EUR</currency>
      </moneyMarket>
    </positions>

Code:

combined_XML <- xml_add_sibling(position1,position2)
combined_XML <- xml_add_sibling(combined_XML,position3)

Actual results:

<positions>
  <moneyMarket>
    <positionName>1</positionName>
    <notional>10000</notional>
    <currency>EUR</currency>
  </moneyMarket>
</positions>
<positions>
  <moneyMarket>
    <positionName>2</positionName>
    <notional>40000</notional>
    <currency>EUR</currency>
  </moneyMarket>
</positions>
<positions>
  <moneyMarket>
    <positionName>3</positionName>
    <notional>50000</notional>
    <currency>EUR</currency>
  </moneyMarket>
</positions>

Expected results:

<positions>
  <moneyMarket>
    <positionName>1</positionName>
    <notional>10000</notional>
    <currency>EUR</currency>
  </moneyMarket>
  <moneyMarket>
    <positionName>2</positionName>
    <notional>40000</notional>
    <currency>EUR</currency>
  </moneyMarket>
  <moneyMarket>
    <positionName>3</positionName>
    <notional>50000</notional>
    <currency>EUR</currency>
  </moneyMarket>
</positions>
Thomas
  • 1
  • You may want to have a look at [this question.](https://stackoverflow.com/questions/43461907/in-r-how-do-i-combine-two-xml-documents-into-one-document) – maydin Jul 23 '19 at 10:24
  • I have looked at that question before but I don't see how this will help. – Thomas Jul 25 '19 at 10:29

1 Answers1

0

I took the example data which is including three xml document with name position1 , position2 and position3. Since each one has a name called position, I used get function to reach them. I assigned i<-3, since there exist three xml document.

If you have got 1000 xml file, then you need to assign i<-1000. So it means that you have got 1000 xml file named with both position and number like ; position1, position2, position3, position4, ..., position1000.

The codes below, adds the children of the other xml documents to the first one which is position1. Thus, at the end, by running xmlParse(position1) you can reach the result.

  library(xml2)  
  library(XML)

  position1 <- "<positions>
                  <moneyMarket>
                    <positionName>1</positionName>
                    <notional>10000</notional>
                    <currency>EUR</currency>
                  </moneyMarket>
                </positions>"

  position2 <- "<positions>
                  <moneyMarket>
                    <positionName>2</positionName>
                    <notional>40000</notional>
                    <currency>EUR</currency>
                  </moneyMarket>
                </positions>"

  position3 <- "<positions>
                  <moneyMarket>
                    <positionName>3</positionName>
                    <notional>50000</notional>
                    <currency>EUR</currency>
                  </moneyMarket>
                </positions>"


 position1 <- read_xml(position1)
 position2 <- read_xml(position2)
 position3 <- read_xml(position3)


 i <- 3

 while(i>1) {

     mychildren <- xml_children(get(paste0("position",i)))

     for (child in mychildren) {

        xml_add_child(get(paste0("position",i-1)), child)

     }

     i <- i-1

 } 

 xmlParse(position1)

Output:

  <?xml version="1.0" encoding="UTF-8"?>
  <positions>
     <moneyMarket>
       <positionName>1</positionName>
       <notional>10000</notional>
       <currency>EUR</currency>
     </moneyMarket>
     <moneyMarket>
       <positionName>2</positionName>
       <notional>40000</notional>
       <currency>EUR</currency>
     </moneyMarket>
     <moneyMarket>
       <positionName>3</positionName>
       <notional>50000</notional>
       <currency>EUR</currency>
     </moneyMarket>
 </positions>
maydin
  • 3,715
  • 3
  • 10
  • 27
  • Thanks for your reply! However, when I want to execute `read_xml(position1)` I get the following error: _Error in UseMethod("read_xml") : no applicable method for 'read_xml' applied to an object of class "c('xml_document', 'xml_node')"_. This might be due to the fact that position1 is already a xml file. But when I don't use read_xml, I get the following error in the loop: _Error in UseMethod("nodeset_apply") : no applicable method for 'nodeset_apply' applied to an object of class "NULL"_ . – Thomas Jul 30 '19 at 07:54
  • What you get when you run `class(position1)` ? – maydin Jul 30 '19 at 08:22