I would like to extract the html content of a tag in R. For instance, in the following HTML,
<html><body>Hi <b>name</b></body></html>
suppose I'd like to extract the content of the <body>
tag, which would be:
Hi <b>name</b>
In this question, the answer (using as.character()
) will include the enclosing tag, which is not what I want. eg,
library(rvest)
html = '<html><body>Hi <b>name</b></body></html>'
read_html(html) |>
html_element('body') |>
as.character()
returns outerHTML:
[1] "<body>Hi <b>name</b>\n</body>"
...but I want the innerHTML. How can I get the content of a HTML tag in R, without the enclosing tag?