How to select all nodes which have no attributes using rvest?

Question

Using rvest, how to select nodes which have no attributes?

For example:

<nodes>
    <node attribute1="aaaa"></node>
    <node attribute1="bbbb"></node>
    <node></node> <- FIND THIS
</nodes>

Here is a related thread using XPath, but when I try in rvest with something similar to

wp %>% html_read(.) %>% html_nodes(xpath = "//node[not(@*)")

where wp is the desired url, I error out with:

Warning message:
In xpath_search(x$node, x$doc, xpath = xpath, nsMap = ns, num_results = Inf) :
  Invalid predicate [1206]

when I can see what I desire to scrape has no attributes in the page source.

To be frank, I just don't know enough about web development and HTML to understand how to generalize this example to rvest's doumentation. Any help or resources would be much appreciated!

EDIT:

The correct code to achieve this in rvest is

wp %>% html_read(.) %>% html_nodes(xpath = "//node[not(@*)]")

score 1 · Accepted Answer · answered Jun 26 '19 at 18:22

1

It looks like you are just missing a closing square bracket:

library(rvest)

"<nodes>
    <node attribute1=\"aaaa\" attribute2=\"cccc\"></node>
    <node attribute1=\"bbbb\"></node>
    <node></node>
</nodes>" %>% 
  read_html() %>% 
  html_nodes(xpath = "//node[not(@*)]")

gives

{xml_nodeset (1)}
[1] <node></node>

answered Jun 26 '19 at 18:22

the-mad-statter

5,650
1
10
20

I am so sorry to have wasted your time! The thrown error made me think it was a logical error. Thank you so much. I appreciate you kindly letting me know it was just syntactic error – cgibbs_10 Jun 26 '19 at 18:44

How to select all nodes which have no attributes using rvest?

EDIT:

1 Answers1