I scrape information with rvest
and store it in a dataframe. All information on various institutions and their context characteristics is stored in one string. It looks similar to JSON
, but it isn't. I followed another stack post but am not successful. I think string manipulation should do the job. Finally, "title", "street", "number", etc. should be variables and each institution should be a row. Thank you very much
library('tidyverse')
library('rvest')
library('stringr')
library('stringi')
library('jsonlite')
rubyhash <- "https://www.blutspenden.de/blutspendedienste/#" %>%
read_html() %>%
html_nodes("body") %>%
html_nodes("script:first-of-type") %>%
html_text() %>%
as_tibble() %>%
slice(1)
substr(rubyhash$value,1,150)
"\n var instituionsmap_data = '[{\"title\":\"Plasmazentrum Heidelberg\",\"street\":\"Hans-B\\u00f6ckler-Stra\\u00dfe\",\"number\":\"2A\",\"zip\":\"69115\",\"city\":\""
rubyhash$json <- str_replace(rubyhash$value, "var instituionsmap_data =", "")
rubyhash$json <- trimws(rubyhash$json)
substr(rubyhash$json,1,150)
"'[{\"title\":\"Plasmazentrum Heidelberg\",\"street\":\"Hans-B\\u00f6ckler-Stra\\u00dfe\",\"number\":\"2A\",\"zip\":\"69115\",\"city\":\"Heidelberg\",\"phone\":\"06221 89466960"
fromJSON(rubyhash$json)