I am trying to scrape some info on the following website: https://www.evaluation.it/aziende/bilanci-aziende. I am not able to write the loop to do it automatically for each firm
I would like to select all firms in the tab called "Italia" and download all info about the balance sheet (from 2017 to 2021) and I would like to add a column with the name of the firm.
These codes are working well:
library(rvest)
library(dplyr)
link <- "https://www.evaluation.it/aziende/bilanci-aziende/a2a/"
page <- read_html(link)
azienda <- page %>%
html_nodes(".big_text1:nth-child(1) i") %>%
html_text()
voce <- page %>%
html_nodes(".text-left") %>%
html_text()
bil_2017 <- page %>%
html_nodes(".text-right :nth-child(2)") %>%
html_text()
bil_2017 <- bil_2017[-2]
bil_2018 <- page %>%
html_nodes(".text-right :nth-child(3)") %>%
html_text()
bil_2018 <- bil_2018[-3]
bil_2019 <- page %>%
html_nodes(".text-right :nth-child(4)") %>%
html_text()
bil_2019 <- bil_2019[-4]
bil_2020 <- page %>%
html_nodes(".text-right :nth-child(5)") %>%
html_text()
bil_2020 <- bil_2020[-5]
bil_2021 <- page %>%
html_nodes(".text-right :nth-child(6)") %>%
html_text()
bil_2021 <- bil_2021[-6]
bilancio <- data.frame(voce, bil_2017, bil_2018, bil_2019, bil_2020, bil_2021
, stringsAsFactors = FALSE)
bilancio$azienda <- azienda
However, as u can see, it is only for the first firm.
Can u help me to write a loop or a function to have data for each firm?
At the end I want a dataset for each firm and a dataset with all firms appended.
Thanks for ur help!