While learning Rust I am trying to build a simple web scraper. My aim is to scrape https://news.ycombinator.com/ and get the title, hyperlink, votes and username. I am using the external libraries reqwest and scraper for this and wrote a program which scrapes the HTML link from that site.
Cargo.toml
[package]
name = "stackoverflow_scraper"
version = "0.1.0"
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
scraper = "0.12.0"
reqwest = "0.11.2"
tokio = { version = "1", features = ["full"] }
futures = "0.3.13"
src/main.rs
use scraper::{Html, Selector};
use reqwest;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let url = "https://news.ycombinator.com/";
let html = reqwest::get(url).await?.text().await?;
let fragment = Html::parse_fragment(html.as_str());
let selector = Selector::parse("a.storylink").unwrap();
for element in fragment.select(&selector) {
println!("{:?}",element.value().attr("href").unwrap());
// todo println!("Title");
// todo println!("Votes");
// todo println!("User");
}
Ok(())
}
How do I get its corresponding title, votes and username?