I'm learning C# through creating a small program, and couldn't find a similar post (apologies if this answer is posted somewhere else).
How might I go about screen-scraping a website for links to PDFs (which I can then download to a specified location)? Sometimes a page will have a link to another HTML page which has the actual PDF link, so if the actual PDF can't be found on the first page I'd like it to automatically look for a link that have "PDF" in the text of a link, and then search that resulting HTML page for the real PDF link.
I know that I could probably achieve something similar via filetype searching through google, but that seems like "cheating" to me :) I'd rather learn how to do it in code, but I'm not sure where to start. I'm a little familiar with XML parsing with XElement and such, but I'm not sure how to do it for getting links from an HTML page (or other format?).
Could anyone point me in the right direction? Thanks!