1

I am doing this course on udemy that had me built a webscraper to scrape Craigslist. I typed it exactly as the teacher showed in his video and when he runs it on screen it works. However, when I run it then it comes back with no matches. Here is the main body of the project:

using (WebClient client = new WebClient())
{
       string content = client.DownloadString($"https://{craigslistCity.Replace(" ", string.Empty)}.craigslist.org/{Method}/{craigslistCategoryName}");

       ScrapeCriteria scrapeCriteria = new ScrapeCriteriaBuilder()
             .WithData(content)
             //.WithRegex(@"href=\""(.*?)\""")
             //.WithRegex(@"<a class=\""(.*?)\"" href=\""(.*?)\"" title=\""(.*?)\""> </a>")
           .WithRegex(@"<a href=\""(.*?)\"" data-pid=\""(.*?)\"" class=\""result-title hdrlnk\"">(.*?)</a>")
             .WithRegexOption(RegexOptions.ExplicitCapture)
             .WithPart(new ScrapeCriteriaPartBuilder()
                 .WithRegex(@">(.*?)</a>")
                 .WithRegexOption(RegexOptions.Singleline)
                 .Build())
             .WithPart(new ScrapeCriteriaPartBuilder()
                 .WithRegex(@"href=\""(.*?)\""")
                 .WithRegexOption(RegexOptions.Singleline)
                 .Build())
             .Build();

        Scraper scraper = new Scraper();

        var scrapedElements = scraper.Scrape(scrapeCriteria);

        if (scrapedElements.Any())
        {
              foreach (var scrapedElement in scrapedElements) Console.WriteLine(scrapedElement);
              Console.ReadLine();
        }
        else
        {
              Console.WriteLine("There were no matches for the specified scrape criteria.");
              Console.ReadLine();
        }
}

When I stop the debugger after content is declared and check content, it is filled with the HTML from that page. However, the scraped content always comes up empty. In the middle you will see 3 Regex lines stacked on top of one another; the one that is not commented out is the line he had in his course. The other 2 were me looking in the HTML of that page and trying to figure out a pattern that will work. If you need to see a specific class, let me know.

I have been using New York for the city and aap for the category.

djblois
  • 963
  • 1
  • 17
  • 52

0 Answers0