0

I am new to Web and part of my Web API 2 training is to create an API controller that uses an HttpClient and an HttpContent to read entire website and from that to return only a specific portion to a text.

When I call that website, it comes back to me as an object. To me this is considered to be the “public API” to use.

From that I am prompt to return only a specific portion into a text to the user.

Now, I have spent hours googling and trying to implement it using Regex expressions, Newtonsoft.Json objects…etc, but I cannot seem to break that down.

Is there a way to accomplice that? What am I missing? Please advice!

 public async System.Threading.Tasks.Task<string> GetAsync()
        {
            using (HttpClient client = new HttpClient())
            {
                using (HttpResponseMessage response = await client.GetAsync("https://www.lipsum.com/"))
                {
                    using (HttpContent content = response.Content)
                    {
                        string text = await content.ReadAsStringAsync();
                        return text;
                    }
                }
            }
        }

Extract and store only this piece of text

Sotiris
  • 1
  • 1
  • Parsing the returned HTML page has nothing to do with HttpClient or Web API. Regex can be used to extract specific, well defined snippets. In other cases you may need to use an HTML parser like [AngleSharp](https://github.com/AngleSharp/AngleSharp). Without any idea what the HTML looks like or what you want to extract one can't offer any help. – Panagiotis Kanavos Sep 20 '19 at 11:07

1 Answers1

0

What I needed was an HTML parser. For that I have used AngleSharp.

I want to thank @Panagiotis Kanavos for his clarification and feedback.

Sotiris
  • 1
  • 1