1

Hi I'm sending a request through HttpClient to a webpage using this code:

Imports System.Net
Imports System.Net.Http

Public Class Form1
   Dim client As HttpClient = New HttpClient()
   Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click


       ' Set User-Agent header
       client.DefaultRequestHeaders.UserAgent.ParseAdd("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36")

       ' Send request and receive response
       Dim response As HttpResponseMessage = client.GetAsync("https://ikalogs.ru/tools/map/?page=1&server=22&world=10&state=active&search=city&allies%5B1%5D=&allies%5B2%5D=&allies%5B3%5D=&allies%5B4%5D=&nick=Bauer&ally=&island=&city=&x=&y=").Result
       If response.IsSuccessStatusCode Then
           ' Get response content as string
           Dim vcmode As String = response.Content.ReadAsStringAsync().Result

           ' Print response content
           RichTextBox1.Text = vcmode
       End If

   End Sub
End Class

I'm using the latest user agenti I'm taking from https://www.useragentstring.com/ but the reponse keep saying the browser is out of date:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
   <title>Browser is out of date!</title>
   <meta charset="utf-8">
   <meta name="description" content="Ikalogs - Saving battle reports" />
   <meta name="author" content="ZigFreeD" />
   <meta name="keywords" content="ikalogs, логовица, анализатор, ikariam" />
   <meta name="copyright" content="(c) 2012-2015 by ZigFreeD"/>
   <meta http-equiv="pragma" content="no-cache"/>
   <meta name="language" content=""/>
   <base target=”_blank” />
   <noscript>
       <meta http-equiv="refresh" content="0; url=/default/jsdisabled/">
   </noscript>
   <style>
       *{
           margin: 0;
           padding: 0;
           color: #542c0f;
           font: 700 20px/27px Arial,sans-serif;
           text-align: center;
           text-decoration: none;
       }

       body{
           background: url(/themes/default/img/themes/error/background.jpg) #eedbb2;
       }

       .errorBrowser{
           width: 500px;
           min-height: 190px;
           position: fixed;
           left: 50%;
           top: 50%;
           margin: -95px 0 0 -250px;
       }

       .errorIcoDesk{
           font: italic 700 14px/18px Arial,sans-serif;
           background: url(/themes/default/img/themes/browsers.png) no-repeat top left;
           width: 50px;
           float: left;
           padding: 100px 25px 0;
           cursor: pointer;
           filter:progid:DXImageTransform.Microsoft.Alpha(opacity=80);
           opacity: .8;
       }

       .errorIcoDesk:hover{
           filter: progid:DXImageTransform.Microsoft.Alpha(opacity=100);
           opacity: 1;
       }

       .errorFi{background-position: -100px 0;}
       .errorCh{background-position: -200px 0;}
       .errorOp{background-position: -300px 0;}
       .errorEx{background-position: -400px 0;}

   </style>
</head>
<body>

<div class="errorBG">
   <div class="errorBrowser">
       <h1>Your browser is outdated or does not support the necessary technology for the operation of the site.</h1>
       <a href="//www.apple.com/safari/">
           <div class="errorIcoDesk errorSa">
               Apple Safari
           </div>
       </a>
       <a href="//www.mozilla.com/firefox/">
           <div class="errorIcoDesk errorFi">
               Mozilla Firefox
           </div>
       </a>
       <a href="//www.google.com/chrome/">
           <div class="errorIcoDesk errorCh">
               Google Chrome
           </div>
       </a>
       <a href="//www.opera.com/">
           <div class="errorIcoDesk errorOp">
               Opera
           </div>
       </a>
       <a href="//ie.microsoft.com/">
           <div class="errorIcoDesk errorEx">
               Internet Explorer
           </div>
       </a>
   </div>
</div>

</body>
</html>

I've been trying differents user agents, but this is the latest I can find online and It seems quite weird that page doesn't accept this one. I think I am doing something wrong while adding the user agent to the header.

Jimi
  • 29,621
  • 8
  • 43
  • 61
Mattia
  • 258
  • 7
  • 25
  • Using `Result` this way is likely to result in a deadlock at some point. You already have the `Sub` declared as `Async`, so you should `Await` the async httpclient routines. Also see https://blog.stephencleary.com/2012/07/dont-block-on-async-code.html – Craig Feb 09 '23 at 15:39

1 Answers1

1

Quite important: never use .Result or .Wait() in this platform

You often need to configure your HttpClient a bit more.
Always better add a CookieContainer and specify to use Cookies (event though it's the default behavior when a CookieContainer exists).
Then also better add headers that specify that decompression is supported and what decompression methods are handled, otherwise you may get back garbage (not in this specific case, but, well, since you're there...)

Now the User-Agent can be almost anything that's recognized

Here, I'm using Lazy initialization, as Lazy<T>(Func<T>).
I find it useful, both because allows to in-line the configuration of the HttpClient object and also add a configured HttpClientHandler with a Lambda and because you may create the class that contains the HttpClient but never actually use it (the object is initialized when you request the Lazy<T>.Value)

Private Shared ReadOnly client As Lazy(Of HttpClient) = New Lazy(Of HttpClient)(
    Function()
        Dim client = New HttpClient(CreateHandler(True), True) With {.Timeout = TimeSpan.FromSeconds(60)}
        client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36")
        client.DefaultRequestHeaders.Add("Cache-Control", "no-cache")
        client.DefaultRequestHeaders.Add("Accept-Encoding", "gzip, deflate")
        Return client
    End Function
 )

Private Shared Function CreateHandler(autoRedirect As Boolean) As HttpClientHandler
    Return New HttpClientHandler() With {
        .AllowAutoRedirect = autoRedirect,
        .AutomaticDecompression = DecompressionMethods.GZip Or DecompressionMethods.Deflate,
        .CookieContainer = New CookieContainer(),
        .UseCookies = True  ' Default here. To be clear...
    }
End Function

After that, your HttpClient's request can act almost as a WebBrowser (unless the Web Site challenges your request and waits for a response - HSTS and different other tricks - then there's nothing you can do)

Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
    Dim url = "https://..."
    Dim response As HttpResponseMessage = Await client.Value.GetAsync(url)
    If response.IsSuccessStatusCode Then
        RichTextBox1.Text = Await response.Content.ReadAsStringAsync()
    Else
        Debug.WriteLine(response.StatusCode)
    End If
End Sub

Protected Overrides Sub OnFormClosed(e As FormClosedEventArgs)
    client.Value.Dispose()
    MyBase.OnFormClosed(e)
End Sub

As a note, the Web Site you have in your post is not secure

Jimi
  • 29,621
  • 8
  • 43
  • 61
  • Hi Jimi thanks a lot for your answer! The code works, It doesn't shows the browser error, but it seems it doesn't return the complete html. I'm looking for the text at the bottom of the page "Search results Nick Alliance City main.Level [X:Y] Island Resourse Wonder Not found" The reponse seems not to have any of that code and I also checked about the presence of an Iframe and it seems it doesn't contain any of it... How do you suggest to proceed? – Mattia Feb 09 '23 at 13:43
  • In which way you think that website is not secure? It present an https and valid certificate, but anyway, I only need to check if a player is on vacation mode or not :).... Grazie Jimi – Mattia Feb 09 '23 at 13:44
  • 1
    I answered the question you asked, i.e., how to prevent the Site from *banning* your request. If the Site generates the content of this page dynamically, using scripts or other server-side procedures, HttpClient (or any other non-interactive tool) cannot help you, you need a WebBrowser that executes the scripts and/or updates IFrames allowing push requests from the server – Jimi Feb 09 '23 at 13:51
  • oh damn, that's the only time I need to go back to webbrowser ... I'll try with the webbrowser that will extract the html once the webpage is loaded ( it shoulnd't be such a big deal actually ) . . You have been amazing once again with your answer ! Keep rocking! – Mattia Feb 09 '23 at 13:53
  • But anyway I need to go back to .net framework as webbrowser control is not accessible through .net6+ – Mattia Feb 09 '23 at 13:57
  • 1
    Thanks :) -- No, you don't. To handle scripted pages, you can use a head-less [WebView2](https://learn.microsoft.com/en-us/microsoft-edge/webview2/) object, which is the new WebBrowser Control. The IE-based WebBrowser has been deprecated a while ago (since IE itself is out of support) – Jimi Feb 09 '23 at 13:58
  • I've just figured out this way, so getting the html doesn't full fill my needs. I mean that's my bad, not YOURS. ...but i found another way, using the same website, to do it and I'm wondering if your code can get the httpClient request response. As I recorded on a gif [link](https://imgur.com/J0jW3RQ) I get only the html of the webpage. I would like to get the response code instead and then to parse it. I've tried with this code [link](https://pastebin.com/HtYvTJv3) but I'm getting a json error.. Wondering which is the best way to achieve it. Thanks!!! – Mattia Feb 10 '23 at 01:36
  • I'm not sure what you're asking here. The URI you have posted (PasteBin) doesn't return a JSON, it returns a page with a map. It also appears you need to register to interact with the map's settings. So, I suppose you can't. As mentioned, you may need a WebBrowser to interact with the selector to get a JSON response. A login is also probably necessary – Jimi Feb 10 '23 at 02:19
  • Hi Jimi, what I am looking for is to extract the json from the get request I m trying to do with the secodlnd code and I was wondering if i can use part of your code too. You dont have to be registered to the website in order to interact with it, tried many times :).. Can you point me out to the right direction? thanks Jimi.. – Mattia Feb 10 '23 at 05:47
  • As mentioned, the URI you have shared doesn't return a JSON. You probably get a JSON back while interacting with the interface. You cannot do that with HttpClient – Jimi Feb 10 '23 at 15:19
  • You are right Jimi, I'm endeed still trying to get that json. As you can see from the gif, the steps I'm doing are: - filling that website areas, clicking on _Find_ and on chrome console I'm getting in the _Network_ tab few things under the _Name_ column - in that column, I'm interested in _Index/_ . After i click it, I go into the tab _Response_ where the Json is present. I then need to parse it ( but this is not even a problem ). The fact is how can I get through _Index/_ ? I think I might need to open a new question. – Mattia Feb 10 '23 at 15:26
  • Yep, you really need to post a new question, explaining very clearly what you're working with and what you want to get -- I've tried to get that page / result, but it doesn't let me fill in the form if I'm not registered – Jimi Feb 10 '23 at 15:29
  • Absolutely. I need to think which words qwery to post in the question. – Mattia Feb 10 '23 at 15:38
  • CookieContainer, headers that specify that decompression and User agent, are things I can add to webview2 too? – Mattia Feb 14 '23 at 00:46
  • You can change the User-Agent string to a custom one (in the rare case this may be needed) -- WebView is the Edge Chromium browser, it of course already handles cookies on its own :) – Jimi Feb 14 '23 at 01:17
  • I've opened a new question for it and explained it better https://stackoverflow.com/questions/75442645/how-to-use-a-useragent-headers-and-cookiecontainer-on-webview2 – Mattia Feb 14 '23 at 01:21