0

In Chrome, loading this webpage

https://www.google.com/webhp

When i digit something in the 'search TextBox', and i inspect the same TextBox element (right click --> inspect): enter image description here If i check the properties enter image description here

I can verify that:

  • innerHTML: ""
  • innerText: ""
  • textContent: ""
  • value: "what i digited"

My question is: In general, parsing a webPage in c#/vb.net, Which could be the fastest way to get the 'Input' value? N.B. Not the innerHTML, the innerText, or textContent, i need to get the 'value'.

I tried with Selenium, and i can get it, but it's slow (also with ChromeOptions 'Headless' argument)

I tried with HtmlAgilityPack, that it's fast, but i'm unable to get the input value...

Any help?

Marcello
  • 438
  • 5
  • 21
  • In `Windows.Forms`, `HtmlElementCollection InputElements = [HtmlDocument].GetElementsByTagName("input"); foreach (HtmlElement elm in InputElements) elm.GetAttribute("value");`. More or less the same logic applies elsewhere. – Jimi Oct 01 '18 at 08:13
  • @Jimi Thanks Jimi, so you suggest to use a form webbrowser? ...i used it in past, but there are lots of 'not friendly trick' to use it rightly, due it's based on IE, and compatibility... Do you know a way, how to get the value throught HAP? – Marcello Oct 01 '18 at 08:27
  • You would probably use something like: `HtmlNodeCollection InputElements = [HtmlDocument].DocumentNode.SelectNodes("//input")` and in the loop get the node value with.`string NodeValue = [DocumentNode].Attributes["value"].Value;`. For basic parsing of Html elements, the `HtmlDocument` class initialized by a `WebBrowser` **class** (not Control). can be quite usefull. But if you think you'll need a more flexible tool (and a better way to create a Document object), use [`HtmlAgilityPack`](https://html-agility-pack.net/parser) without looking back. – Jimi Oct 01 '18 at 08:42
  • @Jimi Yes,Jimi, in past i used both and HAP it's mre flexible, ...the problem is that seems for a strange reason that using HAP i can't get the HTML input tag value, as innerText. Probably it could be an HAP setting as described here: https://stackoverflow.com/questions/2385840/how-to-get-all-input-elements-in-a-form-with-htmlagilitypack-without-getting-a-n – Marcello Oct 01 '18 at 09:00
  • Do you have more than one Form in that page? If so, follow the advices in the answer you linked. I can't say what's the issue in your parsing procedure. You haven't posted one. – Jimi Oct 01 '18 at 09:19
  • @Jimi Thanks, this is just an example of my issue. In my program, load a webpage, login through Selenium interaction, and load some pages. And with Selenium can get everything on the page. But it's slow. Tried to interact between selenium and HAP, and it works, and HAP it's faster, but with HAP can't get input value. – Marcello Oct 01 '18 at 18:53
  • @Jimi Didn't put code, due StackOverflow it's full of examples with different approach, as Selenium, as WebBrowser, as HAP... But my issue it's just to understand why in a HTML input element, the innerText it's different as value. And maybe i can get it with HAP, with a particular HAP setting. – Marcello Oct 01 '18 at 19:16
  • I'm not quite sure the goal here. HAP does straight http requests and gives back a static page that can be parsed, selenium uses a webdriver to emulate or use an actual browser and can be used with headless and regular UI browsers. Selenium has the value because you entered it in the instance that selenium is attached to. HAP doesn't have anything there because it just loaded the page where there wasn't anything in that element and isn't connected to any browser on your computer. – CodingKuma Oct 10 '18 at 00:49
  • @CodingKuma Thanks, You made me a bit of clarity. Since loading the 'input' value with Selenium I have a delay of 50 ms, but have to load hundreds of data every page (mixed input, label, 'li'...), the delay to load each page with Selenium it's more than 5 sec. While loading the data with HAP the delay to get a node is 1 ms, and I can load a page in 1 sec. Was wondering if there is a particular setting in HAP (i.e. HtmlNode.ElementsFlags.Remove("form")...) that allow to get the value, i.e. while HAP load all input nodes, execute also an 'hidden' javascript call to get the value) (continue...) – Marcello Oct 10 '18 at 06:59
  • @CodingKuma ...And in general, i would like to find the fastest way to get the 'webElement input' value. There are many way to do that: Selenium (PhantomJs, Chrome...), , HAP (that don't give all data, i.e. give text label, but don't input value), Form Webbrowser, xmlDocument class, Awesonium, Winium and lots others. I'm trying some of these solutions, but as you know, i should 'study' deeply every approach, and need lots of time. I was wondering if someone, in the past, haved the same issue, and he find a solution or can exclude some approaches so i don't waste time studying that. – Marcello Oct 10 '18 at 07:03

0 Answers0