2

I need to crawl the price values from the price comparison website (product link: https://www.toppreise.ch/prod_488002.html). I am not able to scrape. see the highlighted price in the image that I want to capture:

click to see image

Please help me how to crawl this page.

PS: toppreise.ch will not be accessible in many countries so use VPN

I am using the below code:

Private Sub SiteInfo_Click()
Dim strhtml
On Error Resume Next
ThisWorkbook.Sheets("Data Mining").Activate
Sheets("Data Mining").Range("B1").Select
Set xmlHttp = Nothing
Set xmlHttp = CreateObject("MSXML2.ServerXMLHTTP.6.0")

    StrUrl = ""
    StrUrl = Sheets("Data Mining").Range("B1").Value
    xmlHttp.Open "GET", StrUrl, False
    xmlHttp.Send
    strhtml =xmlHttp.responseText
    END Sub

When I run above code I am only getting below response text . It doesn't gives the whole page. (You can check the source code by using the product link or view here https://www.dropbox.com/s/ah80jt7a25xcicp/source%20code.txt?dl=0 )

<html><head>
        <script type="text/javascript" src="//en.toppreise.ch/js/tpjs.js"></script>
        <script type="text/javascript" src="//en.toppreise.ch/js/afxp.js"></script>
        <script type="text/javascript" src="//en.toppreise.ch/js/jquery.min.js"></script>
        <script type="text/javascript" src="//en.toppreise.ch/js/jquery-ui-autocomplete.min.js"></script>
    </head><body>...   
Community
  • 1
  • 1
Prasath
  • 21
  • 1
  • 2
  • Looks like the page is dynamic, so you'll have to use a different approach, such as browser automation. – Tim Williams Feb 15 '18 at 06:05
  • Thank you so much @TimWilliams. Will you be able to help on that? BTW How to find whether the page is dynamic or not? – Prasath Feb 15 '18 at 06:39
  • Remember to close On Error Resume Next with On Error GoTo 0 (as soon as possible ). You will be hiding errors all the time otherwise. – QHarr Feb 15 '18 at 06:39
  • Try googling "VBA automate IE" and you'll get plenty of examples. – Tim Williams Feb 15 '18 at 06:42

1 Answers1

0

This code works, Thanks SIM

Sub Get_Price()
Dim HTTP As New XMLHTTP60, HTML As New HTMLDocument
Dim post As HTMLDivElement

With HTTP
.Open "GET", "https://www.toppreise.ch/index.php?a=488002", False
.send
HTML.body.innerHTML = .responseText
End With

For Each post In HTML.getElementsByClassName("altLinesOdd")
With post.getElementsByTagName("a")
     If .Length Then R = R + 1: Cells(R, 1) = .Item(0).innerText
End With
With post.getElementsByClassName("spaceVert nobreak")
     If .Length Then Cells(R, 2) = .Item(0).innerText
End With
Next post
End Sub
Ragu
  • 3
  • 4
  • Welcome to Stack Overflow! Thank you for this code snippet, which might provide some limited short-term help. A proper explanation [would greatly improve](//meta.stackexchange.com/q/114762) its long-term value by showing *why* this is a good solution to the problem, and would make it more useful to future readers with other, similar questions. Please [edit] your answer to add some explanation, including the assumptions you've made. – Toby Speight Feb 16 '18 at 09:43