0

I want to scrape data from this url

I am able to fetch simple data from html tags usign curl but not able to fetch data from Json or Ajax, I am not sure is it Ajax or Json data.

In below screen shot I want to fetch Appliance Models Data. enter image description here

Which is coming form I think json or ajax. ==>>

enter image description here

Here below is my script to get data from page -

$loginURL = "https://www.apwagner.com/appliance-part/wpl/wp661600";
//$file='source.html'; //create a html file to save source code
    $ch = curl_init();
    $timeout = 5;
    curl_setopt($ch, CURLOPT_URL, $loginURL);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
    $data = curl_exec($ch);
    curl_close($ch);

Please provide some guidance to fetch this info ..

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL,"https://www.apwagner.com/Product/GetPartModel");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
            "partNumber=wp661600&make=wpl");

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$server_output = curl_exec ($ch);

curl_close ($ch);
Nikita
  • 57
  • 1
  • 10
  • have you looked at `DOMDocument`?? http://php.net/manual/en/domdocument.loadhtml.php – Nishant Solanki Dec 30 '16 at 05:40
  • Yes, I have also checked that. But did not worked. – Nikita Dec 30 '16 at 05:53
  • obviously that wont work directly, that was just to help you out kick-start.. you'll have to create a code using as per your requirements... – Nishant Solanki Dec 30 '16 at 05:59
  • @NishantSolanki I have tried using many ways .. can you please guide how can I get data using DOMDocument or any other way. – Nikita Dec 30 '16 at 06:01
  • I checked the URL and the listing you are trying to access via CURL is not direclty loading .. It's coming through ajax... Get the ajax url.. check which data is passing in the URL, make a ajax call to same url with same data and you will get your results :D.. this answer can help you out.. http://stackoverflow.com/questions/14625915/is-there-a-way-to-let-curl-wait-until-the-pages-dynamic-updates-are-done – Nishant Solanki Dec 30 '16 at 06:07

1 Answers1

1

The part of data page gets through ajax request.

see this screenshot

You need to do it with curl after your first curl response is received

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL,"https://www.apwagner.com/Product/GetPartModel");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
            "partNumber=wp661600&make=wpl");

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$server_output = curl_exec ($ch);

curl_close ($ch);

Or try to scrap data using python script

import string
import time
from selenium import webdriver

driver = webdriver.Chrome('<path to your chrome driver>') 
driver.get('https://www.apwagner.com/appliance-part/wpl/wp661600');
Davit Tovmasyan
  • 3,238
  • 2
  • 20
  • 36