0

Sorry for bad english. So i want to scrap some content from the website, but the div classes are nested and confusing me.

Basically the structure is :

<div id="gsc_vcd_table">
  <div class="gs_scl">
    <div class="gsc_vcd_field">
      Pengarang
    </div>
    <div class="gsc_vcd_value">
      I Anggara Wijaya, Djoko Budiyanto Setyohadi
    </div>
  </div>
  <div class="gs_scl">
    <div class="gsc_vcd_field">
      Tanggal Terbit
    </div>
    <div class="gsc_vcd_value">
      2017/3/1
    </div>
  </div>
</div>

I want to get text I Anggara Wijaya, Djoko Budiyanto Setyohadi from Pengarang field and also get 2017/3/1 from Tanggal Terbit field.

$crawlerdetail=$client->request('GET',$detail);
   $detailscholar=$crawlerdetail->filter('div.gsc_vcd_table');
   foreach ($detailscholar as $key) 
   {
        $keyCrawler=new Crawler($key);
        $pengarang=($scCrawler->filter('div.gsc_vcd_value')->count()) ? $scCrawler->filter('div.gsc_vcd_value')->text() : '';
        echo $pengarang;
   }

Help me please.

1 Answers1

0

If you want to use SimpleXMLElement class.

See this code:

<?php
$string = <<<XML
<div id="gsc_vcd_table">
  <div class="gs_scl">
    <div class="gsc_vcd_field">
      Pengarang
    </div>
    <div class="gsc_vcd_value">
      I Anggara Wijaya, Djoko Budiyanto Setyohadi
    </div>
  </div>
  <div class="gs_scl">
    <div class="gsc_vcd_field">
      Tanggal Terbit
    </div>
    <div class="gsc_vcd_value">
      2017/3/1
    </div>
  </div>
</div>
XML;

$xml = new SimpleXMLElement($string);

$result1 = $xml->xpath("//div[contains(@class, 'gsc_vcd_field')]");
$result2 = $xml->xpath("//div[contains(@class, 'gsc_vcd_value')]");

foreach ($result1 as $key => $node) {
    echo "FIELD: $result1[$key] , VALUE: $result2[$key]<br>\n";
}

And also for get xpath pattern of any elements, you can use inspect in chrome, and Copy XPath.

Another solution is use preg_match_all, see:

preg_match_all('/<div class="gsc_vcd_field">\r\n(.*?)\r\n.*<\/div>\r\n.*<div class="gsc_vcd_value">\r\n(.*?)\r\n.*<\/div>/', $string, $matches);

foreach ($matches[1] as $key => $match) {
    echo "FIELD: " . $matches[1][$key] . " , VALUE: " . $matches[2][$key] . "<br>\n";
}
Nabi K.A.Z.
  • 9,887
  • 6
  • 59
  • 81