0

I want to check if a website contains schema.org markup? I am doing the following:

$domain = 'http://agents.allstate.com/william-leahy-mount-prospect-il.html';            
$client = new Zend_Http_Client();
            $client->setUri($domain);
            $response = $client->request();
            $html = $response->getBody();
            $dom = new Zend_Dom_Query($html);
            $resultSchema = $dom->query('body');

            foreach($resultSchema as $r){
                $data = $r->hasAttribute('itemprop');
                if($data)
                    echo 'YEs';
                else 
                    echo 'No';
            }

I am not understanding how to find this. Is this the correct way of doing it? The schema.org markup used on the website may use any html element. How can i query all the elements and find one that contains the schema.org markup?

VishwaKumar
  • 3,433
  • 8
  • 44
  • 72

1 Answers1

0

Finally after long search and reading was able to get the answer! This is how its done if someone's still looking for an answer.

$seperator = '|'; $dbData = '';
$domain = 'http://agents.allstate.com/william-leahy-mount-prospect-il.html';            
$client = new Zend_Http_Client();
$client->setUri($domain);
$response = $client->request();
$html = $response->getBody();
$dom = new Zend_Dom_Query($html);
$result = $dom->queryXpath('//*[@itemtype="http://schema.org/LocalBusiness"]');
            if($result->count()){
                foreach ($result as $r) {
                    if($r->hasChildnodes()) {
                        $lbHtml = $r->C14N();

                        $dom2 = new Zend_Dom_Query($lbHtml);
                        $lbname = $dom2->queryXpath('//*[@itemprop="name"]');
                        if($lbname->count()){
                            foreach ($lbname as $name) {
                                $name = $name->nodeValue;
                            }
                        }
                    }
                }
            }

            if(isset($name))
                $dbData .= 'name:'.$name.$seperator;
            else 
                $dbData .= 'name:'.$seperator;

            $result = $dom->queryXpath('//*[@itemtype="http://schema.org/PostalAddress"]');
            if($result->count()){
                foreach ($result as $r) {
                    $address = $r->nodeValue;
                }
            }

            if(isset($address))
                $dbData .= 'address:'.$address.$seperator;
            else
                $dbData .= 'address:'.$seperator;

            $result = $dom->queryXpath('//*[@itemprop="telephone"]');
            if($result->count()){
                foreach ($result as $r) {
                    $telephone = $r->nodeValue;
                }
            }

            if(isset($telephone))
                $dbData .= 'telephone:'.$telephone.$seperator;
            else
                $dbData .= 'telephone:'.$seperator;

            $dbData = trim($dbData,'|');

$dbData will contain string containing all the properties of the schema.org data. Hope it helps!

VishwaKumar
  • 3,433
  • 8
  • 44
  • 72