0

i am referring to answer to my question , answer was provided by @abhy, that worked, the link is php proDOM parsing error

i am further working on it and want to save the extracted data into mysql database...

the code i am using is

 ?php
$ch = curl_init(); // create a new cURL resource
$word = 'books';
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://images.google.com/images?q=".$word."s&tbm=isch/");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);

$data = curl_exec($ch); // grab URL and pass it to the browser
curl_close($ch); 

$dom = new DOMDocument();
@$dom->loadHTML($data); // avoid warnings

$listA = $dom->getElementsByTagName('a'); // read all <a> elements
foreach ($listA as $itemA) { // loop through each <a> element
    if ($itemA->hasAttribute('href')) { // check if it has an 'href' attribute
        $href = $itemA->getAttribute('href'); // read the value of 'href'
        if (preg_match('/^\/imgres\?/', $href)) { // check that 'href' should begin with "/imgres?"
            $qryString = substr($href, strpos($href, '?') + 1);
            parse_str($qryString, $arrHref); // read the query parameters from 'href' URI
        echo '<br>' . $arrHref['imgurl'] . '<br>';
    $sql = "INSERT INTO `p_url_imgsrch`(`id`, `word_id`, `url_imgsrch`) VALUES ('','8','$arrHref['imgurl']')";

    mysql_query($sql) or die("error in updating urls") ;

        }
    }
}


?>

the idea is, i will insert each link in database, extracted from parsing, i am sure that my connection is okay, the same mysql query string with dummy data is working in mysql and inserting data successfully. however i use dummy url in place of '$arrHref['imgurl']'..

the error i get is

Parse error: syntax error, unexpected T_ENCAPSED_AND_WHITESPACE, expecting T_STRING or T_VARIABLE or T_NUM_STRING in D:\wamp\www\demo\login\abhay.php on line 24

abhay.php is the file having this code, also, when trying to resolve the issue, sometime error was something like T_STRING ...

where i am doing the blunder? kindly give me a guideline..

thanks

Community
  • 1
  • 1
Zaffar Saffee
  • 6,167
  • 5
  • 39
  • 77
  • Your code could be much simpler: use `DOMDocument->loadHtmlFile()` over cURL, use `libxml_use_internal_errors` over suppressing errors with `@`, use `DOMXPath->query` with `//a[@href(starts-with(., "/imgres"))]` instead of regex. – Gordon Jan 17 '12 at 08:59
  • thanks @Gordon , i saw many useful answers provided by you, thanks for all of them, however, i am totally a new bee , trying to learn things, i am trying to grab what you are saying, but at the moment, i am not getting them all, please give me some more guideline.. – Zaffar Saffee Jan 17 '12 at 09:03
  • sorry, the xpath should read `//a/@href[starts-with(., "/imgres")]` – Gordon Jan 17 '12 at 09:07
  • working example: http://codepad.viper-7.com/Cdk7SF – Gordon Jan 17 '12 at 09:24
  • 1
    thanks @Gordon trying to learn it...thanks for the help – Zaffar Saffee Jan 17 '12 at 12:49

2 Answers2

1

You could use "{$arrHref['imgurl']}" so php knows that this is a variable and you should not insert id if you want it to be auto incremented:

$sql = "INSERT INTO `p_url_imgsrch`(`word_id`, `url_imgsrch`) VALUES ('8','{$arrHref['imgurl']}')"; 

However this might cause a corrupt sql command as soon as $arrHref['imgurl'] contains a single quote.

So a better solution would be to use mysql_real_escape_string:

$imgurl = mysql_real_escape_string($arrHref['imgurl']);

$sql = "INSERT INTO `p_url_imgsrch`(`word_id`, `url_imgsrch`) VALUES ('8','{$imgurl}')"; 
jantimon
  • 36,840
  • 23
  • 122
  • 185
0
$sql = "INSERT INTO `p_url_imgsrch`(`id`, `word_id`, `url_imgsrch`) VALUES ('','8','$arrHref[imgurl]')";

Be aware that this could lead to harming code getting injected.

OptimusCrime
  • 14,662
  • 13
  • 58
  • 96