0

i have an iMDb-Scraper from another site. It worked very well and now iMDb changed it's html-output and the regular expression doesn't find the poster anymore. I'm a noob at regex, so maybe someone can help me

this is the line

$arr['poster'] = $this->match('/img_primary">.*?<img src="(.*?)".*?<\/td>/ms', $html, 1);

and the function (maybe doesn't interest)

function match_all($regex, $str, $i = 0) {
    if(preg_match_all($regex, $str, $matches) === false)
        return false;
    else
        return $matches[$i];

and here is the specific HTML-output from iMDb

<td rowspan="2" id="img_primary">
<div class="image">
<a href="/media/rm3465715968/tt1905041?ref_=tt_ov_i" > 
<img height="317"
     width="214"
     alt="Fast and the Furious 6 (2013) Poster"
     title="Fast and the Furious 6 (2013) Poster"
     src="http://ia.media-imdb.com/images/M/MV5BMTM3NTg2NDQzOF5BMl5BanBnXkFtZTcwNjc2NzQzOQ@@._V1_SX214_.jpg"
    itemprop="image" />
        </a>
</div></td>

Can someone change the regex that i get the jpg back?

Bubbleboy
  • 71
  • 9

1 Answers1

0

what if you change it for

'/img_primary">.*?<img.*?src="(.*?)".*?<\/td>/ms'

This works for me:

<?php
error_reporting(E_ALL);
ini_set('display_errors',1);

$regexp = '/img_primary">.*?<img.*?src="(.*?)".*?<\/td>/ms';

$string = file_get_contents('test.html');

$matches = array();
preg_match_all($regexp,$string,$matches);
var_dump($matches);
Alexey
  • 3,414
  • 7
  • 26
  • 44
  • And what happens if you put your provided html sample into test.html file and launch this script? – Alexey Apr 06 '13 at 15:13
  • Thanks for helping, but i found a good imdb-api (http://www.omdbapi.com/). Putting the sample into a file doesn't help when the other information still run. – Bubbleboy Apr 06 '13 at 15:52