-3

How do you find the following div using regex? The URL and image location will consistently change based on the post URL, so I need to use a wild card.

I must use a regular expression because I am limited in what I can use due to the software I am using: http://community.autoblogged.com/entries/344640-common-search-and-replace-patterns

<div class="tweetmeme_button" style="float: right; margin-left: 10px;"> <a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fjumpinblack.com%2F2011%2F11%2F25%2Fdrake-and-rick-ross-you-only-live-once-ep-mixtape-2011-download%2F"><br /> <img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fjumpinblack.com%2F2011%2F11%2F25%2Fdrake-and-rick-ross-you-only-live-once-ep-mixtape-2011-download%2F&amp;source=jumpinblack1&amp;style=compact&amp;b=2" height="61" width="50" /><br /> </a> </div>

I tried using

<div class="tweetmeme_button" style="float: right; margin-left: 10px;">.*<\/div>
daxim
  • 39,270
  • 4
  • 65
  • 132
user1068544
  • 67
  • 1
  • 1
  • 5
  • If that wordpress plugin is what you are using, then this isn't a Perl question (which is why the answers for using a Perl HTML parser don't apply to you). – preaction Nov 28 '11 at 01:55
  • 2
    -1 for claiming that your regular expression question was a Perl question. – tadmc Nov 28 '11 at 02:36

2 Answers2

1

Use an HTML parser to parse HTML.

HTML::TokeParser::Simple or HTML::TreeBuilder::XPath among many others.

E.g.:

#!/usr/bin/env perl

use strict;
use warnings;

use HTML::TokeParser::Simple;

my $parser = HTML::TokeParser::Simple->new( ... );

while (my $div = $parser->get_tag) {
    next unless $div->is_start_tag('div');
    {
        no warnings 'uninitialized';
        next unless $div->get_attr('class') eq 'tweetmeme_button';
        next unless $div->get_attr('style') eq 'float: right; margin-left: 10px;'
        # now do what you want until the next </div>
    }

}
Sinan Ünür
  • 116,958
  • 15
  • 196
  • 339
1

Using regular expression to process HTML is a bad idea. I'm using HTML::TreeBuilder::XPath for this.

use strict;
use warnings;
use HTML::TreeBuilder::XPath;
use WWW::Mechanize;

my $mech = WWW::Mechanize->new();
$mech->get("http://www.someURL.com");

my $tree = HTML::TreeBuilder::XPath->new_from_content( $mech->content() );    
my $div = $tree->findnodes( '//div[@class="tweetmeme_button"]')->[0];
gangabass
  • 10,607
  • 2
  • 23
  • 35
  • thanks for the quick response however i must use a regular expression due to the software program i am using. http://community.autoblogged.com/entries/344640-common-search-and-replace-patterns – user1068544 Nov 28 '11 at 01:46