0

I am trying to figure out how to parse out a URL (A VAST video tag) from a field in MySQL. This field contains multiple rows that vary, but should contain the same XML code.

This is the URL I am trying to parse out:

http://ad.amgdgt.com/ads/?t=dr&f=vast2&p=10582&pl=a75d1869&rnd=<timestamp>

This is the field in its entirety:

<?xml version="1.0" encoding="utf-8"?> <VAST xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0"> <Ad id="000000"> <Wrapper> <AdSystem version="2.0">DINVICO</AdSystem> <VASTAdTagURI><![CDATA[http://ad.amgdgt.com/ads/?t=dr&f=vast2&p=10582&pl=a75d1869&rnd=<timestamp>]]></VASTAdTagURI> <Impression><![CDATA[<ditu>]]></Impression> <Impression><![CDATA[<comscoreVOX>]]></Impression> <Impression><![CDATA[<comscoreVideo>]]></Impression> <Creatives> <Creative AdID="000000" id="000000" sequence="1"> <Linear> <TrackingEvents> <Tracking event="start"><![CDATA[<event_12>]]></Tracking> <Tracking event="firstQuartile"><![CDATA[<event_13>]]></Tracking> <Tracking event="midpoint"><![CDATA[<event_14>]]></Tracking> <Tracking event="thirdQuartile"><![CDATA[<event_15>]]></Tracking> <Tracking event="complete"><![CDATA[<event_16>]]></Tracking> <Tracking event="mute"><![CDATA[<event_17>]]></Tracking> <Tracking event="unmute"><![CDATA[<event_18>]]></Tracking> <Tracking event="rewind"><![CDATA[<event_20>]]></Tracking> <Tracking event="resume"><![CDATA[<event_21>]]></Tracking> <Tracking event="expand"><![CDATA[<event_23>]]></Tracking> <Tracking event="collapse"><![CDATA[<event_24>]]></Tracking> <Tracking event="acceptInvitation"><![CDATA[<event_26>]]></Tracking> <Tracking event="pause"><![CDATA[<event_19>]]></Tracking> <Tracking event="fullscreen"><![CDATA[<event_22>]]></Tracking> <Tracking event="close"><![CDATA[<event_27>]]></Tracking> <Tracking event="creativeView"><![CDATA[<event_11>]]></Tracking> </TrackingEvents> <VideoClicks> <ClickTracking><![CDATA[<click_tag>]]></ClickTracking> </VideoClicks> </Linear> </Creative> </Creatives> <Extensions> </Extensions> </Wrapper> </Ad> </VAST>

As you can see, the HTML link is delimited by <VASTAdTagURI><![CDATA[ ... ]]></VASTAdTagURI>

How can I tell MySQL to select and return only the HTML link within the two delimiters?

Here is the SQL I am currently working with that returns the entire XML Code:

select
a.name as Advertiser_Name,
a.aid as Advertiser_ID,
b.name as Creative_Name,
b.id_creative as Creative_ID,
bt.tag as XML_Code,
bt.tag_secure as XML_Code_Secure

from advertisers a
inner join campaigns c on c.aid=a.aid
inner join banners b on b.cid=c.cid
inner join banner_tags bt on bt.bid=b.bid
inner join placement p on p.cid=c.cid
inner join locations l on l.lid=p.lid
inner join sites s on s.sid=l.sid

where s.pid='<Partner_ID>' and b.status=1 and c.status in (-1,0,1,2)

group by b.id_creative

order by b.status

2 Answers2

1

Use XPath with ExtractValue():

SELECT ExtractValue(col, '/VAST/Ad/Wrapper/VASTAdTagURI/text()');
univerio
  • 19,548
  • 3
  • 66
  • 68
  • Thank you for your answer. I am somewhat new to advanced queries such as the ExtractValue(). If I provide my query, could you suggest how I should implement? – user3735875 Jun 13 '14 at 16:37
  • 1
    @user3735875 Well, that's just a column expression involving a function call. Just put `ExtractValue(XML_Code, '/VAST/Ad/Wrapper/VASTAdTagURI/text()') AS Url` as one of the columns you're selecting. – univerio Jun 13 '14 at 16:49
  • When I try this, MySQL returns an error "FUNCTION database.ExtractValue does not exist" Does this mean that I need to add this functionality? – user3735875 Jun 13 '14 at 18:56
  • @user3735875 `ExtractValue()` has been available since 5.1.5, so if you have a super old version of MySQL you're out of luck and you have to resort to @Fabricator's method below. – univerio Jun 13 '14 at 19:43
  • Using Query Browser 1.2.17, is this something I can just update? – user3735875 Jun 13 '14 at 20:08
  • @user3735875 No. If you want to upgrade MySQL, you'll have to upgrade it on the server. If a third party operates the server, you'll need to contact them and see if you can get them to upgrade. – univerio Jun 13 '14 at 20:25
0

Didn't know about the ExtractValue method. Here's a stupid way of getting it (using LOCATE and SUBSTR)

SET @s:='<?xml version="1.0" encoding="utf-8"?> <VAST xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0"> <Ad id="000000"> <Wrapper> <AdSystem version="2.0">DINVICO</AdSystem> <VASTAdTagURI><![CDATA[http://ad.amgdgt.com/ads/?t=dr&f=vast2&p=10582&pl=a75d1869&rnd=<timestamp>]]></VASTAdTagURI> <Impression><![CDATA[<ditu>]]></Impression> <Impression><![CDATA[<comscoreVOX>]]></Impression> <Impression><![CDATA[<comscoreVideo>]]></Impression> <Creatives> <Creative AdID="000000" id="000000" sequence="1"> <Linear> <TrackingEvents> <Tracking event="start"><![CDATA[<event_12>]]></Tracking> <Tracking event="firstQuartile"><![CDATA[<event_13>]]></Tracking> <Tracking event="midpoint"><![CDATA[<event_14>]]></Tracking> <Tracking event="thirdQuartile"><![CDATA[<event_15>]]></Tracking> <Tracking event="complete"><![CDATA[<event_16>]]></Tracking> <Tracking event="mute"><![CDATA[<event_17>]]></Tracking> <Tracking event="unmute"><![CDATA[<event_18>]]></Tracking> <Tracking event="rewind"><![CDATA[<event_20>]]></Tracking> <Tracking event="resume"><![CDATA[<event_21>]]></Tracking> <Tracking event="expand"><![CDATA[<event_23>]]></Tracking> <Tracking event="collapse"><![CDATA[<event_24>]]></Tracking> <Tracking event="acceptInvitation"><![CDATA[<event_26>]]></Tracking> <Tracking event="pause"><![CDATA[<event_19>]]></Tracking> <Tracking event="fullscreen"><![CDATA[<event_22>]]></Tracking> <Tracking event="close"><![CDATA[<event_27>]]></Tracking> <Tracking event="creativeView"><![CDATA[<event_11>]]></Tracking> </TrackingEvents> <VideoClicks> <ClickTracking><![CDATA[<click_tag>]]></ClickTracking> </VideoClicks> </Linear> </Creative> </Creatives> <Extensions> </Extensions> </Wrapper> </Ad> </VAST>';
SELECT LOCATE('<VASTAdTagURI><![CDATA[', @s)+LENGTH('<VASTAdTagURI><![CDATA[') start_pos,  
       LOCATE(']]></VASTAdTagURI>', @s, LOCATE('<VASTAdTagURI><![CDATA[', @s)) end_pos,
       LOCATE(']]></VASTAdTagURI>', @s, LOCATE('<VASTAdTagURI><![CDATA[', @s)) - (LOCATE('<VASTAdTagURI><![CDATA[', @s)+LENGTH('<VASTAdTagURI><![CDATA[')) len,
       SUBSTR(@s, 
              LOCATE('<VASTAdTagURI><![CDATA[', @s)+LENGTH('<VASTAdTagURI><![CDATA['), 
              LOCATE(']]></VASTAdTagURI>', @s, LOCATE('<VASTAdTagURI><![CDATA[', @s)) - (LOCATE('<VASTAdTagURI><![CDATA[', @s)+LENGTH('<VASTAdTagURI><![CDATA['))) url;
Fabricator
  • 12,722
  • 2
  • 27
  • 40
  • Thank you for this answer, it looks like it uses the delimiters, but also looks like it will only work for one line of XML code? Is there a way I can apply this to a varying list of XML code? And could you suggest how I could implement this with the query I just added to my original question? – user3735875 Jun 13 '14 at 16:41