Currently I'm trying to parse an XML document provided by the BBC. However, I am doing a simple check of what Ruby actually gets, and it appears to be missing a lot of details.
require 'open-uri'
require 'nokogiri'
class MainController < ApplicationController
def index
@xml = Nokogiri::XML(open("http://www.bbc.co.uk/bbcone/programmes/schedules/scotland/2013/12/13.xml"))
render :text => @xml
end
end
All that I get from the output, truncated for size, is a heap of incoherent text:
p01ml65v 2013-12-13T00:20:00Z 2013-12-13T00:25:00Z 300 b03ktclr Detailed weather forecast. audio_video 300 p01lc1h3 Skiing Weatherview 2013-12-13T00:20:00Z b007yy70 2007-09-02T01:50:00+01:00 0 0 p01ml65w 2013-12-13T00:25:00Z 2013-12-13T06:00:00Z 20100 b03ktclt BBC One joins the BBC's rolling news channel for a night of news. audio_video 20100 p01m1rbq 13/12/2013 2013-12-13T00:25:00Z b00h9fxh 2006-04-05T00:20:00+01:00 0 0 p01ml966 2013-12-13T06:00:00Z 2013-12-13T09:15:00Z 11700 b03ktcn1
It's also missing quite a lot of children. Can you shed some light on how I might approach this issue?
The end-goal at the moment is just to display the title of the show, found in the tree node /schedule/day/broadcasts/broadcast/programme/display_titles/title
initially, and the rest will follow once that's done.