5

I'm using YQL to get a list of feeds, like this:

SELECT title, link, pubDate FROM rss
    WHERE url IN ('.implode("','", array_values($urls)).')

$urls contains the feed urls:

$urls = array(
  'delicious' => 'http://feeds.delicious.com/v2/rss/foo',
  'delicious' => 'http://feeds.delicious.com/v2/rss/anotherfoo',
  'lastfm'    => 'http://ws.audioscrobbler.com/1.0/user/foo/recenttracks.rss',
  ...
);

And I get an array of items with title / link / pubdate fields. Besides the "link" field, I don't get any info that might tell me from where the item comes from.

How can I include an additional field from array_keys($urls) that corresponds to the URL (value) ? Basically I need a field that tells me from what site does the item come from, like delicious, lastfm etc.

For lastfm I could check with strpos if the "last.fm" string is present in $item['url'], but with delicious links this is not possible :(


Laurel
  • 5,965
  • 14
  • 31
  • 57
Alex
  • 66,732
  • 177
  • 439
  • 641
  • I think you need to change the YQL query for that, so the result row has a field that shows by which URL it was requested. So have you tried to add url to the SELECT fields? Not very known in YQL, but probably there's a way. – hakre Dec 26 '11 at 01:18

3 Answers3

1

Can't you simply add URL to your field list? Or are you looking for a function to parse the URL?

SELECT title,link,pub date,URL FROM rss

Am I missing why the above won't work?

Sparky
  • 14,967
  • 2
  • 31
  • 45
1

Since YQL is not SQL, and only use the syntax of SQL for its queries, there is no chance to SELECT the keyword (in your case url) used in the WHERE clause. SELECT can only access fields of the xml file, given with the url keyword. YQL will parse it for you and return the wanted fields. And cause there are many different rss standards, and not every standard force the feed to contain its own url, its not possible to select a field in the feed that will return the url in every case.

I think, your only chance will be to iterate through the $urls array and do one query per feed. Only this way you can keep the relation between feed-url and the items of this feed.

Corubba
  • 2,229
  • 24
  • 30
  • but how do I add the field in my select query? that's what I don't know how to do... – Alex Dec 26 '11 at 00:56
  • Are you sure about the usage of `link` and `url`? Aren't they the same? – Corubba Dec 26 '11 at 01:05
  • How is the array defined ? I suspect the query is returning 4 columns, but the array only holding 3 columns – Sparky Dec 26 '11 at 01:05
  • ok `source` gets added for delicious items, but not for lastfm – Alex Dec 26 '11 at 01:21
  • I think it would be easier to iterate through the `$urls` array and use one query for each feed. The source field depends on what rss standart and what version. – Corubba Dec 26 '11 at 01:24
  • `guid` stores an URL which I could strpos, but I'm wondering if this is a standard rss attribute that gets included in all feeds, like link/title/pubdate – Alex Dec 26 '11 at 01:24
  • When im looking at the source code of the last.fm feed, the url of itself is never mentioned. – Corubba Dec 26 '11 at 01:27
  • ok, there's no point in trying to get this, I'll just use the query.multi table and do multiple queries... – Alex Dec 26 '11 at 01:34
1

And I get an array of items with title / link / pubdate fields. Besides the "link" field, I don't get any info that might tell me from where the item comes from.

To get the info from where each item comes from, add the source field:

SELECT title, link, pubDate, source FROM rss
    ...

The result will then contain a source element in each entry, which has a url attribute which contains the url the item has been fetched from:

<results>
    <item>
        <title>Git Immersion - Brought to you by EdgeCase</title>
        <pubDate>Wed, 21 Dec 2011 22:15:44 +0000</pubDate>
        <link>http://gitimmersion.com/</link>
        <source url="http://feeds.delicious.com/v2/rss/geek">geek's links</source>
    </item>

The YQL Demo.

You can look for all fields by using *, this is how I found out about the source field:

SELECT * FROM rss
    WHERE url IN ('http://feeds.delicious.com/v2/rss/geek', 'http://feeds.delicious.com/v2/rss/foo')

YQL Query Demo

I think this should answer your question. To reduce the amount of data transferred, you can also only select the url attribute of source like so:

SELECT title, link, pubDate, source.url FROM rss
    ...

See as well Using YQL to load and convert RSS feeds really, really fast and How to use YQL to merge 2 RSS feeds sorted by pubDate?.


The following is an older comment on YQL syntax:

Please take note that implode("','", array_values($urls)) will leave your YQL in a invalid syntax:

... IN( urlA','urlB','urlC )

You see the missing single quotes at the beginning and end? Add them:

"'".implode("','", $urls)."'"

and you should be fine. At least you should have one error less.

Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836
  • ok that was just a typo in my question :) I still need to find out how to add my field... – Alex Dec 26 '11 at 01:15
  • @Alex: I've added how you can make your field return, but I'm not really sure if that is what you asked for, let me know. – hakre Dec 26 '11 at 11:29