29

I have some code which I run in very similar circumstances. This is the first circumstance, where I have an imdb_id of a film I want details of:

url = "http://mymovieapi.com/?id=#{self.imdb_id}&type=json&plot=none&episode=0&lang=en-US&aka=simple&release=simple&business=0&tech=0"
doc = Hpricot(open(url)).to_s
json = JSON.parse(doc)

puts json
puts json["imdb_id"]

And this gives the following result:

{"rating_count"=>493949,
"genres"=>["Drama", "Romance"],
"rated"=>"PG-13",
"language"=>["English", "French", "German", "Swedish", "Italian", "Russian"],
"rating"=>7.6,
"country"=>["USA"],
"release_date"=>19980403,
"title"=>"Titanic",
"year"=>1997,
"filming_locations"=>"Santa Clarita, California, USA",
"imdb_id"=>"tt0120338",
"directors"=>["James Cameron"],
"writers"=>["James Cameron"],
"actors"=>["Leonardo DiCaprio", "Kate Winslet", "Billy Zane", "Kathy Bates", "Frances Fisher", "Gloria Stuart", "Bill Paxton", "Bernard Hill", "David Warner", "Victor Garber", "Jonathan Hyde", "Suzy Amis", "Lewis Abernathy", "Nicholas Cascone", "Anatoly M. Sagalevitch"],
"also_known_as"=>["Tai tan ni ke hao"],
"poster"=>{"imdb"=>"http://ia.media-imdb.com/images/M/MV5BMjExNzM0NDM0N15BMl5BanBnXkFtZTcwMzkxOTUwNw@@._V1_SY317_CR0,0,214,317_.jpg", "cover"=>"http://imdb-poster.b0.upaiyun.com/000/120/338.jpg!cover?_upt=66ac07591382594194"},
"runtime"=>["194 min"],
"type"=>"M",
"imdb_url"=>"http://www.imdb.com/title/tt0120338/"}

tt0120338

This is as expected. In the second circumstance, I have the title and year of the same film:

url = "http://mymovieapi.com/?title=#{self.title}&type=json&plot=simple&episode=0&limit=1&year=#{self.year}&yg=1&mt=none&lang=en-US&offset=&aka=simple&release=simple&business=0&tech=0"
doc = Hpricot(open(url)).to_s
json = JSON.parse(doc)

puts json
puts json["imdb_id"]

From this, I get the exact same JSON output:

{"rating_count"=>493949,
"genres"=>["Drama", "Romance"],
"rated"=>"PG-13", "language"=>["English", "French", "German", "Swedish", "Italian", "Russian"],
"rating"=>7.6,
"country"=>["USA"],
"release_date"=>19980403,
"title"=>"Titanic",
"year"=>1997,
"filming_locations"=>"Santa Clarita, California, USA",
"imdb_id"=>"tt0120338",
"directors"=>["James Cameron"],
"writers"=>["James Cameron"],
"actors"=>["Leonardo DiCaprio", "Kate Winslet", "Billy Zane", "Kathy Bates", "Frances Fisher", "Gloria Stuart", "Bill Paxton", "Bernard Hill", "David Warner", "Victor Garber", "Jonathan Hyde", "Suzy Amis", "Lewis Abernathy", "Nicholas Cascone", "Anatoly M. Sagalevitch"],
"also_known_as"=>["Tai tan ni ke hao"],
"poster"=>{"imdb"=>"http://ia.media-imdb.com/images/M/MV5BMjExNzM0NDM0N15BMl5BanBnXkFtZTcwMzkxOTUwNw@@._V1_SY317_CR0,0,214,317_.jpg",
"cover"=>"http://imdb-poster.b0.upaiyun.com/000/120/338.jpg!cover?_upt=ec8bdec31382594417"},
"runtime"=>["194 min"],
"type"=>"M",
"imdb_url"=>"http://www.imdb.com/title/tt0120338/"}

But when I try and call puts json["imdb_id"], I get this error:

no implicit conversion of String into Integer (TypeError)

This always happens when fetching using the title and year, yet it seems unexplained as the JSON output is exactly the same.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Andrew
  • 7,693
  • 11
  • 43
  • 81
  • Is it "put json["imdb_id"]" or "puts"? And can you post the json returned for the second type? – Amith Koujalgi Oct 23 '13 at 18:10
  • @AmithKoujalgi It's puts, I typed that small part out here and made a mistake. I've copied in the 2nd json output, but it is exactly the same. – Andrew Oct 23 '13 at 18:19
  • Why are you using Hpricot? And why are you using Hpricot to "parse" the JSON output received? First, it's not needed, it's also been replaced by Nokogiri, so, *IF* you needed to parse HTML, which you don't, you'd do better with a newer HTML parser. – the Tin Man Oct 23 '13 at 18:41
  • @theTinMan I'm fairly new to JSON (this is my first real app with it), so thanks for the heads up on that. I wasn't sure how to get the JSON from the webpage, and I was using Hpricot elsewhere so used it to do that. How should I be doing it? Using Nokogiri, or some other way? – Andrew Oct 23 '13 at 18:44
  • Don't use Nokogiri or Hpricot. You're requesting a JSON response and that's what you're getting. – the Tin Man Oct 23 '13 at 18:52
  • @theTinMan So what would I pass as an argument into JSON.parse()? It has to be a string of JSON, so how would i get that from the webpage? – Andrew Oct 23 '13 at 19:30
  • See my answer. You're not looking at the data you're getting back. And, it's obvious that the answer you selected *didn't* solve your question. – the Tin Man Oct 23 '13 at 21:23

3 Answers3

48

From the exception, it seems that json in the second response is an array with only one element, so the output of puts json is the same (brackets don't get output with puts), but json["string"] fails because [] expects an Integer to use as index.

Check p json or that json.is_a?(Array) and if it is indeed an array, try with json.first['imdb_id'].

John Topley
  • 113,588
  • 46
  • 195
  • 237
akhanubis
  • 4,202
  • 1
  • 27
  • 19
  • I dont see this as an array. – joncodo Oct 23 '13 at 18:30
  • 2
    `puts [1]` will output the same as `puts 1`(`1`) because it will output each element separated by a new line. From the output given in the question is impossible to know if json is the hash shown or actually an array containing the hash shown. Andrew should check `json.class` or `json.inspect` – akhanubis Oct 23 '13 at 18:33
  • 1
    The second JSON response IS returning an array. Here's the first one: `doc[0..5] # => "{\"rati"` and the second one: `doc[0..5] # => "[{\"rat"`. – the Tin Man Oct 23 '13 at 18:51
3

Here's what's wrong and how to fix it:

  • You're NOT getting HTML back, so you do NOT need to parse HTML. Look at the contents of doc in the examples below. Notice, there is no HTML parsing going on, nor is there any need for it because you're specifically requesting a JSON response: type=json.
  • The first request returns a single response because you're asking for a specific imdb_id. You can only get one response back, so you get just a single object/hash.
  • The second request returns an array of responses, because there could be multiple items with the same title and year, though it seems unlikely. As a result, you have to grab a particular item from the returned array after parsing the JSON. I used first, but your mileage might vary since you could get multiple items; You'll have to figure out which is the appropriate one if you do get more than one.

Here's some code:

require 'json'
require 'open-uri'

imdb_id = 'tt0120338'
url = "http://mymovieapi.com/?id=#{ imdb_id }&type=json&plot=none&episode=0&lang=en-US&aka=simple&release=simple&business=0&tech=0"
doc = open(url).read
doc[0..5] # => "{\"rati"
json = JSON.parse(doc) # => {"rating_count"=>493949, "genres"=>["Drama", "Romance"], "rated"=>"PG-13", "language"=>["English", "French", "German", "Swedish", "Italian", "Russian"], "rating"=>7.6, "country"=>["USA"], "release_date"=>19980403, "title"=>"Titanic", "year"=>1997, "filming_locations"=>"Santa Clarita, California, USA", "imdb_id"=>"tt0120338", "directors"=>["James Cameron"], "writers"=>["James Cameron"], "actors"=>["Leonardo DiCaprio", "Kate Winslet", "Billy Zane", "Kathy Bates", "Frances Fisher", "Gloria Stuart", "Bill Paxton", "Bernard Hill", "David Warner", "Victor Garber", "Jonathan Hyde", "Suzy Amis", "Lewis Abernathy", "Nicholas Cascone", "Anatoly M. Sagalevitch"], "also_known_as"=>["Tai tan ni ke hao"], "poster"=>{"imdb"=>"http://ia.media-imdb.com/images/M/MV5BMjExNzM0NDM0N15BMl5BanBnXkFtZTcwMzkxOTUwNw@@._V1_SY317_CR0,0,214,317_.jpg", "cover"=>"http://imdb-poster.b0.upaiyun.com/000/120/338.jpg!cover?_upt=7dedce781382606097"}, "runtime"=>["194 min"], "type"=>"M", "imdb_url"=>"http://www.imdb.com/title/tt0120338/"}
json["imdb_id"] # => "tt0120338"

This returned a single item as a JSON response. You can see that by looking at the doc variable and see that it is JSON, NOT HTML. Parsing it using the JSON parser returned a hash.

url = "http://mymovieapi.com/?title=#{ json['title'] }&type=json&plot=simple&episode=0&limit=1&year=#{ json['year'] }&yg=1&mt=none&lang=en-US&offset=&aka=simple&release=simple&business=0&tech=0"
doc = open(url).read
doc[0..5] # => "[{\"rat"
json = JSON.parse(doc).first # => {"rating_count"=>493949, "genres"=>["Drama", "Romance"], "rated"=>"PG-13", "language"=>["English", "French", "German", "Swedish", "Italian", "Russian"], "rating"=>7.6, "country"=>["USA"], "release_date"=>19980403, "title"=>"Titanic", "year"=>1997, "filming_locations"=>"Santa Clarita, California, USA", "imdb_id"=>"tt0120338", "directors"=>["James Cameron"], "writers"=>["James Cameron"], "actors"=>["Leonardo DiCaprio", "Kate Winslet", "Billy Zane", "Kathy Bates", "Frances Fisher", "Gloria Stuart", "Bill Paxton", "Bernard Hill", "David Warner", "Victor Garber", "Jonathan Hyde", "Suzy Amis", "Lewis Abernathy", "Nicholas Cascone", "Anatoly M. Sagalevitch"], "plot_simple"=>"A seventeen-year-old aristocrat, expecting to be married to a rich claimant by her mother, falls in love with a kind but poor artist aboard the luxurious, ill-fated R.M.S. Titanic.", "poster"=>{"imdb"=>"http://ia.media-imdb.com/images/M/MV5BMjExNzM0NDM0N15BMl5BanBnXkFtZTcwMzkxOTUwNw@@._V1_SY317_CR0,0,214,317_.jpg", "cover"=>"http://imdb-poster.b0.upaiyun.com/000/120/338.jpg!cover?_upt=7dedce781382606097"}, "runtime"=>["194 min"], "type"=>"M", "imdb_url"=>"http://www.imdb.com/title/tt0120338/", "also_known_as"=>["Tai tan ni ke hao"]}
json["imdb_id"] # => "tt0120338"

This request returned an array of hashes, which is reasonable. The JSON string is an array, which is visible in the doc variable. Parsing it, then grabbing just the first element makes it possible to read the value of imdb_id.

Again, notice that there is no HTML parser involved, nor is one needed. You have to look at the data you're getting back, don't just assume.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0

The error must be somewhere else in your code from what I can tell.

I set the JSON you gave as your output to the variable a in IRB: (JSON from your second example)

1.9.3p194 :043 > a
 => {"rating_count"=>493949, "genres"=>["Drama", "Romance"], "rated"=>"PG-13", "language"=>["English", "French", "German", "Swedish", "Italian", "Russian"], "rating"=>7.6, "country"=>["USA"], "release_date"=>19980403, "title"=>"Titanic", "year"=>1997, "filming_locations"=>"Santa Clarita, California, USA", "imdb_id"=>"tt0120338", "directors"=>["James Cameron"], "writers"=>["James Cameron"], "actors"=>["Leonardo DiCaprio", "Kate Winslet", "Billy Zane", "Kathy Bates", "Frances Fisher", "Gloria Stuart", "Bill Paxton", "Bernard Hill", "David Warner", "Victor Garber", "Jonathan Hyde", "Suzy Amis", "Lewis Abernathy", "Nicholas Cascone", "Anatoly M. Sagalevitch"], "also_known_as"=>["Tai tan ni ke hao"], "poster"=>{"imdb"=>"http://ia.media-imdb.com/images/M/MV5BMjExNzM0NDM0N15BMl5BanBnXkFtZTcwMzkxOTUwNw@@._V1_SY317_CR0,0,214,317_.jpg", "cover"=>"http://imdb-poster.b0.upaiyun.com/000/120/338.jpg!cover?_upt=ec8bdec31382594417"}, "runtime"=>["194 min"], "type"=>"M", "imdb_url"=>"http://www.imdb.com/title/tt0120338/"} 

Then I called:

1.9.3p194 :046 > puts a["imdb_id"]
tt0120338
 => nil 

This gave me the right output.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
joncodo
  • 2,298
  • 6
  • 38
  • 74