4

I want to download a movie dump containing the basic information like movie name and list of actors in a single file. I looked for a couple of options like http://api.themoviedb.org/2.1/ and http://api.themoviedb.org/2.1/ . TheMovieDB does not give an option of downloading the data in bulk. IMDB has the data but it seems to be scattered across files. Moreover I am not able to figure out how to stitch the data from the separate files for actors, movie names etc. as they don't seem to have any common keys. Let me know if I am missing something here.

Could someone please let me know how to go about downloading the movie data set?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
invinc4u
  • 1,125
  • 3
  • 15
  • 26
  • Sorry, this site is about *programming* questions, not "where do I find xyz". – Has QUIT--Anony-Mousse Sep 06 '12 at 06:31
  • 1
    @Anony-Mousse : I need this large movie data set because I have some app in mind where I want to apply data mining principles. I had even tagged my post properly with respect to the same. So, I guess you would want to reconsider the validity of my question with respect to programming. – invinc4u Sep 07 '12 at 03:34
  • Well, once you have an actual data-mining question, that question is a better fit. – Has QUIT--Anony-Mousse Sep 07 '12 at 07:59
  • @Anony-Mousse : I apologize if the above comment sounded a bit rude to you. But the point is I can't get ahead with my idea till I have a data set to play with. Just to give a bit of programming touch to this discussion, I did try web scraping Wikipedia pages to extract movie information. But there is a limit on the number of requests you can fire to such sites. Also, that data is not completely exhaustive and would take a lot of time to generate. And thus this question !! I hope you get my motivation of asking this question and its relevance to my project. – invinc4u Sep 07 '12 at 23:09
  • 2
    Instead of web scraping, try downloading Wikipedia. There is even a RDF version called DBPedia, which is a *parsed* version of Wikipedia. Try working with these. – Has QUIT--Anony-Mousse Sep 08 '12 at 08:55

1 Answers1

2

You could use Freebase to download movies and actors in JSON format. See the API wiki for more information.

For example, the query:

GET https://www.googleapis.com/freebase/v1/mqlread?query=[{%22type%22:%22/film/actor%22,%22id%22:null,%22name%22:null}]

will return:

{
  "result": [{
    "type": "/film/actor",
    "id": "/en/milla_jovovich",
    "name": "Milla Jovovich"
  }, {
    "type": "/film/actor",
    "id": "/en/angus_macfadyen",
    "name": "Angus Macfadyen"
  }, {
    "type": "/film/actor",
    "id": "/en/aisha_tyler",
    "name": "Aisha Tyler"
  }, {
    "type": "/film/actor",
    "id": "/en/stephen_dorff",
    "name": "Stephen Dorff"
  }, {
    "type": "/film/actor",
    "id": "/en/vincent_laresca",
    "name": "Vincent Laresca"
  }, {
    "type": "/film/actor",
    "id": "/en/dawn_greenhalgh",
    "name": "Dawn Greenhalgh"
  }, {
    "type": "/film/actor",
    "id": "/en/nola_augustson",
    "name": "Nola Augustson"
  }, {
    "type": "/film/actor",
    "id": "/en/dudley_moore",
    "name": "Dudley Moore"
  }, {
    "type": "/film/actor",
    "id": "/en/julie_andrews",
    "name": "Julie Andrews"
  }, {
    "type": "/film/actor",
    "id": "/en/bo_derek",
    "name": "Bo Derek"
  }, {
    "type": "/film/actor",
    "id": "/en/robert_webber",
    "name": "Robert Webber"
  }, {
    "type": "/film/actor",
    "id": "/en/dee_wallace-stone",
    "name": "Dee Wallace-Stone"
  }, {
    "type": "/film/actor",
    "id": "/en/ryan_phillippe",
    "name": "Ryan Phillippe"
  }, {
    "type": "/film/actor",
    "id": "/en/salma_hayek",
    "name": "Salma Hayek"
  }, {
    "type": "/film/actor",
    "id": "/en/neve_campbell",
    "name": "Neve Campbell"
  }, {
    "type": "/film/actor",
    "id": "/en/mike_myers",
    "name": "Mike Myers"
  }, {
    "type": "/film/actor",
    "id": "/en/satoshi_tsumabuki",
    "name": "Satoshi Tsumabuki"
  }, {
    "type": "/film/actor",
    "id": "/en/masanobu_ando",
    "name": "Masanobu Ando"
  }, {
    "type": "/film/actor",
    "id": "/en/david_gahan",
    "name": "Dave Gahan"
  }, {
    "type": "/film/actor",
    "id": "/en/martin_gore",
    "name": "Martin Gore"
  }, {
    "type": "/film/actor",
    "id": "/en/andrew_fletcher_1961",
    "name": "Andrew Fletcher"
  }, {
    "type": "/film/actor",
    "id": "/en/alan_wilder",
    "name": "Alan Wilder"
  }, {
    "type": "/film/actor",
    "id": "/en/gerard_butler",
    "name": "Gerard Butler"
  }, {
    "type": "/film/actor",
    "id": "/en/lena_headey",
    "name": "Lena Headey"
  }, {
    "type": "/film/actor",
    "id": "/en/david_wenham",
    "name": "David Wenham"
  }, {
    "type": "/film/actor",
    "id": "/en/robert_de_niro",
    "name": "Robert De Niro"
  }, {
    "type": "/film/actor",
    "id": "/en/gerard_depardieu",
    "name": "G\u00e9rard Depardieu"
  }, {
    "type": "/film/actor",
    "id": "/en/dominique_sanda",
    "name": "Dominique Sanda"
  }, {
    "type": "/film/actor",
    "id": "/en/john_belushi",
    "name": "John Belushi"
  }, {
    "type": "/film/actor",
    "id": "/en/ned_beatty",
    "name": "Ned Beatty"
  }, {
    "type": "/film/actor",
    "id": "/en/dan_aykroyd",
    "name": "Dan Aykroyd"
  }, {
    "type": "/film/actor",
    "id": "/en/lorraine_gary",
    "name": "Lorraine Gary"
  }, {
    "type": "/film/actor",
    "id": "/en/murray_hamilton",
    "name": "Murray Hamilton"
  }, {
    "type": "/film/actor",
    "id": "/en/robert_downey_jr",
    "name": "Robert Downey Jr."
  }, {
    "type": "/film/actor",
    "id": "/en/kiefer_sutherland",
    "name": "Kiefer Sutherland"
  }, {
    "type": "/film/actor",
    "id": "/en/winona_ryder",
    "name": "Winona Ryder"
  }, {
    "type": "/film/actor",
    "id": "/en/john_hurt",
    "name": "John Hurt"
  }, {
    "type": "/film/actor",
    "id": "/en/richard_burton",
    "name": "Richard Burton"
  }, {
    "type": "/film/actor",
    "id": "/en/suzanna_hamilton",
    "name": "Suzanna Hamilton"
  }, {
    "type": "/film/actor",
    "id": "/en/cyril_cusack",
    "name": "Cyril Cusack"
  }, {
    "type": "/film/actor",
    "id": "/en/gregor_fisher",
    "name": "Gregor Fisher"
  }, {
    "type": "/film/actor",
    "id": "/en/tony_leung_chiu_wai",
    "name": "Tony Leung Chiu Wai"
  }, {
    "type": "/film/actor",
    "id": "/en/gong_li",
    "name": "Gong Li"
  }, {
    "type": "/film/actor",
    "id": "/en/faye_wong",
    "name": "Faye Wong"
  }, {
    "type": "/film/actor",
    "id": "/en/takuya_kimura",
    "name": "Takuya Kimura"
  }, {
    "type": "/film/actor",
    "id": "/en/zhang_ziyi",
    "name": "Zhang Ziyi"
  }, {
    "type": "/film/actor",
    "id": "/en/carina_lau",
    "name": "Carina Lau"
  }, {
    "type": "/film/actor",
    "id": "/en/chang_chen",
    "name": "Chang Chen"
  }, {
    "type": "/film/actor",
    "id": "/en/bird_mcintyre",
    "name": "Bird McIntyre"
  }, {
    "type": "/film/actor",
    "id": "/en/maggie_cheung",
    "name": "Maggie Cheung"
  }, {
    "type": "/film/actor",
    "id": "/en/chevy_chase",
    "name": "Chevy Chase"
  }, {
    "type": "/film/actor",
    "id": "/en/steve_martin",
    "name": "Steve Martin"
  }, {
    "type": "/film/actor",
    "id": "/en/martin_short",
    "name": "Martin Short"
  }, {
    "type": "/film/actor",
    "id": "/en/joe_mantegna",
    "name": "Joe Mantegna"
  }, {
    "type": "/film/actor",
    "id": "/en/jon_lovitz",
    "name": "Jon Lovitz"
  }, {
    "type": "/film/actor",
    "id": "/en/alfonso_arau",
    "name": "Alfonso Arau"
  }, {
    "type": "/film/actor",
    "id": "/en/tony_plana",
    "name": "Tony Plana"
  }, {
    "type": "/film/actor",
    "id": "/en/al_pacino",
    "name": "Al Pacino"
  }, {
    "type": "/film/actor",
    "id": "/en/carmen_maura",
    "name": "Carmen Maura"
  }, {
    "type": "/film/actor",
    "id": "/en/luis_hostalot",
    "name": "Luis Hostalot"
  }, {
    "type": "/film/actor",
    "id": "/en/veronica_forque",
    "name": "Veronica Forqu\u00e9"
  }, {
    "type": "/film/actor",
    "id": "/en/hume_cronyn",
    "name": "Hume Cronyn"
  }, {
    "type": "/film/actor",
    "id": "/en/jessica_tandy",
    "name": "Jessica Tandy"
  }, {
    "type": "/film/actor",
    "id": "/en/frank_mcrae",
    "name": "Frank McRae"
  }, {
    "type": "/film/actor",
    "id": "/en/elizabeth_pena",
    "name": "Elizabeth Pe\u00f1a"
  }, {
    "type": "/film/actor",
    "id": "/en/dennis_boutsikaris",
    "name": "Dennis Boutsikaris"
  }, {
    "type": "/film/actor",
    "id": "/en/hal_warren",
    "name": "Hal Warren"
  }, {
    "type": "/film/actor",
    "id": "/en/tom_neyman",
    "name": "Tom Neyman"
  }, {
    "type": "/film/actor",
    "id": "/en/john_reynolds_1941",
    "name": "John Reynolds"
  }, {
    "type": "/film/actor",
    "id": "/en/rajnikanth",
    "name": "Rajnikanth"
  }, {
    "type": "/film/actor",
    "id": "/en/sridevi_kapoor",
    "name": "Sridevi Kapoor"
  }, {
    "type": "/film/actor",
    "id": "/en/kantimathi",
    "name": "Kantimathi"
  }, {
    "type": "/film/actor",
    "id": "/en/konkona_sen_sharma",
    "name": "Konkona Sen Sharma"
  }, {
    "type": "/film/actor",
    "id": "/en/shabana_azmi",
    "name": "Shabana Azmi"
  }, {
    "type": "/film/actor",
    "id": "/en/soumitra_chatterjee",
    "name": "Soumitra Chatterjee"
  }, {
    "type": "/film/actor",
    "id": "/en/waheeda_rehman",
    "name": "Waheeda Rehman"
  }, {
    "type": "/film/actor",
    "id": "/en/rahul_bose",
    "name": "Rahul Bose"
  }, {
    "type": "/film/actor",
    "id": "/en/william_hopper",
    "name": "William Hopper"
  }, {
    "type": "/film/actor",
    "id": "/en/joan_taylor",
    "name": "Joan Taylor"
  }, {
    "type": "/film/actor",
    "id": "/en/frank_puglia",
    "name": "Frank Puglia"
  }, {
    "type": "/film/actor",
    "id": "/en/james_garner",
    "name": "James Garner"
  }, {
    "type": "/film/actor",
    "id": "/en/rod_taylor_1930",
    "name": "Rod Taylor"
  }, {
    "type": "/film/actor",
    "id": "/en/eva_marie_saint",
    "name": "Eva Marie Saint"
  }, {
    "type": "/film/actor",
    "id": "/en/paul_walker",
    "name": "Paul Walker"
  }, {
    "type": "/film/actor",
    "id": "/en/eva_mendes",
    "name": "Eva Mendes"
  }, {
    "type": "/film/actor",
    "id": "/en/devon_aoki",
    "name": "Devon Aoki"
  }, {
    "type": "/film/actor",
    "id": "/en/john_payne_1912",
    "name": "John Payne"
  }, {
    "type": "/film/actor",
    "id": "/en/evelyn_keyes",
    "name": "Evelyn Keyes"
  }, {
    "type": "/film/actor",
    "id": "/en/brad_dexter",
    "name": "Brad Dexter"
  }, {
    "type": "/film/actor",
    "id": "/en/frank_faylen",
    "name": "Frank Faylen"
  }, {
    "type": "/film/actor",
    "id": "/en/peggie_castle",
    "name": "Peggie Castle"
  }, {
    "type": "/film/actor",
    "id": "/en/jean-hugues_anglade",
    "name": "Jean-Hugues Anglade"
  }, {
    "type": "/film/actor",
    "id": "/en/beatrice_dalle",
    "name": "B\u00e9atrice Dalle"
  }, {
    "type": "/film/actor",
    "id": "/en/vincent_lindon",
    "name": "Vincent Lindon"
  }, {
    "type": "/film/actor",
    "id": "/en/dominique_pinon",
    "name": "Dominique Pinon"
  }, {
    "type": "/film/actor",
    "id": "/en/joaquin_phoenix",
    "name": "Joaquin Phoenix"
  }, {
    "type": "/film/actor",
    "id": "/en/james_gandolfini",
    "name": "James Gandolfini"
  }, {
    "type": "/film/actor",
    "id": "/en/catherine_keener",
    "name": "Catherine Keener"
  }, {
    "type": "/film/actor",
    "id": "/en/norman_reedus",
    "name": "Norman Reedus"
  }, {
    "type": "/film/actor",
    "id": "/en/dean_martin",
    "name": "Dean Martin"
  }]
}

Similary, you would do:

https://www.googleapis.com/freebase/v1/mqlread?query=[{%22type%22:%22/film/film%22,%22id%22:null,%22name%22:null}]

to fetch movie titles.

BioGeek
  • 21,897
  • 23
  • 83
  • 145
  • Thanks for the information BioGeek :) But I still don't see how can I use this to connect movies and actors ? Any thoughts on that ? I can query actors and I can query movies but the information that I am looking for is movie and their corresponding actors. – invinc4u Sep 14 '12 at 03:45