3

Recently, I am using IMDBpy API to scrape the IMDB dataset. In this API, there is a imdbpy2sql.py which could convert IMDB movie dataset to a SQL database. But I can not find any description of this dataset. So I can not understand the schema of this SQL database. There are too many tables in this database. Is there any way to know that?

I strictly follow this website to build my database http://blog.secaserver.com/2013/08/importing-imdb-sample-data-set-mysql/.

Thanks so much!!

Zizhao
  • 259
  • 3
  • 13

1 Answers1

10

I doubt that there are too many tables. There are a lot properties/relationships available.

I generated this image once while creating pyIRDG. You can have a look at that code too for documentation on the available data. Here is the output of the comments: http://pastebin.com/zGnZ02w4

I've also used MySQL Workbench to generate a schema from the db.

There is also this German blog article with an ERM image.

not that I'm aware of, and for sure our db is not in any NF. :-) Anyway, you can easily look at the scheme in the imdb/parser/sql/dbschema.py module or using some tool directly on the database. Source.


Glorfindel
  • 21,988
  • 13
  • 81
  • 109
ofthelit
  • 1,341
  • 14
  • 33
  • Thanks so much for your carefulness!! It is exactly what I need. Did you ever use IMDbPY for movie information scraping? – Zizhao Sep 21 '14 at 03:19
  • No, I only used the provided IMDb datasets. – ofthelit Sep 21 '14 at 11:35
  • @ofthelit Very old subject but still got a question : basically the table aka_name looks a bit useless. What you reckon ? – Dirty_Fox Nov 20 '15 at 11:32
  • Depends on your use case. It could be useful for having better matches when searching for people. – ofthelit Nov 22 '15 at 13:29
  • I know it's an old post, but I've come across an question on this particular Diagram and want to know how to find all movies of a particular person/actor. I mean i just want the relevant tables for mentioned task. Thank you in advance. – Harshil Doshi Oct 28 '17 at 15:38
  • Use these tables: name, cast_info and title – ofthelit Oct 30 '17 at 12:30
  • Please edit the externally hosted code into the post; doing so will make sure it remains useful even if the link breaks. My script [is not allowed to do this](https://meta.stackoverflow.com/a/344512/4751173) because of potential licensing problems. – Glorfindel Sep 21 '19 at 15:34
  • Glorfindel, what code are you referring to? The 'Source' link at the bottom refers to 'de bron' of the quote. – ofthelit Sep 23 '19 at 12:42