Papers
arxiv:2005.14147

IMDb data from Two Generations, from 1979 to 2019; Part one, Dataset Introduction and Preliminary Analysis

Published on May 28, 2020
Authors:
,

Abstract

"IMDb" as a user-regulating and one the most-visited portal has provided an opportunity to create an enormous database. Analysis of the information on Internet Movie Database - IMDb, either those related to the movie or provided by users would help to reveal the determinative factors in the route of success for each movie. As the lack of a comprehensive dataset was felt, we determined to do create a compendious dataset for the later analysis using the statistical methods and machine learning models; It comprises of various information provided on IMDb such as rating data, genre, cast and crew, MPAA rating certificate, parental guide details, related movie information, posters, etc, for over 79k titles which is the largest dataset by this date. The present paper is the first paper in a series of papers aiming at the mentioned goals, by a description of the created dataset and a preliminary analysis including some trend in data, demographic analysis of IMDb scores and their relation of genre MPAA rating certificate has been investigated.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2005.14147 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2005.14147 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.