Analyzing Opportunities at the Box Office

What types of films do the best at the box office? And what types of films would be the best for a new film studio to focus on creating? An analysis.

Jessica Forrest-Baldini
Analytics Vidhya

--

Avatar from Twentieth Century Fox

What defines success at the box office? Is it the most ticket sales (highest gross revenue), the highest ROI (return on investment), or perhaps the most profitable?

I decided to take a look at all three.

First I wanted to see if there was a relationship between production budget and gross revenue.

We can see there appears to be a positive correlation here between production budget and gross revenue, across a variety of genres and budgets. This makes sense as the higher the budget, the better actors, writers, producers, film sets and everything else that goes into making a movie. While some high budget films do tank (think Marvel’s Dark Phoenix) and some low budget films do extraordinarily well (think Blair Witch Project), we can see there is an overall linear relationship between budget and gross revenue.

ROI

Let’s correct for budget and see which genres of films offer the greatest ROI or return on investment. Here I averaged ROI for all movies in a genre. A note is that most movies have multiple genres, for which I counted a movie into each one of its genres.

We can see that the genres with the highest on average ROI are Horror, Mystery & Thriller. These may offer the greatest ROI, but do they also offer the greatest profit or gross revenue from ticket sales? And if not, which genres do?

Profit

We can see here that the highest profiting genres are different, and that they appear to be closely correlated with gross revenue from ticket sales and production budget. We can see that the highest ROI genres tend to have much lower on average production budgets. Whereas the highest profiting and grossing genres tend to have much higher budgets.

Let’s take a look at the correlation plots for these relationships. These data are averaged across genres, the same as the data above.

Production Budget

The r-values for these plots are r=0.9815, r=0.9654 and r=-0.2115 respectively using Pearson’s method.

We can see the relationship between worldwide gross and production budget as well as gross revenue and production budget across genres is strongly correlated. We can see there is no significant correlation between ROI and production budget.

Popularity

So why is so much more money spent on genres of films that offer lower on average returns on investment? I took a look at the popularity data from TMBd. Something to note is that these are not box office data, but data calculated from clicks and saves on any given movie on their website on any given day. How they calculate that here.

Popularity is correlated with production budget r=0.5997, worldwide gross r=0.6292 and profit r=0.5931.

We can see there is a significant correlation here between what genres are the most popular with what movie studios invest the most in (production budget) and what sells the most at the box office (worldwide gross) as well as profit.

So it makes sense why movie studios invest so much into these films, and while lower in ROI, they offer greater on average reward.

Something to note is that most movies have multiple genres. The biggest box office hits tend to be a combination of 3–4, with at least 2–3 coming from the highest grossing genres. Think Avatar, which was Action, Adventure, Fantasy and Sci-Fi.

Conclusion

If I were advising a new film studio on what types of films to focus on, if the budget were low, I would recommend the genres with the highest ROI: Horror, Mystery and Thriller. If the budget were high, I would recommend the highest grossing, profiting and most popular at the box office: Adventure, Action, Fantasy, Family, Action and Animation, which was the highest profiting. Noting again that most movies are a combination of genres.

Further Research

It would also be good to look at if and how many popular lead roles contribute to a movie’s success, as well as if time of year released contributes to a movie’s success.

Data Used for this Analysis

  1. The-Numbers.com
  2. https://developers.themoviedb.org/3/discover/movie-discover
  3. https://www.imdb.com/interfaces

Software Used for this Analysis

  1. Python
  2. Jupyter Notebooks

--

--

Jessica Forrest-Baldini
Analytics Vidhya

Data Scientist who’s passionate about cleantech, product development, startups and the environment.