top of page
Search

The NBA Finals and Home Court Advantage

  • Writer: Andrew Giocondi
    Andrew Giocondi
  • Jan 27, 2022
  • 8 min read

Updated: Jan 30, 2022




















Introduction: Defining the Problem


The NBA finals are where the two best basketball teams in the National Basketball Association for that year compete for the Larry O'Brien Championship Trophy. As a sports fanatic, these are games that I follow and watch every year. These series of games are the most exciting, and more importantly, the most intriguing. This is due to the fact that it is commonly harder to predict the winner as a spectator. There seem to be a lot more surprises and unexpected finishes in these games.


Through not only watching, but also playing sports throughout my life, there was always a common belief that the home team has an advantage to winning against the away team. Based on my curiosity about whether or not this phenomenon exists, or how much of an impact it has on a game, I will be exploring data on the NBA finals over the span of almost 40 years. In the NBA, the finals consist of 7-game series where the winner is the first team to reach 4 wins. The team with the higher seed or the better record will play at home for games 1, 2, 5, and 7, while the other will play at home for games 3, 4, and 6.


It is the goal of this project to find out if home court advantage exists and if it leads to better performances. It will uncover which statistics are improved the most by playing at home in the NBA finals and how it has changed over the years.


The following are questions that will guide the project:

  1. Does the home team win more often compared to the away team in the NBA finals games?

  2. Does home court advantage lead to improvements in team statistics in comparison to the away team?

  3. Has the performance of the NBA finals teams at home changed over time?



The Data


The dataset includes game data for both the championship team and 2nd place team of every NBA finals series from 1980 to 2017. The year 1980 was the starting point for this data because the 3-point line was introduced to the game in that 1979-1980 season. Even though not many 3-point shots were taken, this year span will help the analysis of the data stay accurate.



The data titled 'champs.csv' contains game-by-game team totals for the championship team from every finals game between 1980 and 2017. The 'runnerups.csv' contains game-by-game team totals for the runner-up team (2nd place team) from every finals game between 1980 and 2017. The data in both of the files were obtained from basketball reference, which is part of the statistical tracking company, sports reference.


In total, the champion and runner-up data each contain 24 variables and 220 rows. They both contain the same variables and features, including year, team, wins or losses, home or away, game number, and additional statistics regarding each game. These include minutes played, field goal attempts, makes, and percentage, 3-point attempts, makes, and percentage, free throw attempts, makes, and percentage, offensive rebounds, defensive rebounds, total rebounds, assists, steals, blocks, turnovers, personal fouls, and total points scored.



Pre-processing the Data


In order for the data to be as clean as possible, changing it is essential. It needs to be organized, clear of any errors and unneeded aspects, and contain every element required to articulate visualizations and complete the analysis.


First, I was able to notice a few null values. There were 6 in the 3-point percentage column for the champion data and 3 in the 3-point percentage column for the runner-up data, all being in the early 1980's. Once I had a deeper understanding of the data, I realized that these were instances where no 3-point shots were taken in the game. This was understandable as the 3-point line was newly invented and was not used frequently until the late 1980's. If there are no 3-pointers made in a game, whether any shots were taken or not, a 0% 3-point percentage would have that same meaning – that none were made. Therefore, the null values were changed to 0 to represent a 0% 3-point percentage. There also existed an error in the runner-up data where a game had a number of minutes played of 40 rather than 240. I simply changed this inside of the excel file before importing it.


Next, for the sake of simplicity and readability of the data, the 0 and 1 values for both the home and away variable and the win and loss variable were changed to their exact meaning. In the home and away column, the 0’s represented away games and 1’s represented home games. Now, the column includes either ‘Home’ or ‘Away’. For the win and loss column, the 0’s represented losses and 1’s represented wins. Now, the column includes either ‘Win’ or ‘Loss’. This also made it easier to display in graphs.


Following, I counted the number of home wins, home losses, away wins, and away losses. I used these counts to calculate the percentage of each. Basically, this step gave me the ability to display the home and away differences in frequency percentages, and in another graph with both the champion and runner up data visualized together.


Lastly, I wanted to find a way to represent how well a team played in a game without having to use each game stat individually. I decided to take the player efficiency rating formula and apply it to the team stats. For each game in both the champion data and runner-up data, I added the total points, total rebounds, assists, steals, and blocks, and subtracted that by the number of missed field goals, missed free throws, and turnovers. I then added a total efficiency column to each dataset. After, to take away the advantage of teams playing in overtime and recording higher stats than other games, I divided the total efficiency value by the number of minutes played. I then added this new team efficiency rating value to each dataset. The final formula I used was:


TER = (PTS + REB + AST + STL + BLK − Missed FG − Missed FT - TO) / MP



Data Understanding and Visualization


To produce the necessary information that will answer the research questions and solve the problem of the project, a variety of visualizations were created along with corresponding statistics. It is important to note that most of the graphs that are presented are in pairs, with one representing the champion teams and the other representing the 2nd place teams. I kept these separate for reasons of simplicity, but also due to the reason that the champion team results would show more wins, and therefore, are more likely to have better statistics than the 2nd place team. This could have altered the data and made it harder to analyze.



Examining the advantages of playing at home:














These graphs display the number of wins and losses while being at home and away. We can quickly see that the champion teams have significantly more wins than losses at home, while the ratio approaches more even when being away. For the 2nd place teams, obviously, there are more losses than wins. However, the number of wins and losses are better while being at home compared to being away. The difference between the number of wins and losses at home for the champion teams is just over 3 times larger than away. For the 2nd place team, the difference is just over 3 times smaller at home, meaning that the wins are closer to the number of losses. Also, it is not a surprise that the graphs are approximately opposites. If one team has a win, the other team that they are playing against will have a loss.
















This graph is similar to the previous ones, however, it displays the total percentage of each situation occurring while having the champion team and 2nd place team results on the same plot. It shows that the most common finish for games in the NBA Finals is for the home team to win and for the away team to lose.


The calculated percentages for the champion teams were about 81% for home wins, 19% for home losses, 61% for away wins, and 39% for away losses. The values for the 2nd place teams can be found by looking at the corresponding opposite situation.



Observing any improvements of game statistics for when a team is playing at home:













These boxplots represent the summary of all field goal percentages, while also showing the distribution of the exact values. The field goal percentage was one of the few highest correlated variables in relation to whether a team won or lost. Visualizing the overall field goal percentages was a priority as the ability to make shots is crucial to a team’s success. It is apparent that for the champion and 2nd place teams the field goal percentage is higher when playing at home. Although the range of values is wide for home games and engulf most of the values for away games, the difference can be seen in the interquartile range which includes 50% of the data, and the mean. It is interesting to see that the difference is larger for the 2nd place teams than the champion teams. As expected, the champion teams have higher field goal percentages.














The difference in distribution for total points scored is shown in these plots. Total points were another highly correlated variable with a team’s wins and losses. Again, it is portrayed that playing at home in the NBA Finals leads to a statistical advantage for both team results. The overall range as well as the mean is higher for games played at home. The distribution of points scored seems to be more clustered around the mean for the 2nd place team while the champion teams have scored more, in general, in the sample of games. Overall, points are another common reference for team success and will help increase the probability of winning.














The last set of boxplots illustrates the difference between team efficiency ratings. This is likely the most noteworthy representation of team success as it takes into consideration many statistical factors. As seen in the previous visualizations, playing on the home court is advantageous in the NBA Finals. The minimum and maximum values as well as the mean team efficiency rating are higher at home. They seem to also be slightly higher for the champion teams. As seen in the plots regarding the total points scored, the distribution of points is more clustered around the mean for the 2nd place teams.



Observing any improvements of game statistics over time for when a team is playing at home:























Using the team efficiency rating again, we are able to not only view the superiority when playing at home but also the trends over the span from 1980 to 2017. When playing at home, there are only a few years where the rating was lower than away. Therefore, for almost every year, the home team in an NBA Finals game has performed better. It is also surprising to see that the champion teams and 2nd place teams are following approximately the same trend over time. The overall team efficiency rating for both teams seemed to dip from around 1985 to the early 2000’s.



Conclusion: What’s the story?


Following the creation of the visualizations, there are many takeaways. I was able to successfully find the answers to each of the research questions stated in the introduction. In the NBA Finals, it is clear that playing at home will be advantageous for teams. When looking at the number of wins and losses along with the percentages of each situation, it shows that there are substantial differences when it comes to playing at home and away. A team is more likely to win at home whether they are the eventual champions or the 2nd place team. Moreover, an NBA Finals team is likely to perform better while playing on their home court. Not only are the points scored and field goal percentages anticipated to be higher, but the overall team performance and efficiency are likely to be better– leading to a higher chance of winning. Over time, the narrative of home-court dominance seemed to stay consistent. Despite the fact that the team efficiency rating dipped around the same time for the champion and 2nd place teams, the home team in almost every game of the series, for almost every year, performed better.


After the analysis of games in the NBA Finals, it is safe to say that playing at home is helpful. It is influential even in the biggest and most consequential stage of a basketball season.



Code and References


The code for this project:


The following resources were used in the process of this project.




 
 
 

Comentarios


bottom of page