Technion researchers have developed a new method for predicting basketball player performance. It is based on analyzing a player’s past performance and pre-game interviews.
A computational method developed at the Technion significantly improves the prediction of the basketball players’ performance. The study was led by doctoral students Amir Feder and Nadav Oved under the supervision of Professor Roi Reichart of the William Davidson Faculty of Industrial Engineering & Management.
Predicting an athlete’s performance is a research challenge that has long been pursued by researchers around the world, utilizing tools from psychology, statistics, computer science, and more. Until now, performance predictions have mainly relied on the limited prediction factor of the athlete’s past performance. The Technion researchers, however, have added a new predictive factor: “out-of-game” information, specifically – transcripts of pre-game interviews with the players. The concept and study have been published in the journal Computational Linguistics.
The researchers hypothesized that pre-game interviews contain important information that can improve predictions about a player’s behavior and performance in an upcoming game. The rationale is that a given player’s in-game behavior is very difficult to predict, as the activity takes place in a complex and dynamic space. Performance is influenced by the environment, rational decisions, and internal emotions. In turn, the dynamic environment at a game also influences those emotions. These dynamics cannot be predicted solely based on past performance.
The study was based on a dataset consisting of pre-game and post-game media interviews alongside in-game performance metrics from the game following the interviews. The dataset entailed 5,226 performance interview pairs of 36 prominent NBA players. Each of the pairs was assessed by the relationship between the interview and performance. Specifically, the relationship was measured through the correlation between the transcript of the interview and deviations in the performance indicators in the game – risk characteristics, behavior, and strategic decisions. An example of a risk is an attempt to make a long-range basket (three-point range). An example of behavior and strategy is choosing a defense approach.
The researchers designed several models, utilizing state-of-the-art deep neural networks for players’ actions prediction based on the language used in their open-ended interviews. The models are capable of both making predictions based on interview text alone, or a combination of interview text and past-performance metrics. The text-based models outperformed strong baselines based on performance metrics alone, demonstrating the importance of language for action prediction. The models that used both interview texts and players’ past performance metrics improved on some of the most challenging predictions and produced the best results.
For example, in a pre-game interview before the 2016 NBA Finals, LeBron James, then with the Cleveland Cavaliers, was asked about his mental state and how he was feeling based on his personal history (James was born in Cleveland, and returned to the team to bring its first championship). James described his positive mental state and concentration and feelings of ease going into the games. Accordingly, Prof. Reichart explained, “Our models processed the text and guessed that James’ offensive performance would be better than his past averages. In practice, the 2016 Finals series ended with Cleveland’s first – and only – winning championship. In these games, James surpassed himself and starred throughout the series, as our models predicted. “
Chart (see below): NBA player performance prediction accuracy. Columns from left to right: Dataset majority baseline – naive prediction method; Metric-only baseline – prediction based on past performance only; prediction based on interviews (method developed by Technion researchers); prediction on interviews and past performance.
Figure (see below): Prediction accuracy of the model per player, relative to its accuracy for all players (black line), for each prediction task. Points to the right indicate better than average prediction.