A curated list of awesome machine learning applications in the sports domain. An up to date version of this awesome list can be found here (I'll be writing in this notion page for the time being).
Check the contribution guidelines.
Or DM me on twitter for a fast response.
- Kyle Boddy - sabermetrics
- Patrick Lucey - Chief Scientist at Stats Perform
- David Sumpter - Soccermatics Author
- William Spearman - Liverpool Analytics
- Javier Fernandez - Barcalona Analytics
- Luke Bornn - Sim Fraser University
- Keita Watanabe - Japanese Volley Ball
- Tom Decroos - Soccer data analytics researcher
- Handbook of Statistical Methods and Analyses in Sports (Chapman & Hall/CRC Handbooks of Modern Statistical Methods) 1st Edition
- Actions Speak Louder than Goals: Valuing Player Actions in Soccer (KDD 2019) Best Paper, Applied Data Science Track
- Player Vectors: Characterizing Soccer Players’ Playing Style from Match Event Streams (ECML PKDD 2019)
- Automatic Discovery of Tactics in Spatio-Temporal Soccer Match Data
- Spatio-temporal Analysis of Tennis Matches
- DeepBall: Deep Neural-Network Ball Detector (VISIGRAPP 2019)
- Towards Real-Time Detection and Tracking of Basketball Players using Deep Neural Networks (NIPS 2017)
- Predicting soccer highlights from spatio-temporal match event streams (AAAI 2017) [link]
- A Context-Aware Loss Function for Action Spotting in Soccer Videos (CVPR 2020) [link]
- Predicting Wide Receiver Trajectories in American Football (IEEE WACV 2016)
- Coordinated Multi-Agent Imitation Learning (ICML 2017)
- Neural Relational Inference for Interacting Systems (ICML 2018)
- Long Range Sequence Generation via Multiresolution Adversarial Training (NIPS 2018)
- Where Will They Go? Predicting Fine-Grained Adversarial Multi-Agent Motion using Conditional Variational Autoencoders (ECCV 2018)
- Generating Defensive Plays in Basketball Games (ACM MM 2018)
- Generating Multi-Agent Trajectories using Programmatic Weak Supervision (ICLR 2019)
- Stochastic Prediction of Multi-Agent Interactions from Partial Observations (ICLR 2019)
- Diverse Generation for Multi-agent Sports Games (CVPR 2019)
- DAG-Net: Double Attentive Graph Neural Network for Trajectory Forecasting (2020)
- VAIN: Attentional Multi-agent Predictive Modeling (NIPS 2020)
- Winning a Tournament by Any Means Necessary (IJCAI 2018) [link]
- Python - High-level programming language. Norm for ML/DL research
- R - Language for statistical computing and graphics
- D3.js - Javascript library (nearly a language of its own) for cool vizualizations
- Tableau - Data analysis software
- Excel - Spreadsheet software from microsoft
Soccer
- StatsBomb Open Data [link]
- football.db [link]
- FIFA 19 complete player dataset [link]
- Fifa 18 More Complete Player Dataset [link]
- FIFA World Cup [link]
- International football results from 1872 to 2020 [link]
- Wyscout (paid)
Basketball
- NBA shot logs [link]
- NBA player of the week [link]
- Daily Fantasy Basketball - DraftKings NBA [link]
- NCAA Basketball [link]
American Football
- Detailed NFL Play-by-Play Data 2009-2018 [link]
- NFLsavant.com [link]
Baseball
- Lahman’s Baseball Database [link]
Hockey
- NHL Game Data [link]
Other
- FiveThirtyEight [link]
- Sports-1M [link]
- 120 years of Olympic history: athletes and results [[link](120 years of Olympic history: athletes and results)]
- International Workshop on Computer Vision in Sports at CVPR [2013] [2015] [2017] [2018] [2019]
- AAAI Workshop on AI in Team Sports [2020]
- MIT Sloan Sports Analytics Conference [2020]
SSAC don't seem to archive past conferences so search Google Scholar with
source:MIT Sloan Sports Analytics Conference
like [th](source:MIT Sloan Sports Analytics Conference)i[s](source:MIT Sloan Sports Analytics Conference) and you should get around a hundred results. - Workshop on Machine Learning and Data Mining for Sports Analytics [2019] [2018]
- Workshop on Large-Scale Sports Analytics [2016]
Leave out stuff like,
- Robo-Sports: So many papers, especially RoboCup, not sure which are important.
- Coaching/Physiology/Medicine: Fast twitch, ATP, periodization, creatine etc. New methods + lots of data might provide insight but it's too broad.
- Pose estimation/Object Detection: Too general. There are awesome lists specifically for those areas already.
- Low impact research: I can't include everything!