NFL Ghosts: A framework for evaluating defender positioning with conditional density estimation

football
player-tracking data
random forests
conditional density estimation
ghosting
Big Data Bowl
expected points
A ghosting framework to evaluate players with high-dimensional tracking data features through distribution estimates via flexible random forests for conditional density estimation.
Authors
Affiliations

Ronald Yurko

Department of Statistics & Data Science, Carnegie Mellon University

Quang Nguyen

Department of Statistics & Data Science, Carnegie Mellon University

Konstantinos Pelechrinis

School of Computing and Information, University of Pittsburgh

Published

June 25, 2024

arxiv

@article{yurko2024nfl,
  title={NFL Ghosts: A framework for evaluating defender positioning with conditional density estimation},
  author={Yurko, Ronald and Nguyen, Quang and Pelechrinis, Konstantinos},
  journal={arXiv preprint arXiv:2406.17220},
  year={2024}
}

Abstract

Player attribution in American football remains an open problem due to the complex nature of twenty-two players interacting on the field, but the granularity of player tracking data provides ample opportunity for novel approaches. In this work, we introduce the first public framework to evaluate spatial and trajectory tracking data of players relative to a baseline distribution of “ghost” defenders. We demonstrate our framework in the context of modeling the nearest defender positioning at the moment of catch. In particular, we provide estimates of how much better or worse their observed positioning and trajectory compared to the expected play value of ghost defenders. Our framework leverages high-dimensional tracking data features through flexible random forests for conditional density estimation in two ways: (1) to model the distribution of receiver yards gained enabling the estimation of within-play expected value, and (2) to model the 2D spatial distribution of baseline ghost defenders. We present novel metrics for measuring player and team performance based on tracking data, and discuss challenges that remain in extending our framework to other aspects of American football.