Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolving inconsistencies between columns in load_player_stats() #454

Open
1 task done
john-b-edwards opened this issue Jan 14, 2024 · 3 comments
Open
1 task done

Comments

@john-b-edwards
Copy link

john-b-edwards commented Jan 14, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Is your feature request related to a problem? Please describe.

There are some notable inconsistencies in how player biographical or contextual information is represented for different stat_types in load_player_stats().

nflreadr::load_player_stats(stat_type = "offense") |>
    colnames()
#>  [1] "player_id"                   "player_name"                
#>  [3] "player_display_name"         "position"                   
#>  [5] "position_group"              "headshot_url"               
#>  [7] "recent_team"                 "season"                     
#>  [9] "week"                        "season_type"                
#> [11] "completions"                 "attempts"                   
#> [13] "passing_yards"               "passing_tds"                
#> [15] "interceptions"               "sacks"                      
#> [17] "sack_yards"                  "sack_fumbles"               
#> [19] "sack_fumbles_lost"           "passing_air_yards"          
#> [21] "passing_yards_after_catch"   "passing_first_downs"        
#> [23] "passing_epa"                 "passing_2pt_conversions"    
#> [25] "pacr"                        "dakota"                     
#> [27] "carries"                     "rushing_yards"              
#> [29] "rushing_tds"                 "rushing_fumbles"            
#> [31] "rushing_fumbles_lost"        "rushing_first_downs"        
#> [33] "rushing_epa"                 "rushing_2pt_conversions"    
#> [35] "receptions"                  "targets"                    
#> [37] "receiving_yards"             "receiving_tds"              
#> [39] "receiving_fumbles"           "receiving_fumbles_lost"     
#> [41] "receiving_air_yards"         "receiving_yards_after_catch"
#> [43] "receiving_first_downs"       "receiving_epa"              
#> [45] "receiving_2pt_conversions"   "racr"                       
#> [47] "target_share"                "air_yards_share"            
#> [49] "wopr"                        "special_teams_tds"          
#> [51] "fantasy_points"              "fantasy_points_ppr"         
#> [53] "opponent_team"

nflreadr::load_player_stats(stat_type = "defense") |>
    colnames() 
#>  [1] "season"                        "week"                         
#>  [3] "player_id"                     "player_name"                  
#>  [5] "player_display_name"           "position"                     
#>  [7] "position_group"                "headshot_url"                 
#>  [9] "team"                          "def_tackles"                  
#> [11] "def_tackles_solo"              "def_tackles_with_assist"      
#> [13] "def_tackle_assists"            "def_tackles_for_loss"         
#> [15] "def_tackles_for_loss_yards"    "def_fumbles_forced"           
#> [17] "def_sacks"                     "def_sack_yards"               
#> [19] "def_qb_hits"                   "def_interceptions"            
#> [21] "def_interception_yards"        "def_pass_defended"            
#> [23] "def_tds"                       "def_fumbles"                  
#> [25] "def_fumble_recovery_own"       "def_fumble_recovery_yards_own"
#> [27] "def_fumble_recovery_opp"       "def_fumble_recovery_yards_opp"
#> [29] "def_safety"                    "def_penalty"                  
#> [31] "def_penalty_yards"

nflreadr::load_player_stats(stat_type = "kicking") |>
    colnames()
#>  [1] "season"              "week"                "season_type"        
#>  [4] "team"                "player_name"         "player_id"          
#>  [7] "fg_made"             "fg_missed"           "fg_blocked"         
#> [10] "fg_long"             "fg_att"              "fg_pct"             
#> [13] "pat_made"            "pat_missed"          "pat_blocked"        
#> [16] "pat_att"             "pat_pct"             "fg_made_distance"   
#> [19] "fg_missed_distance"  "fg_blocked_distance" "gwfg_att"           
#> [22] "gwfg_distance"       "gwfg_made"           "gwfg_missed"        
#> [25] "gwfg_blocked"        "fg_made_0_19"        "fg_made_20_29"      
#> [28] "fg_made_30_39"       "fg_made_40_49"       "fg_made_50_59"      
#> [31] "fg_made_60_"         "fg_missed_0_19"      "fg_missed_20_29"    
#> [34] "fg_missed_30_39"     "fg_missed_40_49"     "fg_missed_50_59"    
#> [37] "fg_missed_60_"       "fg_made_list"        "fg_missed_list"     
#> [40] "fg_blocked_list"

stat_type = defense lacks the column season_type for instance, and we have player_display_name and position for defense and offense but not kicking (position = K is assumed but that is not always the case, see Dare Ogunbowale's kicking exploits for example).

Describe the solution you'd like

I think we should standardize how biographical and contextual information for player stats is represented in these columns.

Describe alternatives you've considered

No response

Additional context

No response

@mrcaseb mrcaseb transferred this issue from nflverse/nflverse-data Jan 15, 2024
@mrcaseb
Copy link
Member

mrcaseb commented Jan 15, 2024

Transferred to nflfastR as we should resolve this directly in the underlying functions

@mrcaseb
Copy link
Member

mrcaseb commented Jan 15, 2024

Cross checking this and it seems like some of this has already been resolved in nflfastR. I guess we need to trigger the workflow to rebuild all data in nflverse-pbp at some point

season_type is currently missing in def so we need to add this before rebuild

pbp <- nflreadr::load_pbp(2023)

off <- nflfastR::calculate_player_stats(pbp, weekly = TRUE)
def <- nflfastR::calculate_player_stats_def(pbp, weekly = TRUE)
kick <- nflfastR::calculate_player_stats_kicking(pbp, weekly = TRUE)

colnames(off)
#>  [1] "player_id"                   "player_name"                
#>  [3] "player_display_name"         "position"                   
#>  [5] "position_group"              "headshot_url"               
#>  [7] "recent_team"                 "season"                     
#>  [9] "week"                        "season_type"                
#> [11] "opponent_team"               "completions"                
#> [13] "attempts"                    "passing_yards"              
#> [15] "passing_tds"                 "interceptions"              
#> [17] "sacks"                       "sack_yards"                 
#> [19] "sack_fumbles"                "sack_fumbles_lost"          
#> [21] "passing_air_yards"           "passing_yards_after_catch"  
#> [23] "passing_first_downs"         "passing_epa"                
#> [25] "passing_2pt_conversions"     "pacr"                       
#> [27] "dakota"                      "carries"                    
#> [29] "rushing_yards"               "rushing_tds"                
#> [31] "rushing_fumbles"             "rushing_fumbles_lost"       
#> [33] "rushing_first_downs"         "rushing_epa"                
#> [35] "rushing_2pt_conversions"     "receptions"                 
#> [37] "targets"                     "receiving_yards"            
#> [39] "receiving_tds"               "receiving_fumbles"          
#> [41] "receiving_fumbles_lost"      "receiving_air_yards"        
#> [43] "receiving_yards_after_catch" "receiving_first_downs"      
#> [45] "receiving_epa"               "receiving_2pt_conversions"  
#> [47] "racr"                        "target_share"               
#> [49] "air_yards_share"             "wopr"                       
#> [51] "special_teams_tds"           "fantasy_points"             
#> [53] "fantasy_points_ppr"
colnames(def)
#>  [1] "season"                        "week"                         
#>  [3] "player_id"                     "player_name"                  
#>  [5] "player_display_name"           "position"                     
#>  [7] "position_group"                "headshot_url"                 
#>  [9] "team"                          "def_tackles"                  
#> [11] "def_tackles_solo"              "def_tackles_with_assist"      
#> [13] "def_tackle_assists"            "def_tackles_for_loss"         
#> [15] "def_tackles_for_loss_yards"    "def_fumbles_forced"           
#> [17] "def_sacks"                     "def_sack_yards"               
#> [19] "def_qb_hits"                   "def_interceptions"            
#> [21] "def_interception_yards"        "def_pass_defended"            
#> [23] "def_tds"                       "def_fumbles"                  
#> [25] "def_fumble_recovery_own"       "def_fumble_recovery_yards_own"
#> [27] "def_fumble_recovery_opp"       "def_fumble_recovery_yards_opp"
#> [29] "def_safety"                    "def_penalty"                  
#> [31] "def_penalty_yards"
colnames(kick)
#>  [1] "season"              "week"                "season_type"        
#>  [4] "player_id"           "team"                "player_name"        
#>  [7] "player_display_name" "position"            "position_group"     
#> [10] "headshot_url"        "fg_made"             "fg_att"             
#> [13] "fg_missed"           "fg_blocked"          "fg_long"            
#> [16] "fg_pct"              "fg_made_0_19"        "fg_made_20_29"      
#> [19] "fg_made_30_39"       "fg_made_40_49"       "fg_made_50_59"      
#> [22] "fg_made_60_"         "fg_missed_0_19"      "fg_missed_20_29"    
#> [25] "fg_missed_30_39"     "fg_missed_40_49"     "fg_missed_50_59"    
#> [28] "fg_missed_60_"       "fg_made_list"        "fg_missed_list"     
#> [31] "fg_blocked_list"     "fg_made_distance"    "fg_missed_distance" 
#> [34] "fg_blocked_distance" "pat_made"            "pat_att"            
#> [37] "pat_missed"          "pat_blocked"         "pat_pct"            
#> [40] "gwfg_att"            "gwfg_distance"       "gwfg_made"          
#> [43] "gwfg_missed"         "gwfg_blocked"

@mrcaseb
Copy link
Member

mrcaseb commented Jan 15, 2024

Season type has been added to defense stats. We could define a consistent column order to finish this off

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants