Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform the EIA-861 short form table #3540

Open
6 tasks
zaneselvans opened this issue Apr 3, 2024 · 0 comments · May be fixed by #3565
Open
6 tasks

Transform the EIA-861 short form table #3540

zaneselvans opened this issue Apr 3, 2024 · 0 comments · May be fixed by #3565
Labels
data-cleaning Tasks related to cleaning & regularizing data during ETL. data-loss data that we expect should exist seem to be missing or dropped in pudl tables eia861 Anything having to do with EIA Form 861 good-first-issue Good issues for first-time contributors. Self-contained, low context, no credentials required. new-data Requests for integration of new data.

Comments

@zaneselvans
Copy link
Member

Smaller utilities report an abbreviated version of the EIA-861, called the short form. This data is currently being extracted into an asset named raw_eia861__short_form, but as noted by @gschivley it's not getting transformed and written to the database, so what information we have about these smaller load serving entities is currently getting lost. This data started being reported in 2012, and is available up through the present (most current EIA-861 data is 2022).

Complications

  • In 2019, the EIA-861 Short Form data was intermingled with all the other Long Form data. So there's a gap in the reporting, and all of that other 2019 Short Form data will need to be extracted from the other various tables, or this data would need to be put into those tables for internal consistency.
  • Note that because the EIA-861 tables are not yet being harvested to extract utility IDs, no foreign key relationship can be defined between this table and the existing utility entity tables.

Tasks

from pudl.etl import defs
short_form = defs.load_asset_value(AssetKey("raw_eia861__short_form"))
short_form.info()
RangeIndex: 13015 entries, 0 to 13014
Data columns (total 16 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   ba_code                 11449 non-null  object 
 1   demand_side_management  13001 non-null  object 
 2   early_release           0 non-null      object 
 3   entity_type             9648 non-null   object 
 4   green_pricing           1124 non-null   object 
 5   net_metering            12999 non-null  object 
 6   report_year             13015 non-null  float64
 7   state                   13015 non-null  object 
 8   time_based_programs     13001 non-null  object 
 9   total_customers         13015 non-null  object 
 10  total_revenue           13015 non-null  object 
 11  total_sales             13015 non-null  object 
 12  utility_id_eia          13015 non-null  float64
 13  utility_name_eia        13015 non-null  object 
 14  water_heater            11891 non-null  object 
 15  data_maturity           13015 non-null  object 
dtypes: float64(2), object(14)
memory usage: 1.6+ MB
short_form.sample(10)
ba_code demand_side_management early_release entity_type green_pricing net_metering report_year state time_based_programs total_customers total_revenue total_sales utility_id_eia utility_name_eia water_heater data_maturity
2338 WACM N nan nan nan Y 2014 CO N 3056 5670 53658 5036 City of Delta - (CO) . final
6980 MISO N nan Municipal nan Y 2018 IL N 3586 8206 59329 11790 City of Mascoutah - (IL) . final
1757 CPLE N nan nan nan N 2013 NC N 1122 2038.5 18427 11205 Town of Lucama - (NC) . final
9029 PNM N nan Cooperative nan Y 2020 NM Y 4313 7456 45100 23326 Sierra Electric Coop, Inc . final
696 nan N nan nan N N 2012 MN N 1931 3182.9 27943 13681 North Branch Water & Light Comm nan final
807 nan Y nan nan N N 2012 IA N 377 540 4563 15751 City of Readlyn - (IA) nan final
5460 ERCO N nan Municipal nan N 2016 TX N 3019 8932 69610 2050 City of Bowie - (TX) . final
8653 MISO N nan Municipal nan N 2020 MN N 210 238 2087 16451 City of Rushmore - (MN) . final
8659 MISO N nan Municipal nan N 2020 MN N 1245 2940 20852 17836 Springfield Public Utils Comm - (MN) . final
4479 WACM N nan Municipal nan N 2015 WY N 772 938.5 10273 15051 City of Pine Bluffs - (WY) final
@zaneselvans zaneselvans added data-cleaning Tasks related to cleaning & regularizing data during ETL. new-data Requests for integration of new data. eia861 Anything having to do with EIA Form 861 data-loss data that we expect should exist seem to be missing or dropped in pudl tables good-first-issue Good issues for first-time contributors. Self-contained, low context, no credentials required. labels Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-cleaning Tasks related to cleaning & regularizing data during ETL. data-loss data that we expect should exist seem to be missing or dropped in pudl tables eia861 Anything having to do with EIA Form 861 good-first-issue Good issues for first-time contributors. Self-contained, low context, no credentials required. new-data Requests for integration of new data.
Projects
Status: Backlog
Development

Successfully merging a pull request may close this issue.

1 participant