Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Issue #3484 and #3540: Updated 'four' to 'six' and Transformed EIA-861 short_form table #3550

Closed
wants to merge 11 commits into from

Conversation

Nancy9ice
Copy link

@Nancy9ice Nancy9ice commented Apr 8, 2024

Closes #3484 and #3540

What problem does this address

  • Changed the number of ways to access PUDL data from 'four' to 'six' in the data_access.rst documentation.
  • Added the new core_eia861__yearly_short_form() asset

#3540 To-do List

  • The column mapping that lets us extract this data from the spreadsheets should be updated to use the same naming conventions as the rest of PUDL. E.g. ba_code should be balancing_authority_code_eia. Most of these columns should already be defined in pudl.metadata.fields. The column mappings are stored in src/pudl/package_data/eia861/column_maps/short_form_eia861.csv

  • A new core_eia861__yearly_short_form() asset will need to be defined in pudl.transform.eia861

  • Columns will need to be cleaned up to have appropriate NA values and to conform to the appropriate data types. See the other transform functions for examples. It looks like there are a number of boolean columns which currently contain Y or N. There may also be columns where only certain values are valid, which need to be cleaned up (e.g. do all state abbreviations actually correspond to a real state?)

  • The new short_form table should be integrated into the processes downstream which compile the table of balancing authorities

  • Either the 2019 short form data will need to be extracted from the other tables where it was reported and added to this one, or the contents of this table will need to be added to all those other tables.

  • If we're adding a new table to the PUDL DB, it will need to have a schema defined in pudl.metadata.resources.eia861

Nancy9ice and others added 10 commits April 8, 2024 12:34
Closes Task 1 of #3540.

This PR addresses the issue of aligning the column names in 'src/pudl/package_data/eia861/column_maps/short_form_eia861.csv' to suit the defined column names in 'pudl.metadata.fields'. While I've done this in the commit 2 of #3550, I checked the raw short form data files and realized that green_pricing_revenue was wrong so I changed it to 'green_pricing". Some of the columns in 'src/pudl/package_data/eia861/column_maps/short_form_eia861.csv' are not defined in 'pudl.metadata.fields' though.
## Closes Task 2 & 3 of #3540

## What was done?

Transformation tasks for the core_eia861__yearly_short_form() table were added to 'pudl.transform.eia861'.
## Closes Task 4 of #3540

## What was done?
The 'core_eia861__yearly_short_form' was added to 'pudl.output.pudltabl' to enable consolidation with the balancing authority tables.
## Closes Task 6 of #3540

## What was done?

The schema of the new core_eia861__yearly_short_form table was defined in pudl.metadata.resources.eia861
@Nancy9ice Nancy9ice changed the title Fix Issue #3484: Updated 'four' to 'six' Fix Issue #3484 and #3540: Updated 'four' to 'six' and Transformed EIA-861 short_form table Apr 10, 2024
@Nancy9ice Nancy9ice closed this by deleting the head repository Apr 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

An incorrect number of ways to interact with PUDL data is mentioned in the data_access.rst documentation file.
1 participant