Fix Issue #3484 and #3540: Updated 'four' to 'six' and Transformed EIA-861 short_form table #3550
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #3484 and #3540
What problem does this address
#3540 To-do List
The column mapping that lets us extract this data from the spreadsheets should be updated to use the same naming conventions as the rest of PUDL. E.g. ba_code should be balancing_authority_code_eia. Most of these columns should already be defined in pudl.metadata.fields. The column mappings are stored in src/pudl/package_data/eia861/column_maps/short_form_eia861.csv
A new core_eia861__yearly_short_form() asset will need to be defined in pudl.transform.eia861
Columns will need to be cleaned up to have appropriate NA values and to conform to the appropriate data types. See the other transform functions for examples. It looks like there are a number of boolean columns which currently contain Y or N. There may also be columns where only certain values are valid, which need to be cleaned up (e.g. do all state abbreviations actually correspond to a real state?)
The new short_form table should be integrated into the processes downstream which compile the table of balancing authorities
Either the 2019 short form data will need to be extracted from the other tables where it was reported and added to this one, or the contents of this table will need to be added to all those other tables.
If we're adding a new table to the PUDL DB, it will need to have a schema defined in pudl.metadata.resources.eia861