Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document all columns in mart_gtfs_quality & review YAML #3103

Open
5 tasks
lauriemerrell opened this issue Nov 13, 2023 · 0 comments
Open
5 tasks

Document all columns in mart_gtfs_quality & review YAML #3103

lauriemerrell opened this issue Nov 13, 2023 · 0 comments

Comments

@lauriemerrell
Copy link
Contributor

As an analytics engineer, I want all of our columns to be documented in dbt so that future maintainers and users of the warehouse will understand what each column is and how it should be used.

AC:

  • Add documentation for all columns in mart_gtfs_quality currently lacking documentation (see query below)
  • Audit associated dbt YAML:
    • For YAML files longer than ~10 models with common anchors (used by more than ~3 models), define the anchors at the very top of the file, as done here: https://github.com/cal-itp/data-infra/blob/main/warehouse/models/mart/gtfs/_mart_gtfs_dims.yml#L3
    • Check that anchors are being used appropriately: if there's a common field with equivalent description, an anchor should be used; if there's a field with the same name that doesn't use the anchor, consider using the anchor and overriding the part that's different or adding a comment about why this instance can't use the anchor
    • Review field documentation and evaluate for completeness/correctness
-- identify columns missing documentation 
SELECT *
FROM `cal-itp-data-infra`.`mart_gtfs_quality`.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS
WHERE description IS NULL
ORDER BY table_name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant