Skip to content

Releases: dbt-labs/metricflow

MetricFlow 0.205.0

02 Mar 00:18
2631fea
Compare
Choose a tag to compare

This release is an intermediate update to MetricFlow and can be used with existing dbt-core 1.7 installations. The breaking changes listed here will only affect users relying on any sql-comment tagging the MetricFlow CLI had been doing previously (e.g., the mf_rid_* comments).

Notable improvements include increased flexibility in querying the metric_time dimension and increased consistency in how we manage metrics requesting joins against a time spine for filling in missing values on a time axis.

MetricFlow 0.205.0 - February 29, 2024

Breaking Changes

  • Remove SQL-Comment-Based Tags (#1034)

Features

  • Enable querying metric_time without metrics. (#928)
  • Enable querying cumulative metrics with their agg_time_dimension. (#1000)
  • Enable offset metrics to be queried with agg_time_dimension. (#1006)
  • Add Support for Consistent SQL Query Generation (#1020)

Fixes

  • Validate that there are metrics or group by items in each query. (#1002)
  • For measures that join to time spine, allow joining when agg_time_dimension is queried. (#1009)
  • Join to time spine if requested for conversion metric input measures. (#1048)
  • Enable querying offset metric with multiple agg_time_dimensions at once. Also fixes a bug when filtering by a different grain than the group by grain. (#1052, #1053)
  • Bug fix: if measure joins to time spine, apply filters again after that join. (#1039)
  • Improve error message for metrics/queries with missing inputs (#1051)

Docs

  • change group-bys to group-by in the tutorial message

Under the Hood

  • Add test for nested derived metric filter rendering bug fixed in 0.204.0 (#920)

Dependencies

Contributors

MetricFlow 0.204.0

13 Jan 02:11
ad70bd7
Compare
Choose a tag to compare

To take full advantage of this release, please upgrade to dbt-core 1.7.4 or later. Earlier versions of dbt-core 1.7.x will continue to work with any config that does not include Conversion Metrics.

Notable improvements include:

  • Support for Conversion Metrics
  • Support for Trino (huge thanks to @sarbmeetka for the contribution!)
  • A variety of important fixes and adjustments, particularly around time offset metrics

MetricFlow 0.204.0 - January 11, 2024

Features

  • Add Trino support to the MetricFlow. (#207)
  • Implemented date_part in where filter. (#None)
  • Resolve Ambiguous Group-By-Items (#887)
  • Support for Conversion Metrics (#252)
  • Add a Query Validation Rule for Repeated Metrics in a Query (#943)
  • Expose label on Metric & Dimension for use in APIs. (#956)

Fixes

  • Apply time offset for nested dervied & ratio metrics (#882)
  • Fix Incorrect SQL Column Name Rendering for WhereConstraintNode (#908)
  • Unable To Satisfy Query Error with Cumulative Metrics in Saved Queries (#917)
  • Fixes a bug in dimension-only queries where the filter column is removed before the filter has been applied. (#923)
  • Bug fix: Keep where constraint column until used for nested derived offset metric queries. (#930)
  • Fixes incorrect time constraint applied to derived offset metrics. (#925)
  • Remove default time constraint for queries with cumulative metrics. (#917)
  • Return exit code 1 for failed validations (#867)
  • Optimizer Does Not Deduplicate Common Metrics (#941)
  • Duplicate input measures after combiner optimizer (#969)

Under the Hood

  • Test to ensure Dimension and TimeDimension syntax are identical in the case of time dimensions
  • Fixed typo in error message

Dependencies

  • Remove unnecessary MarkupSafe dependency (#950)

Contributors

MetricFlow 0.203.1

15 Nov 23:21
863352b
Compare
Choose a tag to compare

MetricFlow 0.203.1 - November 15, 2023

Fixes

  • Fix error in cumulative metric output when start-time and end-time are specified (#869)

Dependencies

  • Remove Rudderstack client and associated dependencies (#866)
  • Relax version pin on typing extensions to allow >=4.4, <5 (#875)

Contributors

MetricFlow 0.203.0

14 Nov 21:41
08ae532
Compare
Choose a tag to compare

This release requires an upgrade to dbt core 1.7 , which is now the minimum requirement for future minor version updates to MetricFlow.

Notable improvements include:

  • Support for saved queries
  • Fixes for issues where time granularity was not being applied to columns with finer-grained input than what was listed in the config. This may cause performance regressions for users of partitioned datasets in certain engines, which we are committed to resolving.
  • Support for configuring a way to fill null value measure outputs to prevent implicit null filtering.

Complete changelog follows. Please take note of the breaking changes.

MetricFlow 0.203.0 - November 13, 2023

Breaking Changes

  • Use FULL OUTER JOIN to combine input metrics for derived metrics. This is a change from using INNER JOIN and may result in changes in output. (#842)
  • Update Dependencies to Use dbt-semantic-interfaces~=0.4.0 (#846)

Features

  • Enable DATE PART aggregation for time dimensions (#770)
  • Support Saved Queries in MetricFlow (#765)
  • Support for sort order in query interface (#None)
  • Support for the Dimension(...).grain(...) syntax for the where parameter (#None)
  • Support querying dimensions without metrics. (#804)
  • Join to time spine and fill nulls when requested on metric input measures. (#759)
  • Fill nulls for multi-metric queries (#850)

Fixes

  • Removing methods and reordering parameters for Query Interface. (#None)
  • Coerce time granularity to configured value to prevent finer-grained timestamps from causing unexpected query behavior (#714)
  • Prioritize source nodes based on correct cost (#801)
  • Enables case insensitivity for various query params. (#802)
  • Ensure extract calls return consistent results across engines (#792)
  • The --order param was being dropped from CLI saved queries. (#835)
  • Fix query validation for metric_time requirements (#825)
  • Use FULL OUTER JOIN for dimension-only queries. (#863)

Under the Hood

  • A simple update to make the where filter query parameter objects more accurate (#None)
  • Expose underlying where clause error message (#None)
  • Remove query interface and depend on DSI protocol instead (#None)
  • re-categorize TypeErrors that arise from create_from_where_filter into InvalidQueryException (#None)
  • Add the ability to use distinct select in sql nodes (#None)
  • Removed DatePart Enum and change imports to depend on DSI version instead. (#None)

Dependencies

  • Update to dbt-semantic-interfaces~=0.3.0. (#809)
  • Update typing-extensions minimum version to 4.4 (#823)
  • Update dbt dependencies to ~=1.7.0 (#860)

Contributors

MetricFlow 0.202.0

06 Sep 22:57
bd4a236
Compare
Choose a tag to compare

This release adds support for granularity adjustments in window offsets and fixes an issue with BigQuery granularity adjustments for YEAR level granularities.

BigQuery users take note - YEAR granularity now truncates to calendar year start (1st January), which was the original intention, rather than ISO year start, which is something like the Monday of the week containing the first Thursday of the year. This is now consistent with other engines.

MetricFlow 0.202.0 - September 06, 2023

Features

  • Adds the option for users to specify group by parameters with object syntax matching the where/filter expressions. (#717)
  • Expose measures for metrics on MFEngine with agg_time_dimension (#735)

Fixes

  • Update dataflow plan to support different granularities with time offset metrics (#726)
  • Removes unneeded JoinOverTimeRangeNode step from dataflow plan. (#743)
  • Update BigQuery YEAR granularity truncation to use January 1st instead of ISOYEAR start (#755)

Dependencies

  • Allow tabulate versions >= 0.8.9 (#762)

MetricFlow 0.201.0

18 Aug 22:37
530b1bb
Compare
Choose a tag to compare

This release adds duckdb support and provides a partial fix for user-reported issues with partition pruning in BigQuery.

MetricFlow 0.201.0 - August 17, 2023

Features

  • Add dbt-duckdb as a supported adapter and remove legacy DuckDB sql client (#583)

Fixes

  • Remove barrier to partition pruning certain time partition filter predicates (#712)

Under the Hood

  • Make duckdb the standard for all dev-env environment runs, including make test (#723)
  • (#728)

Dependencies

  • Update pandas to 1.5.x (#719)
  • Relax version pins for MetricFlow dependencies (#720)

Contributors

v0.200.0

07 Aug 21:05
6ff60fc
Compare
Choose a tag to compare

This release represents a complete overhaul of MetricFlow targeted at a first class integration with dbt! Complete changelog below:

MetricFlow 0.200.0 - August 02, 2023

Breaking Changes

  • License Change - Version 0 to 0.140.0 was covered by the Affero GPL license. Version 0.150.0 and greater is covered by the BSL license. (#465)
  • Removing time_format from DimensionTypeParams (#494)
  • Use Templates For Defining Metric Filters (#505)
  • Rename Metric.constraint to Metric.filter (#511)
  • Deprecate and refactor CLI commands
  • Removes async query and query cancel methods from SqlClient protocols (#577)
  • Remove time spine introspection and table creation, which may break cumulative metric queries (#592)
  • Remove SqlEngineAttributes construct from SqlClient interface in favor of dialect rendering and engine type properties (#577)
  • CLI needs to be ran in a dbt project root directory
  • Remove expr & ratio metrics and bundle with derived metrics. (#504)
  • explain_get_dimension_values & get_dimension_values take a list of metrics parameters
  • Remove MetricFlow config file - all future configuration must originate with the dbt project (#624)
  • Update to dbt-semantic-interfaces==0.1.0.dev8. (#634)
  • Changed the --group-bys option in mf query to be --group-by
  • Add Support for primary_entity in Semantic Models (#694)

Features

  • Added new entity calls to CLI/MetricFlowEngine
  • Script to Generate Snapshots for Supported SQL Engines. (#609)
  • Add dbt adapter support for postgres and enable it for tests (#578)
  • Use dbt adapter to run queries and warehouse validations from MetricFlow CLI (#624)
  • Enable Snowflake queries in dbt <-> MetricFlow CLI integration (#579)
  • Refactor mf tutorial to work alongside a dbt project.
  • New package dbt-metricflow which bundles dbt-core and metricflow and dbt-adapters
  • Add Support for Python 3.10 / 3.11 (#659)
  • Include metric_time in List Dimensions Output Where Appropriate (#673)
  • Enable support for Redshift queries in dbt-metricflow integration (#582)
  • Enable Databricks support for the dbt-metricflow integration (#580)
  • Enable support for BigQuery for dbt metricflow integration users (#581)

Fixes

  • Removes MySQL from SqlEngine and SqlDialect options since it is not supported. (#0)
  • Derived metrics were not respecting the constraint defined in the original input metric's definition.
  • Fixes type error in BigQuerySqlExpressionRenderer (#536)
  • Fix broken type signature for log_call decorator
  • Apply transformations to dbt-generated serialized model to fix issue with query generation (#624)
  • Improve error message rendering in MetricFlow CLI (#646)
  • Added --version and fix manifest transformer rules for dbt-core-=1.6.0b8 (#650)
  • Include granularity suffix on time dimension name rendering for all time dimension granularities
  • Clean up list dimensions outputs

Under the Hood

  • Adding Changie (#457)
  • Ensure use of ValidationIssue instead of ValidationIssueType. ValidationIssueType was from a time before ValidationIssue classes had proper inheritance, and it's continued use was become problematic for typing.
  • Removing model from the ModelValidator.validate return type. The model isn't altered, and thus doesn't need to be returned.
  • Moving AggregationType enum into dbt-semantic-interfaces
  • Moving errors relevant to dbt_semantic_interfaces to dbt_semantic_interfaces
  • Migrating to RapidFuzz (#470)
  • Matching dbt-core issue templates (#457)
  • Removing the transform CLA (#450)
  • Pinning dbt-core to 1.4 (#475)
  • Removing YamlLint (#472)
  • Add ObjectToReference class in preparation of removing the .reference calls (##463,, ##464)
  • Moving all *Reference objects to dbt-semantic-interfaces.
  • Add pytest flag to use a persistent source schema for faster repeat testing. (#482)
  • Renamed instances of and related to Identifiers to Entities. (#dbt-semantic-interfaces#9)
  • Improves typechecking coverage by updgrading to MyPy 0.942 and removing blanket ignore all imports setting (#536)
  • Push mypy to run using local environment packages in pre-commit. Developers should always use a clean virtual environment to ensure consistency with CI. (#530, #536)
  • Update mypy to 1.3.0 (#546)
  • Migrate from Poetry -> Hatch for Project / Package Management (#549)
  • Enable the ability to return only dimensions requested in the query, specifically used for dimension values queries.
  • Raising a UnsupportedEngineFeatureError instead of a generic RuntimeError when a data platform doesn't support a feature
  • Remove SqlIsolationLevel constructs and other vestigial remnants of defunct SqlClient features (#577)
  • Raise a more specific exception when a Metric isn't found during linking.
  • Update test environment configuration to allow for more streamlined dependencies
  • Remove DDL and other unused methods from SqlClient protocol

Dependencies

  • Switches MetricFlow SemanticManifest dependencies from the local dbt semantic interfaces package to the initial dev release (#540)
  • Clean up unused dependencies, relax tabulate version pin (#545)
  • Update dbt dependencies to support development on the new integration (#571)
  • Move SQLAlchemy and SQL engine dependencies out of the production package (#672)
  • Update dependencies and attribution file in preparation for 0.200.0 release (#703)

Contributors

v0.140.0

26 Jan 23:10
a3af95b
Compare
Choose a tag to compare

Highlights

We've added a number of new features, including:

  • Derived metrics
  • Support for joining against versioned dimensions (Slowly Changing Dimensions, or SCD)
  • Percentile measures
  • dbt Cloud support

Breaking Changes

  • Result layout is changing from one row per metric/null dimension valued pair to one row per null dimension value regardless of number of metrics in the query. This only affects queries for multiple metrics where the requested dimensions contain null values. See the description on the relevant PR for more detailed information and an example illustrating how the output will change.
  • Updates to the required SqlClient protocol could cause typechecking failures for users injecting a custom SqlClient implementation into the MetricFlowClient
  • Version minimum changes in SQLAlchemy and snowflake-sqlalchemy could cause dependency conflicts when installed in python environments with libraries requiring an older version of either of these dependencies.

New Feature Details

Derived Metrics (@WilliamDee)

MetricFlow now enables the user to reference metrics in the definition of a metric - an expression of metrics. This feature will further simplify and DRY out code by removing the need to create pre-aggregated subqueries in the data source definition or duplicated measure definitions. For example:

metric:
  name: net_sales_per_user
  owners:
    - nick@company.com
  type: derived
  type_params:
    expr: gross_sales - cogs / active_users
    metrics:
      # these are all metrics (can be a derived metric, meaning building a derived metric with derived metrics)
      - name: gross_sales
      - name: cogs
      - name: users
        constraint: is_active # Optional additional constraint
        alias: active_users # Optional alias to use in the expr

Versioned dimension (SCD Type II) join support (@tlento)

MetricFlow now supports versioned dimension (Slowly Changing Dimension (SCD) Type II) joins!
Given an SCD Type II table with an entity key and dimension values with an appropriate start and end timestamp column, you can now fetch the slowly changing dimension from an SCD Type II table through extra configurations in your data source.
For specific details and examples, please see the documentation on slowly changing dimensions.

Percentile measures (@kyleli626)

MetricFlow now supports percentile calculations in measures! Simply specify percentile for the agg type in your data sources and input the desired percentile within agg_params as seen in the documentation for configuring measures. This feature also provides a median aggregation type as a convenience around the appropriate percentile configuration. For example:

measures:
  - name: p99_transaction_value
    description: The 99th percentile transaction value
    expr: transaction_amount_usd
    agg: percentile
    agg_params:
      percentile: .99
      # False will calculate the discrete percentile and True will calculate the continuous percentile
      use_discrete_percentile: False
    create_metric: True
  - name: median_transaction_value
    description: The median transaction value
    expr: transaction_amount_usd
    agg: median
    create_metric: True

Note that MetricFlow allows for choosing between continuous or discrete percentiles via the use_discrete_percentile parameter.

dbt Cloud support (@QMalcolm)

MetricFlow is now available to use with dbt Cloud. Instead of requiring additional MetricFlow config yamls to enable dbt in your MetricFlow model (as in the previous dbt metrics release), the MetricFlow CLI can work off of a semantic model built from dbt Cloud. To use, simply follow these two steps:

  1. Install our dbt cloud package (e.g., by running pip install "metricflow[dbt-cloud]")
  2. Add the following to .metricflow/config.yaml:
dbt_cloud_job_id: <job_id>
# The following service token MUST have dbt Metadata API access for the project containing the specified job
dbt_cloud_service_token: <dbt_service_token>

Other changes

Added

  • Support for querying metrics without grouping by dimensions (@WilliamDee)
  • A cancel_request API in the SQL client for canceling running queries, with the necessary support for SQL isolation levels and asynchronous query submission (@plypaul)
  • Support for passing in query tags for Snowflake queries (@plypaul)
  • DataFlowPlan optimization to reduce source table scans (@plypaul)
  • Internal API to enable developers to fetch joinable data source targets from an input data source (@courtneyholcomb)

Updated

  • Improved readability of validation error messages (@QMalcolm)
  • Made Postgres engine tests merge-blocking in CI to reduce cycle time on detecting engine-specific errors (@tlento)
  • Updated poetry and python versions in CI to align with our build process and verify all supported Python versions (@tlento)
  • Eliminated data source level primary time dimension requirement in cases where all measures have an aggregation time dimension set (@QMalcolm)
  • Extended support for typed values for bind parameters (@courtneyholcolm)
  • Removed the optional Python levenshtein package from build dependencies in order to streamline package version requirements (@plypaul)
  • Consolidated join validation logic to eliminate code duplication and speed development (@plypaul)
  • Factored join building logic out of DataflowToSqlQueryPlanBuilder to streamline development (@tlento)
  • Improved visibility on underlying errors thrown by sql client requests (@courtneyholcomb)
  • Updated SQLAlchemy and snowflake-sqlalchemy minimum version requirements to resolve a version incompatibility introduced with SQLAlchemy 1.4.42 (@tlento)
  • Added CI coverage for Databricks SQL Warehouse execution environments (@tlento)

Fixed

  • Resolved error encountered in Databricks whenever table rename methods were invoked (@courtneyholcomb)
  • Fixed bug with warehouse measure validation where an error would be inappropriately thrown when users with measure-specific agg_time_dimension configurations attempted to run the full validation suite (@WilliamDee)
  • Issue with parsing explain output for Databricks SQL warehouse configurations (@courtneyholcomb)
  • Floating point comparison errors in CI tests (@tlento)
  • Issue with incomplete time range constraint validation that could result in invalid queries(@plypaul)
  • Resolved GitHub upgrade warnings on use of deprecated APIs and node.js build versions (@tlento)
  • Resolved python-levenshtein optimization warning on CLI startup (@jzhu13)
  • Resolved SQLAlchemy warning about the impending deprecation of the engine.table_names method (@Jstein77)
  • Improved error message for queries with time range constraints which were too narrow for the chosen time granularity (@kyleli626)
  • Eliminate SQL rendering error in BigQuery which would intermittently produce invalid GROUP BY specifications (@tlento)

v0.130.1

19 Oct 20:50
5d6b709
Compare
Choose a tag to compare

Highlights

We've improved safeguards for proper model development and added support for profile and targets overrides for dbt queries!

Added

  • Support for overriding dbt profile and targets attributes when querying dbt models (@QMalcolm)
  • Validation to block use of DISTINCT keyword in COUNT aggregation expressions, as this can lead to incorrect results if optimized queries relying on partial aggregation attempt to do something like SUM(counts) to retrieve a less granular total value. (@tlento)

Updated

  • Made minor improvements to safeguards for internal development (@tlento)

v0.130.0

19 Oct 20:43
9d9d10c
Compare
Choose a tag to compare

Highlights

Introducing query support for dbt metrics!

With this release you can now use MetricFlow to run queries against your dbt metrics config! If you wish to use the MetricFlow toolchain to query your dbt metrics you can now do this with a simple configuration change. To use, reinstall Metricflow with the appropriate dbt package (see below for supported installations) and make sure the following is in your .metricflow/config.yaml:

model_path: /path/to/dbt/project/root
dbt_repo: true

From there you can use all of Metricflow's tools to query that model!

Our supported installations can be added as follows:

  1. BigQuery: pip install metricflow[dbt-bigquery]
  2. Postgres: pip install metricflow[dbt-postgres]
  3. Redshift: pip install metricflow[dbt-redshift]
  4. Snowflake: pip install metricflow[dbt-snowflake]

Packaging changes

  • Use of the new dbt integration requires installation of extended package dependencies. These should not be pulled in by default.
  • Developers experiencing module not found errors with dbt models will need to run the expanded installation via poetry install -E dbt-<data_warehouse> where <data_warehouse> is one of the supported extras noted above.

Added

Updated

  • Internal refactor to use the more appropriate MetricReferences as lookup keys in place of MetricSpec classes. (@WilliamDee)