Semantic layer in dbt – dbt Docs

Posted on by By admin, in DBT | 0

The dbt Semantic Layer, powered by MetricFlow, simplifies the process of defining and using critical business metrics, like revenue in the modeling layer (your dbt project). By centralizing metric definitions, data teams can ensure consistent self-service access to these metrics in downstream data tools and applications. The dbt Semantic Layer eliminates duplicate coding by allowing data teams to define metrics on top of existing models and automatically handles data joins.

Moving metric definitions out of the BI layer and into the modeling layer allows data teams to feel confident that different business units are working from the same metric definitions, regardless of their tool of choice. If a metric definition changes in dbt, it’s refreshed everywhere it’s invoked and creates consistency across all applications.

What is MetricFlow?

MetricFlow, which powers the dbt Semantic Layer, helps you define and manage the logic for your company’s metrics. MetricFlow handles SQL query construction and defines the specification for dbt semantic models and metrics. It allows you to define metrics in your dbt project and query them with MetricFlow commands whether in dbt Cloud or dbt Core.

Create a semantic model

Semantic models consist of entities, dimensions, and measures.Semantic models are nested under the semantic_models key

semantic_models:
  #The name of the semantic model.
  - name: orders
    defaults:
agg_time_dimension: ordered_at
    description: |
      Order fact table. This table is at the order grain with one row per order. 
    #The name of the dbt model and schema
    model: ref('orders')

Define your entities. These are the keys in your table that MetricFlow will use to join other semantic models. These are usually columns

  #Entities. These usually correspond to keys in the table.
    entities:
      - name: order_id
        type: primary
      - name: location
        type: foreign
        expr: location_id
      - name: customer
        type: foreign
        expr: customer_id

Define your dimensions and measures. Dimensions are properties of the records in your table that are non-aggregatable. They provide categorical or time-based context to enrich metrics. Measures are the building block for creating metrics. They are numerical columns that MetricFlow aggregates to create metrics.

    #Measures. These are the aggregations on the columns in the table.
    measures: 
      - name: order_total
        description: The total revenue for each order.
agg: sum
      - name: order_count
        expr: 1
agg: sum
      - name: tax_paid
        description: The total tax paid on each order. 
agg: sum
      - name: customers_with_orders
        description: Distinct count of customers placing orders
agg: count_distinct
        expr: customer_id
      - name: locations_with_orders
        description: Distinct count of locations with order
        expr: location_id
agg: count_distinct
      - name: order_cost
        description: The cost for each order item. Cost is calculated as a sum of the supply cost for each order item. 
agg: sum
  #Dimensions. Either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension.  
    dimensions:
      - name: ordered_at
        type: time
type_params:
time_granularity: day 
      - name: order_total_dim
        type: categorical
        expr: order_total
      - name: is_food_order
        type: categorical
      - name: is_drink_order
        type: categorical  

We at Helical have more than 10 years of experience in providing solutions and services in the domain of data and have served more than 85+ clients. We are also DBT partners, hence in case if you are looking for certain assistance, consulting, services please do reach out on nikhilesh@Helicaltech.com

0 0 votes
Article Rating
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments