Skip to main content

Command Palette

Search for a command to run...

Approximate Aggregate Functions in BigQuery

Updated
1 min read
Approximate Aggregate Functions in BigQuery
C

Senior Data Engineer • Contractor / Freelancer • GCP & AWS Certified

Sometimes you don't need perfect, but just good enough. Take approximate aggregate functions in BigQuery, for example.

These are a type of aggregate functions that produce approximate results instead of exact ones but have the upside of typically requiring fewer resources for the computation.

When would I use one? This would be suitable where we can live with an uncertainty or small difference, especially for huge tables, during a preliminary check or data exploration.

Let's look at a practical example. Suppose we have the following data:

APPROX_TOP_COUNT will compute the approx top N elements and their value counts

SELECT 
    APPROX_TOP_COUNT(value, 5) AS top_value_counts 
FROM `learning.data_source`

APPROX_COUNT_DISTINCT will compute the approx distinct count (also can be grouped)

SELECT 
    APPROX_COUNT_DISTINCT(value) AS approx_distinct_value_count 
FROM `learning.data_source`

You can discover more approximate aggregate functions in the documentation.

Thanks for reading!

Found it useful? Subscribe to my Analytics newsletter at notjustsql.com.


Enjoyed this? Here are some related articles you might find useful:

Practical SQL

Part 1 of 50

Short, practical posts on SQL and BigQuery — from core language features to advanced query patterns. A reference for data practitioners at every level.

More from this blog

D

Datawise — SQL, BigQuery & Python for Data Engineers

205 posts

Data Engineer with a passion for transforming complex data landscapes into insightful stories. Here on my blog, I share insights, challenges, and the ever-evolving dance of technology and business.