Skip to main content

Command Palette

Search for a command to run...

Why you should care about partition pruning in BigQuery

Updated
1 min read
Why you should care about partition pruning in BigQuery
C

Senior Data Engineer • Contractor / Freelancer • GCP & AWS Certified

When it comes to performance improvements and cost savings, handling only as much data as we need is very important. And partitioning is a cornerstone here.

Now, when working with tables that are partitioned, BigQuery tries to exclude the partitions it does not need (akin to pruning a tree) based on filters (WHERE clause) and JOINS, thus saving you time and processing power (and money).

But it's not always that simple. If you perform operations on the partitioned field (say the date field in a date-partitioned table), Big Q might not be able to prune the table accordingly. So you'll end up processing the entire massive table, even though you were only after one single day.

There's an example below with this happening when converting the date to a different timezone, but I've seen it happen with other operations. Pruning would not work in MERGE statement sourced from two UNIONed partitioned tables.

Check the number of rows read from the table in the examples below.

Found it useful? Check out to my Analytics newsletter at notjustsql.com.


Enjoyed this? Here are some related articles you might find useful:

More from this blog

D

Datawise: A blog on SQL, BigQuery, Analytics, Python Tips and Tricks

204 posts

Data Engineer with a passion for transforming complex data landscapes into insightful stories. Here on my blog, I share insights, challenges, and the ever-evolving dance of technology and business.