Skip to main content

Command Palette

Search for a command to run...

Pay attention to cardinality & grain when UNNESTING in BigQuery!

Updated
1 min read
Pay attention to cardinality & grain when UNNESTING in BigQuery!
C

Senior Data Engineer • Contractor / Freelancer • GCP & AWS Certified

Whenever you're UNNESing an ARRAY, you're getting a Cartesian product between the row and the array contents. If you were to unnest another array, you'll get another Cartesian product between the output of the previous unnest and the elements in the current array.

Let's look at an example. A student has their grades stored in an ARRAY as well as their food allergies in another ARRAY.

If we are to UNNEST both array we'll end having count_of_grades x count_of_allergies rows for each student, 4x3 in this case.

Why this happens? Well the allergies and grades have no relationship between each other, they just refer to the same student row.

Take this into account when you're working with nested data.

Found it useful? Check out to my Analytics newsletter at notjustsql.com.


Enjoyed this? Here are some related articles you might find useful:

More from this blog

D

Datawise: A blog on SQL, BigQuery, Analytics, Python Tips and Tricks

204 posts

Data Engineer with a passion for transforming complex data landscapes into insightful stories. Here on my blog, I share insights, challenges, and the ever-evolving dance of technology and business.

BigQuery UNNEST Multiple Arrays: Avoid Cartesian Products