How often do you use STRUCTs in BigQuery? I do a lot, and here's an interesting use case.
First, what is a STRUCT?
So, if you're not familiar with them, a STRUCT is a data type used to represent an object, allowing us to group related fields within one data cell.
Think of it as the ability to store complex 'things' inside a BigQuery row, in addition to the 'primitive' types like INT64 or STRING. So you can have a STRUCT 'person' that has attributes like name, age and salary. This can of course be REPEATED to obtain an ARRAY of person STRUCTs.
Now, STRUCTs are useful for many things, one of which I'm going to present today.
Here is a trick I use when working with window functions like LEAD and LAG that involve STRUCTs in BigQuery.
Problem statement
Did you ever have to retrieve the historical attribute (previous or next) of an entity in a temporal table (SCD-type 2)?
Here's how an example could look.
SELECT
id,
value_int,
value_text,
valid_from,
valid_to,
LEAD(value_int) OVER (PARTITION BY id ORDER BY valid_from) AS next_value_int,
LEAD(value_text) OVER (PARTITION BY id ORDER BY valid_from) AS next_value_text,
LAG(value_int) OVER (PARTITION BY id ORDER BY valid_from) AS prev_value_int,
LAG(value_text) OVER (PARTITION BY id ORDER BY valid_from) AS prev_value_text
FROM `learning.input_data`
ORDER BY id, valid_from
Okay, but what if you have a dozen attributes? Instead of writing tens of LEAD or LAG functions, leverage STRUCT and look up an entire STRUCT of attributes.
The solution
Here's an adapted example that uses STRUCTs.
WITH input_data AS (
SELECT
id,
value_int,
value_text,
STRUCT(value_int, value_text) AS value,
valid_from,
valid_to
FROM `learning.input_data` )
SELECT
id,
value_int,
value_text,
valid_from,
valid_to,
LAG(value) OVER (PARTITION BY id ORDER BY valid_from) AS prev_value,
LEAD(value) OVER (PARTITION BY id ORDER BY valid_from) AS next_value
FROM input_data
ORDER BY id, valid_from
This way, you can use LEAD or LAG only once, regardless of how many attributes you need to look up.
There are of course other interesting use cases for STRUCTs, which we will explore in upcoming posts. Stay tuned!
Found it useful? Subscribe to my Analytics newsletter at notjustsql.com.