In one of the previous posts about BigQuery labels, I provided an example showcasing the usage of the INFORMATION_SCHEMA JOBS view when analyzing query statistics per label.
Now what are these views? These are system-generated views that provide metadata about your datasets, tables, columns, jobs, partitions, constraints and more.
Just wanted to highlight that INFORMATION_SCHEMA view (JOBS included) can do much, much more than that.
Look into your table storage costs, query resource consumption, build dynamic queries, and leverage partition metadata for incremental pipelines - all of this can benefit from the INFORMATION_SCHEMA views.
Let's look at a couple of interesting use cases.
JOBS is one of the most powerful views when you’re interested in the queries run in that project. It can provide information about:
slot time used
bytes processed and billed
errors if any and so much more!
SELECT creation_time, job_type, query, total_bytes_billed, total_bytes_processed, total_slot_ms, referenced_table.table_id AS referenced_table_name FROM `region-eu`.INFORMATION_SCHEMA.JOBS, UNNEST(referenced_tables) AS referenced_table
This view contains information about columns, such as:
tables where they reside
whether they are nullable or not
default values if any
Need a dynamic UNPIVOT? Fetch the column names dynamically using this view.
DECLARE myunpivot STRING; SET myunpivot = ( SELECT CONCAT('(', STRING_AGG( column_name, ','), ')'), From( SELECT column_name FROM learning.INFORMATION_SCHEMA.COLUMNS where table_name ="Customer_Data" and column_name not in("CustomerId") )); EXECUTE IMMEDIATE format(""" SELECT * FROM ( SELECT * FROM learning.Customer_Data ) unpivot ( value FOR keys in %s ) """, myunpivot);
CONSTRAINT COLUMN USAGE
If you’re interested in column usage in constraints (newly added BigQuery feature), there is this dataset-level view, showcasing which constraints are applied to columns.
SELECT * EXCEPT(table_catalog, constraint_catalog) FROM testing.INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE;
Looking to dive into table storage, including physical and logical storage, so you can estimate your costs? Use one of the table storage views (per project or region, as below).
SELECT * FROM region-eu.INFORMATION_SCHEMA.TABLE_STORAGE;
But there's more!
Building an incremental pipeline and you’d like information about the last time a Partition has been modified? There’s the PARTITIONS view.
Looking to find object access grants? There’s OBJECT_PRIVILEGES
Want to see a list of Table Snapshots? There is a view for that too.
Be sure to check the BigQuery documentation for the latest list of INFORMATION SCHEMA views you can use