What are Scheduled Queries?
Scheduled Queries in BigQuery allow users to run SQL tasks on a predefined, automated schedule.
Instead of manually initiating a query each day or week, BigQuery can do it for you. This is perfect for light orchestration of repetitive data transformations, updates, and regular reporting tasks.
Creating a Scheduled Query manually
A scheduled query can be created manually using the Schedule button in the Query window.
A dialog box is presented where we are prompted to provide the following information:
a name for the scheduled query
a scheduling option or the option to run it manually (on demand)
start date and end date for which this query would run
optionally, a destination dataset and table for the query result (including the possibility of adding a partitioning field)
region selection
encryption options
principal (user or service account) to run the query under
write disposition, in order words what to do with the contents already in the table: keep (and append new data) or overwrite (and replace with the new data)
notification options
Creating a scheduled query using Terraform
A schedule can also be created using Terraform, as per the following example.
This option provides a subset of options from the manual method - at the time of writing, for example, the write disposition cannot be set up here.
variable "envConfig" {
type = map(object({
project_id = string
service_account_name = string
}))
}
variable "config" {
type = map(object({
name = string
query = string
destination_table_name_template = string
}))
}
resource "google_bigquery_data_transfer_config" "query_config" {
for_each = var.config
display_name = each.value["name"]
location = "europe"
data_source_id = "scheduled_query"
schedule = "every saturday 05:00"
destination_dataset_id = "learning"
params = {
destination_table_name_template = each.value["destination_table_name_template"]
write_disposition = "WRITE_APPEND"
query = each.value["query"]
}
service_account_name = var.envConfig["dev"].service_account_name
project = var.envConfig["dev"].project_id
}
Creating a Scheduled Query using the API
It's also possible to create a scheduled query using one of the BigQuery APIs or the bq CLI command - check the GCP documentation here.
Viewing scheduled queries
Viewing the scheduled query can be done by Accessing the 'Scheduled queries' option in the BigQuery subgroup.
It will display a list of queries, their schedule, region, destination (if any) and next run time.
Click on a particular query would present a Run history and its output. There is also an option for scheduling a backfill (which we can use for a Manual Run).
Clicking on the Configuration tab would display the configurations used to create the query.
In conclusion, BigQuery Scheduled queries can be a useful tool in your toolset, as a quick and easy way to do light orchestration of SQL tasks.
Thanks for reading!
๐ Delving into BigQuery, Google Cloud and Analytics? Stay connected with me for valuable insights, handy tips, and strategies to effortlessly traverse the vast universe of cloud computing and data analytics!
Found it useful? Subscribe to my Analytics newsletter at notjustsql.com.