Spreadsheets are a central part of a modern workplace. Pretty much every office workplace uses one. It's also quite used for personal use cases. One such product is Google Sheets.
I use it for many things - organizing finance, planning travel itineraries, collecting data and expenses.
Today I want to illustrate how we can integrate data from Google Sheets to BigQuery.
In BigQuery, a Google Sheets page can be represented as an external table - a type of table where the data is stored outside of BigQuery.
Where to use it?
I've seen them used for holding static (seed) data and mapping tables, an improvised Master Data Management of sorts, allowing analysts to edit data as business realities change.
What to pay attention to?
Security - who can view and edit Sheets
History - how do you handle change in sheets? Do you need snapshots?
Recovery - what if someone deletes data from the worksheet?
Let's see an example of how to set it up.
Setting up a Google Sheet as an External Table
We start by adding a new data source.
We pick the Google Drive as the source.
We provide the necessary configuration:
Google Sheets URL
Destination Table: project, dataset, table name
We provide the schema: column names and types
We provided the number of header rows to be skipped
After we hit Create Table the external table is created. We can verify that the data has been properly created.
Viewing the DDL of an (external) table
We might need to see the DDL (data definition language) statement for a table. This can be achieved using one of the INFORMATION_SCHEMA tables.
Using the query below you can see how the (external in our case) table is defined.
SELECT
table_name, ddl
FROM
learning.INFORMATION_SCHEMA.TABLES;
CREATE EXTERNAL TABLE `learning.example_table`
(
FirstName STRING OPTIONS(description="First Name"),
LastName STRING OPTIONS(description="Last Name"),
City STRING OPTIONS(description="City"),
Country STRING OPTIONS(description="Country")
)
OPTIONS(
sheet_range="A1:D4",
skip_leading_rows=1,
format="GOOGLE_SHEETS",
uris=["https://docs.google.com/spreadsheets/d/**edited_text**/edit#gid=0"]
);
Thanks for reading and keep enjoying SQL!
Found it useful? Subscribe to my Analytics newsletter at notjustsql.com.