Have you ever worked with sharded tables in BigQuery?
I've encountered them in a project long time ago and haven't seen them much around since.
Does the name not ring a bell?
Well, think of it as pseudo-partitioning, a way to store data split between different tables, each having a different suffix in the name <code>dataset.table_name_{your_suffix}</code>.
We'll be getting different tables, but they can be queried as one by using a wildcard *, retrieving data from all the tables matching the wildcard, like having an invisible UNION ALL behind the scenes.
In practice, I've seen these suffixes most of the time being dates like YYYYMMDD. So, in this case, BigQuery docs discourage this usage of sharding, citing the overhead in terms of storing a separate schema and metadata + permission checks as compared to just using a date-partitioned table.
So you're just better off to use a partitioned table in this case.
They even offer a quick way to convert a group of date-sharded tables to a regular date-partitioned table.
Have you ever encountered any interesting use cases for sharding?
<img src="https://miro.medium.com/v2/resize:fit:1400/0*z0MMYxfNVgEss3Ca" alt />
Found it useful? Check out to my Analytics newsletter at <a target="_blank" href="https://www.notjustsql.com"><a href="http://notjustsql.com" class="autolinkedURL autolinkedURL-url" target="_blank">notjustsql.com</a></a>.

Have you ever worked with sharded tables in BigQuery?

I've encountered them in a project long time ago and haven't seen them much around since.

Does the name not ring a bell?

Well, think of it as pseudo-partitioning, a way to store data split between different tables, each having a different suffix in the name `dataset.table_name_{your_suffix}`.

We'll be getting different tables, but they can be queried as one by using a wildcard \*, retrieving data from all the tables matching the wildcard, like having an invisible UNION ALL behind the scenes.

In practice, I've seen these suffixes most of the time being dates like YYYYMMDD. So, in this case, BigQuery docs discourage this usage of sharding, citing the overhead in terms of storing a separate schema and metadata + permission checks as compared to just using a date-partitioned table.

So you're just better off to use a partitioned table in this case.

They even offer a quick way to convert a group of date-sharded tables to a regular date-partitioned table.

Have you ever encountered any interesting use cases for sharding?

![](https://miro.medium.com/v2/resize:fit:1400/0*z0MMYxfNVgEss3Ca align="left")

*Found it useful? Check out to my Analytics newsletter at* [*notjustsql.com*](https://www.notjustsql.com)*.*

Sharded tables in BigQuery

Data Engineer with a passion for transforming complex data landscapes into insightful stories. Here on my blog, I share insights, challenges, and the ever-evolving dance of technology and business.


Explore Datawise - a blog on Analytics, SQL BigQuery and Python. Dive deep into tutorials, case studies, and the latest trends.