Write better SQL queries faster using Mock Data

Photo by Thao LEE on Unsplash

Write better SQL queries faster using Mock Data


2 min read

๐Ÿ› ๏ธ Here's a technique I frequently employ to streamline my query development process.

Ever find yourself stuck in the complexity of writing non-trivial queries, especially with voluminous tables? ๐Ÿค” The amalgamation of query complexity and business problems can sometimes make writing your #SQL take more time than necessary.

When exploring a new coding approach, utilizing an unfamiliar function, or navigating through unseen data, I create a simplified, representative example using dummy data and a few test cases.

In a blank #BigQuery window, I write a couple of Common Table Expressions (CTEs) to simulate input, then focus on mimicking only the pivotal columns - maintaining the grain and incorporating one representative value, along with the join columns. A single value can often represent a surrogate key for the grain of the tables.

๐Ÿš€ This approach enables me to:

- Iterate rapidly, considering edge cases irrespective of the data subset.
- Sharpen the technical requirements by applying the solution to the available data and addressing the business question.
- Contemplate how the technical solution can be optimized for data scale and manage real-world corner cases or unexpected scenarios.
- Save processing costs, since I'm working only on kilobytes of data until I have a solution that is >80% ready

๐Ÿ‘€ So, the next time you have to write a big query (pun unintended), especially when dealing with unfamiliar data or functions, take a moment to code a swift example in a blank BigQuery window. It pays off to think simply first.

๐Ÿ”„ Is this Test-Driven Development (TDD) for BigQuery? Perhaps. I like to term it Iterative Development.

Did you find this article valuable?

Support Constantin Lungu by becoming a sponsor. Any amount is appreciated!