Skip to main content

Command Palette

Search for a command to run...

Comparing ranking functions in BigQuery

Updated
2 min read
Comparing ranking functions in BigQuery
C

Senior Data Engineer • Contractor / Freelancer • GCP & AWS Certified

One of the most common sightings in SQL code is using ranking functions. It's simple but we must surely get it right. I use ROW_NUMBER very often for de-duplication, also used DENSE_RANK a couple of times - but I've never used RANK. So how are they different?

- ROW_NUMBER - sequential row number, regardless of equal values. If no order is provided, the results might be different every time you run it (aka non-deterministic): 1,2,3,4,5

- DENSE_RANK - also a sequential row number, but takes into account peer rows (with equal values). next rank is the immediate following: 1,2,2,3 (no gaps)

- RANK - also a sequential row number, takes into account rows with equal values, but next rank is incremented: 1,2,2,4 (with gaps)

See below for an example of how it all works.

SELECT 
  store_code, 
  country, 
  sales_usd, 
  ROW_NUMBER() OVER country_sales AS row_no,
  DENSE_RANK() OVER country_sales AS dense_rnk,
  RANK() OVER country_sales       AS rnk

FROM input_data

WINDOW country_sales AS (PARTITION BY country ORDER BY sales_usd DESC)

Thanks for reading!

Found it useful? Subscribe to my Analytics newsletter at notjustsql.com.


Enjoyed this? Here are some related articles you might find useful:

More from this blog

D

Datawise: A blog on SQL, BigQuery, Analytics, Python Tips and Tricks

204 posts

Data Engineer with a passion for transforming complex data landscapes into insightful stories. Here on my blog, I share insights, challenges, and the ever-evolving dance of technology and business.