Photo by frank mckenna on Unsplash
A portable Data Analytics stack using Docker, Mage, dbt-core, DuckDB and Superset
Just wanted to share a small learning-by-doing project of mine. It's a containerized Data Analytics suite, covering end-to-end analytics process for a small (imaginary) company.
We're talking about:
- generating example data in parquet files using Python
- ingesting data into DuckDB
- model data using dbt-core
- loading a DuckDB datamart
- orchestrate using MageAI
- displaying it all in a Superset dashboard.
Each of the components is in a separate Docker container, tied all together with docker-compose.
I've previously set up similar projects with Airflow and Dagster.
It's pretty bare bones (somewhat as intended) and has some rough edges, but it should be a good starting point for a demo, template or learn how all these components works together.
I would of course appreciate any feedback or suggestions on how to make it better.
Found it useful? Check out to my Analytics newsletter at notjustsql.com.