Skip to main content

Dagster ETL pipeline

In this tutorial, you'll build a full ETL pipeline with Dagster that:

Ingests data into DuckDB
Transforms data into reports with dbt
Runs scheduled reports automatically
Generates one-time reports on demand
Visualizes the data with Evidence

Prerequisites

To follow the steps in this guide, you'll need:

Python 3.10+ and uv installed. For more information, see the Installation guide.
Familiarity with Python and SQL.
A basic understanding of data pipelines and the extract, transform, and load (ETL) process.

Step 1: Set up your Dagster environment

uv
pip

Open your terminal and scaffold a new Dagster project:
```
uvx -U create-dagster project etl-tutorial
```
Respond y to the prompt to run uv sync after scaffolding
Change to the etl-tutorial directory:
```
cd etl-tutorial
```
Activate the virtual environment:
- MacOS/Unix
- Windows
source .venv/bin/activate
.venv\Scripts\activate

Open your terminal and scaffold a new Dagster project:
```
create-dagster project etl-tutorial
```
Change to the etl-tutorial directory:
```
cd etl-tutorial
```

Create and activate a virtual environment:

MacOS/Unix
Windows

python -m venv .venv

source .venv/bin/activate

python -m venv .venv

.venv\Scripts\activate

Step 2: Launch the Dagster webserver

To make sure Dagster and its dependencies were installed correctly, navigate to the project root directory and start the Dagster webserver:

dg dev

Next steps

Continue this example with extract data

Prerequisites
Step 1: Set up your Dagster environment
Step 2: Launch the Dagster webserver
Next steps