Zus
Zus offers tabularized FHIR data in the form of data marts via Snowflake data share, and makes their schema publicly available.
This is an early release of the connector -- as with any connector we welcome and look forward to community contributions to improve the connector.
Getting Started
Step 1: Prerequisites
Before you begin, ensure you have the following:
- Access to your data warehouse: Credentials and network access to your data warehouse instance (e.g. Snowflake, BigQuery).
- Zus FHIR Data Marts: Your raw Zus data must be loaded into specific tables within your data warehouse.
- dbt CLI Installed: You need dbt (version 1.9 recommended) installed on your machine or environment where you'll run the transformations. See dbt Installation Guide for help with installation.
- Git: You need Git installed to clone this project repository.
- Authentication Details: These details will be important in connecting to dbt with a
profiles.yml
file.
Step 2: Clone the Repository
Open your terminal or command prompt and clone this project:
git clone https://github.com/tuva-health/zus_connector.git
cd zus_connector
Step 3: Create and Activate Virtual Environment
It's highly recommended to use a Python virtual environment to manage project dependencies. This isolates the project's packages from your global Python installation.
- Create the virtual environment (run this inside the zus_connector directory):
# Use python3 if python defaults to Python 2
python -m venv venv
This creates a venv directory within your project folder.
- Activate the virtual environment:
- macOS / Linux (bash/zsh):
source venv/bin/activate
- Windows (Command Prompt):
venv\Scripts\activate.bat
- Windows (PowerShell):
.\venv\Scripts\Activate.ps1
- Windows (Git Bash):
source venv/Scripts/activate
You should see (venv) prepended to your command prompt, indicating the environment is active.
Step 4: Install Python Dependencies
With the virtual environment active, install the required Python packages, including dbt and the warehouse-specific dbt adapter (e.g. dbt-snowflake
, dbt-bigquery
).
Step 5: Configure profiles.yml for Data Warehouse Connection
dbt needs to know how to connect to your data warehouse. In general, this is done via a profiles.yml file, which you need to create. This file should NOT be committed to Git, as it contains sensitive credentials.
- Location: By default, dbt looks for this file in ~/.dbt/profiles.yml (your user home directory, in a hidden .dbt folder).
- Content: See the dbt docs.
Step 6: Install dbt Package Dependencies
This project relies on external dbt packages (The Tuva Project and dbt_utils). Run the following command in your terminal from the project directory (the one containing dbt_project.yml):
dbt deps
This command reads packages.yml and downloads the necessary code into the dbt_packages/ directory within your project.
Step 7: Test the Connection
Before running transformations, verify that dbt can connect to Snowflake using your profiles.yml settings:
dbt debug
Look for "Connection test: OK connection ok". If you see errors, double-check your profiles.yml settings (account, user, role, warehouse, authentication details, paths).
Running the Project
Once setup is complete, you can run the dbt transformations:
Full Run (Recommended First Time), this command will:
- Run all models (.sql files in models/).
- Run all tests (.yml, .sql files in tests/).
- Materialize tables/views in your target data warehouse as configured.
dbt build
This might take some time depending on the data volume and warehouse size.
Run Only Models:
If you only want to execute the transformations without running tests:
dbt run
Run Only Tests:
To execute only the data quality tests:
dbt test