Canvas

Code on GitHub

This is an early release of the connector -- as with any connector we welcome and look forward to community contributions to improve the connector.

Getting Started

Step 1: Prerequisites

Before you begin, ensure you have the following:

Access to your data warehouse: Credentials and network access to your data warehouse instance (e.g. Snowflake, BigQuery).
Canvas EHR data: Your raw Canvas data must be loaded into specific tables within your data warehouse.
dbt CLI Installed: You need dbt (version 1.9 recommended) installed on your machine or environment where you'll run the transformations. See dbt Installation Guide for help with installation.
Git: You need Git installed to clone this project repository.
Authentication Details: These details will be important in connecting to dbt with a profiles.yml file.

Step 2: Clone the Repository

Open your terminal or command prompt and clone this project:

git clone https://github.com/tuva-health/canvas_connector.git
cd canvas_connector

Step 3: Create and Activate Virtual Environment

It's highly recommended to use a Python virtual environment to manage project dependencies. This isolates the project's packages from your global Python installation.

Create the virtual environment (run this inside the canvas_connector directory):

# Use python3 if python defaults to Python 2
python -m venv venv

This creates a venv directory within your project folder.

Activate the virtual environment:

macOS / Linux (bash/zsh): source venv/bin/activate
Windows (Command Prompt): venv\Scripts\activate.bat
Windows (PowerShell): .\venv\Scripts\Activate.ps1
Windows (Git Bash): source venv/Scripts/activate

You should see (venv) prepended to your command prompt, indicating the environment is active.

Step 4: Install Python Dependencies

With the virtual environment active, install the required Python packages, including dbt and the warehouse-specific dbt adapter (e.g. dbt-snowflake, dbt-bigquery).

Step 5: Configure profiles.yml for Data Warehouse Connection

dbt needs to know how to connect to your data warehouse. In general, this is done via a profiles.yml file, which you need to create. This file should NOT be committed to Git, as it contains sensitive credentials.

Location: By default, dbt looks for this file in ~/.dbt/profiles.yml (your user home directory, in a hidden .dbt folder).
Content: See the dbt docs.

Step 6: Install dbt Package Dependencies

This project relies on external dbt packages (The Tuva Project and dbt_utils). Run the following command in your terminal from the project directory (the one containing dbt_project.yml):

dbt deps

This command reads packages.yml and downloads the necessary code into the dbt_packages/ directory within your project.

Step 7: Test the Connection

Before running transformations, verify that dbt can connect to your data warehouse using your profiles.yml settings:

dbt debug

Look for "Connection test: OK connection ok". If you see errors, double-check your profiles.yml settings (account, user, role, warehouse, authentication details, paths).

Running the Project

Once setup is complete, you can run the dbt transformations:

Full Run (Recommended First Time), this command will:

Run all models (.sql files in models/).
Run all tests (.yml, .sql files in tests/).
Materialize tables/views in your target data warehouse as configured.

dbt build

This might take some time depending on the data volume and warehouse size.

Run Only Models:

If you only want to execute the transformations without running tests:

dbt run

Run Only Tests:

To execute only the data quality tests:

dbt test

Running Specific Models:

You can run specific parts of the project using dbt's node selection syntax. For example:

Run only the staging models: dbt run -s path:models/staging
Run a specific model and its upstream dependencies: dbt run -s +your_model_name

Getting Started​

Step 1: Prerequisites​

Step 2: Clone the Repository​

Step 3: Create and Activate Virtual Environment​

Step 4: Install Python Dependencies​

Step 5: Configure profiles.yml for Data Warehouse Connection​

Step 6: Install dbt Package Dependencies​

Step 7: Test the Connection​

Running the Project​

Run Only Models:​

Run Only Tests:​

Running Specific Models:​