Medicare CCLF Connector

The Medicare CCLF Connector is a dbt project that transforms raw CMS Comprehensive Claims and Line Feed (CCLF) files into the Tuva Project Input Layer. It is the primary connector for running Tuva analytics on MSSP ACO data and expects raw data organized to the latest CCLF layout documented in the connector repo.

What Is CCLF Data?

CMS provides monthly CCLF files to MSSP ACOs containing claims for their assigned beneficiaries. The connector currently maps these source tables:

Part A — Institutional claims (inpatient, outpatient, SNF, home health, hospice), including header records, revenue center detail, diagnosis codes, and procedure codes
Part B — Professional and DME claims (physician/supplier and durable medical equipment)
Part D — Pharmacy claims
Beneficiary demographics (beneficiary_demographics) — Age, sex, race, dual eligibility status, and other beneficiary attributes
MBI cross-reference (beneficiary_xref) — Historical Medicare Beneficiary Identifiers used to link beneficiaries across file vintages
Enrollment (enrollment) — Supplemental enrollment source used for coverage spans or member months

CCLF files use a naming convention that encodes the file date (for example, P.A1234.ACO.ZC1Y18.D240101.T000000). The connector expects a parsed file_date field and uses it to deduplicate data across regular and run-out files.

Model Layers

The connector uses a three-layer dbt architecture:

Staging (views)

Type-casting only — one model per CCLF source table:

Model	Source
`stg_parta_claims_header`	Part A header
`stg_parta_claims_revenue_center_detail`	Part A revenue center
`stg_parta_diagnosis_code`	Part A diagnoses
`stg_parta_procedure_code`	Part A procedures
`stg_partb_physicians`	Part B physician claims
`stg_partb_dme`	Part B DME claims
`stg_partd_claims`	Part D pharmacy claims
`stg_beneficiary_demographics`	Beneficiary demographics
`stg_beneficiary_xref`	MBI cross-reference
`stg_enrollment`	Enrollment input (from ALR connector or custom)

Intermediate (tables)

The current repo includes a larger intermediate layer that handles normalization, deduplication, ADR (Add/Drop/Revision) logic, and code pivoting. Representative models include:

Model	Purpose
`int_beneficiary_demographics_normalized`	Normalized beneficiary demographics source fields
`int_beneficiary_demographics_deduped`	Deduplicated demographics using latest `file_date`
`int_beneficiary_xref_deduped`	Deduplicated MBI cross-reference
`int_enrollment`	Processed enrollment dates
`int_eligibility_member_months_combined`	Combines enrollment member months before eligibility output
`int_institutional_claim_deduped`	Part A claims after initial dedup
`int_institutional_claim_adr`	Part A claims after Add/Drop/Revision resolution
`int_physician_claim_deduped`	Part B physician claims deduped
`int_physician_claim_adr`	Part B physician claims after ADR
`int_dme_claim_deduped`	Part B DME claims deduped
`int_dme_claim_adr`	Part B DME claims after ADR
`int_pharmacy_claim_deduped`	Part D pharmacy claims deduped
`int_medical_claim`, `int_medical_claim_winning_keys`	Medical claim assembly and final winning-key selection
`int_*_pivot`	Diagnosis and procedure codes pivoted to wide format

Final (tables)

Three Tuva Input Layer tables:

Table	Description
`eligibility`	One row per member per month (or per enrollment span)
`medical_claim`	Standardized medical claims from Parts A and B
`pharmacy_claim`	Standardized pharmacy claims from Part D

ADR (Add/Drop/Revision) Handling

CMS CCLF files use an Add/Drop/Revision system to manage claim corrections across monthly files:

A (Add) — New claim record
D (Drop) — Cancel a previously submitted record
R (Revision) — Replace a previously submitted record

The connector applies ADR logic to produce a final, corrected set of claims by matching Drop and Revision records against their corresponding Add records using claim identifiers. This is critical for accurate cost and utilization analytics.

Enrollment Options

The connector supports two ways to provide member enrollment data:

Option 1: Member months from the ALR Connector (recommended for MSSP)

Run the CMS ALR Connector first. The CCLF connector can then reference that dbt model directly when the cms_alr_connector variable is enabled.

vars:
  cms_alr_connector: true

Option 2: Enrollment spans from a custom source

If you have enrollment spans or member months from another source, you can provide them directly in the enrollment source table. The connector will expand spans into member months automatically, and if you supply member months instead of spans you should also set member_months_enrollment: true.

Configuration

Set these variables in dbt_project.yml or via --vars:

vars:
  # CCLF source data
  input_database: "your_database"
  input_schema: "your_mssp_schema"

  # Optional behaviors
  demo_data_only: false
  cms_alr_connector: true
  member_months_enrollment: false
  # Optional schema prefix for multi-tenant deployments
  tuva_schema_prefix: "optional_prefix"

How to Run

cd medicare_cclf_connector

# Install dbt dependencies
dbt deps

# Build all models and run tests
dbt build

# Run tests only
dbt test

# Run a specific model
dbt run --select medical_claim

Output

The three output tables conform to the Tuva Input Layer schema. Once populated, you can run the full Tuva Project on top of them to generate:

Core Data Model (encounters, conditions, procedures, labs)
Financial PMPM
Chronic Conditions
Quality Measures
Readmissions
CMS-HCCs
And all other Tuva data marts

See the Tuva Input Layer documentation for the full column specifications.

The repo currently depends on the Tuva dbt package via packages.yml, so dbt deps should be run before building.

Supported Databases

Database	Supported
BigQuery	Yes
Redshift	Yes
Snowflake	Yes

Project Structure

medicare_cclf_connector/
├── dbt_project.yml
├── models/
│   ├── staging/           # One view per CCLF source table
│   ├── intermediate/      # Normalization, dedup, ADR, and pivot models
│   └── final/             # eligibility, medical_claim, pharmacy_claim
├── macros/                # Cross-database adapter dispatch macros
├── descriptions/          # Doc blocks used in model/source descriptions
├── integration_tests/     # End-to-end test suite
└── docs/                  # Generated dbt docs artifacts

What Is CCLF Data?​

Model Layers​

Staging (views)​

Intermediate (tables)​

Final (tables)​

ADR (Add/Drop/Revision) Handling​

Enrollment Options​

Option 1: Member months from the ALR Connector (recommended for MSSP)​

Option 2: Enrollment spans from a custom source​

Configuration​

How to Run​

Output​

Supported Databases​

Project Structure​