Claims Mapping
Claims mapping is the process of transforming raw claims data sources into Tuva by converting the data format to match the Tuva Input Layer. The Input Layer acts as an API for Tuva. Once a healthcare data source has been mapped to the Input Layer you can run all of Tuva on that data source with a single command: dbt build
.
Every claims dataset is different. Every health plan has their own data model they store their adjudicated claims data in.
Mapping to the Input Layer involves creating dbt models (i.e. SQL statements in a dbt project). While it is necessary to create a model for every table in the Input Layer, it is not necessary that you have source data to populate every table or column in the Input Layer. For example, if you don't have pharmacy claims you still need to create the pharmacy_claim
table in the Input Layer, but you can mapp null to every column.
This spreadsheet shows what columns in the Input Layer are needed for the various Tuva Data Marts. Not mapping to a column needed for a data mart will result in no data being produced in that mart.
The notes that follow describe advice and heuristics for mapping claims data sources to the Input Layer. Consult the Input Layer data dictionary (link above) for a complete list of fields in the Input Layer.
Medical Claim
The medical_claim
table contains the billing information submitted to health insurers for medical services, supplies, and/or procedures rendered to a member of the health plan. Adjudicated claims from payers, health plans, self-insured employers, brokers, and third party administrators are the most common source of data.
Admit Source and Type
admit_source_code
is used in institutional claims to indicate where the patient was located prior to admission. The field does not exist in professional claims. The field exists at the header-level, meaning there should be only 1 distinct value for this field per claim.
admit_type_code
is used in institutional claims to indicate the priority of admission, e.g., urgent, emergent, elective, etc. The field does not exist in professional claims. The field exists at the header-level, meaning there should be only 1 distinct value for this field per claim.
Admit source, along with admit type, is generally not considered very reliable because the accuracy of the code is not verified during the claims adjudication process (other than verifying that the code is in fact a valid code).
Despite this, it's possible to use admit source to help identify things like:
- transfers from another hospital
- inpatient stays that came through the emergency department
Admit type is commonly used to identify things like elective procedures.
Admit source and type codes are maintained by the National Uniform Billing Committee (NUBC).
Bill Type
'bill_type_code' is by far the most complex of the administrative codes in medical claims. Each digit has a distinct purpose and meaning:
- 1st digit: This is always "0" and often omitted.
- 2nd digit: Indicates the type of facility, e.g., skilled nursing facility
- 3rd digit: Indicates the type of care, e.g., inpatient part A
- 4th digit: Indicates the sequence of the bill (also referred to as the frequency code)
The thing that makes this code complex is that the possible values of the 3rd and 4th digits depend on the value of the 2nd digit. As a result, some claims datasets will separate out the digits of bill type code into distinct fields. However, we find it preferable to work with bill type code as a single field.
Despite the complexity of this field, it's extremely useful. 'bill_type_code' is used extensively in the creation of service categories, including the identification of acute inpatient, outpatient, skilled nursing, and emergency department services, among many others. The field is generally considered reliable because the accuracy and suitability of the code is verified during the claims adjudication process, i.e., a claim may be denied if the code doesn't make sense.
'bill_type_code' values are maintained by the National Uniform Billing Committee (NUBC).
Claim ID
claim_id
is intended to be a unique identifier for a set of services and supplies rendered by a healthcare provider that has been billed to insurance. It is the most fundamental data element in the medical_claim
table and every row in the table must have a claim_id
populated. If certain records do not have a claim ID populated these records should not be mapped to the Input Layer.
Claim Line Number
A claim is often made up of multiple records. claim_line_number
is intended to be a unique identifier within a claim that distinguishes these records (i.e. each distinct service, supply, or procedure rendered on the claim).
Every record on a claim should have a unique claim_line_number
and it should be a positive integer. Generally speaking claim_line_number
should be sequentially increasing starting with the number 1. For example, a claim with 10 records with have claim_line_number
populated starting a 1 and going up to 10 on the tenth record. However we often encounter claims with strange line numbers e.g.:
- Not starting with 1
- Not sequential
- Repeating numbers
This weirdness isn't ideal and can indicate other problems in the dataset (e.g. duplicate records). However it's not a problem in isolation.
claim_line_number
can be created manually if it’s unavailable in the source data or if it’s not sequential positive integers. For example:
row_number() over (partition by claim_id order by claim_end_date) as claim_line_number
Claim Type
claim_type
is the categorization of a claim based on the specific claim form used in billing i.e. institutional or professional. It's an important field used in Claims Preprocessing to assign service categories and group claims into encounters.
Each claim_id
must have a unique claim_type
which should be one of these values:
- institutional: For claims rendered using a UB-04 claim form
- professional: For claims rendered on a CMS-1500 claim form
Often claim_type
is available in the source medical claim data. If claim_type
is not present, other fields on the claim can be used to determine whether the claim is institutional
or professional
.
In theory, the following fields should only be found on institutional claims:
bill_type_code
revenue_center_code
ms_drg_code
orapr_drg_code
admission_date
anddischarge_date
discharge_disposition_code
andadmit_type_code
andadmit_source_code
The following field should only be found on professional claims:
place_of_service_code
Often a claims data source will have all of these fields populated simultaneously. Technically this should not be possible, since the data elements are separated by claim form. However, payers will merge claims from the two claims forms into a single table and then populate missing data elements based on logic they develop internally. This can make it difficult to separate professional from institutional. In these circumstances we start by looking for institutional data elements (listed above). If bill_type_code
,revenue_center_code
, or ms_drg_code
or apr_drg_code
are present, we assign the claim an institutional
claim type.
If none of these fields are present then we assign the claim a professional
claim type as long as a place_of_service_code
is present.
If it’s not possible to determine the correct claim_type
, we assign a claim_type
of undetermined
.
Whatever claim_type
is assigned to a claim, it must be the same for all records (i.e. lines) within that claim.
Dates
All dates should be formatted as YYYY-MM-DD
.
To understand the key date fields in medical claims, it's useful to consider an example of a patient who's been receiving care in a long-term care (i.e. skilled nursing) facility for 1 year, from January 1st to December 31st, and suppose the facility bills the insurer every month on the beginning of the month.
claim_start_date
: The start date of the billable period for the claim. In the example above this date would always be the first date of the month.claim_end_date
: The end date of the billable period for the claim. In the example above this date would always be the last date of the month.admission_date
: The date the patient was first admitted to the facility. In the example above this date would be January 1st. This field only exists on institutional claims, not professional.discharge_date
: The date the patient was discharged from the facility. In the example above this date would be December 31st. This field only exists on institutional claims, not professional.paid_date
: The date the claim was paid by the insurance company. This date could be any date after the claim_end_date. Often this date is within a couple weeks of claim_end_date.
There are 2 other date fields in medical claims. They are claim_line_start_date
and claim_line_end_date
. These date fields are less important - in fact we don't currently use them in any analytics in Tuva.
Discharge Disposition
'discharge_disposition_code' indicates where the patient was discharged following a stay at a facility. The field only exists on institutional claims. The field is sometimes called discharge status or patient status. The field exists at the header-level, meaning there should be only 1 distinct value for this field per claim.
The code is commonly used to identify things like:
- Patients that died during an institutional stay
- Patients who were transferred
- Patients who were discharged to home or home w/ home health services
- Patients who left against medical advice (LAMA)
Discharge disposition codes are maintained by the National Uniform Billing Committee (NUBC).
DRG
ms_drg_code
is a classification system used by Medicare to categorize inpatient hospital stays and group them based on a patient’s diagnosis, procedures performed, age, sex, and complications or comorbidities. It is necessary for Medicare reimbursement but often used by hospitals as a standard for all inpatient stays.
apr_drg_code
stands for "all patient refined DRG". It was developed by 3M to extend DRGs to a more general patient population.
Financial Data
paid_amount
is the dollar amount that the health insurer paid for the covered service.
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
allowed_amount
is the maximum dollar amount a health insurer will reimburse for a covered service.
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
coinsurance_amount
is the dollar amount a member has paid for a covered service as part of cost-sharing with the health insurance provider. After a deductible is met, covered services may still require a member to pay for a percentage of the cost (e.g. 80/20 - 80% paid by the health insurer and 20% paid by the member)
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
deductible_amount
is the dollar amount a member has paid for a covered service before the health insurer will pay the cost for covered services.
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
total_cost_amount
is the total amount for a member’s cost of care. Based on the source data set, it may equal the sum of the other payment fields or it may include Medicare’s claim pass-through per diem amount.
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
HCPCS
HCPCS codes indicate the services and supplies rendered by providers to patients. These codes are used in both institutional and professional claims forms. These codes exist at the line-level, meaning there can be many HCPCS codes on a single claim. There are codes for many different types of supplies and services including:
- physician visits
- lab tests
- imaging reads
- durable medical equipment
- remote patient monitoring devices
And many many other types of things. There are thousands of HCPCS codes spread across two levels. Level 1 codes, also called CPT codes, are maintained by the American Medical Association (AMA). Level 2 codes are maintained by CMS.
Professional contracted rates between payers and providers are established using HCPCS codes. These rates are referred to as a fee schedule. Conversely, institutional rates are often paid on a per encounter (e.g. DRG) or per diem basis.
HCPCS Modifiers include the following fields: hcpcs_modifier_1
, … , hcpcs_modifier_5
A hcpcs_modifier
can be an additional code to hcpcs_code
that provides more information about the circumstances surrounding the service. They represent specific details about the service. Some examples are:
- LT - service was performed on left side of body
- 76 - service was repeated by the same physician on the same day
- 77 - service was repeated by another physician on same day
Patient ID
patient_id
is a unique identifier that is designed to unify a patient’s records and provide a consistent reference for the specific individual. It allows for the linking and tracking of a patient’s healthcare journey across different source data sets.
member_id
is an identifier specific to the health insurer or health plan. It is assigned by the insurance company to uniquely identify a specific individual only within their system.
patient_id
andmember_id
are populated for every row in the input layermedical_claim
table.patient_id
andmember_id
have the same value for all lines within the sameclaim_id
.patient_id
is unique across all data sources
Payer and Plan
payer
contains the name of the health insurance payer of the claim (Aetna, Blue Cross Blue Shield, etc). This field should can be populated manually if not available already as a field in the source data.
plan
contains the specific health insurance plan or sub-contract the member is enrolled in (e.g. Aetna Gold, Aetna Bronze 4, BCBS Chicago, etc).
plan
may not be available in the source data and should be hardcoded (e.g. select 'aetna bronze 1' as plan
) and it can be the same as the payer if no plan is needed for analytics.
Place of Service
Place of service codes indicate the type of care setting professional claim services were delivered in. This field only exists on professional claims. Place of service is coded at the line-level to reflect the fact that services during a particular encounter can occur in different locations. Because of this, a single professional claim can have multiple place of service codes.
Place of service codes are used to assign claims to services categories. For example, place of service code 11 indicates an office visit.
CMS maintains place of service codes.
place_of_service_code
is contains a 2 digit code that specifies a specific location where the medical service was provided.
place_of_service_code
is only found on professional
claims and is a line-level field that should be populated on all claim lines.
Revenue Center
Revenue center codes are used to account for the services and supplies rendered to patients in institutional care settings. These codes are only used in institutional claims. Typically these codes will correspond to a facility's chargemaster, which is a listing of all charges used by the institution in billing. Although a hospital will use these codes to "charge" the health insurer, they have no bearing on the contracted payment amount, i.e., the amount paid to the provider by the payer. The payment amount is entirely determined by MS-DRG for inpatient claims and often a per diem rate for skilled nursing.
Many different categories of revenue center codes exist including for example:
- Room and Board
- Emergency
- IV Therapy
For a given institutional claim there may be dozens of revenue center codes used. These codes are submitted at the line-level of the claim, so there is no limit to the number of revenue center codes that may be used on a given claim.
Revenue center codes play an important role in identifying different types of insitutional claims, including acute inpatient, emergency department, and others.
Revenue center codes are maintained by the National Uniform Billing Committee (NUBC).
revenue_center_code
is a 4 digit code that used to classify and identify different departments or units within a healthcare facility.
revenue_center_code
is only found on institutional
claims and is a line-level field. It is generally required for billing so payers understand the nature of a service and can determine proper reimbursement but it can be omitted from source data sets.
Diagnosis Codes
diagnosis_code
represents the patient’s medical conditions and/or diagnosis and communicates information about the their health and why healthcare services were provided.
diagnosis_code_1
contains the primary condition for which the patient is seeking care. It is the primary medical issue that is the focus of the visit.
diagnosis_code
is a header-level field for all claim types which means they should be unique per claim.
The number of diagnosis_code
fields available in the data will vary by source and data provider. There can be up to 25 codes
but it is not unexpected to see only 1-5 codes. When mapping diagnosis_code_1
any field in the source data labeled ‘primary’ or as ‘diagnosis_1’ is acceptable.
- data type is
string
diagnosis_code_type
contains the coding system for the diagnosis codes contained on the claims.
On October 1, 2015 the U.S. healthcare industry switched from using ICD-9 to ICD-10. If the source data does not contain information about the diagnosis_code_type
then the switch over date can be used to classify the diagnosis codes. It should be compared against the claim_end_date
for example:
case
when claim_end_date < 2015-10-01
then 'icd-9-cm'
else 'icd-10-cm'
- data type is
string
diagnosis_code_type
is popualated when anydiagnosis_code
is populateddiagnosis_code_type
is a value from Tuva’s code type terminology file
Present on Admission
diagnosis_poa
refers to the patient’s condition at the time they were admitted to the hospital. It indicates whether the condition was already present and active or if it developed during their hospitalization.
diagnosis_poa
is a header-level field for all claim types which means they should be unique per claim.
The number of diagnosis_poa
fields available in the data will vary by source and data provider. There can be up to 25
codes to describe each diagnosis_code
but it is not unexpected to see only 1-5 codes or none at all.
- data type is
string
Procedure Codes
procedure_code
represents inpatient surgical, diagnosis, or therapeutic procedures rendered during a patient’s visit.
procedure_code
is a header-level field for all claim types which means they should be unique per claim.
The number of procedure_code
fields available in the data will vary by source and data provider. There can be up to 25 codes but it is not unexpected to see only 1-5 codes.
- data type is
string
procedure_code_type
contains the coding system for the procedure codes contained on the claims.
On October 1, 2015 the U.S. healthcare industry switched from using ICD-9 to ICD-10. If the source data does not contain information about the procedure_code_type
then the switch over date can be used to classify the diagnosis codes. It should be compared against the claim_end_date
for example:
case
when claim_end_date < 2015-10-01
then 'icd-9-pcs'
else 'icd-10-pcs'
- data type is
string
procedure_code_type
is populated when anyprocedure_code
is populateddiagnosis_code_type
is a value from Tuva’s code type terminology file
Procedure Dates
This includes the following fields: procedure_date_1
, …, procedure_date_25
procedure_date
represents the date the corresponding procedure occurred (e.g. procedure_date_1
is the date for procedure_code_1
).
procedure_date
is a header-level field for all claim types which means they should be unique per claim.
The number of procedure_date
fields available in the data will vary by source and data provider. There can be up to 25 codes but it is not unexpected to see only 1-5 codes.
- data type is
date
in the formatYYYY-MM-DD
Provider NPI
rendering_npi
, facility_npi
, and billing_npi
is populated with the Nation Provider Identifier (NPI) for a provider.
rendering_npi
represent the practitioner who performed or rendered the specific service. This value can be populated on either institutional or professional claims.
facility_npi
represents the healthcare facility or institutional where a specific service was rendered. This value should only be populated on institutional claims.
billing_npi
represent the individual or organization responsible for billing and receving payment for healthcare services.
If only one NPI field is provided in the source data, then reference the NPI in Tuva’s provider terminology file to
determine if it is a person or a place. If it is a person, then the NPI should be mapped to rendering_npi
in the input
layer. If it is a person and it’s an professional claim then also map to billing_npi
. If it is a location and the claim type is institutional, then map tofacility_npi
.
Eligibility
The eligibility table contains enrollment and demographic data of health plan members. Eligibility data typically exists in one of two formats:
- Enrollment Spans
- Member Months
The Tuva Input Layer Eligibility table uses the Enrollment Span format. This is the most common format used in claims data. If your eligibility data uses this format it will be relatively easy to complete the mapping. If on the other hand your data uses the member months format, you'll need to convert it.
The primary keys for this table are:
patient_id
member_id
payer
plan
enrollment_start_date
enrollment_end_date
data_source
Gender
gender
represents the biological sex of a member. It is mapped to either male, female, or unknown.
Race
race
represents the physical race of a member.
Birth Date
This field represents the birth date of a member. data type is date
in the format YYYY-MM-DD
Death Date
death_date
contains the day a member died.
- data type is
date
in the formatYYYY-MM-DD
death_flag
death_flag
contains a flag indicating if a member died; 1 for yes 0 for no.
death_flag
should be 1 if a death_date
is populated. death_flag
can be 1 and death_date
NULL if only an indicator is available in the source data.
- data type is int
death_flag
is populated with a 1 or 0
enrollment_start_date and enrollment_end_date
The grain of this table will affect how these fields are populated:
- One row per member month -
enrollment_start_date
will be the beginning of the month andenrollment_end_date
will be the last day of the month.- e.g.
enrollment_start_date
= 2023-01-01enrollment_end_date
= 2023-01-31
- e.g.
- One row per enrollment span -
enrollment_start_date
will be the first day of enrollment andenrollment_end_date will
be the last day of enrollment.- e.g.
enrollment_start_date
= 2023-01-01enrollment_end_date
= 2023-12-31
- e.g.
In the source data, enrollment end date may be NULL
to indicate that the member is actively enrolled. After confirming
this with the data provider, enrollment_end_date
should be populated with the last day of the current year.
- data type is
date
in the formatYYYY-MM-DD
enrollment_start_date
andenrollment_end_date
are populated in every row
payer
payer
contains the name of the health insurance payer of the claim (Aetna, Blue Cross Blue Shield, etc)
payer
may not be available in the source data and should be hardcoded (e.g. select 'aetna' as payer
)
payer
is populated for every row- data type is
string
payer_type
payer_type
contains the type of insurance provided by the payer.
- data type is
string
payer_type
is populated for every row- value is mapped to one of the values found to Tuva’s payer_type terminology file.
plan
plan
contains the specific health insurance plan or sub-contract the member is enrolled in (e.g. Aetna Gold, Aetna Bronze 4, BCBS Chicago, etc).
plan
may not be available in the source data and should be hardcoded (e.g. select 'aetna bronze 1' as plan
and it can be the same as the payer if no plan is needed for analytics.
- data type is
string
plan
is populated for every row
original_reason_entitlement_code
original_reason_entitlement_code
contains a member’s original reason for Medicare entitlement.
original_reason_entitlement_code
is helpful for the CMS HCC mart to provide a more accurate risk score.- If it's unavailable,
medicare_status_code
is used. If neither are available, the mart will use a default value of “Aged”. - data type is
string
- value is mapped to one of the values found to Tuva’s OREC terminology file.
dual_status_code
dual_status_code
indicates whether a member is enrolled in both Medicare and Medicaid.
dual_status_code
is helpful for the CMS HCC mart to provide a more accurate risk score.- If unavailable, the mart will use a default value of “Non” (i.e., non-dual).
- data type is
string
- value is mapped to one of the values found to Tuva’s dual status terminology file.
medicare_status_code
medicare_status_code
indicates how a member currently qualifies for Medicare.
medicare_status_code
is helpful for the CMS HCC mart to provide a more accurate risk score.- It’s used when
original_reason_entitlement_code
is missing. - data type is
string
- value is mapped to one of the values found to Tuva’s medicare status terminology file.
data_source
data_source
is populated with the name of the entity providing the data. It may come from the health insurer directly (e.g. Aetna, BCBS) or a third party (e.g. HealthVerity, Datavant).
- data type is
string
data_source
is populated for every row
Pharmacy Claim
The pharmacy claim table contains the billing information submitted to the health insurer for medications dispensed to a member of the health plan. The primary keys for this table are:
claim_id
claim_line_number
data_source
claim_id
claim_id
is a unique identifier for a set of services and supplies rendered by a healthcare provider that have been billed to insurance. It is the most fundamental data element in the pharmacy_claim
table, and every row in the pharmacy_claim
table should have a claim_id
. If the source data does not have claim IDs or is missing claim IDs for some rows in the data, then those rows should not be mapped to Tuva’s input layer.
- data type is
string
claim_id
is populated for every rowclaim_id
is unique across all data_sourcesclaim_id
is unique across all lines within a claim
claim_line_number
claim_line_number
is a unique identifier within a claim that distinguishes each distinct service, supply, or procedure rendered.
Every row should have a claim_line_number
; it must be a positive sequential integer. claim_line_number
can be created manually if it’s unavailable in the source data or if it’s not sequential positive integers. For example:
row_number() over (partition by claim_id order by claim_end_date) as claim_line_number
The max(claim_line_number
) for a given claim_id
must be equal to the number of claim lines for that claim_id
.
When mapping to the input layer the following expectations must be met or else The Tuva Project will not run and produce errors. Any row of data that does not meet the requirements must be omitted from the input layer.
Expectations in the input layer:
- data type is
integer
claim_line_number
is populated for every rowclaim_line_number
is a positiveclaim_line_number
is sequential (1,2,3,…)- The maximum value of
claim_line_number
for is equal to the total number of lines in a claim
patient_id and member_id
patient_id
is a unique identifier that is designed to unify a patient’s records and provide a consistent reference for the specific individual.
It allows for the linking and tracking of a patient’s healthcare journey across different source data sets.
member_id
is an identifier specific to the health insurer or health plan. It is assigned by the insurance company to uniquely identify a specific individual only within their system.
patient_id
andmember_id
are populated for every row in the input layermedical_claim
table.patient_id
andmember_id
have the same value for all lines within the sameclaim_id
.patient_id
is unique across all data sources
payer
payer
contains the name of the health insurance payer of the claim (Aetna, Blue Cross Blue Shield, etc)
payer
is populated for every row- data type is
string
plan
plan
contains the specific health insurance plan or sub-contract the member is enrolled in (e.g. Aetna Gold, Aetna Bronze 4, BCBS Chicago, etc).
If no plan information is available, the payer should be populated in this field.
- data type is
string
plan
is populated for every row
prescribing_provider_npi
precribing_provider_npi
is populated with the national provider identifier (NPI) of the provider who prescribed the medication.
The National Plan & Provider Enumeration System (NPPES) is used to create Tuva’s provider terminology file. (This file is blank in GitHub due to its size. The data is stored in a public that is referenced in the dbt_project.yml.)
- data type is
string
prescribing_provider_npi
is a value from Tuva’s provider terminology file
dispensing_provider_npi
dispensing_provider_npi
is populated with the national provider identifier (NPI) of the provider who dispensed the medication. This NPI may represent the pharmacist or the pharmacy.
The National Plan & Provider Enumeration System (NPPES) to used to create Tuva’s provider terminology file. (This file is blank in GitHub due to its size. The data is stored in a public S3 bucket.)
- data type is
string
dispensing_provider_npi
is a value from Tuva’s provider terminology file
dispensing_date
dispensing_date
is the date that the medication was given (i.e. filled).
- data type is
date
in the formatYYYY-MM-DD
ndc_code
ndc_code
is the National Drug Code assigned to prescription and over-the-counter drugs. NDC can be a 10 or 11 digits, which are broken out into 3 segments:
Labeler (1-5) - The manufacturer or labeler of the drug
Product (6-9) - The specific drug and it’s strength
Package (10-11) - The package size and type
data type is
string
paid_date
paid_date
is the date that the health insurer processed the claim for payment. It should coincide with the date that the pharmacy received reimbursement from the health insurer.
- data type is
date
in the formatYYYY-MM-DD
paid_amount
paid_amount
is the dollar amount that the health insurer paid for the covered medication.
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
allowed_amount
allowed_amount
is the maximum dollar amount a health insurer will reimburse for a covered medication.
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
coinsurance_amount
coinsurance_amount
is the dollar amount a member has paid for a covered medication as part of cost-sharing with the health insurance provider. After a deductible is met, covered services may still require a member to pay for a percentage of the cost (e.g. 80/20 - 80% paid by the health insurer and 20% paid by the member)
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
deductible_amount
deductible_amount
is the dollar amount a member has paid for a covered medication before the health insurer will pay the cost for covered services.
- data type is
numeric
with two decimal points (e.g.numeric(38,2)
)
data_source
data_source
is populated with the name of the entity providing the data. It may come from the health insurer directly (e.g. Aetna, BCBS) or a third party (e.g. HealthVerity, Datavant).
- data type is
string
data_source
is populated for every row