While this description seems complicated, it is greatly simplified (and excludes all discussion on episodes of care for simplicity, which is like a program on tip of other programs).
Terminology
Member | The person covered by the insurance (i.e. the patient) |
Partner | The insurance company. |
Provider | The entity that contracts with the value based program. In some programs, this may be a doctor’s office, while in others it may be a larger health system (e.g. the Mayo Clinic). |
Basic Considerations
Basic considerations of value based healthcare are:
- Individual members do not sign up for value based programs – it’s really just a contract between the providers and the insurance company. Hence, you don’t know if a member participates in a value based health program until you first determine who their doctor is (a process called attribution).
- Each value based program is designed for a specific type of provider.
- Most commonly, they are based on the primary care provider. If the provider is an independent office, there’s usually one type of program that works best for them, and if the provider is part of a larger health system, a different type of program might be best for them.
- Other programs target behavioral health providers.
- Others target specialists, etc. I won’t get too descriptive about episodes of care/bundled payments as they become more complicated.
- An individual member may have multiple providers and be part of multiple value based programs. For example, I may have a primary care doctor, a behavioral health doctor/therapist, and also see a specialist for my glaucoma.
- While most parts of a value based program are not meant to be negotiable, the reality is that a very influential provider system could theoretically override any part of a program’s logic. In practice, this is more likely to happen for certain types of programs, but any system that we create must allow any contract to modify portions of the program.
Pre-stage: Receiving files
The partner can either send us all of the files at once, or they can send them piecemeal as they are available. Some of the files are meant to overwrite any previous data that we had, while others are meant to be added to existing data (e.g. claims data normally falls into this category).
The data that they send us normally falls into three categories:
- Claims data, which all had the same basic information (but in their own format).
- Other data that is mostly common across partners, but it is less structured than claims data. This includes provider rosters, membership data, and quality measure data.
- Other data that’s specific to a program.
It would be nice if everybody sent us data according to a spec that we provide, but:
- This is what they pay us to do.
- Even if they followed a specification, it’s our job to ensure that they follow it correctly. Sometimes it’s more reliable if we do the work.
- There is often some amount of program-specific data for which we cannot foresee and include in our spec.
In this pre-stage, we do the following:
- Maintain file configuration data for every file type that they send us.
- Perform data quality checks on these files. As these files are in the raw format, we cannot easily share these data quality checks across partners. Hence, this is labor intensive.
- Maintain a dashboard allowing them to verify which files we have received and to view errors. For example, it’s relatively common for them to provide a file that doesn;t match the normal naming convention, or to add columns to a file that we don’t understand.
This is mostly a fixed cost for each partner and the cost is amortized across programs as new programs are added (although each program often has a smaller incremental cost to check any new program-specific data).
Step 1: Data ingestion
We store our data in a common data model (CDM), which is a little bit of a misnomer. Some data has a common schema and meaning (e.g claim data). For program specific data (and for most other data generated by the pipeline), we define some generic structure, but the meaning of the data is often program/contract specific.
In this stage we do the following:
- Convert the data from the partner-specific raw format into our CDM.
- Run some additional data quality checks to ensure that all of the primary keys match.
The conversion to CDM is a fixed cost for each partner and the cost is amortized across the programs as new programs are added (although each program often has a smaller incremental cost to ingest any new program-specific data).
Many of the data quality checks can be shared across partners (as the claims data, provider roster, membership data, and measure data all share a common schema and meaning).
Many people are looking at AI to handle data ingestion. This may be viable, but if the source data is ingested incorrectly, all subsequent data is wrong. To verify that AI ingested the data correctly is probably as expensive as ingesting the data without using AI.
Step 2: Grouping
We start out with a bunch of data. Program logic will often treat certain types of claims differently than others, so we first need to associate each claim/claim line with the appropriate groups/tags.
There can be multiple groupers.
- Prometheus (not the cloud based one used for tracking events) is a third party grouper that is commonly used.
- Partners may define grouping rules indicating how to classify claims into various categories, including behavioral health, inpatient/outpatient, pharmaceutical, associated with an ED visit, etc. Often there is a group hierarchy.
I haven’t dived as deeply into every program’s grouping rules, but the claims grouper rules for the few programs I investigated were a list of relatively straightforward rules:
If a primary diagnosis, place of service, pharmacy HIC code, etc. matches a value and if the claim is of a certain type, either grouper the claim line or every non-grouped line remaining in the entire claim.
If this is true across most programs, creating configuration rules for these is pretty straightforward. We also need configuration defining:
- Categories and subcategories.
- Which groups are mutually exclusive.
- Whether the rules are executed depth first or breadth first.
Step 3: Program eligibility
This stage indicates whether a member meets the minimum requirements to be eligible for a specific program and must be run for all member/program combinations within a partner to determine if the member is eligible for each program.
Eligibility requirements vary by program, but they are commonly simple rules such as:
- They need to have a specific type of insurance policy.
- They needed to have been insured continuously by the partner for a minimum amount of time.
- They need to live in a specific region.
I’m not an expert on eligibility requirements, but defining basic configuration that handles 80% of all programs is probably relatively straightforward.
Step 4: Attribution
We normally run this stage for every member and program combination for which the member passed the eligibility stage.
Attribution answers the question of “who is your doctor” as defined by the program rules. Each program may look at specific types of claims to answer this question. For example, a behavioral health program will usually only look at behavioral health claims, whereas a primary care program will look at claims generally associated with a primary care visit (such as a physical or a sore throat).
In some cases, the partner will tell us who the member is attributed to for specific programs. In other cases (where we compute attribution), the partner will override our choice to tell us that a member is not attributed to a specific doctor.
When attribution is determined by the pipeline, 80% of the attribution programs rely on similar information, which is usually some combination of:
- The doctor that they have visited the most looking back over a specific window of time.
- The doctor that they have visited the most recently within a specific period of time.
To determine the above rules requires configuration defining:
- The filters used to define which claims constitute a visit. This may be based on codes and other claim properties, or on grouping that has been done previously.
- The look back window to apply.
The attribution logic must also take into account:
- Excluding doctors that they tell us are not attributed.
- Special tie-breaking rules (e.g. if they saw two doctors the same number of times within the lookback period), which often take into account the most recent visits or the longest relationship with the doctor.
These common rules probably handle 80% or the programs, but some programs have very complex rules that leverage lots of side data. So 80% of program attribution config is pretty reasonable, whereas the remaining 20% is much more difficult.
If a member has not visited a doctor for a certain amount of time, it is common for the member to not be attributed.
It is possible for different programs to use different rules and to produce different results.
Example 1:
Program 1: Primary care practice
Looking for: Primary care provider
Attribution rules say that the member is attributed to Dr. X.
Program 2: Behavioral health
Looking for: Behavioral health doctor
Attribution rules day that the member is attributed to Dr. Y
The member is attributed to two different programs, but that’s Ok since they are not mutually exclusive – one is based on the primary care provider and the other is based on the behavioral health doctor.
Example 2:
Program 1: Primary care practice
Looking for: Primary care provider
Attribution rules say that the member is attributed to Dr. X.
Program 2: ACO health system
Looking for: Primary care provider
Attribution rules day that the member is attributed to Dr. Y
The member is attributed to two different programs and this is not OK since the two programs are mutually exclusive.
To address these situations, we need configuration indicating:
- Which programs are mutually exclusive and which are not.
- A set of tie breaking rules favoring one program over another. These rules are defined by the provider, and not by the program.
Step 5: Enrollment
We now know which doctors a member is attributed to for each program. With this data, we:
- Map the doctor to the correct provider.
- Determine if the provider has contracted with the value based program.
- If so, get the contract.
If the provider has contracted with the value based program, there may be additional program rules that we need to check to verify that the member is still eligible for the program. If so,
If the member passes these additional rules (which may or may not exist), then the member is officially part of that value based program.
Step 6: Cohort building
This is related to enrollment and may not be needed for every program, but programs can choose their scoring and payment populations in different ways. For example:
- It’s possible that payment is based on a smaller population than scoring. For example, you can contract to be paid for a specific age group or line of business, but not all age groups or all lines of business. But the provider may still be scored against all lines of business and/or age groups.
- If scoring is based on a percentile ranking that the pipeline computes, we may include all other providers, only providers within the value based program, only providers within a certain region, or various combinations of these options.
I don’t recall all of the cohort scenarios so I’m unclear exactly what configuration is required to define these cohorts. I’d imagine:
- There might be optional config further defining the scoring population.
- We could build the payment population based on contract parameters – if they only include certain populations, we’d only include attributed members that meet that criteria.
Step 7: Metric Generation
The program probably requires a variety of metrics to be generated at the member level. Common metrics include:
- Cost of care. This sounds like just the total claim payments for the member, but programs often have rules for which claims to include and exclude. There are often per-member caps that are also applied, so the total cost of care is capped if the member exceeds that amount.
- Risk scores based on the members’ claims and other factors.
- Member months, which is the number of months in the program year for which the member is enrolled in the program.
- Capitation payments or care coordination fees.
- If the program uses segmentation/stratification (e.g. does different things for different age groups, lines of business, regions, etc.), we need to ensure that all metrics generated are properly segmented/stratified.
- This includes ensuring that all quality measures received for a member are also properly stratified.
- Essentially anything that a program would like to see reported that happens at the member level or requires segmentation/stratification must be computed as a metric.
This stage has other requirements/complexities that I will describe later.
Step 8: Rollup
The metrics computed in step 7 need to be aggregated at the provider level. For example, the total cost of care for each member attributed to the provider will get summed up. We need to keep these aggregations properly segmented/stratified, however, so we can continue to track the aggregates for each segment.
Step 9: Provider actuals
This is a stage that allows us to adjust the provider’s numbers before they are scored. Some programs require this, while others do not.
For programs that require us to compute percentiles, I found this to be a good pace to do this (for reasons that I won’t get into).
This is also a good place to compute completion factors, which I will describe elsewhere.
Step 10: Scoring
What constitutes a “score” depends very much on how payment is defined in the payment. For example, a score in a shared savings program might simply be the amount of money saved.
But most value based programs also have a quality score and a quality gate, where no bonus is paid unless the quality score meets the quality threshold defined by the quality gate. Scoring quality almost always involves one or more lists of measures. It’s common for these lists to be different between age groups, and the entire scoring functions between age groups can also be completely different.
Most measures are only scoreable if they have a certain number of samples, so it’s common for measures to not be scored. How does that work if you have three measures (A, B, C) and A accounts for 50% of the score, and B and C both account for 25% of the score? If you pass A and B but can’t score C, is it fair to give you a 75% score? In this case we usually re-weight the scores so that A accounts for 67% of the score and B accounts for 33% of the score.
The scoring is often hierarchical, where A, B, and C in the above example can be substituted from single measures to lists of measures, where each measure in those lists account for a certain percentage of A, B, and C.
Individual measures can be scored pass/fail, or sometimes they are a percentage of a potential payment number.The thresholds can be predetermined or can be generated on the fly using percentiles (i.e. compared against other providers).
Step 11: Payment
Payment is usually just a series of financial calculations that can often be defined using a spreadsheet, but can also involve measure lists and re-weighting as well.
For example, they can say measures A = 10$, B = 5$, and C = 5$. If you pass all three, you get $20 per member, per month. But C didn’t qualify for scoring, so we also have to re-weight the dollar amounts so that A = $13.33 and B = $6.66.
Some programs can actually charge you money for poor scores, but some of those programs have a condition where they won’t if your score improved by a certain amount over the previous year (even if you still did poorly). I’ve also seen programs that pay you a fixed amount for every passing HEDIS measure that otherwise doesn’t qualify for scoring.
Sometimes a percentage or percentile score maps to specific values based on the range that they fall in. Sometimes the map to the percentage of money you get to keep from a fixed amount.
In one case, a program defined a series of descending buckets, where each bucket has a dollar amount and a number of payments. The highest percentile score gets to allocate a certain number of payments from the top buckets. The next highest percentile got to allocate payments from the highest remaining buckets, etc. This is the one payment scenario I’ve encountered that cannot be defined using a spreadsheet.
For payments that involve hospital systems, there are usually formulas that define how much money goes to the hospital vs. the individual practices.
So payment is essentially a list of formulas, but may include re-weighting based on non-scorable measures, table lookups, and more complex scenarios that don’t map well to a spreadsheet.
Target Setting
Target setting is done before the program year starts, but it’s hard to describe without first understanding more about how the programs work.
For shared savings programs, target setting includes long calculations over multiple lookback windows to estimate how much money a provider will spend in the upcoming program year.
It can also be used to determine thresholds for the upcoming program year, as well as data snapshots. For example, if the program needs to compare a current measure score with a previous year, the previous year’s score would be part of this snapshot.
Pingback: My Career – Part 24: Nuna – Talking Smac