Health care organizations are increasingly engaging in population health management to address the performance demands of value-based care programs. These programs encourage organizations to achieve high-quality metrics while mitigating cost.
The first part of this three-part series — of which this is the third — discussed how the interoperability of data between providers and health systems ensures an accurate longitudinal view of the patient. The second part showed that accessing data from a variety of sources is critical to effective PHM. This article will cover the optimal approach to data infrastructure — adapting and scaling data services with flexibility.
Overcoming the challenges
Health care organizations have made significant investments in electronic health record systems and, to a lesser degree, enterprise data warehouses to address patient care and reporting demands. Many organizations, however, are finding these investments inadequate for population health management and rapidly evolving regulatory mandates (e.g., the Medicare Access and CHIP Reauthorization Act). The demands are especially high when health systems are creating clinically integrated networks and forming accountable care organizations.
EHRs within clinically integrated networks often cannot exchange information with one another without losing meaning. Those EHRs also can’t handle the volume and variety of data needed for effective PHM. Furthermore, traditional EDW solutions require highly structured data in designated formats. Much of the data critical for PHM, however, is not structured, nor is it typically available in EHRs, so traditional EDWs have limited usefulness.
A “data lake” approach, which employs “big data” principles, overcomes these challenges. Such an approach can serve as the basis of a PHM data infrastructure, complementing existing investments in EHRs and EDWs.
A data lake is a repository that employs parallel computing and storage techniques for speed, scale and redundancy. It holds vast amounts of structured and unstructured data. The data model and relationships between data are not typically defined until the data is needed for different uses.
Population health management requires such a repository, one that will adapt to evolving needs. But leaders must ensure other core capabilities. PHM data infrastructure must have (1) a comprehensive view of individual patients in different venues of care; (2) a common profile of the provider with a transparent system for assigning patients; (3) cleansing, standardization and harmonization of clinical and claims data; (4) regulatory compliance with data governance; and (5) the skills and resources to support the programs.
Comprehensive view of individual patients
To succeed in PHM, a health care organization needs to accurately assess gaps in an individual patient’s care, calculate the total cost of care for the patient and identify the appropriate resources to manage the patient’s care. Assembling this view of the patient requires a master-person index that works for care providers and venues; it also needs to identify each patient so clinical and claims data can be linked to a record of a single person.
Most health systems have an enterprise master patient index. This index, though, may fail to address population health management for a community, particularly when patients are managed in a clinically integrated network or an accountable care organization. Various data sources may have disparate patient identifiers and differing demographic components. Ideally, a system should employ both traditional deterministic matching and probabilistic matching. Doing so provides a greater chance of matching the same person from various settings, and it guards against matching two different individuals to the same record.
Common provider profile, flexible attribution
For clinicians to be accountable for the patients they manage, the health care organization should ensure that the process it uses to attribute patients to clinicians is transparent. The data infrastructure must be able to support a variety of assignment models, including those relevant for primary care as well as for specialty care.
These models may be based on frequency of visits, specific events or procedures, or specific assignments. Organizations may need multiple attribution models for different initiatives and quality metrics that may change over time. A common, unified profile of the provider, perhaps using the National Provider Identifier, is also necessary to ensure that it captures demographics, locations, care metrics and so on, regardless of where the provider practices.
Clean, high-quality data
Gathering data from a variety of sources, especially EHRs, can create challenges when the quality of the data and the way it is represented differ. Even when the data adhere to standard codes, they may include a wide variety of “local” codes. The ideal data infrastructure must account for several data management needs, including cleaning, standardizing and imputing missing data when appropriate.
Although a health system should never remove data, data specialists should censor or mark suspect data so it is not inadvertently used. Examples of censoring data include identifying clinically or physiologically implausible values as well as identifying artifact generated by the source systems.
Standardization should map local, coded data into common terminologies to facilitate analysis at scale. Ideally, the terminologies should be those widely adopted for interoperability when available, such as LOINC for laboratory testing and results plus patient-reported outcomes, and RxNorm for medications, along with SNOMED, ICD-9 and ICD-10.
Missing values may signal omissions in documentation or omissions in care — or they may be perfectly appropriate given a specific patient’s diagnoses and care history. Data specialists should take great care to correctly interpret the situation. In other situations, however, a missing value can be imputed from other data with a high degree of certainty, and the data infrastructure should support imputed data, such as imputing a body mass index in an adult patient with a known height and weight. But specialists should clearly mark such imputed values in the data.
Regulatory compliance
In addition to complying with federal security and privacy rules such as the Health Insurance Portability and Accountability Act, the data infrastructure should allow for local privacy rules that may constrain the exchange of sensitive data (e.g., HIV test results or mental health diagnoses). Moreover, given the variety of clinicians involved in a person’s care, health care organizations should ensure that only specific individuals are able to access restricted information.
Some of these constraints may mean that those with a different treatment relationship have limited access to detailed clinical data even as aggregate care gaps and performance data are shared within the organization or to the ACO. Nevertheless, the data governance process requires that the data infrastructure capture access history for audit purposes, when appropriate.
Skills and resources needed
Organizations need to couple data infrastructure technology with the resources to make the most of the investment. In addition to software engineers, database experts and data analysts, a PHM data lake infrastructure requires data scientists and informaticians. In addition, given regulatory needs, complex clinical organizations will need individuals well-versed in HIPAA and data governance.
A PHM data infrastructure using a data lake and these core capabilities will facilitate an organization’s pursuit of population health, including advanced analytics, risk stratification, identification and closure of care gaps through outreach, provider performance, care management and more. Finally, each organization must decide whether it makes sense to build each capability for itself or to partner with companies that can help jump-start the process.
Editor’s note: This article is the third in a three-part series on common barriers to population health management. In parts 1 and 2, Dr. Jain addressed the challenges of data interoperability and access to actionable data.
Anil Jain, M.D., is a vice president of IBM Watson Health and former senior executive director of information technology at the Cleveland Clinic. He continues to practice and teach internal medicine at the clinic.
The opinions expressed by the author do not necessarily reflect the policy of the American Hospital Association.