National Committee on Vital and Health Statistics
Subcommittee on Quality

The Meaningful Measure Supply Chain:
Building Measures That Matter for Our Nation’s Health and Healthcare Priorities

October 13-14, 2009

National Center for Health Statistics

Hyattsville, Maryland

Executive Summary

The Subcommittee on Quality of the National Committee on Vital and Health
Statistics (NCVHS) held a hearing on October 13-14, 2009, on the Meaningful
Measure Supply Chain. The hearing featured eight sessions covering two broad
areas—1) the nature of meaningful measurement and the status of
performance measurement today; and 2) the challenges and lessons in developing
and using meaningful measures in four priority areas. Those areas are care
coordination, disparities, value, and population health/health status. The
presentations are summarized in the next section; the agenda and list of
participants are in the Appendix. (The speakers’ slides and a transcript of the
hearing are posted on the NCVHS Website.[1])

Over the two days of the hearing, there were numerous opportunities for rich
discussion among the presenters and NCVHS members and staff. This executive
summary focuses on the ideas generated by those discussions.

The emerging contributions of health information technology (IT) to quality
measurement and improvement, combined with the opportunities created by new
federal Stimulus legislation, provided the context and impetus for the hearing.
Electronic health records (EHRs) make possible the collection of clinically
relevant quality data as an integral part of the health care process, drawing
on multiple data and linking, ultimately, to real-time decision support. The
HITECH[2] provisions of the
American Recovery and Reinvestment Act of 2009 are designed in part to harness
technology in these ways to improve health care quality. One HITECH criterion
for earning an incentive for EHR adoption is using EHRs to report clinical
quality measures.

The NCVHS Subcommittee on Quality was motivated to hold the October hearing
partly by a concern that in health care, ease of measurement too often takes
precedence over measuring what matters. Believing that it is necessary to
understand the supply chain for measure development and improvement to
determine how best to take advantage of current opportunities, the Subcommittee
invited measure developers, endorsers, system developers, and reporters to
present their perspectives. On the basis of its findings, it plans to draft
recommendations to the Secretary for consideration by the full Committee.

The NCVHS hearing addressed four questions:

How do we approach building meaningful measures?
What is the current process for developing measures? Does it adequately
address measure development for key national priorities and subpopulations?
How do we introduce new data sources—clinical data from EHRs,
user-generated data, etc.—into the measure-development process?
How do we maintain and update measures? What are the health IT system
implications?

The current landscape, the emerging vision, and the way forward

The hearing testimony painted a picture of a quality landscape undergoing a
gradual transition toward needed changes such as a focus on episodes of care
and increased use of outcomes data and composite measures. Collaborative
efforts were described that are developing simplified data sets, retooling
existing measures for use with EHRs, and laying the groundwork for expanded use
of health IT and multiple data sources in future quality measurement.

While recognizing that this transition is an evolutionary process that will
take several years, NCVHS members were concerned that current efforts do not
seem driven by the sense of urgency and opportunity felt by the National
Committee. Subcommittee members noted the number of separate measure
development efforts, producing what Dr. Helen Burstin of the National Quality
Forum called “a cacophony of measures.”

The Subcommittee talked with the panelists about the kind of breakthroughs
needed to meet current opportunities, and what conditions would facilitate the
breakthroughs. The group identified the need for a parsimonious set of
harmonized, standardized quality measures that represent agreement among
providers, evaluators, payors, and others about what to measure. Participants
stressed the importance of embedding quality data collection in the care
process, with rapid feedback mechanisms so the resulting information supports
health care decisions. A recurring theme was the idea of creating measures that
are both useful and “aspirational” for physicians, thus contributing
to a continuing learning process. A related suggestion was to increase the
quality data available for internal and local use, unconnected to large-scale
reporting and accountability initiatives.

To facilitate these processes, participants agreed that health care
organizations will need data stewards responsible for aggregating and cleaning
the data. In view of the slow pace of change, one suggestion was to focus in
the short-term focus on collecting the data elements that are the building
blocks for measures, rather than on measures, per se. Another innovative
approach that garnered interest was the AHRQ-funded DARTnet demonstration
project, which uses a distributed network instead of a data storage warehouse.
There was broad agreement that the needed breakthroughs are not likely to
happen without a national policy framework and strategy for quality measurement
and improvement, an overarching architecture, and a new accountability and
governance framework.

Summary of Presentations

Introduction

Dr. Carr and Dr. Tang, Subcommittee Co-Chairs

Dr. Carr pointed out that ease of measurement too often has taken precedence
over measuring what matters. New data sources, including electronic health
records (EHRs) and the reliance on measures of meaningful use under HITECH
provisions of ARRA, underscore the relevance of understanding the supply chain
for measure development and improvement. The Subcommittee invited measure
developers, endorsers, system developers and reporters to present their
perspectives in this hearing. She reviewed the questions guiding the hearing
(see above).

Dr. Tang, who is a member of the HITECH Policy Committee and chairs its
Meaningful Use Group, discussed the HITECH context for the hearing. One of its
four criteria for earning an incentive for EHR adoption is using EHRs to report
clinical quality measures. The Policy Committee will recommend criteria for
evaluating whether hospitals and eligible professionals qualify for incentives.
It has chosen to focus on clinically derived quality measures; but they and
current practices have many limitations, which he reviewed. He concluded,
“In short, we lack meaningful measures.” That is the motivation
behind this hearing.

Setting Priorities for Measurement

Helen Burstin, M.D., National Quality Forum (NQF)

Dr. Burstin focused on NQF’s National Priorities Partnership (NPP) and the
work toward meaningful use measures. NQF has worked for a decade to endorse
national consensus standards and, more recently, to publicly report on
performance, and has added to its mission setting aligned national priorities
and goals. Its evaluation criteria now state that what is measured should
enable a significant improvement in a priority area. Under an HHS contract, NQF
is evaluating the 20 highest priority conditions for Medicare in terms of nine
criteria. Measures that do not meet the criteria will not be evaluated.

NQF has been trying to push the field toward higher performance, and toward
composites and away from narrow process measures. It measures disparities in
all contexts, and seeks to harmonize measures across sites and providers. The
idea is to promote shared accountability and measurement across patient-focused
episodes of care. All of this moves toward measuring outcomes and
appropriateness, which will soon be combined with measures of costs and
resource use. A population health perspective and the concept of a population
at risk (i.e., those for whom prevention is/was possible) are involved. NQF
also would like to see the longitudinal assessment of care.

Dr. Burstin observed that “the cacophony of hundreds and hundreds of
measures is not getting us where we want to go.” Thus, the NPP was
convened to work for agreement on high-leverage areas in which harmonized
“effector arms” around goals could drive improvement more rapidly.
The goal of the NPP, in which 32 leadership organizations participate, is to
establish national priorities and goals for public reporting in four “key
aims”: providing effective care, eliminating harm, removing waste, and
eradicating disparities. She outlined and commented on the six national
priorities: patient/family engagement in managing health and making care
decisions, population health, safety, care coordination, palliative care, and
overuse. Several objectives are specified in each priority area. Eradicating
disparities is a seventh, cross-cutting priority. NQF has classified an initial
set of measures as disparity-sensitive in the ambulatory care setting, with an
eye to stratifying these and other measures identified in this way.

Discussion

In the discussion period, Dr. Burstin was asked about the timeline for the
projects she described. She said the national priorities will be filled out
within two years. This will have to be “overlaid” with health IT
(which Dr. Eisenberg will describe on day two of this hearing). She noted that
her next presentation (below) will talk about feasibility in regard to quality
measures, and she emphasized the need for expanded funding for measure
development.

What makes a measure meaningful?

Helen Burstin, M.D., NQF

Dr. Burstin noted that the definition of “meaningful” has taken on
many layers in the last couple of years, and there has been a huge growth of
measures. The questions are whether there are too many, or too few, and whether
they are the right ones. The transition to EHRs can be expected to be
transformative in the way we look at what is a meaningful measure.

The criteria for NQF evaluation are importance to measure and report,
scientific acceptability of the measurement properties, usability, and
feasibility (with greater emphasis on health IT). There must be evidence for
the measure, it must be related to a priority area, and there must be
opportunity for improvement. There is increasing focus on outcomes and
intermediate outcomes. Structural measures continue to be important, and
efficiency is increasingly being emphasized.

Dr. Burstin commented on some of the problems that clinical guidelines
create for measurement because they lack specificity, precise definitions, and
precise terminology. They also focus measurement on “measureable branch
points” that she likened to “searching for the keys under the lamp
post.” She noted the need to develop standards for “computable
clinical guidelines” that are usable for measurement and improvement. The
first question should be, “What is the most important thing to
measure?” followed by “Where can the data be found to do that?”

With new measures moving into the field and the advent of EHR-enabled
measures, there is interest in demonstration of comparability of different data
sources. Dr. Burstin does not expect this to happen in the short term.
Exclusions increase the complexity and burden of measurement and limit the
ability to use electronic sources. In general, the best approach with EHRs is
not yet clear. Regarding usability, NQF requires that the measures it endorses
are usable for both public reporting and to inform quality improvement. The
most challenging requirement to implement is that measures are harmonized, to
ensure that they have distinct or additive value. Feasibility is of particular
importance from today’s perspective. NQF plans to require specifications for
EHRs for measure submission within a year or two.

Dr. Burstin then reported on NQF’s work on the Quality Data Set (QDS). NQF
believes the way to make measures more meaningful is to ensure harmonization
and the acquisition of “the right kind of clinical data.” This, she
said, represents “a real transition” for NQF. The vision is for the
QDS to be built into a measure-authoring tool for measure developers. The
Health IT Expert Panel (HITEP) developed a list of questions for making
measures meaningful from the IT perspective, including one on the use of
patient-centered data sources. She stressed that thinking about meaningful
measurement is an evolutionary process. Finally, she pointed out that measures
will be more meaningful when data streams pull in information from a range of
sources including pharmacies, labs, EHRs, and PHRs.

David Reuben, M.D., American Board of Internal Medicine

Dr. Reuben focused on the meaningfulness criteria of validity, importance,
and longevity; and on the role and differences between certifying boards and
specially societies in improving quality and how their efforts align with
others. He posed a series of questions related to the three criteria. For
example, on validity, does the measure capture what it is intended to and
discriminate performance among providers? Does improvement in the measure
result in improved outcomes? He called special attention to the
meaninglessness of measures from forced responses to move to the next
screen. Regarding importance, he contrasted weighing a patient vs. providing
nutritional counseling to illustrate the different impacts of ways to satisfy a
quality measure. He also commented on the relative values of individual
measures versus composite scores, and the merits of looking at composite
outcomes rather than “a couple of markers.” On longevity, he noted
that guidelines remain current for about three years, which suggests that
quality measures should not be “there forever.” Moreover, people
learn to “game the system.”

Turning to physician organizations, Dr. Reuben described the different
types—medical societies, licensing boards, and certifying boards. The
third group (which includes ABIM) are not-for-

profit oversight organizations with an established role in defining
specialties in “the field” of medicine. The umbrella body for the 24
certifying boards is the American Board of Medical Specialties. Improving
quality is an important part of the ABIM mission, which has a program of
maintaining certification designed to make sure that physicians “keep up.
He described how it operates, and stressed that it is important on the quality
landscape because of the evidence that board-certified physicians, especially
recently certified ones, provide better care and have better outcomes. The ABIM
maintenance of certification (MOC) process is designed to evolve into a process
of continuous certification.

The certifying boards have historically assessed medical knowledge, clinical
judgment and diagnostic acumen – all critical to quality care and
difficult to discern via performance measures. ABIM and other boards have now
broadened their programs to include assessment of performance in practice,
using NQF measures where available. ABIM’s Practice Improvement Modules (PIMs)
include a patient survey, chart review, and practice survey rolled up into a
feedback report which provides physicians a rich and varied view of their
practice strengths and weaknesses. Physicians are then required to design and
implement a QI intervention in response to the feedback report and to
re-measure to gauge the impact. He then illustrated this with ABIM’s diabetes
PIM and showed how the organization is deriving composite measures and talked
about the differentiations among levels of performance that it enables.

Finally, he described a number of board alignment efforts in the private and
public sectors. He stressed that ABIM is aligned with “where the quality
field is headed”; that PIMs change physician behavior and are readily
adoptable; and that public and private payors can leverage this infrastructure
to accelerate improvement.

Discussion

In the discussion period, the topics covered by NCVHS members and the
presenters included: how to create a more “aspirational” performance
measurement system; how to make measures more directly relevant to clinical
practice and get more physician buy-in; the need for performance improvement
tools aimed at system redesign to make health care more efficient in delivering
evidence-based care; the manifold challenges for physicians in improving care
and how to address them in measurement; the ABIM vision for combining
continuous quality improvement and certification; and what to do about
proprietary measures.

Dr. Reuben said it was important to make it easier for physicians to do the
right thing, which is partly about creating better systems. Regarding the
daunting pace of change in medicine and the short life of measures, Dr. Burstin
said the solution lies in making sure one has the right elements and data
types; these are not going to significantly change, though the measures may. As
the shift continues toward outcomes measures, what are now process measures can
be folded into clinical decision support. She also advised acceptance of the
notion that the environment will probably remain “uncomfortable” for
at least the next five years.

Several comments dealt with the importance of research—for example, to
evaluate different data sources, and (more long term) to “tune the
measures” to account for the variations in genetics, behavior, community
exposures, and other determinants of health outside health care. There was
considerable discussion of how to use the performance measurement system to
encourage high performance, not just to identify “the bottom
feeders.” In the context of a discussion of P4P programs, Dr. Burstin
noted that huge improvements over time, not just absolute scores, should also
be rewarded. She also pointed out that internal data sharing among peers can be
used to motivate improvement within organizations, with real-time monitoring
and feedback being key to getting knowledge into practice. Dr. Fitzmaurice
summed up this session as having “painted a picture of how difficult it is
and yet how far we have come.”

Current Measure Development, Endorsement, and Adoption Process

The present session focuses on the current process for developing and
endorsing measures and the prospects for encouraging, promoting, and pulling
more good measures out of the system. Dr. Tang reiterated that this is “a
big moment of opportunity” because of HITECH.

Bernie Rosof, M.D., Physician Consortium for Performance Improvement
(PCPI)

Dr. Rosof presented an overview of PCPI’s work. PCPI is convened and staffed
by the American Medical Association (AMA) and consists of 125 national and
state medical groups, quality organizations, and others, including a health
care consumer/purchaser panel. PCPI and others are beginning an effort to
advance the alignment of performance measures, for integration into maintenance
of certification programs. The pressures to accomplish this come from CMS, the
PQRI program, and other sources. The selection process includes developing
specifications for multiple data sources, including EHRs; a protocol for
testing measures; and an implementation plan, among other factors. Dr. Rosof
stressed that practicing health care professionals are the “essential
drivers” of quality and they must not be left behind in this process. The
PCPI process aims to increase the number of physicians who receive process of
care and clinical outcomes data, from the current status of only 1 in 5. Its
goal is to engage all physicians and health care professionals in meaningful
use of measures (meaningful for all stakeholders) and of EHRs. Among its
criteria for topic selection, the “high value characteristics” are
care coordination, patient safety, and appropriateness/overuse.

The current PCPI work plan is to fill the gaps to include measures related
to six appropriateness topics and care coordination, and to “round
out” measurement sets. Dr. Rosof gave an illustration for the heart
failure measurement set, a current focus, which will include outcomes,
intermediate outcomes, processes (with attention to appropriateness and
underuse), and both inpatient and outpatient arenas. The work plan also
involves integration into EHRs and working with NCQA and NQF, EHR vendors, and
physician users of EHRs.

Karen Kmetic, Ph.D., American Medical Association

Dr. Kmetik elaborated on the areas introduced by Dr. Rosof. She presented
the four-stage model of PCPI’s measure development process. It is designed to
take advantage of EHRs, “a new rich clinical source,” in a way that
involves all PCPI stakeholders. The process:

Develop and maintain clinically-relevant quality measures
Develop and maintain EHR specifications for measures (2 levels—see
below)
Evaluate EHR specifications with vendors and physician users
Implement real-world “incubator groups” to test feasibility and
validity

Dr. Kmetik illustrated the process with a new measurement set developed for
heart failure. It includes both inpatient and outpatient arenas and a
functional assessment measure, and a broad-based team has been involved in
developing and testing it. EHR specifications (step 2 above), have two levels:
level 1 provides “all available code sets, algorithms for calculations,
and ‘rules'”; level 2 presents them in an “unambiguous
SDO-approved format.'” Step 3 is based on conversations with physician
users about how they use EHRs. Step 4, testing by incubator groups, is vetting
national specifications and level-1 specs. PCPI first put the groups together
four years ago to test EHR products. They represent different specialties and
practice sizes, and they test different EHR products. She stressed the
importance of testing—”actually trying it to see what happens.”

PCPI has also developed a tool for tracking progress on measure sets. Using
a clinical example, she showed a matrix that has process, intermediate outcome,
outcome, and cost/utilization on one axis and a series of stages on the other.
(See slide 10.) She noted the need to track the readiness to “go live in a
national meaningful use program,” and the value of this tool for doing so.
In conclusion, Dr. Kmetik said PCPI would continue all of the activities
outlined above, and would expand incubator groups for testing and track
progress to determine readiness.

Sarah Scholle, M.P.H., Ph.D., NCQA

Ms. Scholle focused on NCQA’s work to expand measurement opportunities using
EHRs and health information exchanges. She contrasted current measurement and
data environments with what will be possible in the future, when both the
capabilities and the data sources will be different. The future vision is for
measurement that is concurrent with clinical services, linked to real-time
decision support, using data sources available across settings and using more
clinically relevant measures. The “dream environment” will have
claims data from all plans and electronic clinical data from all providers,
linked to a rich clinical decision support environment; and it will all be
web-based or e-survey data collection-based. Also, it will be both
patient-centered and population-based.

To get there, she outlined what is needed from measure developers and
“evaluators” such as NCQA, noting that the answer is strongly related
to emerging electronic capabilities. In the short run, they need to convert
existing measures into ones that can be used in the electronic environment,
while thinking about new measures and evaluation models that can capitalize on
the new capabilities.

She highlighted issues to be addressed related to the formats for EHR-based
measures, where to look for the data, the hierarchy for data searches, and what
code sets to use; whether the data should be concurrent, retrospective, or
both; whether they should be visit- or population-based, or both; and what
updating process to have. She then illustrated how measure development will
change in a meaningful use environment, and added that NCQA is looking at
moving into that process. A proposed draft standard, the Health Quality
Measurement Framework (“eMeasure”), is the model for getting to
electronic measurement, using XML to tag elements. NCQA has outlined a path for
retooling its existing measures, and it is “actually doing this right
now” with support from HHS and NQF, and working with PCPI to convert
specifications to EHR value sets and logic. It is looking at 35 high-priority
measures that would be available for use in 2011. In addition, there are new
opportunities. Ms. Scholle outlined an “evidence stewardship” model
of multiple uses for enhanced evidence-base using electronic data systems
(slide 13). The systems enable more effective and dynamic feedback loops among
the evidence base, decision support, clinical care, and evaluation for improved
quality and safety.

She then outlined NCQA’s activities on Meaningful Use measure priorities in
11 areas. In areas of particular importance, she reported on a national working
meeting on measurement of overuse and appropriateness that NCQA sponsored in
June 2009. There was interest, but caveats to proceed with caution. The
Archimedes model, another opportunity, combines clinical decision support with
outcomes measurement; Kaiser-Hawaii is helping to test it. Finally, Ms. Scholle
talked about the opportunities to make measure updating more feasible. NCQA
formally re-evaluates all HEDIS measures at least every three years. It will
coordinate with EHRs and HIEs for planned and unplanned updates.

Frank Opelka, M.D. American College of Surgeons

Dr. Opelka is involved in a ten-hospital system in Louisiana that has a
learning network for quality improvement. He also is involved with the American
College of Surgeons’ National Surgical Quality Improvement Program (NSQIP),
which encompasses multiple measurement systems. He focused on procedural-based
care, which he called “a whole different realm of performance
measurement.” NSQIP captures about 135 data elements over 30 days and uses
risk adjustment methodologies. 35 of the elements are risk adjustors, though he
said only 6 or 8 would be sufficient. (He added that the risk adjustment needs
to be updated.) They sample 20 to 25 percent of multiple procedures.

The NSQIP is a facility-level measurement, and it is trusted and meaningful
to providers for quality improvement. It has demonstrated the ability to create
large learning networks within regions and to prevent a significant number of
complications, representing “astronomical” savings. One focus is
appropriateness of care. Dr. Opelka asserted that the best approach is
condition-specific and evidence-based. Patient-specific factors are confounding
beyond the capacity of current measures to sort out with respect to
appropriateness. He concluded with comments on provider-level composite
measures, combining structure, process, outcomes, appropriateness, and
efficiency as well as the results of a surgical CAHPS that The American College
of Surgeons is about to launch.

Finally, he said that measures should not only be based on priorities but
developed in conjunction with the payor community. The College of Surgeons has
gotten useful information from payors about variance and gaps. It also works
with vendors who run the clinical databases, and they are “dropping
patches into the EHRs.”

Discussion

This discussion focused on the tension between the time it will take to come
up with appropriate EHR-derived measures and the urgency brought on by the
current situation—not only HITECH, but the fact that the Medicare Trust
Fund is going bankrupt. Dr. Rosof noted that adequate funding would help move
the process along.

Dr. Middleton asked the speakers to identify the breakthrough opportunities
they would like to see to accelerate development and implementation of
measures. These were suggested:

Harmonization across payors, so that providers would not have to deal with
multiple quality programs—which Dr. Opelka predicted would require
“government intervention.”
An up-front investment in quality measurement, quality improvement and
learning networks by all payors. Dr. Opelka said there is “overwhelming
evidence” in the surgery world of the return on investment in good quality
care.
Information that clinicians can use and explain to patients; meaningful
reports; and, to accomplish that, HIT systems containing structural measures of
the needed capacities, starting with “a few really good measures” and
growing from there.
National standards, agreed upon by all stakeholders—providers,
payors, consumers, specialty societies, and boards—that harmonize current
performance measures.
Getting timely data into the hands of physicians (rather than just
exporting them, with no feedback).

Dr. Middleton asked about the relative merits of a “floor of
competency” versus an aspirational goal in performance measurement, and
the role of learning networks. Dr. Opelka said both are needed—a floor or
baseline at the national level, and “the upper 10 percent” at a
personal level. The key to improvement, he said, is systems that create
standardization and reliability. The measures should be “specific to our
issues, and then we have to own them”; and a team effort is needed to
figure out how to change. Learning networks help in all these areas.

Dr. Tang proposed as a breakthrough making sure the quality measures
“align with the physician’s mind.” In addition, he pointed to the
idea of the quality data set and asked the panelists if their organizations
would be willing to draw from such a consensual set of measures. He and others
observed that now, organizations are engaged in good but separate processes
that lead to different quality measures, when what is needed is a core set of a
few. Dr. Opelka observed that it would be useful for such a data set to allow
further data mining, to derive locally-relevant information for quality
improvement.

Mr. Reynolds commented on the potential value for payors of having an
agreed-upon set of measures for all patients, providers, and payors, thus
making it possible to say to vendors, “You should do this.” Dr. Tang
added that there is a moment of opportunity now, and the hearing is “a
plea to measure developers to hit the breakthrough measures,” with
meaningful use as “the excuse to do it.” Dr. Rosof noted that this is
not a new discussion. The question is who would organize the national standards
and who would make the rules. He observed that this could be accomplished
within an existing framework, such as PCPI.

Dr. Middleton asserted that first, what is needed is a sustainable business
case for quality, once the stimulus money has been spent. Panelists commented
on the need to realign incentives to pay for quality rather than volume; and on
the need for system-level accountability. Dr. Kmetik agreed with the idea of
leveraging “the moment of meaningful use,” noting that it is enabling
a different dialogue with vendors and payors. The conversation needs to be
about what everyone has to gain by “doing it right.” In conclusion,
Dr. Tang told the panelists, “measure developers have a lot of power at
the moment,” and they were encouraged to use it.

Building Meaningful Measures: Adoptability

Floyd Eisenberg, M.D., NQF

Dr. Eisenberg presented a model of information for quality developed by
HITEP, which is funded by AHRQ and chaired by Dr. Tang. Dr. Middleton
co-chaired a workgroup. The HITEP’s task was to develop a quality data set and
identify workflow for quality in a clinical setting. Dr. Eisenberg noted that
everything starts with guidelines and evidence. Illustrating with diabetes, he
showed the four elements in the specifications of the HITEP QDS: the concept,
the code list, the data type, and finally the data flow. Together these make up
a model of a QDS element. He noted that while quality is the current focus of
use of the QDS, it is also useful for public health reporting and illness
identification and for research. Workflow involves the data source and the
setting. Multiple QDS elements together make up a measure, and all the measures
are stored in a measure database. There will be a mechanism for finding the
elements in EHRs, where they will be stored in consistent ways. This has to be
linked with health IT, and HITSP has done work on how to link each of the data
elements to an information model that EHRs might use—called the Quality
Interoperability Specification. It will take the HITEP quality data elements
and definitions and make sure each one is represented in a setting in the
electronic record.

There are gaps in the HITSP side, he said. One was increasing the
granularity of standards harmonization and specification; another was the need
to identify appropriate standards or recommend new ones for such things as
functional status, care experience, communication, and “patient
declined.” Another gap relates to structural measures to indicate an
office’s HIT capacity. As a next step, NQF will soon call for nominees for a
panel to say what kinds of information can be gotten from routine EHR use to
determine that it is used for this purpose. He showed a slide of an early
prototype for applying data elements to individual measures. The intent is for
the measure authoring tool to create an electronic measure. It is now being
reviewed by HL7; then it will be tested with meaningful use measures. HITEP was
asked to retool 72 measures.

Dr. Eisenberg reviewed new data types and sources, many of which correspond
to the gaps mentioned above. One issue is how to get information from the
patient. In general, he said, it will be important to be able to identify the
source. For example, some new data sources will come directly from devices.
Finally, he showed a model of data collection, with different levels of
collection from different sources. One issue with the QDS is “maintaining
currency of the very atomic particles for those measures.” Maintenance of
the QDS is “our next job,” he said. NQF has a regular endorsement
process. He described its evaluation process as measures are retooled
electronically. Finally, regarding code list issues, he said NQF is talking
with the National Library of Medicine, CDC, and NCI to get together with
stakeholder groups and figure out how to do this best to avoid creating another
silo.

Blackford Middleton, M.D., Partners Healthcare and Brigham and Women’s
Hospital

Dr. Middleton, an NCVHS Subcommittee on Quality member, said he had
experience as both an EMR developer and an implementer and would present an
implementer’s point of view. He identified four components of quality measure
adoptability in HIT:

the quality of the measure,
its “implementability” in HIT,
the practicality of its use in clinical practice with HIT, and
its maintainability in implemented HIT.

He then offered and discussed a number of questions for each of the four
areas. (See slides for complete lists.)

The first step is to start with a high-quality measure, which he defined
with a long list of attributes including being well specified, clinically
meaningful, representative, and unbiased. Implementability involves using
standard data elements in the numerator and denominator, and making sure that
HIT implementation does not bias the measure and that the HIT functional
requirements are considered in the measure specification, among other criteria.
The criteria for practicality include whether the standard data elements are
captured automatically in the process of care, and whether the method of data
capture or the data source biases the measure. Also of interest is whether the
measure report can be implemented in a useful way for each user, and whether it
can scale for multiple uses.

Dr. Middleton noted that maintainability/adaptability is something that EHR
implementers “wrestle with on a daily basis.” Criteria for this
attribute include whether the measure supports quality reporting at the point
of care and whether it can be updated easily and practically, plus
considerations related to semantic and syntactic integrity, and knowledge
management and curation. He described some of the challenges in the last area.
He noted the efforts to define a common information model for quality
reporting, and said it might be beneficial to have such a meta-model.

Discussion

Dr. Carr pointed out the need for an “intermediary data
aggregator,” and both panelists agreed that having someone in this role is
“a fundamental piece of the picture.” Dr. Carr wondered if national
resources and entities were focused on this function, and on risk adjustment, a
question that prompted further discussion. She proposed that this body of work
will be critical to meaningful measurement. NQF was mentioned as a likely
entity to carry it out at a national level. Dr. Middleton noted the need for a
national architecture for quality data reporting and management. A standard or
cardinal list will help when it is available, he said; in the meantime, it will
be necessary to keep cleaning up the data. The next question is, “Who pays
for that?”

Dr. Tang commented on the proliferation of measures and the existence of
“multiple siloed measure developers.” He noted the possible need for
“some kind of coordinating body over the measure developers” and the
relevance of the QDS in this regard. Dr. Eisenberg agreed that the QDS will
help with consistency and in other ways. NQF is funded to convene the measure
supply chain to start talking about the use and population of QDS.

Meaningful Measures for Care Coordination

Kathryn McDonald, Stanford Health Policy

This session began a series on the development and use of meaningful
measures in several high-priority domains and the challenges in each context.
(The other sessions were on day two.) Dr. McDonald conducts research on health
care quality and patient safety, much of it for AHRQ. Her past work laid the
foundation for a new project on care coordination measure development for
ambulatory care, supported by the AHRQ Quality Indicators Program. She reviewed
the history of that program, which has expanded to address new priorities
including care coordination measures. All its measure development is grounded
in evidence-based medicine, coupled to users’ needs. She likened gaps in
measurement, and the need to pay special attention to them, to the same issues
in care gaps. In both cases, it is important to “pay attention to what is
missing,” not just to evaluate what is present.

She then addressed the hearing questions (see executive summary) with
reference to measuring care coordination. Regarding building meaningful
measures, she identified four main concerns: a working definition, conceptual
frameworks, research evidence, and adequately covering the areas most likely to
drive improvement. She discussed two examples of conceptual frameworks to
illustrate the importance of using “some logic regarding the connection
between what might be measured and how the measure could monitor the
situation.” She illustrated by describing the care transitions work of Dr.
Eric Coleman. To answer the question about the measure development process, she
described AHRQ’s QI process as applied to care coordination. One slide
illustrates the steps in evaluating each indicator on the candidate list. She
noted that patients are an important information source in this area; and in
some cases, notably transitions across settings, they are the only source.
Regarding introduction of new data sources, she stressed the importance of
drawing from a variety of data sources. Finally, she cited the experience of
the AHRQ QI program as a model for developing, maintaining and updating
measures, noting that it is a stable reference point and resource.

Sarah Hudson Scholle, NCQA

Ms. Scholle reported on a project on care coordination that NCQA is
conducting with Johns Hopkins and Park Nicollet researchers. She started and
ended her presentation with these key points:

Care coordination measures should address structure, process and outcomes.
Process measures are most actionable, but are lacking.
Process measures should be routine by-products of the care process.
Care coordination measures depend on HIT systems that track essential data
elements and effective workflows for clinicians and staff.

She stressed that care coordination is about information sharing. Structural
measures are an important starting place; process measures get at what is
happening; and while outcomes are most relevant to families and policy makers,
they must be risk-adjusted and are difficult to attribute. NCQA has a grant to
think about this measurement framework with respect to vulnerable children. She
showed a matrix laying out the care coordination measurement approach across
the stages and sites of care.

The aforementioned project has produced a model for ambulatory care
coordination. It also has identified a number of measurement issues, which she
discussed. The major ones concern the urgency and expected timing of the
referral, how to assess effective communication with patients and families,
where accountability belongs, and the differences between integrated and
non-integrated settings. In addition, with EHR-based measurement the project
team is concerned about underreporting of numerators and identification of the
eligible population. She noted that with EHRs, it is important to have
structural measures showing the capability and the workflow.

Discussion

Dr. Carr expressed concern that “we have so much complexity that we
become paralyzed and lose sight of the immediacy” of the questions at
hand.

Dr. Green observed that “it is essential that the development of
meaningful measures move in lockstep with explicitly articulated statements
about the care process.” He illustrated two clinical situations in which
measures that were not “contextualized into the care process” would
be meaningless. Dr. McDonald affirmed that it is necessary to think about
“the exact setting and the exact patient, and what the interdependencies
are.” She added that for this reason, she believes the linear Donabedian
framework is less appropriate for care coordination. The group discussed the
difficulty of assigning accountability related to care coordination.

Dr. Tang returned to the earlier theme of breakthroughs and expressed
concern about “the cost of complying with reporting” and his doubts
that the mechanics of care coordination, as important as they are, need to be
monitored nationally rather than handled locally. He stressed the availability
of new tools to streamline local processes to enhance coordination. Dr.
McDonald noted the possibility of simply asking the patient his or her
perception of coordination. Ms. Scholle acknowledged the validity of the
question about the costs vs. the benefits of the proposed measures; but she
stressed the importance of knowing that the structures are in place that enable
communities to know whether coordination is taking place. This is especially
important in non-integrated settings.

Dr. Carr reiterated her concern that “the complexity is so overwhelming
that we might never get there.” She noted the benefits of simply asking a
question, which can “catalyze a universe of systems that will answer in a
way that is right for them.” Dr. Tang wondered whether we need measure
coordination as well as care coordination.

Mr. Quinn commented on the possibility of using the social networking
paradigm and tools to observe how much “traffic” there is around
patients and to look at the patterns related to care coordination.

Re-Cap and Discussion

Subcommittee members named the following major themes in the hearing thus
far as lessons about measurement that can be applied to national priorities:

Transitioning from the way things were done in the past to where we would
like to go, given new opportunities—for example, the idea of ongoing
feedback to physicians on how they are doing and how to change
Real-time quality management, feedback, and reporting
An aspirational approach to quality, rather than a pejorative one,
possibly combined with a new kind of CME
Harmonizing payors around a core set of national standard quality measures
Actionable quality—making reports and data actionable
A sustainable business for quality after the Stimulus money is gone
Parsimony of quality measures and approach, and what kind of national
architecture would enable getting quality data from the point of care to CMS
for insights into payment reform and back to the provider
The need for an aggregator/data cleaner
The need for concurrency of measurement, clinical decision-making, and care
delivery
Measuring social networks

While praising the expertise of the presenters relative to their respective
fields, Mr. Reynolds expressed disappointment in the absence of any
“grouping up” around a feasible, comprehensive quality measurement
and improvement process that looks at a bigger picture and is appropriate to
current conditions in the U.S. He commented on the pressing need to bring the
industry up to speed on the concepts and practices discussed during the
meeting. He noted that HIPAA, for all its limitations, had the merit of saying,
“This is it”; and ultimately, HIPAA will make a difference because it
“grouped everybody up.” He speculated that if there were a standard
data set, some payors, at least, would “do everything possible to push
people there.” Finally, he stressed that the Committee, with its
understanding of timeframes and national policy, can and should pull together
observations and/or recommendations to help move the ball forward.

Several members echoed Mr. Reynolds’ sentiments. Dr. Middleton reiterated
the need to think about the bigger picture and identify the big issues that
should be put on the table. He wondered about the need for a “quality
czar.” Mr. Reynolds said he would like to see “excitement to go to
the same place and make a difference.”

Meaningful Measures for National Priority Aspects of the U.S. Health Care
System—Carolyn Clancy, M.D., Director, AHRQ

Dr. Clancy began by characterizing the present moment as a mixture of
excitement about the possibilities ahead and humility due to the weaknesses
that must be faced. She noted that feasibility drives existing measurement
activities, sometimes driving out what is important. Now, “right over the
horizon” lies a health care world with ubiquitous data and the possibility
of actually determining what we want to measure. The current phase of
“retooling existing measures or retooling EHRs to capture imperfect
measures” is a step in that direction. She noted the tension regarding the
functions of measurement that arose in the previous day’s discussion, a tension
between “information that makes my job easier” and “pure
old-fashioned accountability.” The field of medicine is headed toward
taking scientific knowledge and tailoring it for the unique needs of individual
patients. Leading-edge institutions are starting to get there, but the
measurement enterprise does not yet have the methods to do so.

Dr. Clancy outlined some of the challenges in this area. Measuring and
reporting clinical quality has “gotten so granular” that it may now
be producing too much information. There is interest in making the information
actionable; for example, efficiency measures are a “huge gap.” The
Keystone Project and NSQIP are closer to the vision of collecting enough
actionable data to guide clinical efforts, which she noted is “the
fundamental purpose for collecting the data.” There are huge challenges
with attribution; for example, measures focus on individual efforts, while the
field is moving toward promoting teamwork. Furthermore, EHRs are now used to
support a transaction-based system that rewards volume rather than quality; and
the information on groups of patients is “primitive.”

The Recovery Act, in both comparative effectiveness and health IT, is
“one bright, promising start”– a down-payment on the infrastructure
needed to make health reform sustainable. Health IT can make everything easier
once data collection is part of routine care. This issue needs more study, she
said, along with “the structure in terms of how we record
information,” something that has not changed in 50 years or more.

She called attention to the DARTnet Project as groundbreaking work on a new,
distributed network prototype for data acquisition and aggregation, funded by
AHRQ. In general, the agency is trying to make EHRs more useful for comparative
effectiveness research. The DARTnet project is giving insight into the
incentives for physicians to participate. She noted that data stewardship must
be factored into any data collection/aggregation strategy. AHRQ is attentive to
the pitfall of focusing on data rather than information—”an area
where the Federal government excels.” Another program AHRQ is excited
about is the development of an initial core health care quality measure set and
roadmap for children, funded under the Children’s Health Insurance Program
Reauthorization Act of 2009 (CHIPRA). States will have a list of measures they
can voluntarily report by January 2010. The process has been a highly creative
and collaborative one, with intense stakeholder discussions.

Overall, Dr. Clancy said, now there is an important opportunity for synergy,
and the present NCVHS/Subcommittee on Quality initiative comes at a very good
time. New funding provides a one-time opportunity to focus on infrastructure.
The Institute of Medicine gave guidance on priority topics, including
infrastructure needs, with its Initial National Priorities for Comparative
Effectiveness Research (Jan. 2009). She noted the IOM recommendation for a
prospective registry to compare strategies in some areas, and said we have to
figure out how EHRs can pre-populate registries in a manner organically
connected with health care delivery.

She outlined a vision for 21^st century health care (“Using
information to drive improvement: Scientific infrastructure to support
reform”) with these attributes:

Information-rich, patient-focused enterprises
Information and evidence transform interactions from reactive to proactive
Actionable information available to clinicians and patients “just in
time”
Evidence continually refined as a by-product of care delivery

In addition, quality and disparity must be linked very tightly, which is a
strong focus of the comparative effectiveness investments. She stressed that
this requires local data and solutions customized to particular communities,
while also extracting generalizable knowledge. Finally, she noted the need for
the health IT and quality assessment/improvement communities to interact much
more than they have in the past—something the meaningful use incentives
will facilitate. She added that she is “thrilled that you are thinking
about social networking.”

Discussion

Dr. Tang asked Dr. Clancy to comment on these observations from the hearing
thus far: First, measures need to get simpler, and possibly just focus on
outcomes. Second, consumers are becoming more activated and health-literate;
perhaps they can just be asked how they are doing. Third, physicians need
bi-directional information exchange with relevant, timely content. Finally,
measure developers have transformative power right now because of ARRA.

Dr. Clancy expressed agreement with these statements. As a “killer
app,” she noted that many doctors want information on which of their
patients are having challenges with medication adherence. Also, we need a
strategy to look systematically across an organization or practice to identify
where mistakes were made and balls dropped in the previous week. She stressed
that “losing people” (i.e., records, referrals, etc.) is a major
problem that the system is not tracking or addressing adequately. Also needed
is something to enable greater responsiveness to patients. She highlighted the
“teach-back” approach as a good example.

Dr. Scanlon called attention to the sense of urgency associated with health
reform and contrasted that with the “evolutionary process” in which
the quality world is now engaged. He wondered what EHR vendors are supposed to
do in the interim. In view of this situation, MEDPAC is considering a
recommendation to ask not for measures but for “the building blocks for
measures”—e.g., lab values—to provide a faster path while the
measures continue to change. Dr. Clancy responded positively to the idea, and
also noted the HITEP effort to identify a subset of priority measures to
endorse. She noted the training issues involved, given that doctors are not
trained to “look back” or engage in peer review.

Mr. Reynolds invited Dr. Clancy to identify focal areas for NCVHS in which
it could make the greatest difference. She noted the need for more work on the
level of attribution and on how to get at disparities for small
sub-populations; but beyond that, she said she would have to think about it. As
a short-term focus, she suggested looking at the drafts of health reform bills
and identifying common elements among them. Often, when legislation directs the
Secretary to do something, a lot of work is needed to figure out how to do
whatever is directed—in this case, for example, the ideal data collection
strategy for quality measures.

Dr. Green returned to the DARTnet project, noting the unstated assumption in
much measurement work that data must be gathered and stored somewhere. This
project provides an alternate approach, in which EHRs are queried and there is
no permanent data warehouse. Dr. Clancy pointed out that even so, an entity is
needed to facilitate the queries. She speculated on the possibility of merging
the ideas of a distributive data network and prospective registries.

Dr. Fitzmaurice commented on the need for leadership “to put it
together” and wondered where the leadership should come from and if the
right kinds of partnerships were in place to move quality ahead. He also
wondered how NCVHS could help catalyze the process. Dr. Clancy stressed the
value of multiple stakeholder input, and in particular the merits of having
articulate consumers at the table who can serve as “game changers,”
changing the conversation from “This is so hard” to “Yes, we
can.” She stressed that consumer engagement is critical, especially for
chronic illness. Finally, there was discussion of how to measure the activity
and success of using science to provide individualized treatment. AHRQ is
investing in studying this over the next few years.

Meaningful Measures of Disparities

Ernie Moy, AHRQ

Mr. Moy works on the National Health Care Quality and Disparities Reports.
He approaches the meaningfulness question within ONC’s perspective, i.e., in
the context of quality improvement: specifically, data capture and sharing on
subgroups and populations experiencing health care disparities in ways that are
amenable to quality improvement by improving clinical processes. He pointed out
that to reduce disparities, disadvantaged groups have to improve at a greater
rate than advantaged groups. He asserted that it is meaningful and necessary to
look at disparities to improve quality, and to combine data on quality and
disparities to target interventions. He cited many gaps, starting with the
insufficiency of data capture by subgroup. Outcomes related to disparities are
not narrowing, he reported.

Disparities data can be used to guide quality improvement activities by
targeting the problem to increase efficiency, guiding interventions to increase
effectiveness, and tracking progress to make sure the interventions are really
having an effect. Disparities in specific areas (e.g., colorectal cancer
screening) can vary widely across states, so it makes sense to work on the
states with the biggest gaps. There are also wide variations, not just among
cities but in areas within them. In general, the smaller unit you can get to,
the more valuable and actionable the data are. Mr. Moy also called attention to
the wide variations among ethnic subgroups and the need for targeting along
those lines; and the same is the case with variations related to language
proficiency.

Finally, he offered these personal recommendations based on the above
observations: It is important to support collection of disparities consistent
with OMB recommendations, and to collect information on English proficiency.
For the granular data, he recommends identifying the areas most meaningful for
national tracking. States and localities need to define what ethnicities and
languages are most relevant to local circumstances. Targeting specific
subgroups in specific areas may be an efficient way of improving quality.
Finally, we need to assess our disparity data measurement activities to see if
they actually change processes and improve quality.

Kalahn Taylor-Clark, Ph.D., Brookings Institution

Dr. Taylor-Clark focused on three topics: current measures of equity,
opportunities and challenges of current measurement strategies, and what makes
disparities measurement meaningful. Equity is one of the IOM’s six domains of
quality, but there are virtually no measures for it. To measure it, clinical
effectiveness measures (or others) are stratified. One challenge in creating
meaningful disparities measures is that “we simply do not have a
discipline.” There are no standard race, ethnicity, and language data
across organizations. In addition, we need the ability to integrate data
systems from demographic, claims, clinical, and lab sources, among others.
Another challenge is that there are no incentives to collect, report, or
utilize the data.

On the first challenge (no standard data), an IOM August recommendation
acknowledged the need for standard data. This raises the tension between
national and local realities—that is, the need for the large OMB
categories at the national level and for more detailed demographic data at the
local level. Thus, we need a standard way to roll up the data into standard
categories, which the IOM has endorsed. Still needed is a map to help health
care organizations and locales make valid comparisons and roll up the data.

Dr. Taylor-Clark then commented on the need for an integrated data system.
The challenge is that data systems do not speak to each other well enough to
combine data from different sources, including non-traditional ones such as
employers. A major difficulty for health care organizations is getting race and
ethnicity data from their patients. Regarding aligning incentives for data
collection, reporting and use, she described activities by the Joint
Commission, NQF and NCQA to develop measures of equity. In addition, the
Brookings Institution has a Racial/ethnic Health Care Equity Initiative,
connected to its High-value Health Care Project.

Discussion

In response to a question, both speakers said they look at socioeconomic
disparities and effects in their work as well as at race and ethnicity. Mr. Moy
said the important thing is to be able to identify groups experiencing
difficulties in order to target them efficiently. Dr. Taylor-Clark noted that
disparities still exist based on race and ethnicity after controlling for
socioeconomic position.

Asked to comment on the idea of patient-centered customization with respect
to cultural differences, she recommended targeting medical education to help
students learn to develop culturally appropriate interventions using
patient-centered measures. Mr. Moy added that it would be most efficient to
cluster population subgroups and share the interventions and tools with other
parts of the country where they could be useful.

Dr. Scanlon noted that NCVHS has issued recommendations on race and
ethnicity reporting, including a recommendation for granular data—not at
the provider level, but collected once. He asked if IOM was recommending,
similarly, that the data be collected once and transmitted. Dr. Taylor-Clark
commented on the need for data transfer protocols among health care providers.
Mr. Moy reiterated that he encourages providers to start collecting information
on the groups that are particularly meaningful to their practice.

Dr. Green praised Mr. Moy’s graphic presentations for the Disparities Report
as a model of simple, pragmatic information reporting. Finally, the group
discussed the principle of patient-centeredness, including patient preferences,
and what it might entail and how to measure it within care. Dr. Taylor-Clark
stated that outcomes analysis shows that preferences are not driving many
disparities in outcomes. Mr. Moy noted the importance of patient education to
ensure that “preferences” represent informed choice.

Meaningfully Measuring Value and Efficiency

Joachim Roski, Ph.D., Brookings Institution

Dr. Roski began by reviewing the basic nomenclature of his field, citing the
AQA definition of value of care: “a measure of a specified
stakeholder’s preference-weighted assessment of a particular combination of
quality and cost of care performance.” The other key terms are cost
and efficiency. The cost of care is viewed and measured differently
depending on the perspective of cost to whom–consumer, plans/employers,
providers, or society; so it is important to be clear about what perspective is
adopted. Approaches to units of cost are unit-based, episode-based, per capita
(person-based), or a combination. Dr. Roski noted that “episode of care
measures are not a panacea,” partly because of the potential for bias. He
discussed different measure development approaches, including proprietary
episode-based approaches, transparent (public domain) episode-based approaches,
and transparent condition-specific per capita approaches. He noted the problems
with the first category because the way episodes are defined is not known.
Brookings is working with ABIM on a similar but transparent approach for 12 (of
500) conditions.

He then enumerated the key challenges for measure developers pertaining to
focus (costs, resource use, or paid amounts?), methods, data sources,
standardization, comprehensiveness, risk adjustment, linking measures of cost
and quality, consensus, and wide-scale implementation. The first question with
episode-based measurement is defining an episode, and then determining how to
“approximate out of administrative data some clinical concepts of an
episode of care. He discussed issues with linking clinical and administrative
data, and pointed to a project to link registry data with WellPoint
administrative claims in all California hospitals. On the concept of
efficiency, although NQF has not yet endorsed any cost measures, some are
expected early next year; then work will be needed on how to link cost and
quality measures to get to efficiency.

Regarding consensus, Dr. Roski noted the many questions about how to
implement efficiency measures consistently, and about what kind of
infrastructure will be pragmatic. He predicted increasing data sharing between
providers but different arrangements and capacities in different environments,
making it difficult to achieve the ideal of communitywide health information
exchange. The challenge with diversity is to avoid being too prescriptive while
being able to consistently extract numerators and denominators and give
providers value-added information out of data exchange. Dr. Roski and
colleagues thing the practical path forward involves distributive data models.
They are experimenting with a way to consistently query health plan data for
performance information. He asserted that creation of the right environment for
this work “requires leadership on the federal side or on the public sector
side in terms of coordination and planning. He noted the role of NCVHS is
figuring out the strategic vision so that implementation can happen in a
coordinated way in public and private sectors.

Michael Rapp, MD, JD, CMS: Value-Based Purchasing–Combining Cost and
Quality

Dr. Rapp noted that CMS always has to focus on “practical ways
forward” because of its role in implementing legislation. CMS defines the
term value in the context of value-based purchasing (VBP) more broadly
than the AQA definition cited by Dr. Roski. Currently, CMS is only authorized
to pay differentially for better quality for end-stage renal disease. There is
broad support for and active work on VBP at CMS, which implements it in several
demonstrations. The basic concept is to bring together cost and quality. Dr.
Rapp reviewed of the major challenges involved, related in particular to level
of attribution and defining the episode of care. Another question concerns what
are valid cost measurements. He noted that at CMS, “we always have to
think about the adverse consequences of anything that we do.”

CMS would like to get to using outcome measures, of which it now uses about
74. Dr. Rapp outlined some of the agency’s measurement and reporting projects
and initiatives, notably on 30-day mortality and readmission data for AMI,
heart failure, and pneumonia. CMS also plotted out and mapped the geographic
variations on the measures, showing significant variation among states. This
activity and data have received considerable public attention. CMS was required
to submit to Congress in 2007 tentative measures for VBP, and they included a
30-day mortality measure. He noted that for another initiative on hospital
acquired conditions, the projected savings are not large enough to “save
the Medicare program.” PQRI has about 170 measures, including a few
outcome measures. Many hospital measures concern outcomes, and home health
measures are mostly outcome measures. As for how CMS acquires data, Dr. Rapp
said it does not use a distributive model; “We collect the data.” CMS
has 74 registries, all of which have to work “exactly the same.”

Regarding moving to episodes, he commented on the advantages of using
hospitalization as a “bright line,” including as a starting point for
post-hospitalization measures. He noted that payment in the post-acute world is
based on functional status assessment – which “could tell us
something about the future.” He described the Care Instrument being used
in a demonstration project. In the future, EHRs could include items such as
functional status to enable assessment and care coordination. On coordination,
he noted that the 9^th QIO scope of work has the theme of care
transitions, related to the re-hospitalization issue.

Discussion

NCVHS members commented on what role CMS might play in the transition to
meaningful use of EHRs, episode-based measurement, and other policy directions.
They stressed the power of any measures selected by CMS, because of its size.
Dr. Rapp declined to “speak to the future,” noting that CMS does what
Congress directs it to do. He also stressed that CMS has been interested for a
long time in moving in the direction now being pushed by the HITECH
legislation. He cautioned that it will take a long time to get there.

Dr. Scanlon called attention to the MDS OASIS model as a good one for the
future, in contrast with “the measurement approach,” because OASIS
started with what a provider should know about individuals to provide good
care. If Medicare moved in that direction, he said, it could be a powerful
force and contribute to meaningful use of EHRs.

Dr. Green expressed hope that the field can move toward a clinically
meaningful definition of episodes as part of patient-centered care and
measurement. Dr. Middleton commented on the goal of introducing the idea of the
velocity of disease into data and measurement through the focus on episodes. He
and Dr. Rapp agreed on the importance of including the patient’s perspective in
approaches to health care quality.

Dr. Tang commented on the potential of the proposed CMS acceptance of
electronic data in PQRI (Dr. Rapp said a decision is expected in November) as a
possible model that might possibly be scaled up as a pathway.

Finally, in response to a question, Dr. Rapp noted the significant impact of
public reporting in motivating quality improvement by providers. He cautioned
that it remains to be seen whether such improvements will translate into cost
reductions in health care.

Meaningful Measures of Integration, Population Health and Health Status:
Healthy People 2020

Linda Harris, Ph.D., HHS Office of Disease Prevention and Health
Promotion

Dr. Harris said her team could partner with NCVHS “as we learn together
about how to measure the life and health status of Americans and join in
understanding how public health, population health and clinical outcomes might
be viewed together.” She characterized Healthy People, which is in its
fourth decade, as combining loftiness and a grounding in rigor. Its priorities,
goals and objectives are truly national in that they are developed by a
coalition of grassroots participants. At present, the list of objectives is
being developed, and a list of about 2,000 will soon be released for public
comment. Healthy People 2020 will be launched around December 31, 2010.

Healthy People 2020 will for the first time take an “ecological
perspective.” Instead of limiting its objectives to conditions, it will
look at the following determinants of health: health services, biology and
genetics, individual behavior, social environment, and physical environment.
Dr. Harris observed that NCVHS in its current process is more focused on
illness, conditions and health care, while Healthy People “is about
preventing conditions from happening in the first place.” In the area of
health communication and health IT, the goal is to move toward a learning
system, using an “interactional view” of health.

Every topic area has a working group. The group working on health
communication/health IT has tried to create an integrated view. She reviewed
the health communication and health IT objectives within each health
determinant area, to provide insight into the Healthy People approach, noting
that surveys are its main data source. Improved health literacy and more use of
electronic personal health management tools are objectives for individual
behavior; increased social support for social environment; increased best
practices in risk communication for physical environment; and increased
personalized health guidance for biology and genetics. In health services,
improving patient-provider communication is very important; ODPHP hopes to
measure the extent to which patients feel “part of the process.”
Other objectives in that area are increased use of health IT and increase in
advanced connectivity. ONC is part of the team developing these objectives.

The team looks at communication as a series of interactions between systems,
“which is where decisions are made and change happens.” Dr. Harris
discussed Dr. Ed Wagner’s Care Model, which her team sees as a promising way to
understand “productive interactions between an informed, activated patient
and a prepared, proactive team,” something that Dr. Wagner regards as a
goal for health care. She proposed that her team’s health communication and
health IT perspective might contribute to the work of NCVHS as it contemplates
meaningful measures, adding that she hopes for continued conversations and
shared learning. The bottom line, she noted, is that “we have to have the
measures.”

Discussion

Dr. Carr asked about accountability and how to identify the causes of
improvements using survey data. Dr. Harris said Healthy People planners want
some measures in which providers have some accountability; she also noted the
benefits of a social determinants approach where data on several areas are
available. She agreed with Dr. Fitzmaurice about the merits of
“piggybacking” on the meaningful use measures now under development,
particularly to understand “the sweet spot between population health and
health service delivery.” ODPHP is working on personal health management
and social networking, collaborating with the Pew Foundation around Pew’s
understanding of “the social life of information.”

Dr. Green noted the learning available to NCVHS from the Healthy People
focus on measures independent of conditions. He talked with Dr. Harris about
measures of feedback for people about their health. She commented on the
“provider-focused” approach to episodes in an earlier NCVHS
discussion and put forward a more person-centered concept as an alternative,
noting that “people can pretty reliably give us what an episode means for
them.” Asked how to determine whether people are getting reinforcement for
promoting their own health, she described new types of PHRs that are designed
with feedback mechanisms. Dr. Ed Wagner is very interested in feedback; this is
an area, she said, in which “we need to learn together.”

Dr. Tang drew attention to the 2,000 Healthy People measures being
circulated for public comment and asked of any efforts to condense them and
create, for instance, a survey tool with 20 or 50 items. He wondered how NCVHS
could help. Dr. Harris said everyone wants this kind of reduction, and it will
be important to do so; however, no one has stepped up to do it. The challenge
is to prioritize and identify the 10 or so most important measures. She
stressed the current “opportunity for cross-fertilization” among the
Secretary’s advisory committees and suggested that NCVHS talk to Jonathan
Fielding, the chair of the Healthy People FACA.

Floyd Eisenberg, M.D., NQF

Dr. Eisenberg noted that NQF has the set of priorities set by the National
Priorities Partnership (see earlier NQF presentations), and it plans to work
with Healthy People 2020 on coordinating work across the six priorities. Dr.
Harris added that the Institute of Medicine has offered similar assistance.
Groups are being brought together over the next several months to call for
measures in the six priority areas, and some Healthy People objectives may fit
in.

NQF is trying to encourage use of available electronic data. Some of this,
Dr. Eisenberg said, may require research—for example, on how to measure
social networking to show engagement and education of patients and families.
NQF is trying to coordinate measures from the electronic data stream. A first
group of recommended measures is being retooled for EHRs; the process will be
finished by March 2010, and they will be available for 2011. NQF also plans to
add clinical decision support and coordination of care elements to the Quality
Data Set (QDS). He reviewed the information he presented the previous day
regarding the QDS framework. Altogether, there is a set of data types for which
the developers are considering where in an electronic information model the
information would be found and where it would be shared. (He agreed with Dr.
Carr that this idea resonates with Dr. Scanlon’s idea of a repository of
queriable data elements.) The framework has been created; now it has to be
implemented. It includes multiple data elements, which can be reused.

Dr. Eisenberg also agreed with Dr. Carr that this could represent a
blueprint for vendors. He said vendors are expressing concern that their
ability to innovate will be constrained if the data elements are prescribed,
and he asked NCVHS members for their views on this. In response, Dr. Scanlon
critiqued the notion “that standardization is holding back
innovation” and asserted that “that argument needs to be ignored very
quickly.” Dr. Tang agreed, saying that “we ought to just move on and
say it has got to be done.” Dr. Eisenberg said this is the direction NQF
has taken. The innovations, several suggested, can be around improving
workflow.

Summary, Discussion and Next Steps

To conclude the hearing, NCVHS members reviewed the major themes they had
heard and discussed the Subcommittee’s next steps.

Dr. Tang noted that while the Subcommittee had learned a lot about current
activities, it also became aware of the lack of an overarching strategy or
framework to move measures into the new era of richer data and tools. Returning
to the notion of a “killer app,” he highlighted the goal of
bidirectional data flow, in which data are rapidly analyzed and returned to
those making everyday health care decisions. In addition to an overarching
vision and framework, he proposed that a coordinating body may be needed to
guide measure development.

Dr. Scanlon noted the difficulty of knowing whether the huge amount being
spent on health care and health IT is making any difference. He pointed to
current IT capacities that make it possible to get information back to
providers within hours. Noting the hard work focused on measures, which leads
to greater proliferation of measures and complaints from vendors, he called for
creation of “a Swiss Army Knife” of data that is amenable to multiple
uses. He called attention to the potential linkage between such information
flows and HIPAA, including its privacy dimension, and urged that NCVHS move
forward to bring the full breadth of the Committee’s perspectives to bear on
the current problem.

Dr. Tang pointed out that in addition, the rapid movement on meaningful use
policy provides an opportunity to help get “some good measures out
there.” He and the other members agreed on writing a letter to the
Secretary with the Committee’s advice, for approval at the November full
Committee meeting. In addition to intersecting the meaningful use proposal with
the findings from this hearing, the Subcommittee can think about what measures
would be meaningful for measuring the health system and influencing its
direction in health reform.

Dr. Middleton encouraged attention to Dr. Harris’ suggestion about
interfacing among the FACAs; he noted that this could lead to higher quality as
well as more coordinated and effective advice to the Secretary.

Dr. Carr pointed to the lack of a “sense of urgent action” in the
hearing presentations, although “timeliness is an undercurrent of
everything.” Mr. Quinn proposed that while many people are doing good
work, some of it collaboratively, there is a leadership vacuum. The urgency of
the current situation calls for a new governance and accountability framework.

The group agreed to individually list the ten most important themes, which
will be lumped into categories and provide the foundation for developing a
letter for November.

The Co-Chairs then adjourned the hearing.

—Appendix: Agenda, Presenters, and
Participants—

Agenda and Presenters

Moderators: Justine Carr, M.D. and Paul Tang, M.D., Subcommittee Co-Chairs

Setting priorities for measurement

NQF National Priorities Partnership and NQF work towards meaningful use
measures:

Helen Burstin, M.D., MPH, Senior Vice President for Performance Measures,
The National Quality Forum (NQF)

What makes a measure meaningful?

Development process
Adoptability
Right measures
Outcomes vs. process measures
Structural vs. behavioral measures
Subject Areas

Helen Burstin, M.D., NQF

David Reuben, M.D.,
UCLA School of Medicine; Chair-elect, American Board of
Internal Medicine (ABIM) Board

Current measure development, endorsement, and adoption process

Participants and roles
Data sources
Strengths
Shortcomings
Linkage with EHRs
What aspects of the current process support development of meaningful
measures?
Which don’t?
Addressing sub-populations
Use of new data sources (e.g. EHRs and user-generated)

Bernard Rosof, M.D.,
Senior V.P. for Corporate Relations and Health Affairs, North Shore Long
Island Jewish; Chair, Physician Consortium for Performance Improvement

Karen Kmetic, Ph.D.,
Director of Clinical Performance Evaluation, American Medical
Association

Sarah Scholle,
Assistant V.P. for Research, NCQA

Frank Opelka, M.D.,
Louisiana State University Healthcare Network; National
Surgical Quality Improvement Program (NSQIP)

Building meaningful measures – adoptability

Specifications
Linkage with Health IT
New data sources
Data collection
Update/keeping measures current

Floyd Eisenberg, M.D.,
Senior VP, Health Information Resources, NQF

Blackford Middleton,
M.D., Chairman, Center for Information Technology, Partners Healthcare;
NCVHS Subcommittee on Quality

Meaningful measures for care coordination

Current measures
Strengths
Weaknesses
What makes a measure meaningful?
Recommendations

Kathryn McDonald, Executive Director and Senior Scholar, Stanford Health
Policy (by phone)

Sarah Hudson Scholle,
NCQA

Meaningful measures for national priority aspects of the U.S. health care
system

Carolyn Clancy, M.D.,
Director, AHRQ

Meaningful measures of disparities

Current measures
Strengths
Weaknesses
What makes a measure meaningful?
Recommendations

Ernie Moy, AHRQ

Kalahn Taylor-Clark,
Ph.D., MPH, Brookings Institution

Meaningful measures of value, including efficiency

Current measures
Strengths
Weaknesses
What makes a measure meaningful?
Recommendations

Joachim Roski, Ph.D.,
MPH, Managing Director, High-Value Healthcare Project, Engelberg Center for
Healthcare Reform, Brookings Institution

Michael Rapp, M.D.,
Director, Quality Measurement and Health Assessment Group, Office of
Clinical Standards and Quality, CMS

Meaningful measures of integration, population health and health
status

Current measures
Strengths
Weaknesses
What makes a measure meaningful?
Recommendations

Linda Harris, Ph.D.,
Lead, Health Communication and ehealth Team, ODPHP/DHHS

Floyd Eisenberg, M.D., NQF (by phone)

Summary, Discussion and Next Steps

Meeting participants and attendees:

NCVHS Subcommittee on Quality members:

Justine Carr, M.D., Co-Chair
Paul Tang, M.D., Co-Chair
Larry A. Green, M.D.
Blackford Middleton, M.D.
William Scanlon, Ph.D.
Harry Reynolds (NCVHS Chair)

NCVHS Staff and Liaisons

Debbie Jackson, NCHS
Katherine Jones, NCHS
Matt Quinn, AHRQ
Mike Fitzmaurice, AHRQ liaison

Others (not including presenters)

Cynthia Sydney, NCHS
Zoe Hruban, ODPHP
Kristin Anderson, HHS
Sean Arayasirikul, HHS
Jennifer Shevchek, AMA
Rebecca Zimmermann, AHIP
Mari Savickis, AMA
Shari Ling, CHS
Allison Viola, AHIMA
Ann Greiner, ABIM
Rod Piechowski, AHA

[1] ncvhs.hhs.gov (attached
to the agenda on the calendar posted on the home page; use appropriate date to
locate)

[2] Health Information
Technology for Economic and Clinical Health Act

Minutes of the October 13-14, 2009 NCVHS Subcommittee on Quality Hearing

National Committee on Vital and Health Statistics Subcommittee on Quality

The Meaningful Measure Supply Chain: Building Measures That Matter for Our Nation’s Health and Healthcare Priorities

October 13-14, 2009

Executive Summary

The current landscape, the emerging vision, and the way forward

Summary of Presentations

Introduction

Setting Priorities for Measurement

What makes a measure meaningful?

Current Measure Development, Endorsement, and Adoption Process

Building Meaningful Measures: Adoptability

Meaningful Measures for Care Coordination

Re-Cap and Discussion

Meaningful Measures for National Priority Aspects of the U.S. Health Care System—Carolyn Clancy, M.D., Director, AHRQ

Summary, Discussion and Next Steps

—Appendix: Agenda, Presenters, and Participants—

National Committee on Vital and Health Statistics
Subcommittee on Quality

The Meaningful Measure Supply Chain:
Building Measures That Matter for Our Nation’s Health and Healthcare Priorities

Meaningful Measures for National Priority Aspects of the U.S. Health Care
System—Carolyn Clancy, M.D., Director, AHRQ

—Appendix: Agenda, Presenters, and
Participants—