[This Transcript is Unedited]

Department of Health and Human Services

National Committee on Vital and Health Statistics

Workgroup on Quality

Use of Administrative and Clinical Electronic Data for
Quality Assessment Hearing

June 19, 2007

Hubert H. Humphrey Building
Room 505A
200 Independence Avenue, SW
Washington, D.C. 20201

Proceedings By:
CASET Associates, Ltd.
10201 Lee Highway, Suite 180
Fairfax, Virginia 22030
(703) 352-0091


P R O C E E D I N G S [8:30am]

Agenda Item: Introductions and Purpose

DR. CARR: I think we are ready to begin. Good

morning. This is a hearing of the Quality Workgroup of the National
Committee of Vital and Health Statistics. I am Dr. Justine Carr, from Beth
Israel Deaconess Medical Center, chair of this committee and member of NCVHS. I
have no conflicts on what is being discussed today. I will ask as we go to the
right each person to introduce themselves.

[Introductions around the room.]

DR. CARR: Welcome everyone, and thank you for being here. I know we have a
couple of workgroup members who are splitting their time with the Privacy
Committee and we have another workgroup member in a cab arriving momentarily. I
think we want to make sure that everyone has their time that is due them today.
So, with that, we would like to stay on time. I actually had pre-introductory
slide to help frame the day. As we know, there is strong momentum from many
sectors to measure and report quality of healthcare. As we will hear more about
today, the burden of collection and reporting is high. With the goal of
electronic health records by 2014, the hope is that some of the burden will be
lifted. Our focus today is discussion on how we are functioning today in the
hybrid era where quality reporting is both derived from administrative and
electronic sources. Our key question is how are we doing, and how is care
improving? The questions for the speakers are briefly: one, describe your
initiative, but then what data did you select and why? What resources were
required? How did you ascertain data reliability?

A second question, and the most important, how did you use your data? What
interventions were triggered, and how did it affect the quality of care? So in
other words, as we hear about the burden of collection and the various
configurations of data collection, we do not want to lose sight of the fact
that how we get there that the there is quality of care. We will be very
interested in lessons learned. What works and what things might inform the
configuration of the electronic health record going forward?

This one you just have to keep tapping. This is my picture of measure and
quality in the hybrid world. Up in red there is safe, effective care. As you
know, we create and define measures. We collect the data and put it in
electronic format. As you can see, we sometimes abstract from paper records,
abstract even from electronic records. We depend on administrative data and we
are adding and will hear today about how adding some electronic elements such
as lab or medications has helped us.

The goal and the data can be aggregated and reported back and acted upon
with a final goal of improving safe and effective care. I think we are well
aware of the fact that there is a lot of activity going on about aggregating
data and the electronic health record. I just wanted to focus the fact that we
want to talk about today. What is getting better? What is the burden of
collecting this data? What is the return once we have this data?

So, I want to give special thanks to Marybeth Farquhar from AHRQ and
Cynthia Sydney who were indispensable in getting all of this organized, and of
course the members of the Quality Workgroup for their insight and
recommendations. So, again, as a reminder please ask the speakers to keep to
twenty minute presentations and leave ten minutes for discussion. I am also
reminded to ask you to keep your Blackberries away from your speakers. If you
do not, you will find out what happens. I would also like to welcome two
additional members. Carol, do you have any additional comments? And then Simon?

MS. MC CALL: My name is Carol McCall. I am a member of the Quality
Workgroup as well as a member of the NCVHS full committee. I just want to thank
you for taking the time to be with us today. We are very excited to hear the
stories. As Justine has said, it is not about the data. It is about how you use
it. So, we are anxious to hear the stories about what works, what you have been
able to achieve that can inform some of the policies and processes as we move
forward. So, thank you.

DR. COHN: I am Simon Cohn and I chair the full committee. I am here as a
guest of the workgroup today. I will be here for most of the day.

DR. CARR: Okay, well I think we are even a little ahead of schedule, but I
would like to invite Crystal and David and Allison to move forward then with
their framing of the testimony that we will hear today.

Agenda Item: Framing the Testimony – AHIMA

MS. VIOLA: Dr. Carr, members of the Quality Workgroup, and ladies and
gentlemen, good morning. I am Allison Viola, Director of Federal Relations at
American Health Information Management Association. Joining me this morning is
Crystal Kallem, Director of Practice Leadership at AHIMA and Dave Gans, Vice
President of Practice Management Resources at the Medical Group and Management
Association MGMA. They will be providing detailed testimony regarding the
issues surrounding healthcare data, collection, and reporting. On behalf of
AHIMA and MGMA and its members, thank you for allowing us this opportunity to
provide input on the issues and challenges associated with collecting and
reporting healthcare data. Although we have developed written testimony, and
you should have this documentation with your handouts, I would like to turn the
discussion over to Crystal where she will delve a little bit deeper into the
issues to provide a more practical overview of the challenges faced by
increased quality measurement and reporting initiatives.

MS. KALLEM: Thank you, Allison. Thank you, Dr. Carr for inviting us. AHIMA
and MGMA are very aware of the current environment related to healthcare
quality. A large number of our members are managing the data collection and
recording responsibilities within healthcare facilities on a daily basis and
continue to express concerns surrounding the ever-mounting requests for data.
As a result, we begin to see the need to highlight this critical issue on a
broad scale. So, AHIMA and MGMA formed a partnership and approached the Agency
for Healthcare Research and Quality with a proposal to gather key stakeholders
from the industry to help us identify solutions and direct change. We greatly
appreciate the decision by AHRQ to fund both an invitational conference and a
task force of our members to develop supporting conference materials. The
taskforce was composed of a group of health information and office management
experts who helped identify the issues and the variations associated with
performance measurement, data collection, and recording.

The findings from the taskforce address the impacts of healthcare providers
and organizations forced to respond to the ever increasing reporting
requirements, including the lack of uniform data collection and analytic
specifications, the lack of qualified staff to support the requirements for
data, technological challenges, organizational challenges, economic pressures,
and other competing priorities.

These findings laid the foundation for productive dialogue during the
conference. Although the taskforce did not have enough time to quantify the
specific costs of the current obligations on providers at the time we created
the report, we were able to categorize and describe the scope of these issues
that contribute to the increased costs and demands. This invitational
conference was held last November. We brought together over 50 experts from
public and private healthcare organizations to address how best to collect and
report data for quality, public health, and performance initiatives. The
participants represented a wide array of stakeholders, including hospital and
physician organizations, payers, employers, government agencies, accrediting
agencies, and other stakeholders with performance measurement and data
management background and expertise. The briefing paper developed by the
taskforce and the full conference report can be obtained at the link provided
on this slide.

So, as the industry moves forward there are a large number of issues and
challenges related to this topic. With the widespread adoption of electronic
health records, interoperability, and paper performance programs, the need to
align these initiatives is becoming vital. Dr. George Ishim from Health
Partners allowed us the opportunity to share this slide with you that provides
a visual depiction of the various demands on healthcare organizations and
providers as they deal with the increasing and disparate requests for data.

At the same time, providers continue to struggle with staffing shortages,
tighter reimbursements, and pressures to accomplish more with less making their
ability to meet these various requirements and increasing concerns. Not only
are there a large number of organizations demanding data, but each requestor of
data has its own set of requirements and specifications to comply with. One
quality measure could have varying specifications among two or more requesting
organizations. A healthcare provider must dispense each measure’s numerator and
denominator statements, data elements and abstraction specifications, allowable
data sources, date of submission deadlines, analytic specifications, and the
list goes on. So, you can have one measure – for instance, diabetes
hemoglobin A1B — that is requested from five different performance measurement
requestors and they could all have different specifications.

These issues are present in both the paper and electronic environments. In
an electronic environment, providers must map the data from their existing
systems to the various performance measurement data requirements in an effort
to obtain appropriate high quality data. Not only does this data need to be
mined from their electronic systems, but it has to be formatted to comply with
each requestor’s data submission requirements.

In preparing for today’s discussion, I wanted to provide some specific
details regarding how the data collection process works. Hackensack University
Medical Center allowed me the opportunity to share with you a copy of their
manual data collection workflow diagram. This diagram depicts the actual
process used by the organization to manually collect and report data from the
CMS and joint commission measures. In addition, this organization voluntarily
participates in a CMS premier project as well. Thankfully, in this particular
example, CMS and the joint commission measures are more closely aligned than
some of the other performance measurement initiatives. This flow depicts four
topic areas that are being monitored: heart failure, acute myocardial
infarction, community acquired pneumonia, and surgical infection prevention.

Each quarter, the organization identifies the sample of cases that are
needed to be abstracted for each topic. Some topics require 100 percent of the
cases be abstracted while others allow for sampling. Even for sampling
requirements, different organizations could require different mechanisms for
sampling of the data. This step alone can be a confusing and complicating

After the population is identified, patient lists are created so that the
medical records can be located and pulled for abstraction. On average, it takes
approximately 27 to 43 hours per month, per topic to extract charts in
preparation for data extraction. Then it takes approximately two weeks after
the end of the discharged month, charts are ready for data abstraction. It
takes just over three weeks to complete the actual data abstraction
requirements. It does not end there. After the charts are abstracted, the data
are grouped for data submission. Following data submission, variances, and data
errors are identified and corrected and then resubmitted.

Additional requirements are depicted on this particular flow chart as well.
Some of these additional steps required include the validation activities.
Periodically CMS identifies a sample of cases that are requested for data
validation. Hospitals are required to submit paper copies of their medical
records to the CMS central data abstraction center who then re-abstract that
data to identify whether or not the data was abstracted correctly. All of these
steps contribute to the process of this manual data collection activity.

In September of 2004, AHIMA testified before this very workgroup regarding
the challenges associated with quality measurement data collection reporting.
Barbara Seagle, Director of Health Information at Hackensack University Medical
Center and a member of AHIMA testified to this workgroup about current
organizations’ experiences with voluntary and mandated reporting requirements.
Three years later, the issues remain the same and the demands for data continue
to increase. Hackensack University Medical Center allowed me the opportunity to
share with you some of the numbers from 2004 compared to 2007.

Barbara reported in 2004 that the number of cases required for manual data
abstraction were 500 cases per month. Since then, this number has increased to
over 650 cases per month that they abstract for the data reporting
requirements. The number of full time staff required to collect and report the
data has increased by two full-time equivalent staff. Barbara’s department is
in the process of requesting an additional FTE to support the increasing
performance measurement demands. Fortunately, Barbara has managed to maintain
her highly qualified and trained staff throughout the years, but her salaries
are now averaging $42.00 an hour, which is an increase of $10.00 since 2004. In
all, Barbara’s organization has experienced the 72 percent increase in the
financial resources required to collect and report these demands in data

In preparation for today’s discussion, I reviewed the data collection
specifications for four of the hospital quality heart failure measures. There
are a total of 34 data elements required to calculate the results for four
heart failure measures. I identified 13 key clinical elements for each of these
four measures. There are 24 different data sources from which you are allowed
to abstract data within a medical record. When manually abstracting these data,
data abstractors must have a clear understanding of the data element’s
specifications to know where they can pull information and when. This does not
include all of the corresponding inclusion and exclusionary criteria that the
abstractors must also be familiar with including all of the synonyms and
varying terms that must be knowledgeable about as well. In a small hospital,
one abstractor may be abstracting data for all measures for all topics and will
have to be familiar with all of the requirements for all of the data elements.
You can see that there is a potential for error in this situation.

The current environment does not make electronic retrieval of data any
easier. In an electronic environment, healthcare providers must identify how
the data are stored in their electronic systems and map the data according to
the corresponding data abstraction guidelines and variables. Without broadly
agreed upon standards for defining data content, variations and taxonomy of
terms among performance measurement systems are difficult to interpret and
often require costly and laboring data mapping activities to link and extract
data from these electronic systems.

I hope that some of these examples have been helpful as you prepare for
today’s discussion. I will now turn the presentation over to Dave Gans who will
describe the remaining challenges in our recommendations. The information that
I have provided so far has been primarily focused on a hospital setting.
Although hospitals are the focus of today’s meeting, Dave will highlight a few
of the key challenges faced by physician practices and hopes that this
committee will consider addressing these challenges during a future meeting.

MR. GANS: Thank you, Crystal. Our meeting in last November examined the
needs of the healthcare data collection and reporting. One important element
was to examine what happens if we were to look at the clinical data or as many
health care payers had looked at administrative information predominantly from
the billing record. What we know from looking at clinical information, as
Crystal described, is that it oftentimes has data not available on the clinical
insurance claim form. It has the laboratory test results. It has imaging
results. It has the nursing notes. It has diagnostic information that may or
may not be replicated on the billing form. Also, while this information is
extremely rich, in many cases it must be manually obtained from the medical
record. Even in an automated environment, that still requires personal
intervention to acquire information. Contrasting with that would ease a
collection of administrative information because this is already intact in the
billing record. It lends itself to be very attractive for many payers and
others to look at quality data. However, there are problems and in so far as
that there may or may not be uniform coding rules. There may or not be uniform
conventions, guidelines, and definitions and in fact brought out during
discussions at our meeting was the fact that different payers have different
standards, different definitions for the same topics. They use same terms to
mean different activities, so there is a definite need that you will see in a
recommendation for standardization in the data collection and reporting

Also, information in the billing record is not necessarily complete. For
example, examining these prescription drugs by patients. If the patient fills a
prescription at the Veterans Administration, that will typically not occur.
That will not be shown in the payer record – at a military treatmen
facility because this is a military dependent, a beneficiary retiree –
that will not show. If the patient chooses to purchase their drugs utilizing a
generic drug discount at a store such as Wal-Mart, it will not appear in the
billing record. Also, it will include patients who may refuse therapy.
Discussions with an MGMA member, Cardiovasculare Associates in Clearwater,
Florida described the problems that they have experienced with patients who may
be in a protocol that would normally require a blood-thinning therapy, but the
patient has contraindications to blood thinners. Consequently the medical
record will show the contraindication, will show the need for an alternative
therapy. However, the billing record will show a deficiency. So, the billing
record has opportunities, but also has substantial disadvantages in examining
quality care information because it is oftentimes incomplete.

What we know is that there is substantial variations in performance
measurement systems and reporting standards. While the intentions are well
intended from each of the payers for each of the data and quality collecting
organizations, the many different standards and many different organizations
have caused many of these problems to occur – often has required formats,
you can see the problems for updating performance measures are not streamlined,
are not standardized. Also there are oftentimes updates in performance measures
required by a payer and certain payers utilize so-called “black box”
systems, where the actual protocol may be hidden behind a computer algorithm
known only to the payer. It cannot be replicated by the providers for reporting
data. SO you only know the results of the black box edit; you do not know the

All of this goes on at a time of increased economic pressures on physicians
and hospitals. We have increased higher costs of doing business. Medical Group
Action Association has a long series of surveys examining the economic costs
and efficiencies of medical group practice. In the past year, we have observed
increases in cost of 6-8 percent among physician practices even with the
lessening of pressures due to malpractice insurance expenses.

At the same time, we are seeing declining reimbursement. Medicare currently
pays physicians at the same rate as in 1999. We are potentially looking at
substantial decreases in Medicare payment in 2008. At the same time we have an
expectation. In fact in many cases a mandate, electronic prescribing systems
have been shown to be substantial benefits in quality of care. The use of
electronic health records has potential to improve quality carriers(?). However
the cost of implementing these systems are borne by the doctor without subsidy.
Oftentimes with increased inefficiencies, not necessarily increased
efficiencies, physicians and hospitals and all providers have an expectation to
do more with less.

Examining the physician office environment, first thing we observe is that
physician and most doctors in the United States are in solo practice and small
medical groups, especially among primary care physicians. These organizations
are relatively unsophisticated. We are talking small business, and many cases
we are talking family business. These organizations do not have electronic
health record. One in seven medical groups have the electronic health record
based on a study that the Medical Group Management Association set up a
research conducted under AHRQ funding two years ago. We are observing that
there are increased interests in electronic health records, but still among
small practices and especially among solo physicians they live in a manual

Also, very few medical group practices have a certified coder on staff.
Certified coders are very expensive. There are un-reimbursable expenses, and
they are not necessary in that small primary physician practice. These same
small practices do not have a chief information office. They typically will not
have an information staff which makes even more complex the use of electronic
health records or obtain the information for that medical record. As that
comment regarding to sophistication, even in a moderate-sized medical group
where you have a medical record supervisor, this is a staff supervisor, not
necessarily a trained medical records expert. There are consequently data
extraction oftentimes used for research purposes, for clinical, our device
research is often times designed by the practice administrator or the
physicians because the lack of sophistication and trained staff in medical
record extraction.

Even in electronic health record environments, software is often times
maintained by the electronic health record company, not by the practice.
Consequently access to the databases of the electronic health record is
maintained by your EHR company even though it may be held physically on servers
in the practice, and the practice may not even have access to that information
without the permission and authority and codes passed on to it by electronic
health record company.

As I mentioned, looking at the cost of providing health care services today
and the relatively low level for reimbursement that medical practices
oftentimes examine cost first as long as it does the minimal accepted function.
We will look at maximizing functionality especially when it comes to measuring
quality because it is un-reimbursed activity for the most part today. Practices
are so concerned on minimizing costs and increasing efficiency.

Our taskforce at our conference last year had a series of recommendations.
These recommendations incurred at the conclusion of our November
11th meeting and were codified over the next two months in a series
of communications with our attendees. We had three major recommendations. The
first was to form public/private entity to oversee and evaluate policies and
procedures for the collection and reporting of healthcare data performance
measurement information.

We also had recommendations provide funding to support research on the
quality of data and also to provide funding to support additional research on
the cost associated with performance measurement data collection. I was most
pleased that the agency helped research and quality a step forward to further
gather information in these three areas. These recommendations occurred last
November. On the 4th of June, AHRQ issued an RFI to examine, under
the health data stewardship title, which I saw a copy of the Federal Register
announcement in the handouts. The Healthcare Stewardship is going to allow to
gather information to foster broad stakeholder discussions on the topics. While
outstanding in its data collection, the same RFI notes that there is currently
no intentions or plans to issue a related request for a proposal.

MGMA and AHIMA encourages this committee to give a recommendation that such
an RFI be issued following an appropriate time to understand the dimensions of
healthcare data collection and in creation of this public private entity. Also,
on June 6th AHRQ issued task orders. They are going to examine and
de-fund examining the cost of acquiring healthcare data in primary care
practices. We are extremely pleased to see an affirmative action in this area
and wait with great interest to see the results.

We notice several opportunities for action in our report. We feel first
that this opportunity for the public private entity provide the policies for
healthcare data measurements. We need an organization that is impartial, that
can create core data content standards as prerequisite for reliable and
consistent data collection and reporting. We feel it is extraordinarily
important to assist providers in their data collection efforts to standardize
performance measurement systems. Also, we feel it is extremely important to
continue the collaboration among critical stakeholders in healthcare data

This entity needs to be empowered. It also needs to be held accountable to
collect and prioritize input from key stakeholders, to facilitate the process
to obtain regular input regarding measurement standards, to develop a plan for
short, mid, and long term goals and tactics, to reach a national consensus on a
starter set, a basic set of uniform data that measure healthcare quality and
performance. Also, to coordinate health information exchange and quality
initiatives at all levels, national, state, and local for both data integrity
and response from use of information. Last, to conduct all business in a very
public and transparent manner.

Crystal and I anticipate questions, and we welcome them. Thank you so much.

DR. CARR: Thank you, that was very thorough and

powerful and a sobering presentation. I appreciate the time you took on it.
It was of particular interest to get the update on Hackensack Hospital, because
as you say they were here a few years ago. They enlightened us at that time on
the burden.

My question is what your thoughts are on how we get to quality. I think
there are two approaches. When we are starting in the clinical setting, and we
read the literature, and we know what is the evidence based best practice. Then
we have the challenge of finding a way to get that data. So, one is just
combing through records. We also use administrative data, which is easy. It
helps us get to categories of patients. There is a danger in sort of measuring
what is available.

So, I see this dichotomy of measure what is available versus take the data
on quality and find a way to measure it. I am thinking in particular about
where we will go. We will hear from Kelly this afternoon, but should we be
thinking that some day the electronic health record will be able to remove this
burden and give us everything we need, or will we always have a need to be in a
hybrid state for the issues that you described about a particular individual
that makes them the exception.

MR. GANS: I will make one comment and then let

Crystal conclude. Looking at the electronic health record, that may give us
a substantial improvement in the collection of quality data. Discussions with
MGMA members who have electronic health records, developing the query can take
as long as three weeks to understand the multiple databases that the EHR
maintains to understand how to extract the data from the electronic health
record. Once the queries are written, as long as they remain standard, they can
be repeated with relatively little cost to the organization in efficiency or
time. The major problem that organizations have had is the lack of
standardization in developing the queries. As I said, looking at the very
sophisticated practice, because of the complexity of the electronic health
record and the multiple databases, which will maintain information, it can take
a substantial time to write the necessary queries to gain the information you
need. Once they are written, they can be maintained. So, consequently having an
electronic health record and having a standardized set of guidelines for
clinical data measurement and data collection, we have an opportunity to make
data collection comprehensive and efficient. When the guidelines change or when
different group providers or different organizations measure quality have
different standards, the cost of acquiring the information is so excessive that
even with an EHR it is going to be difficult, too complex.

MS. KALLEM: I concur with what Davis has

mentioned. I would like to further extend upon that. The use of electronic
health records will be very beneficial when providers adopt them. There is that
level of risk associated with adopting and implementing these electronic
products. In turn, EHR vendors need direction when it comes to the
incorporation of the performance measurement requirements into their products.
They need to have clear guidance on how the data should be collected within
their products and how the data should be exported from their products. There
are some initiatives going on within the industry to start tackling those
issues. We need to continue to move those efforts forward, and then that
information needs to be standardized.

So, measure developers need to develop their measures in the standardized
format so that the vendors can then take that information and incorporate it
into their EHR products. In addition, there is a need to standardize the data
itself, the clinical data within the EHR products. We need to identify the key
clinical elements and then formalize and standardized the clinical data that
needs to be captured within the EHR products and also share that information
with the vendors.

MS. MC CALL: I have a question. I would like to

build on the comment you just made. I would preface it by a statement saying
that I completely agree with the need for two things. One is to look at the
design of measurement itself and the fact that there seems to be a lot of
variation in that. There is also the variation and the lack of standardization
within the data. Those are very different things. If you think about the
analogy of putting together a meal, you have ingredients for your recipe and
then you have the actual act of putting it together and then you have the joy
of eating it. We will call that quality. My question is not about EHRs. My bias
is that I think we have talked too much about the record, and the record just
provides the wrong ingredients, but what we seek is a delightful meal. How much
variation, when you go out and talk to people, how much natural variation do
you think there will be in the measures themselves? Obviously, we want to first
try to standardize, and then the standardized new studied act approach. How
much natural knowledge creation should we anticipate as we build entities to
design measures?

MS. KALLEM: I have been spending quite a bit of

time actually evaluating the different performance measures to assess what
variations exist between those different measures. There are a lot of
similarities. A lot of the differences are with regard to the specific ways
that the data should be collected and reported. The actual overall measure
itself has the same goal in mind. The challenges that I found is actually in
locating some of the specifications for the measures and identifying. They
provide such a high level overview of how the information should be collected.
There are specific requirements that it is difficult to know how I would
collect that information within an office. I cannot imagine small physician
practices having to jump in and try to collect information and report that
information when they have to spend hours trying to find the specifications
required. Having an overarching entity to help direct that and even provide
central locations for where information about the measures should be stored and
can be gathered would be useful. I am not sure that has appropriately addressed
your question, but those I have known from my experience.

DR. GREEN: I would like to ask you a question

about something I am going to call the Charlie Effect. The Charlie Effect
goes like this, you have a group of providers who want to have improved quality
in their care related to behaviors and unhealthy behaviors like smoking
cigarettes, being physically inert, and eating unhealthy diets. They work with
a large electronic health vendor and had a delightful design how to create
prompts in the record and to drop out of those prompts measures of whether or
not they get to their goals, the measures, of where they want to arrive at. The
vendor decides that it is just too complicated, and it cannot be done for at
least two years. They go 112 miles away to where there is a guy named Charlie
who is the informatics program person for a hospital-based information health
system, and 25 minutes later it is over. They have got it. They are ready to
rock and roll and follow under the implement. What is your opinion about the
size of the impediment that relates to simply not having the capacity on site
in the hospital or in the practice to do a little bit of programming?

MR. GANS: I will make the comment. I concur exactly with your comment on the
Charlie Effect. In interviews with practices, I talk to organizations that hack
their own electronic health record to get quality information out because they
are not authorized by their vendor to have that full access and unlimited
access to their own data.

What they will do is they will find that individual who has great
programming skills, buy their time, and to be able to implement and obtain
information that they need for their own information. Also what happens is that
as the practices become more sophisticated, as you move from a manual paper
record, you exchange the staff for no longer having file clerks to have
information system staff members in the practice. Those information staff
members are invaluable.

We have examined organizations that have implemented electronic health
records. The ones that have the greatest success, one of their key factors has
been having IS staff in-house that work for the practice, work for the doctors,
and consequently are available to meet the standards of the physicians in that
organization. I concur exactly that IS staff is essential. However, it is a
problem that they are expensive. They are not a reimbursable cost to the
practice. They individually may add very little to the patient environment, but
they are absolutely essential to updating information on electronic health

DR. CARR: I want to be cognizant about time.

Crystal, did you want to add a comment? Okay, and Carol briefly?

MS. MC CALL: I had one more question. The question is, in your discussions
with hospitals and physician groups, what do they say regarding the emerging
world of personalized medicine and whether or not they believe or you believe
that that will materially change the burden or if the burden will essentially
remain the same?

MR. GANS: My contacts have been very concerned about the personalized health
records because of the difficulty of getting information into that personal
health record is – I presume that is where you are going?

MS. MC CALL: It was really around personalized medicine and what is
happening with certain types of tests.

MR. GANS: First of all the medicines and testing to patients to identify
appropriate regime customized to their body genome — that has been thought
that if the patients will agree to be the next goal of medication is to
personalize the treatment to the specific genome of the patient.

DR. CARR: Let us hold a little bit more on that and just continue. Actually
Bill Scanlon has a question. I would like to invite if there is anyone in the
back who would like to ask a question just to come up.

DR. SCANLON: I wanted to ask about this public private entity to build a
consensus because I think I feel this is a little déjà vu. This
was 1995 you might be talking about through the promise of administrative
simplification for standardization and write an act to HIPAA. I think some
people would argue we have not necessarily realized all of the promises of
HIPAA. I guess I am a little worried that we need to identify what the teeth
part needs to be that this entity has in order to make that consensus really be
effective. A consensus may be a necessary condition but might not be
sufficient. How do we get everyone to participate or play in this process? You
are actually speaking from a provider’s perspective of how do we get everyone
to play? I think we are going to have to at some point listen to what the
payers have to say about this because it is part of why HIPAA has not been
fully realized as they had a different perspective of what they need and want.

DR. CARR: Thank you very much. Now, I invite

Denise Remus to speak with us — Denise?

Agenda Item: Performance Measurement and Quality
Improvement – Bay Care Health System

DR. REMUS: Good morning. I appreciate the opportunity to speak with you this
morning. I actually did not create a presentation. I have some written notes
that I will be speaking from and actually taking a prerogative to try to expand
on some of the comments made by some of the speakers earlier this morning.

I am the current Chief Quality Office of BayCare Health System. I am a
registered nurse with my doctorate in nursing and have spent the last 15 years
of my career focused on quality measurement and quality improvement using data
from a variety of sources, primarily a hybrid of clinical and administrative
data. I have worked with hundreds of hospitals across the country. Prior to
taking the BayCare CQO position a few months ago, I was Vice President of
Clinical Informatics, Premier, where I was responsible for the analytics and
methodologies underlying their performance measurement products and oversaw the
analyses conducted for the CMS/Premier Hospital Quality Incentive Demonstration
Project which evaluated the impact of pay for performance on quality
improvement. You will hear more about that from my colleague, Dr. Wynn. I
conducted a research referred to as a Performance Pay Study, which identified
the relationship between reliable care and improved outcomes.

BayCare is a nine-hospital system located in Florida, specifically the
Tampa/St. Petersburg area. It was formed ten years ago as a larger health
system made up of three independent health systems. In 2006, our hospitals had
2,707 beds; 121,700 inpatient discharges; nearly 49,000 outpatient surgeries;
over 350,000 emergency room visits; and over 500,000 home health visits. We
have 17,000 team members. We are the largest private employer in the Tampa Bay
area. While the majority of my comments are based on my professional experience
since I have only been in the CQO role for last two months, they will be framed
within the BayCare base.

BayCare’s experience in quality measurement and improvement, BayCare is
committed to clinical excellence and quality improvement. Several years ago
BayCare created a clinical outcomes warehouse comprised of all patient
administrative, clinical, and financial data. The administrative data elements
are pulled from internal systems such as TSI and Envision. We maintain patient
data regardless of whether there is a bill generated. We derive it straight
from our administrative systems that serves as a secondary billing source. The
clinical data are based on national definitions and abstracted by more than a
dozen dedicated team members based at the hospitals. Actually, in thinking
about the Hackensack story I realize that we probably have about 16 FTEs across
the system that are dedicated full-time to just clinical record extraction.
These individuals support the clinical data needs associated with all
regulatory and accreditation requirements including the Joint Commission Core
Measures, Joint Commission Centers of Excellence, CMS Annual Payment Update,
Quality Alliance, and the State of Florida reporting requirements. They also
abstract data for other measurement programs including the Society of Thoracic
Surgeons, American College of Cardiology, American Heart Association “Get
With the Guidelines,” and American Nurses Association’s Magnet program as
well as clinical trials and clinical research and of course additional registry
data. BayCare is an active participant in the CMS/Premier Hospital Quality
Incentive Demonstration Project, which evaluates the impact of financial
incentives on quality improvement.

The clinical outcomes warehouse provides critical information to support
our internal quality improvement efforts and operations. We use administrative
data to evaluate the outcomes – that is the primary source of looking at
mortality, readmissions and complications and we support the Agency for
Healthcare Research and Quality Indicators.

There are many challenges associated with maintaining this clinical data
system. The billing data that we use, the administrative data that we pull from
is dynamic, with updates and modifications occurring frequently based on ICD-9
coding and processes, time lags, different decisions made on reevaluation of
the records, changes in charges, and payments. Any data pulled, even for our
own internal purposes, represents a snapshot in time. We do not have an
electronic health record system yet and are dependent on medical record
abstraction for all clinical data. It is difficult to abstract data during the
care delivery process due to management of a paper record and the operational
definitions of measures. In fact even after the patient is discharged I
frequently hear from the abstractors we cannot find the records. It is
somewhere – we are trying to find it within our time frame. Getting all of
the components of the record together remains a challenge.

The operational definitions are also difficult. When the patient is actually
in house it is a challenge to understand if they are going to end up in that
clinical population or not, what are the definitions. The clinicians are often
saying, just tell me where the patient going to fall(?)? So it is the dynamic
balance between evidence-based care and what we are actually trying to measure.

Most abstraction is done after the patient is discharged and the clinical
documentation has been coded into the HIM system and transferred into ICD-9
codes. Inter-rater reliability of abstractors is continued concern. There are
many challenges in interpreting the national definitions and applying the
algorithms to patient records within which clinical documentation rarely
follows an ideal path.

We have nine hospitals; each of them has a standard form and structure of a
medical record. But any time that I go on a unit and pull a chart you will find
there is variation in how the information is compiled, what is documented
where. So the abstractors and their training and education remain key to help
them understand where to look un a medical record and how to interpret the
different components, to follow the national guidelines and algorithms.

We have case managers and quality team members who review patient lists,
records and during rounds when the patient is actually in the hospital in an
attempt to proactively identify patients who will fall into clinical
populations such as Heart Failure and AMI to ensure delivery of evidence-based
care in a timely manner. We utilize standardized order sets of reminders. We
have medication reconciliation forms. We still have a challenge insuring that
all of the right care is delivered at the right time.

We have extensive educational programs and we maintain a dedicated team of
educators regionally as well as within each of the hospitals to try to focus on
understanding what evidence-based, high quality care means. Performance on
evidence-based measures are incorporated into our organization’s Key
Performance Indicators and Quality and Safety Plan goals which are reported to
senior management and our Board of Trustees on a regular basis. We do
incorporate our performance metrics into our management performance evaluation.
So compensation for management is tied to a component of these metrics. However
our performance on evidence-based measures, while very good, is not as high as
we would like it to be. One of the things that we recognized early on was the
comparative data from our hospital quality demonstration is that we did very
well in the first year of the project and actually earned incentive payments in
several of the areas. We were not able to sustain that improvement, and as we
stepped back and looked at the processes we found too many of them were
dependent on a case manager dogging(?) the patient for the system, reminding
the physicians – we have not established reliable systems of care. That is
because of a huge focus for what we are doing currently.

We recognize the need to approach our improvement efforts in a
methodological manner. So two years ago, BayCare implemented Six Sigma across
our organization: we have seven permanent black belts and master black belts
that are employed, 20 black belts in training, and have provided Green Belt
training or are in the process, to all Directors. The goal of the current CEO
is to ensure that all of our team members understand we need to be a data
driven organization and that we need to form our decision making with the best
information possible. We have conducted over 115 Six Sigma projects in the last
two years that have focused on patient satisfaction, throughput, clinical
quality, and financial/operational processes. Last year the projects identified
more than $7 million in forecasted savings.

We continue to expand the use of this methodology, and as I mentioned,
trying to make sure we focus on building reliable care and stable systems,
which is not easy to do. One of the challenges is that clinicians in the
delivery of care do not think about quality measures. They are taking care of
that one patient at a time. Which is giving the individualized focus that we
want them to consider but their forethought is not necessarily always on the
evidence-based measures. So, we are continuing to try to bring that into
alignment. The physicians are eager to understand where the variances are. We
do generate physicians with specific profiles on all of the measures, but we
really would like to strike a better balance between the care at that point in
time to that patient and immediate feedback and triggers. We have been a paper
environment. That has been difficult to do.

What are some of our future steps? Education, education, education is
always key. Continuing to try to help the clinicians understand the evidence
based measures, best practices and how to build reliable systems of care
delivery. For example, this year our Key performance Indicator that focus on
clinical quality we are looking at the overall appropriate care score. That is
all patients that fall into the populations of AMI, CABG, we have two hospitals
that do CABG, pneumonia, heart failure and hip and knee, rather than individual
quality measures or the composite quality score. So by helping them understand
where the care delivery system breaks down and ensuring that that patient gets
the appropriate care from start to finish, we are sharing the insight on where
the system failures are occurring and that way we can begin our interventions
more quickly.

We are working on enhancing documentation in a paper world. Paper records
often contain incomplete and inconsistent documentation and are difficult to
read. I have had – not sure I would say the privilege – in reviewing
records for different opportunities I have established a new administration for
the abstract person. As I tried to make my way through physician writing,
documentation, use of abbreviation – just their unique abbreviation for
how they describe the clinical presentation of a patient. One of the ongoing
challenges for documentation of times and reconciliation across multiple forms
as we look at many of the timing metrics, what we find is throughout our
system, depending at what clock you look at on the wall, they vary. We have
instructed our clinicians not to rely on their watch for anything other than
the second hand if they need to use that, but to actually use the standardized
clock that we have in the system. What we find is that the clock that is on the
lab machine that punches or indicates the time of test may vary dramatically
from the wall clock in the critical care unit. I mean literally ten minutes.
When you are looking at timing measures where you are trying to meet something,
it is difficult to do that kind of reconciliation.

Some of the lessons learned that have helped us as we start to transition
to the future stay, BayCare is implementing an electronic medical record
system. We are in the 2nd year of a 7-year project. It has been very
interesting to me to step into the project when it has already been underway
because I actually come with an unbiased perspective and a more reality based
perspective then the key members who are actually involved in the
implementation. They are so eager and enthusiastic, and knowing what I know
about some of the data challenges, just trying to help them with the reality

The projected costs are in excess of $200 million for complete
transformation across our nine hospitals. We do have a major proof for a
10th hospital, again, it can add $30 million to the cost of building
a new hospital to ensure that today’s technology is put in place to support the
electronic medical records. Our EMR will integrate rules and alerts into
clinical workflow to enable better clinical decisions; the project includes
CPOE; electronic physician, nursing, and other clinical documentation; and
clinical order entry. One of the things I want to emphasize is that in looking
at the system, of which this is not the first installed in this vendor –
this is a national system, I am astonished at how much build is going on in the
install and recognize that the electronic medical record is only as good as the
design. It is only as good as the programmer. It is only as good as the
forethought of those who are building and designing this system and to
understanding what do we mean by clinical data that we need to store, that we
need to consider for a delivery of care as well as retrospective evaluation,
and that is not always on the forethought in their mind. The alignment of the
data that we are going to be storing with the performance measure remains a
challenge. When you look at the programming, there is an electronic medical
record that is put is place that is a vendor program and they will make all
kinds of decisions along the way about what data fields are allowable, the
ranges, etc. and then the decision needs to be made on what to store. What will
we actually dump into our clinical warehouse, our clinical outcome data base
that we can use for subsequent analyses. It is that subsequent analyses that
requires some of the highest programming resources that we referred to. We
maintain several full time programmers already to support our clinical
warehouse. Pulling those queries is not as complicated as you would think from
a programmer perspective. The biggest challenge is the translation of what you
need from a clinical standpoint and helping them understand what we really mean
by some of the inclusion/exclusion. What are the ICD-9 codes we need to look
at? What are these other characteristics of the patient we need you to look at?
That is where I am finding myself continuing to go back and visit with the
programmer to see what did you really catch or hear because that translation
between the clinical need and the information system programmer is extremely

One of the other things that I am finding is that the vendors will often
have specifications. For example, they don’t allow certain fields to have a
null value even though that would be appropriate in the real world of clinical
care, so defaults are put in depending on how they are defined could be seen as
a true value. Those are things that unless you watch it with due diligence, you
can end up with a lot of information that is not very relevant and in fact
could give you a false picture of the care of the patient.

The project has also required a thorough evaluation of our physical
environment. When you think about it we have a system structure – we are
trying to place within them an electronic world that would set up the whole
paper chart. We have to think about whether the PCs, laptops, monitors, and
printers that we need? Where are they placed? What is portable? How many of
those portable Units do we need? Where are the plug ins? How do we store them
for recharge? What are the handhelds of the future? Where are we going there?
How do we communicate throughout our system and transfer information in a more
live state so that if that patient actually goes from the critical care down to
x-rays, that information is readily available in real time electronically.
Those are all challenges within an existing physical plan of trying to identify
that. We have had to look at our work station. We believe that with the scope
of information the clinicians will be looking at that they need very large
monitors or two monitors. Again, trying to find that in your existing work
space is extremely difficult.

Change is difficult under the best of circumstances. One of the other
things that we are doing is training all of our key members in change
management methodology. We have adopted Kotter’s(?) eight steps with a focus on
the simple phases of: Prepare, Engage, and Sustain.

Another change that BayCare is making is to be fully transparent in what we
consider to be quality of care. We will be publishing our quality measures
publicly, including tests of statistical significance against national
comparative data when available. That has been one of the real challenges in
finding national comparative data to help evaluate our opportunities for

Some suggestions I thought of in preparing this presentation, and I will
run through it. Please continue to enhance the administrative data. The
administrative data will remain the base of all health care information in your
future. I do not think it is going to go away in my career, and I doubt it will
go away much after that. We have a lot of opportunity to look at how can we the
ICD-9 coding system more effectively? Obviously moving to ICD-10 will be
extremely helpful. How do we think about it? What are the clinical data
elements that we could actually move into a code that would move us away from
that metaphoric distraction. One of the frustrations I have had is that you
look at certain medications for the patient in heart failure and you look to
see what was their ejection fraction, why cannot we categorize that like we do
many other things and move that into an ICD-9 code so that we know what that
patient’s ejection fraction is? We do not have to rely on pulling in an old
record. If there is a question, we certainly do the appropriate thing, but if
we already have that information we follow the patient through and be much more
effective than continually to try to find these records.

Present-on-admission is an enhancement to the current ICD-9 codes and will
assist greatly in distinguishing co morbidities from complications. However
early experience in implementation of this in Florida has identified challenges
in definitions and we are really struggling the AHIM team to help them
understand what we mean by present-on-admission. The guidelines that have been
published are not sufficient for them to really understand and move that. When
you actually look at them and the initial data that has been submitted, you
find very few codes that are not present-on-admission. If they were never
present-on-admission where are we in capturing our complication? That is a real
concern for the way that something has been set up.

The other piece is the clinical complexity of patient and at one point can
we actually identify some kind of a system that will allow us to understand a
stelloquy of disease that is not necessarily a potentially avoidable
complication. It might be a sequence of information, but it involves new
changes to that patient that we want to be able to identify, but if the
present-on-admission is not done appropriately, it could be considered a
complication. So, that distinction is making the clinicians extremely nervous.
I think moving them away and more toward we believe it was
present-on-admission. We need to take a real look at that.

Continue to expand the detail in operational definitions. Focus on
inter-rate of reliability. We need to standardize definitions across
measurement programs and encourage the public sector to do so. There has been
great effort on the part of CMS and the Joint Commission to standardize their
definition but there is still a gap in other national programs. For example the
Joint Commission for Stroke Centers of Excellence is different than AHA’s
stroke data in Get with the Guidelines. So we continue to have to maintain two
separate data collection systems for those two different definitions.

We need to support measures that have transparent operational definitions
and are in the public domain. Please do not force hospitals to have to adopt
proprietary databases. We need to encourage those organizations to put the
measures that are clinically relevant and evidence based within the public
domain. Please let them compete on something else, but not on the operational
definitions of the measures. If they can do a better job collecting and
reporting the information, that is wonderful. If we can do it with someone else
or internally, we would prefer the opportunity to do that as long as again,
there are the appropriate checks for validity.

We need to focus on standards for health information technology. Again, my
astonishment that we are doing so much build with a system that has supposedly
been implemented across the country. I am astonished that what I see is lack of
health information technology guidelines. It appears to be vendor driven and
vendor centric. There is a gap between data needed for clinical care delivery,
quality measures, and performance improvement. Will it all come together in the
future? At what cost?

Evidence base measure and practice should drive what is in the health
systems, not the other way around and too often we have been forced to create
measures by what we have available. That is from looking at the quality care
and clinical delivery of care in the wrong way.

One of the other areas that I want to talk briefly about is the challenges
in identifying the patient. An electronic health record, even the best quality
measures are all based on identifying the right patient, making sure that that
patient’s history is pulled in, that we have good clinical information. I can
tell you story after story where we have individuals coming into the system
using a veteran’s(?) ID card – his identification of the patient – we
cannot link our records across studies and across time. We have a true
integrated delivery system within Baycare. We provide home care services,
behavioral health, inpatient and outpatient in these areas, and we are
challenged with tracking patients across time. We have to large primary care
groups that we work with across our system and we can’t link the information.
We thought CIPA would save us. We thought we might come up with a national
patient identifier. It remains a concern. We are starting to look at
Biometrics. There is a healthcare system that we have talked with that uses
palm prints to try do identify individuals. Again, when we move into an
electronic world we have to be very careful to make sure we are pulling
together the right information. One mistake, typo in a date of birth or a
middle initial could pull in easily another individual. It is a real challenge.
It is going to be harder to track that through in an electronic world than it
is with a paper record where there might be other cues that help you understand
that you are linking information from the wrong person.

We do need to look at expanding measures in other clinical areas. BayCare
is not opposed at all to the concept of value-based purchasing. We want to
recognize and acknowledge when we are doing well and when we are not. We want
to identify our opportunities for improvement. There is only a small set of
metrics that we have that actually cover the clinical population. We know that
that will create some additional burden to the extent that we can use it, we
can create those quality measures, use information either in administrative
data, enhancing administrative data, help us more effectively gather the
information, help us more detailed operational definitions that will make a big
difference in how we can insure that we deliver high quality care. We do have
gaps in a lot of other clinical areas. We are looking at oncology right now,
for example, and there is just a huge variance in the clinical information that
could be collected there, what those quality measures might look like.
Behavioral help is another challenging area for us where we know we have a lot
of opportunity and it is a real challenge to come up with sound evidence-based
measures and good information.

I appreciate the time to talk with you today. I would certainly be open to
questions. Thank you.

DR. CARR: Denise, thank you very much. That was very informative. Carol?

MS. MC CALL: Thank you very much. I would echo

Justine’s comments. This is a delightful set of testimony. I have some
specific questions, which go to some of what we are trying to learn throughout.
You obviously have a very robust health statistics enterprise. You recognize
this is what you want to do. It sounds like you have a real good handle on what
you are doing and where you want to go. Within that, my question is, can you
pick one or two areas where you have made a specific effort and you have really
seen either dramatic improvement or you have learned something just really
profound. I was really intrigued by your comment that you had created some
evidence based measures and you have done very well, and you found that you
were not able to sustain them. So maybe part of your comment could also
include, do you and the folks at BayCare, think of that as a success or a

DR. REMUS: In the world of continuous quality improvement, we see this as an
opportunity. One of the examples I used was heart failure. We did much better
in heart failure in the initial year of the demonstration than we have
subsequently. We have drilled into that. We have found that most of our
failures started around the discharge instruction. In the model of insuring
that medication reconciliation occurred if the patient had all of the
appropriate discharge instructions and that the right prescriptions were sent
home, all of those other components, we certainly also looked if we could help
facilitate a follow-up visit. We found that in looking at the model for one of
our hospitals that was extremely challenged for the second year, they actually
had one dedicated a case manager that followed all of these patients through
the hospital and made sure that everything was done. I had no idea what
happened when the poor woman had the day off. What we found was that role
changed. She left the organization, and they restructured a little bit how they
handled things, and again, without the recognition of some of the system-ness
and the challenge of that when someone quit watching it, it fell through the
crack. We really fell down on that measure of heart failure discharge

So, one of the things we have done is look at it and have implemented
across the system the standard medication reconciliation form. We have had
educational programs for our physicians on what do we mean and why is it
critical to ensure that when that patient goes home, we have an understanding
of what they were using at home, how that might differ from what they are going
home with, the dynamic of changing the medication dose, the doctor doesn’t
often think about it and they will prescribe a different dose – well, the
patient looks at it and goes I’ve got the 20 milligrams at home I am not going
to change it again. I have already paid for that. I will try to split it or I
will try to do whatever I can and you end up with a patient that is readmitted.
We need to look at the systemness.

We do have some wonderful success in some of the clinical areas, for
example, MRSA infection, ventilator associated pneumonia that we have been
doing improvement projects on. What is interesting is that as we look at what
we call our report cards of the board and we are enhancing that. We have been
presenting this information for a while. I started to drill into it because
these are not national measures. What I found, even across our own system, we
did not have apples to apples. Some of the systems were not too bad, one was a
red apple and one was a green apple. We were close, but one of the challenges
as we move to national benchmarks is saying, how can we make sure that we
consistently internally use an operational definition that we can prepare
across ourselves as well as nationally? That is one of the areas where we have
done well, but we need to enhance how we do the measures. We need to keep that
flow going. We have some great records that we have put in place in the
critical care unit. We have a new software tool we are implementing called IQ
Trackers. That is something that understand more clearly at the exact time of
the admission of the patient to ICU in helping us collect information so that
we can look more clearly at all the patient mix and see where opportunities
are. We have done a lot of training around all of the bundle of ventilator
associated pneumonia. So there has been some good success there.

DR. GREEN: I would like to ask you to return to something you said about
fifteen minutes ago where you were pointing out how that you thought for the
rest of your career and your lifetime that we would be dealing with an
administrative dataset.

DR. REMUS: Some administrative data, yes.

DR. GREEN: Could you just elaborate your own thinking a little more about
what the constraints are that we face in trying to get to quality measures and
clinical phenomenology as long as we are bound by administrative data that has
basically as it here the analysis of claim in a commercial transaction? Could
you say how you think about that?

DR. REMUS: The way I think about administrative data is not necessarily as
much feeling as it is the internal data that is collected in the hospital
system operations as well as subsequent billing. So when I think of
administrative data, I think of the patient demographic, discharge status, date
of birth, and the physician that took care of them. That key demographic and
operational data element, but additionally all of the ICD-9 codes. Unlike a
bill for example, our internal system will hold unlimited ICD-9 diagnosis codes
and procedure codes. So the accuracy and integrity of ICD-9 codes, when you
look at them in the clinical definitions there is just as much variance there.
The ICD-9 codes that we have provided with some comparability to look at the
patient history, things that were done to them that are translated into a
future code, so I see all of that information continuing in the near future
which is that we still need the demographic information on a patient. We need
to know who is paying. We need to know the physicians that took care of them.
We need to enhance that. I want to know the doctor that did the procedure. I
want to know the date and time of the procedure. We can’t vector that
administratively in the system. That may never end up in a bill that is being
submitted but to the extent that I have an internal to my care delivery system
and I can use that to evaluate that administrative data has a lot of value. On
top of it, the clinical data will have to deal with lab value. That has to do
with medication administration. Has to do with some of the patient’s response.
Has to do with pain management and other things that we can collect clinically
that enhances that additional information phase. I think we are still going to
be capturing patients and looking at them in ICD-9 codes for a while. Does that
answer your question? I see it is different – the administrative data that
they have in hospitals is much intensive than what a secondary administrative
data set.

DR. GREEN: Do you have anything you would like to say about how sticking
with a core of administrative database actually enables the quality?

DR. REMUS: Enables? To the extent—one of the advantages that we have of
administrative data today is that it is a known entity of consistency. We have
good training. We have good systems that capture the information. There is a
lot of integrity. The concern that I have about some of the transition in
clinical data is that we do not have that experience and history. We do have
vendor centric systems. We have a lot of variability there. We have challenges
with the abstraction and inter-rate reliability of the data but until we have
more experience with that world, we still have the base to be able to pull our
patients to identify heart failure, patients that have had hip and knee
replacements, patients that had AMI —there are certain things we can pull
from that that make it easy to then say, oh, in that patient population this is
what we want to look at retrospectively for quality measures. The clinical data
is going to be critical, however, there is no question as we move to a better
system in electronic health records that can inform our care delivery,
hopefully we will have that panacea of clinical alert. We will know very
quickly this is a pneumonia patient they should have the antibiotic. This is a
patient who is eligible for a PCR(?) we need to get him in there and we need to
acclimate all the systems. That we won’t rely on human memory and human factors
to start some of those better processes going. We will have some alert that
when the patient cherkonan(?) assay is being elevated, what is critical and
what do we need to pay attention to. Those are the things that a good clinical
system can do. They help us deliver higher quality of care more reliably as
well as look at it retrospectively. Maybe it is just my experience, but I am a
little suspicious that it is going to take us a while to build a system that
will do that well that we can maintain and can use to retrospectively evaluate
quality and care.

DR. CARR: I will take my privilege of making the

final comment before we ask Mark Wynn to speak. What I am impressed with
from all of the speakers this morning is that independent of the precision of
getting at specific quality issues, the act of collecting and reporting data
has had a transformational change and is pointing us not just to electronic
health record, but to a new workforce that is sophisticated in IS as well as
clinical and coding and beyond to individual definitions and exclusions, and to
your point, geographical change in institutions. Then I think most compelling
is that this measurement is here for the long haul. You cannot study for the
exam. You can do it for a year, but you cannot sustain it unless you understand
it. I think that in terms of quality, coming out of our current state, this
transformational quality that you need Six Sigma and that you need to change
what you do and look more carefully at what you do. I think that is huge
quality impact.

We will move now to Mark Wynn. Thank you. Mark?

Agenda Item: Performance Measurement and Quality
Improvement – CMS

DR. WYNN: Thank you very much. I will briefly talk about what we have
learned in CMS about incentives for hospital quality and especially what we
have learned from our Premier demonstration that Denise was involved with and
where we are going from here. I will be talking a little bit about the hospital
pay for performance agenda, the Premier payment model. Some of the changes we
have made in that payment model which we have learned to deal from and some of
the challenges and issues as we move forward.

First of all we have a very active demonstration program within CMS on the
Medicare side. Those demonstrations have included a number of trials of things
that are now rolled out into the program at large including the DRGs system,
the initial Medicare managed care, our Critical Access Hospitals in rural
areas, and a number of other areas.

We currently have about 30 demonstrations operational with another dozen or
so under development, most of which has been required by law and which we are
busily implementing. We have a number of paper performance demonstrations on
going, not only the premier demonstration with hospital quality incentives, but
also a large physician group practice demonstration – the PGP
demonstration and then a smaller physician group called Medicare for Care
Management Performance Demonstration. Another demonstration in development on
nursing homes and home health quality. The Deficit Reduction Act of two years
ago requires a CMS report on how CMS would implement Medicare Pay for
Performance in the fiscal year 2009. We have been busily working on that
project with a good deal of attention, subcontractors and contractors support
and so forth. Tom Velick(?) is in charge of that. I am a member of that
committee and it is not entirely a coincidence the number of the things that we
have learned in the premiere demonstration have been proposed for
implementation in the Medicare Pay For Performance Policy evaluate purchasing
program. We expect to send a report to Congress we hope in of August of 2007
depending on clearances and further developments.

In addition, we are continuing to work on closing out the Premier
Demonstration Phase 1 with data collection and verification and so forth, and
we have just started the second phase of the Premier administration. I will go
into some of the differences there in a moment.

The basic background issues on the Premier Hospital Quality Incentive
Demonstration. This is a demonstration with the Premier Incorporated. We use
this financial incentives to encourage hospitals to demonstrate high quality
inpatient care. We report quality measurement data on our CMS website or
roughly 260 hospitals in the demonstration. It is intended to test the impact
of quality incentives. The initial implementation was in October of 2003. The
performance period of first base to the demonstration ended in September of
2006. We just started phase 2 which is for another 3 years, which is 2006
through September of 2009. The initial phase of the demonstration, we are
scoring hospitals on quality measures related to each of the five clinical
conditions. We are rolling up the individual measures into overall scores for
each of those clinical conditions, scoring them to determine top performers by
each of the conditions and pay incentives for those conditions.

Our recognition and financial rewards system for the initial phase of the
demonstration is to give a two percent bonus for the hospitals in the top
decile who have reached those five conditions and a one percent bonus for the
second decile. It is a single check in an annual bonus amount.

Here we are using the incentives in proportion to the Medicare payment
amounts. That is the one or two percent, for the basic DRG payment suggested by
area wages but not including direct or indirect medical education,
disproportionate share in other add ons to the system. We are looking at cases
defined by the principle diagnosis or procedure and if there is a distinction,
for example, somebody who has both that they are presenting with AMI, but they
have a CABG, the procedure will trump the diagnosis. We have some hospitals
that are not paid under the DRG system such as the hospitals in Maryland or the
Critical Access Hospitals. In those cases, we simulate what they would have
been paid under the DRG system and pay them under the same type of procedures.
All of the Medicare Fee for Service cases in the clinical category are included
in the payment base. There are sometimes a disconnect between these two so let
me briefly go over that. For example, all hip and knee cases are brought in but
not of them are measured under clinical poling. For example, accidents,
secondary procedures, are excluded from the quality measures but all of the
Medicare beneficiaries who are in the Fee For Service and come in and get a hip
and knee or knee replacement are included in the payment amount.

The steps we use to determine the payment amount is simply listing out the
ICD-9 codes for each of the clinical categories, running the data to determine
the payment amounts for all the Medicare discharges with those principle
diagnoses or procedures. We determine which are the highest quality hospitals
and then calculate the two percent or one percent amounts.

Now one of the interesting things about this demonstration is, do we have a
penalty box performance area? That is our shorthand for saying that in the
3rd year, the quality score for each of the hospitals must exceed
the baseline lower two deciles. We set the clinical thresholds for the year one
sores at the lower 9th and 10th deciles and we insist
that the hospitals must exceed those performance scores in the 3rd
year of the demonstration. Otherwise, they have a penalty of one or two

Here is our anticipated scenario. In this chart, on the left-hand side you
see in the first year the 9th and 10th decile are
determined for one of these clinical areas that say AMI. You must exceed 80
quality points to exceed that 9th decile. Then we measure that. It
takes us several months. So, we are into the 2nd year by the time we
can report the clinical areas that define this 9th and
10th decile. We hope that all of the hospitals exceed this baseline
in the 3rd year and that there is nobody in that penalty box. If
there is anybody below that 9th or 10th decile in quality
score as measured in the first year, there is a penalty of one or two percent.
We have not done our final measurements there, but I think the glass is 95/96
percent full. There are very few hospitals that are potentially in that penalty
box. I think almost all of them have exceeded that and that is I think, good

Another thing about the demonstration is we have really seen quarter
improvements. We have seen each quality score for AMI, CABG, pneumonia, that is
community acquired pneumonia, heart failure and hip and knee replacement. As
you can see, in each of the quarters here reported for each of the 5
categories, we see continued quality improvement. I think that is very good
news. I would argue, on a basis of research, we have seen quality improvement
not only in these major areas but also in these areas that have not been major.
There is this spillover effect throughout the hospital and to continue quality
improvement we are not just teaching to the test.

Another thing that we have seen that I think is very good news is that all
the hospital categories have improved. As you might expect from quality
improvement literature, we have seen reduction in the variance, improvement in
the means score, improvement in the bottom scores and improvement in the top
scores. We think it has been a very successful project so far.

The first year and second year of the demonstration, we have distributed a
little bit less than $9 million to about 123 hospitals in the first year, 115
hospitals in the second year. These top performers have represented large and
small facilities across the country.

One of the things that we do in our demonstrations is formal analyses. We
think that is very important to learning what we can learn in transferring what
we have learned to the rest of the program. In addition, a lot of what we learn
is informal, ongoing implementation seat of the pants type of learning. Despite
the fact that we have not completed our formal and objective evaluations, we
did learn that there were some opportunities for improvement in our incentive
types of policies. So, in the second phase of the demonstration, we changed the
incentive policies a little bit. We think it is going to be fair. It is going
to be broader distribution. It is going to do a little bit better moving from
what you might call a tournament model to something which is a little bit more
broad based and especially one that encourages improvement of all hospitals,
including some of those that were below the average hospital in the

Therefore, we have a 3-part payment system in the second page of the
demonstration. First of all, we are going to be paying incentives to all
hospitals that exceed the baseline mean as defined a few years earlier. We are
reserving 40 percent of our funds for that. Secondly, similar to what we have
right now, we are going to be paying for high attainment. That is the top 20
percent attainor’s will get a bonus. Thirdly, objectively speaking, you really
get more bang for the buck if you are able to take some mediocre hospital and
raise them from the 10th percentile to the 55th
percentile. That is an enormous change. That is compared to taking what we were
talking about in Hackensack or some other hospital that has very good quality
scores in the past and changing them from the 95th to the
96th percentile. That is good. Hats off to them. It is really great.
We want to encourage that but really you get more improvement in the total
system if you include and encourage those mediocre performers to move up. I
have visited some of those hospitals and I have to tell you that some of them
have done a very good job in really getting behind this and using this not only
looking for the money, frankly, that is a minor part of it but using this as a
focus and ability to improve quality.

What sorts of lessons have we learned? Well, Pay For Performance can work.
It provides focus and incentives to improve quality and we have seen
substantial quality improvement in the hospital area. Secondly, we think it is
inevitable. It is coming. We want to prepare ourselves to move forward in this
area. Thirdly, very modest dollars can have big impacts. We have substantially
less than one percent of the payments these hospitals are going into. We are
seeing big impacts just from minor dollars. We have seen continued improvement
overall in the 2nd and 3rd and 4th years of
the demonstrations. As Denise has said, there is some difficulties, but we have
seen continued improvement overall. Finally, I would argue that the precise
methodologies are somewhat important. Frankly, it is less important than
getting the signal out there in terms of the choices of measures and overall
perception of fairness. The exact role of percentiles and so forth may be less
important than just doing the demonstration and getting it started.

We have a number of challenges to Pay For Performance. I am not going to
take the time to go into all of it. I just want to acknowledge that we are very
aware that there are operational issues, financial issues, scoring issues,
major selection issues, and lots and lots of issues as we work through these
policies and it will take continued work both from the hospital quality
measurement community, from CMS and from all of the stakeholders to more in
this quality area. Thank you very much for the opportunity to get to talk to
you this morning.

DR. CARR: Thank you. That was very efficiently presented and very
informative. Again, it goes back to this theme that having the measures is
creating improvement within that measure, but transformational spillover into
other areas. I think that is impressive. Do I have any questions?

DR. W. SCANLON: I think in terms of what we have heard this morning, this
whole issue of there are so many different requests for information that that
becomes problematic and then becomes more problematic as that shifts over time.
I guess I am wondering about where you are in terms of thinking about changing
measures. I do not know whether they have changed at all in phase two or
whether you deliberately avoided that with the issue of retiring measures,
which has come up, when you reach a peak in terms of performance. Should a
measure be retired or should it be retained? Then there is the second issue,
which is something important from a payer’s perspective, which is the integrity
of the data, and what kinds of investments need to be made to assure that one
has a good, reliable information.

DR. WYNN: Thanks, Bill. A couple of things here. First of all, yes, we are
adding measures in the second phase to the demonstration. We are looking at a
number of these measures right now. Some of those areas include the age gaps as
a clinical experience from questionnaires of patients. I think that is really
good to include and we are going to start to roll that out in a few months for
all of the hospitals and we are looking at how we can include it next year in
the demonstration. I think there is a lot of interest in including it in the
MedCare systemwide policy proposal as well

Some of the reasons I think that is good is that there are some very
difficult to measure areas that I wish we could just look at, say, proficiency
of discharge planning. At Acme Hospital it is 92 points and at St. Mary’s it is
95 points. Well, that sort of a system doesn’t exist. So this is at least a way
to get at some of those very important issues regarding discharge planning,
information that is given to the patient, patient perception, and quality of
care which is so important, not just from perception, but also the measurement
of these objective areas that are hard to measure otherwise..

In terms of the retirement of measures that peaked out, that is a continuing
discussion. I do not know if there is a single answer to that. I personally
think it is a good idea to retain some of these measures that are relatively
high. The reason for that is that it retains continued attention to these
measures. Some of these measures, which have been told by some of physician
friends, that is cookbook medicine. Giving an aspirin to a patient who comes in
with potential AMI. It may be a cookbook, but if you look back a couple of
years to some of the work that came out of a few IO programs, reported by Steve
Jenks(?) about 3 or 4 years ago, I saw shockingly low attention to doing these
simple and cookbook types of things. A continued attention, especially if it is
relatively easy to measure, seems to me to be a good idea. We are back and
forth on that, and I know there is some tension on retiring some of these

In terms of where we go in the future, we will have a report hopefully
published and sent to Congress – we are hoping in August. We never know
about clearances. We are hopeful that will go through because there is a lot of
support for it. I think that even if it does not go through instantaneously,
sometimes it takes a little longer to get these policies put into law and
moving forward. We will be working on a continuing basis in terms of expanding
our faithful performance demonstrations, testing new measures, and looking at
adjustments and potential policies from which we can learn from future
nationwide policies.

DR. CARR: Just to add on to that, has there been any thought to putting —
if you are a high achieving hospital and you report less frequently is it
almost — given the cost we heard this morning of the burden of data
collection. I think it is a kind of incentive if the high performing hospitals
could report less frequently?

DR. WYNN: We have looked into a number of these

areas and sampling is part of it, too, but even using sampling procedures,
you still have a very high data burden, even if you are having 400 cases in a
large hospital. Well, most hospitals do not have 400 cases in a given clinical
area. So, it takes a pretty large hospital to get to that point where you are
sampling. In terms of frequency, no we have not been looking at that — I
am not quite sure how to do that and still hold them to the same standards that
other hospitals are in the system.

Quality of data has been quite good. We have been doing random sample
evaluations for the data, the inter-rater reliability has been over 90 percent
consistently despite the challenges of working with paper records that Denise
talked about a moment ago. We have heard very few reports of problems in this
area. It is a challenge that I think we have been able to meet. We do have some
proposals of how you can do data verification on a more targeted basis, perhaps
some random sampling and some more focused sampling for those hospitals that
have shown rapid improvements or aberrant data. Those are under continuing
discussion. We do not have a published plan on that yet.

DR. W. SCANLON: This issue of giving some relief to a higher performer. I
am afraid that Denise’s testimony indicated how fragile high performance may

DR. CARR: It was defined year over year.

MD. W. SCANLON: Right, yes. Changes not measures may be something that
triggers a significant change in the performance.

DR. CARR: Carol?

MS. MC CALL: Congratulations on some early success on what I think is
something that is pretty exciting. I have a couple of very specific questions.
You had on your last slides outlined of a number of challenges. I was assuming
in those challenges a number of plans to address as many as possible. So, my
question for you is a two-part question really. Some of these things will
happen naturally and some of them won’t. And some of them hold greater
opportunities than others. If you have to pick just three things out of there,
which do you think won’t happen naturally and therefore you need our support to
remove barriers? Where can we help, and specifically, how? Second, which hold
the greatest opportunity for synergy with something else that if we could
somehow – it could happen naturally – but if we could amplify it
would be greater than if we did not?

DR. WYNN: That is a tough couple of questions. In terms of what is going to
happen naturally, nothing happens naturally. It takes continued effort on every
single one of these issues. In terms of where we need to work in terms of
synergy, I think that to me one of the areas that the government has a special
role to play is the standardization of the measures in the reporting systems.
Certainly, CMS is well aware of that and every time there are differing
measures, there is an attempt to precisely standardize that. There is nothing
that drives a coder crazier than these minor changes as much as a comma or a
point or something like that. That really speaks to the continued need for
standardization and I know there is an enormous amount going on in that area. I
think that is a particular area, both standardization of measures and the
standardization of the reporting system and the incentive system. That is, we
do have a responsibility as the largest payer for health services to lead the
way here and, to the extent possible, to lead in such a way that private
employers, anyone else, any of the other stakeholders can join with us, not
mandated to but can if they wish to join the train and move in the same
direction and incentivize in much the same way for their own programs.

MS. MC CALL: Thank you very much.

DR. GREEN: Mark, I am wondering if you could teach me something about the
long-established persistent pattern in Medicare beneficiaries of expenditures
being concentrated in a relatively small percentage. It looks to me as if you
have enough demos in your package that you might know what the impact of this
quality measurement is on that subgroup of beneficiaries. Can you tell us
anything about that?

DR. WYNN: Well, certainly there is an 80/20 rule that works in spades in
health care. Everybody knows that. So far, number one, in some cases quality
measurement and quality improvement may not do a lot of good. In some cases,
these are patients that are so frail and so much in need of health care that
even the best types of preventive care, quality improvements, and so forth may
have marginal impacts on them, at least in many clinical categories.

Another thing that we have learned is that – unfortunately this is
especially true for the Medicare beneficiaries. In many cases these folks are
so frail, hard to reach, hard to engage, and they have not only mental
impairments that it is hard to engage them in the types of case management that
may do them the most good. It is relatively straightforward to do it in an
employed(?) population. A third thing that we have learned is that 20 percent
does not stay put. What you see in one year is not what you see in the next
year. There is a lot of regressions from the mean. I am afraid there are a lot
of folks who do not quite understand it is that the extreme groups in time one
in most cases, in almost the measurement of any type of phenomenon, turn to be
less extreme in time two. In order to really measure what you are doing for
these folks in terms of quality services, support, that they reduce future
expenditures. You cannot just do a simple time series analysis. You really have
to do a demonstration that has random assignment of control groups and some of
that is hard to do. Sometimes the answers do not turn out as well as they seem
to if you don’t do that kind of control group experiment. Those are some of the
things we learned. It is a hard nut to crack.

DR. GREEN: Thank you very much.

DR. CARR: I would like to thank our first panel of speakers. It was
incredibly informative and you answered our questions and you made it very
understandable. It really set the stage for the questions that we want to ask
so thank you very much. We are going to break now. It is 10:20 and we will
break until 10:35. We will be five minutes behind, but hopefully not too much
more than that.

[Brief recess.]

DR. CARR: Okay I think it is time to regroup and take your seats. Okay,
welcoming Dr. Tang who is also doing double duty with the privacy hearing along
with Dr. Cohn. So, thanks both for coming up. We will start off now with Dick
Johannes about the hybrid clinical data model.

Agenda Item: Performance Measurement and Quality

DR. JOHANNES: Thank you very much and good morning. My name is Dick
Johannes. I am a clinical gastroenterologist with a master’s level training in
computer science. I have had academic appointments at Johns Hopkins and still
hold one at Harvard Medical School where I still practice gastroenterology. But
80 percent of my effort goes into supporting an outcomes research group and
clinical research group within Cardinal Health that supports quality
initiatives in all hospitals in the State of Pennsylvania as well as many
hospitals beyond Pennsylvania. My comments this morning are going to be
principally related to inpatient data and reporting at the hospital level.

Shortly before his death in 1934, William H. Wells, who was one of the
founders of the Johns Hopkins School of Medicine and the only one to live long
enough to be recorded on film. On that film, he credited his success to the
intersection of good fortune and preparedness. That somewhat familiar twist
describes how we, at Cardinal Health Medical actually have come to have data
that is relevant to today’s discussion.

In the early 1980’s, several meetings occurred between the founder Medic
Well(?) and Donald Fetteroff who was then the Chief Medical Officer at Highmark
Blue Cross Blue Shield and Ernie Sessa who founded the National Association of
Health Data Organizations and was to become the first Executive Director of the
Pennsylvania Health Care Cost Containment Council or PHC4.

All three of these people shared a belief, which was then only a belief,
that validity and precision of adjustment models used for public reporting,
depended on clinical data that went beyond claims data. PHC4 acted on this
belief. Pennsylvania is the only state to perform uninterrupted public
reporting of hospital performance for over 20 years across more than 50 medical
conditions. Cardinal Health has supported the data definition, data collection,
and risk adjustment methodology over the entire time. Through this pathway, we
have had data on all discharges from roughly 190 Pennsylvania hospitals with
the standard data UB92 data, but coupled with laboratory results, vital signs,
and selected abstracted data elements such as the presence or absence of
thorough(?) infusions, assessment of level of consciousness, or left
ventricular rejection fraction. It is worth noting that both the fine and
pneumonia severity index and the arc failure rescue metric were both developed
using these data.

I will try to cover four central themes during my time this morning. First,
the timing of clinical data is important to admission-based severity
stratification. Second, laboratory data and to a lesser extent vital signs are
highly objective and powerful predictors. Laboratory data is becoming widely
available in electronic format. Finally, the influence of clinical data for
face validity should not be underestimated.

Turning to the clinical data and timing, since vital signs and laboratory
data are both time stamped, they provide better identification of risk in the
peri-admission period. The free risk adjustment models from the criticism that
late hospital events are used in the adjustment process and hence make it
impossible to separate complications from comorbidities. The issue was not one
of being able to discern whether an event in an ICD-9 code and when –
because if that ICD-9 code were also related to mortality, then the C statistic
could actually improve. This is why I think the methods describing this need to
move away from the battle of C statistics. We need to continue to pursue
message methods that identify mortality attributable to patient admissions

One new approach already mentioned on the near horizon is the use of
present on admission coding or POA Coding to identify whether secondary
diagnoses were or were not present at the time of hospital admission. While an
important addition, some pause is warranted. Recall that a POA flag to be added
that the code must first be collected.

Let us examine how this might work for a situation such as hyponitremia.
Hyponitremia or low serum sodium is defined as the sodium level below 135
milliequivalents per deciliter. There is an ICD-9 code for it, and if used can
actually advance cases into higher-level DRGs for reimbursement. Hence, there
is actually an incentive for getting it right. These data show the results from
578,878 cases from 83 hospitals reporting data electronically over the years
2002-2004. We asked the question, what fraction of cases with
laboratory-documented hyponitremia on admission also carry a secondary
diagnosis of hyponitremia down that column.

As can be seen, the sensitivity is only 11.8 percent and only improves
marginally when looking across the full length of the hospital stay. Nearly
nine of ten cases lack a secondary diagnosis for which to affix a POA code.
Since patients with abnormal laboratory results often have them repeated as
their clinical course unfolds. We have also looked at cases where the diagnosis
has been repeatedly confirmed by laboratory data and again asked, what is the
detection rate? Sensitivity for hyponitremia only rose to 30 percent when over
more than ten determinations of this abnormality were in fact known. Similar
results can be seen on this slide for a variety of other laboratory findings.

Let me now turn to which laboratory studies our group has relied n over the
years. Two-hundred-thirty laboratory studies are represented in this table.
They are all common tests that, with the exception of cardiac enzymes and blood
gases, are collected in over 90 percent of inpatient admissions, which means
that the issue of missing data is minimized. At present we have used these data
primarily to examine clinical status on admission. However, using automated
laboratory data collection, there are opportunities to examine changes during
the hospital stay. For example, tracking creatinine longitudinally could
provide insight into renal function throughout the hospital course.

Having now looked at which laboratory values we have found useful, I wanted
again to ask the question, what is the predicted power at laboratory values?
What as particular ICD-9 code tells you whether something such as renal failure
was or was not present? Laboratory data also failed(?) with severity. One of
the steps in constructing our risk adjustment models entailed examining
potential variables in a univariate manner prior to testing them in
multivariate manner.

Here are the results for one such analyte, serum albumin. This is 2003 data
on roughly a million patients from 218 hospitals and plots serum albumin levels
against mortality. The dotted blue line shows the normal range and the dotted
red lines are the cutoff points used to transform the continuous data into five
discreet ranges. As you can see is markedly elevated mortality risk once the
albumin crossed the normal level, rising to nearly 17 percent for albumins
lower than 2.5 grams per deciliter. To help put this into perspective, the in
hospital mortality risks for several diseases – heart failure, myocardial
infarction and sepsis – are shown as reference.

Having now begun to define curves like this one, w are moving on to do it
for a specific disease. That is, does a creatinine of 3.5 carry the same risk
in a patient with chronic renal failure who has had it for three years as
someone with acute renal failure who has had it for three days? There are
strong argument for using disease-specific cutoffs in these laboratory values.

Lisa Isani(?) is a major contributor in this area and used the term
“dimensions of risk” to characterize the various classes of data used
in the risk adjustment process, and I borrowed that term and concept from her.
This can be shown in this stack of cylinders, each of which represents data of
a different type. As you go up that chain, cost and difficulty of collecting
the data and collecting it accurately clearly increases. It is often thought of
as a relatively thin slice of added clinical data atop a larger base of claims
data. WE will come back to that formulation near the end of my comments

There has been recent rekindling in interest in attempts to quantify the
benefits of clinical data. A series of studies that are the result of an AHRQ
sponsored contract led by Dr. Anne Elixhauser, who is with us today is making
this occur. The contract was led to Abt Associates with a subcontract to
Michael Pine and Associates. The first results were reported in December
publicly at the annual NAHDO meetings and two publications appeared, one in the
American Surgeon and the other in the Journal of the American Medical
Association at the beginning of this year.

What was done in these studies, to begin to appreciate this literature, it
is important to at least at some level to understand the design. Three years of
data was used across 2000 and 2003. When multiple models were developed for
several disease groups, eight conditions were examined. Five were medical and
three were surgical. Myocardial infarction, congestive heart failures, stroke,
gastrointestinal hemorrhage, pneumonia, and on the surgical side abdominal
aneurism repair, coronary artery bypass surgery, and craniotomy. Models were
built up progressively forming a family of models beginning with age only. Then
in standard administrative data, administrative data that was complemented by
an imputed POA flag created from clinical data which in my mind may represent
an upper bound for that performance. Then laboratory data, vital signs,
clinical elements, and compound clinical elements such as the level of
consciousness and left ventricular ejection fraction were progressively added.

An example of the results can be seen in this slide which were presented at
the NAHDO meetings. Despite it deceptively simple appearance, I find this slide
usually takes some time to understand. The goal was to examine the degree to
which enhanced systematic bias associated with inadequate risk adjustment could
result in misclassifying hospitals. Since hospitals are often compared with
standardized rather than absolute differences, the effect of observed expected
differences for each model of level against the gold standard are shown in
terms of standard deviation units. The Y axis represents the percentage of
hospitals that would exceed any selected upper boundary of standard deviation.
It could be thought of as fraction of hospitals that are subject to
misclassification as a result of systematic bias in the models.

For example, consider what is meant at the there standard deviation level.
This would represent a hospital that was actually one standard deviation from
the mean as a positive outlier and move three full standard deviations in the
other direction, to become a negative outlier. Even raw and adjusted data can
handle that.

However, as you move that difference back, small differences, particularly
toward one standard deviation, where a hospital is in fact at the cusp of the
inter-quartile range, you can see that to keep the bias below a 10 percent
level, we need to add the green line or a laboratory value, to get it down from
unadjusted age, administrative, administrative plus DRA(?), vital signs for

It is also clear that if you wanted to keep it below 10 percent at any level
of standard deviations you would have to add vital signs as well. Full clinical
models perform across the board at all levels. The question here is no longer
one of whether they work but how practical it is, and Dr. Elixhauser will speak
to this point I think at some point.

Since my assisted in the generation of this dataset, but we did not know how
the study would take form, we chose to ask a separate and somewhat different
question. Namely, what are the relative values of the various blocks of data?
We collaborated with Dr. Jeffery Silber at the University of Pennsylvania who
was instrumental in introducing us to a method called the Omega Statistic,
which measures the relative contribution of two different groups of explanatory
value to the overall power of the prediction. This is of interest because it
could be argued, and rather than adding clinical data to claims data, one might
ask, why not maximally use the objective of least damnable laboratory data
first and add other data elements above it. We studied six conditions —
ischemic and hemorrhagic stroke, congestive heart failure, pneumonia and sepsis
– and there is considerable overlap with the Pine(?) work. Hierarchical
models using random intercept logistic(?) progression were constructed and the
relative contributions of the various blocks were compared, using this omega
statistic. This work has been accepted to Medical Care and will appear in the
August issue.

You can see the results here. If we compare the relative contribution of the
laboratory data to ICD-9 you get 7, 2.6, 3.6, 57, 14 and 8. Interestingly
enough, the vital signs across the board, usually around a level of three, also
seems to hold true for situations where one might expect vital signs to move
values such as cardiac conditions. One of the last elements that are difficult
to obtain, assessment of mental signs, are terribly important for neurologic
situations, where those events carry the day.

One of the ways to look at this is these results should not be too
surprising. Laboratory data are commonly used clinically because they support
robust estimation of the function of critical organ systems. For example, the
kidney is typically evaluated by BUN and creatinine and to a lesser extent
sodium and potassium, standard liver function tests. Where do my lab values
need some help? The answer is perhaps in the heart and brain. BNP and Cardiac
enzymes are valuable. A lot of clinicians would argue that a few changes might
occur for impact from that time and neurologic issues of laboratory data are
really quite limited.

With all this said, let’s go back to my cylinder, using congestive heart
failure as an example. This shows what one of our actual models looked like in
terms of the distribution of the data that are actually used. If you can see,
it is really quite clinical. The distribution tends to be a large number of
laboratory values with smaller contributions coming from the other three

One final point before I summarize that there is distrust amongst physicians
rightly or wrongly for results that occur exclusively from claims data. By
providing models with greater transparency and consistency, we may be able to
do a better job of recruiting a full clinical committee into the quality
agenda. This has certainly been true in Pennsylvania where the Pennsylvania
Medical Society is an outspoken supporter of the public reporting done in
Pennsylvania and largely because of the coupled method of using claims data in
combination with clinical data.

In conclusion, clinical data has several advantages. They are objective,
precise, time stamped, they suffer from few missing data, they are not
susceptible to gaming, they are easily verified in the medical literature, much
accepted by the clinical community, and they have a tremendous opportunity
particularly in terms of the laboratory data for automated data collection. In
my mind, at least for data collection, we should be moving the discussion no
longer from a question of whether to a question of when and how. Thank you.

DR. CARR: Thank you very much. Questions? Go ahead. Well, I wanted to ask on
following up on Mark Wynn’s testimony where he talked about the 80-20 rule and
the 20 percent of the frail — how would this synergize or be applied to the
CMS measures or have you done that?

DR. JOHANNES: We have not at this time. We would love to do a comparison of
several of the methods to look at just that question. The low albumins and
congestive failure would represent cardiac ataxia and that puts them at one far
end of the severity spectrum. It is one of the reasons that I think the lab
data are so helpful in identifying particular type of patient that you are
trying to severity stratify.

DR. CARR: In terms of the issue you raised about physician buy-in, can you
say a little bit more about that in terms of having this clinical data added to
the claims data and what that response is.

DR. JOHANNES: As you might imagine the hospitals in Pennsylvania, since they
are publicly reported annually across a variety of these diseases, yet they
happen to fall into an outlier range particularly on the negative side, they
are quick to contact PHC4 and ourselves and first go through that whole process
of questioning the data, questioning the method, questioning the coding, and
then eventually getting to a point where you do or do not recognize that they
may have a care issue and effect a change. It is much easier to get the
clinical proof on board if they argue that the cases were or were not sicker
than their neighboring hospital, to show them hard data regarding the
proportion that had renal failure and those that do not and how that goes
compared with their peers. It just makes that part of the argument go smoother.

DR. CARR: Just one other question before we get to Carol is that you have
been collecting for 20 years and were there data elements that you were
collecting initially that you ultimately excluded?

DR. JOHANNES: That is a great question. Thank you for asking, Justine. I
almost put some comments about that into this presentation. The answer is
unquestionably yes. Not only some, but many, which is what surprised me so much
that people organizing things such as core measure data went to what we
consider the most difficult data to collect, KCF data first.

DR. CARR: What is KCF?

DR. JOHANNES: Key Clinical Finding elements such as ejection fraction,
something that really requires chart extraction, because those are the ones
where maintenance of the glossary hospital-to-hospital inter-rater reliability
which we measure in Pennsylvania, that is difficult to do — the clinical
elements that I believe are ones that are closest to being ready for widespread
use are the laboratory data. They do not share some of those problems that
others do. Yes, we used to collect data in 1996, all of the data was collected
through abstraction at that juncture and there were almost 200 data abstraction
elements. We are planning to get that down. If you say vital signs represent
four or five if you call blood pressure two numbers. We are probably going to
take that down to no more than five others. It used to be, when I got there, it
was 210. Five years ago it was down to 67. I think we will take it down to
somewhere around 10.

DR. CARR: Thank you. Carol?

MS. MC CALL: Thank you for sharing the information. It was absolutely
fascinating. I for one, being kind of a data dog, could spend all afternoon
learning about things like that. I will try to hold back. A couple of
questions. A lot of this was focused around the topic of risk adjustment and
getting things apples to apples. My question would be, are there additional
uses outside of or in addition to performance measurement where there could be
intermediate measures related to quality or evidence-based medicine where the
measure itself is used for more than making sure that they are comparable but
can be used to make decisions or provide clinical guidance to a physician? Can
you talk about some of these?

DR. JOHANNES: I can tell you a couple of them. We have been examining a
number of disease states to look at what could be more bedside methods that you
could use to affect stratification to affect care as well as on the backend
than using it to do observed and expected ratios for comparison. We have been
looking heavily at pancreatic disease and that area, and we just presented our
results at the spring HEA meetings where we came up with a 5-element approach
that we call DISAP. It is a wonderful double entendre because it stands for
death(?) index of severity in acute pancreatitis, but it also means DUN, frame
of mental status, serum albumin, advanced pH and pulmonary findings(?). That’s
just easy to remember.

Using that, it is a score between 0 and 1, and each one is scored between 0
and 5 and used in that manner. In addition, we have used data to answer a
variety of clinical questions surrounding quality over the years. I mentioned
two of them. Another one we had looked at is the relationship between
hemoconcentration and mortality in a variety of diseases. Once you have these
data, there are a variety of clinical questions you could ask beyond immediate
quality measures.

MS. MC CALL: I know a lot of this was around inpatient. How much would
extend to an ambulatory setting? Part two of the question is, can you get into
a frame where with a person or a patient that becomes kind of a know your
numbers challenge?

DR. JOHANNES: Well, for people with chronic diseases, I think if you have
been educating them well, they do know their numbers. The idea that you can get
people closer to these numbers I think is a very sound idea. I think it is

DR. TANG: Thank you so much for the presentation.

It was very cogent and well presented and compelling. I do have one quick
question. If you could innumerate the ten you think are going to, or the 5
additional besides the vital signs you are going to hold up and the other is,
recognizing that LBEF or cardiac dysfunction is so central to a lot of the
cardiac measures, is there some other substitute for that which is right now
uncoded and not easy to abstract from electronic systems?

DR. JOHANNES: I will try to answer both of those.

First the five, I think we would keep evidence of level of consciousness at
the time of admission. We would still collect left ventricular and ventricular
fraction to answer your question because I do not think there is a substitute.
I do not even think BMP is yet a totally good substitute even in congestive
heart failure. The presence of acytes and the presence of certain drugs in
ambulatory settings, particularly MLS suppressives, anticoagulants, insulin for

DR. TANG: Did you answer Carol’s question about how this would apply to the

DR. JOHANNES: I will answer that question. I think that is a reach. I think
vital signs in any of the clinical data in today’s world, collected broadly,
are a reach. I think it needs to move in that direction. I think there are
great opportunities there. I have not seen any data, and I have very limited
data in the ambulatory setting, to be able to answer that. I am now becoming
more interested in looking at laboratory data as a function of time through the
hospital stay in an effort to understand a potential complications mid stay. I
must say I still remain heavily focused on the inpatient side.

DR. CARR: Bill?

DR. W. SCANLON: I think this is fantastic in terms of the potential here.
Where I am interested in going is whether we are drawing an artificial
distinction between administrative and clinical data and whether we should be
redefining the administrative data to include some of these clinical measures
as long as they are not ones that you have to abstract or ones that are subject
to a lot of variability. It seems that the power that is there is something
that we should not be forgoing. I guess I am wondering if this was proposal for
hospitals you are dealing with where we were going to require this, how would
that be received? Maybe you have to think back in time because these hospitals
we deal with mostly have become accustom to reporting.

DR. JOHANNES: I think if you constrain it to the laboratory data, you will
have a substantially different discussion. I fully agree. I actually hate the
term administrative versus clinical data. I think they are all clinical. It is
a question of what information is in each of those buckets that the other
bucket does not have. In no way would I argue that a purely physiologic measure
of severity would supersede one that was also coupled with classical
administrative ICD-9 data.

I think that ICD-9 can and should continue to be probed, but I would argue
that I would rather see more time going into improving understanding how to
recognize lower gastrointestinal hemorrhage from different clauses and get
those codes right then try to code hyponitremia. Given that it is not coded
well, I think that the pendulum may be tipping to the point where attainment of
the lab data is less expensive in both time and money and accuracy than
improvement of the ICD-9 codes.

DR. CARR: Thank you very much. With Bruce’s permission, we are going to
switch the order and ask Anne to present now because of the synchrony with
Dick’s presentation. So thank you Bruce and thank you Anne.

Agenda Item: Performance Measurement and Quality
Improvement – AHRQ

MS. ELIXHAUSER: Okay, thank you very much. My name is Anne Elixhauser and I
am a Senior Research Scientist at the Agency for Healthcare Research and
Quality. I am going to be talking about an initiative that we have been working
on for the past several years on improving the value of administrative data,
specifically for the purpose of reporting quality of care at the hospital

Here is what I want to be talking about for the next 20 minutes. I will
first provide you with some background on the issues, and then I am going to
summarize a research study that was conducted by Michael Pine who was also
referred to by Dr. Johannes earlier. The study that was conducted by Michael
Pine and his staff at Abt Associates. I was the AHRQ lead on that, but Dr. Pine
was the principal investigator. Then I am going to describe where we are going
next to actually implement the results of these research studies.

Hospital administrative data as we have talked about or hospital bills or
claims data are available for a near census of hospital discharges from
currently about 45 states in the United States. These data provide information
on every hospital stay in those states including basic patient demographic
information, how the patient was admitted — was it a routine admission or a
nursing home, what happened to the patient during the stay, what sort of
resource use occurred, and how were they discharged, routine, to a facility, or
whether they died, whether they dies in the stay. Now, AHRQ has been working in
collaboration with 38 of these states to collect all of these discharge
abstracts in each state to convert them into a uniform format and to make them
available for research. This is what is called the Healthcare Cost and
Utilization Project or HCUP. As part of our work with the HCUP data, AHRQ has
sponsored the development of a set of measures that use solely hospital
administrative data to assess quality of care. These are the AHRQ quality
indicators. There are currently four modules. Prevention quality indicators are
measures of inventory care sensitive conditions. Inpatient quality indicators
look at mortality, utilization, and volume of services. Patient safety
indicators look at potential safety problems that occurred in the hospital, and
the newest module of the pediatric quality indicators which are focused solely
on children. Since these measures have been released for public use, a number
of organizations have adopted them for purposes of quality assessment. About
nine states currently use the quality indicators for publicly reporting quality
of care for hospitals in their state.

It has long been known that despite the wealth of information that is
provided by hospital administrative data, but the data do have some critical
limitations. One is that while the administrative data do contain some clinical
information, it is limited to what is contained in ICD-9-CM codes. So, while we
may know that a patient has uncontrolled diabetes, we do not know how badly out
of control their diabetes is. We know the potential, but we do not know the
blood pressure reading.

Furthermore, while we have a list of diagnoses, we do not whether those
diagnoses were present on admission or whether they developed during the stay.
So, we know the patient had pneumonia in the hospital, we do not know if they
were admitted with pneumonia or if it occurred as a complication of their care.
Algorithms like the AHRQ quality indicators can do a lot of good with this
information. For example, if a patient had a major elective surgery on day one
or day two of a hospital stay, if they had a diagnosis of pneumonia, we can
assume that they were not admitted with the pneumonia since elective surgery
would presumably not be performed on a patient with active pneumonia. The AHRQ
QI’s in this situation would assume that this case has a hospital-acquired
pneumonia. Nonetheless, because of concern about inadequate risk adjustment,
because of concern about penalizing providers who have the sickest patients,
questions have been raised about using solely administrative data for quality
in reporting.

How do we get more clinical detail? We have been talking a little bit about
that already. Two states, California and New York already are collecting
information whether the diagnoses are present on admission. Now CMS is
mandating that this information be collected for all Medicare patients starting
in January of next year. One state, in Pennsylvania, clinical information is
being manually abstracted from the medical record using outcome systems.
Questions have been raised by hospitals about the cost and the burden of such
manual data collection. Although EMRs are being more widely adopted, we are
still years away from routinely being able to rely on EMRs to provide us the
information that we need for quality assessments. That one exception is lab
data which are available electronically in about 80 percent of all hospitals.

Given this context, AHRQ sponsored a study that was conducted by Michael
Pine and his associates to asses the impact of adding clinical information to
the administrative record for purposes of quality reporting. We examine more
complex and more expensive to obtain data to identify the most cost effective
enhancements to administrative data. Now, because POA information is collected
at the same time and by the same personnel who would abstract a medical record
for the claim, and who code diagnoses for ICD-9 codes, we added POA information
early in the modeling process. We then added lab values at the time of
admission assuming that numeric information from a single point in time would
be relatively easy to obtain and relatively inexpensive to obtain, given that
lab data is available from the majority of hospitals.

We also assessed the impact of simply increasing the number of diagnosis
fields to see what kind of impact that would have. We then examined the impact
of improving the documentation of diagnostic information using ICD codes.
Currently, coding rules stipulate that when there is a final diagnosis, for
example stroke, that symptoms like coma would no longer be coded. We wanted to
see what would happen if these findings were coded in addition to the final

We then added information on vital signs and admission. Again, numeric
values at one point in time but that are less routinely available
electronically. Finally, we added more conical information that was more
difficult to obtain. Then, through cost effectiveness analysis we assessed the
most cost effective enhancements to administrative data. These studies have
been reported in a number of manuscripts already. One that Dick mentioned that
I didn’t include here – but there has been one published in JAMA in
January of this year, one that came out in June in the Journal of Patient
Safety. Another one will be coming is the Annals of Surgery shortly, and we
have submitted a fourth for publication.

The results I present here will highlight some of these findings and will go
a little bit beyond what Dick provided as well.

The data that we used for the study was supplied by the Pennsylvania Health
Care Cost Containment Council. We really do appreciate their generosity in
providing the data. They provided us with all administrative data from 188
hospitals over a three-year time period spanning 2002-2003. In addition, for
all of these records, they also supplied detailed clinical data that were
abstracted from medical records using this outcome systems which records a
hospital day corresponding to each data element. We also used New York and
California claims data, which identifies those conditions that are
comorbidities – that is, present on admission versus complications that
originated during the stay. We applied what we learned on the New York and
California data to the Abt data so that we could model a POA modifier for the
Pennsylvania data. We studied eight mortality measures and four patient safety
measures. There were three surgical measures, five medical, and four patient
safety measures that you see here.

As I mentioned, we developed incrementally more complex models. The
sequence that I outline today is actually one of a number of different
sequences of models that we tested that were reported in the various
manuscripts. The ones that I report today are illustrative of the findings

We began with a model that was based just on routine administrative data and
up to eight diagnosis fields. We then added POA information. Third, we
increased the number of diagnosis fields to 24 to see if more diagnostic
information was helpful. Fourth, we then added information on conditions that
were present in the Atlas(?) data, but did not appear in the ICD codes because
of coding rules. This includes conditions such as coma, pneumosuppresion, chest
diffusion, and history of chronic lung disease. It also included some coding of
numeric values like hyponitremia.

We then added numerical laboratory data that were obtained on the day of
admission to the model that included POA and both 8 and 24 diagnosis fields.
Then we added lab data to the model that assumed improved coding of the claims
data. Finally, we added full clinical information, vital signs, other lab data,
key clinical findings, and composite clinical scores like the ASA

Now, other analyses that were done as part of this study broke out these
clinical models in more detail. We ran separate models for vital signs and the
lab results for clinical findings and for composite scores and details are
available in some of the other papers that I mentioned.

Here are the C-Statistics for the Mortality Models. These are mean
C-statistics across the eight mortality models that we looked at. The
C-statistic measures the discriminative ability of a model. So .5 is a pure
guess and 1 is perfect discrimination, 0.7 to 0.8 is good, 0.9 is pretty
excellent. So the C-statistic for the curative administrative model was about
0.79, which is really quite good. When we added the POA information we
increased to about 0.84. When we added lab data, that is about 0.86. The full
clinical model was 0.88 for C-statistics. So, what we see here is the biggest
jump in the C-statistic when you add the POA information. The next biggest jump
is when we add lab data or when we model a revision in ICD-9 coding to allow
for coding of symptoms like coma or immune compromised data. We got another
smaller jump when we combined improved coding and lab values together. The full
clinical model added relatively little additional discriminative ability. I do
not present the patient safety models here, but it is a very similar sort of

We used another measure of model performance which we termed Hospital Level
Bias. Basically, what we did here is that for each hospital, we calculated the
difference between the number of adverse events that were predicted by the full
clinical model where we had the most information and the number of adverse
events predicted, each alternative model that had less than full clinical
information. Then we expressed this difference in terms of standard deviation
units by dividing the difference between the number of adverse events in the
full clinical model and the number of adverse events in the less complete model
by an estimate of the standard deviation of the number of events predicted by
the full clinical model.

So, the use of these standard deviation units basically takes into account
variations in the number of cases at the various hospitals as well as the
predicted events of those cases in the full clinical model. So we are basically
always comparing less than complete model to the full clinical model and seeing
what kind of bias we are introducing by using less than complete information.

We then reported the number of hospitals with differences greater than 0.5,
1.0, 1.5, two standard deviations, just in order to provide readers different
thresholds to see how much error they can live with. Then we examined
improvements in model performance in terms of reduction in the percentage of
hospitals with unacceptable bias. These hospital level predictions really are
the most relevant measures for this study because they size how hospital
rankings will change under various models. For simplicity, lets focus on the
first column that corresponds to a threshold of 0.5. What we see here is that
the mean percentage of hospitals with bias compared to the full clinical model,
was just under 70 percent for the raw data – that’s the red line at the
top. That means if we just use raw data, 70 percent of the hospitals are
classified inappropriately compared to the full model.

For the basic administrative model, the light blue line, this is the basic
administrative model based on just eight diagnosis fields. About 45 percent of
hospitals had bias exceeding 0.5 standard deviation units. When we added POA to
the model, regardless of whether we used 8 diagnoses or 24 diagnosis fields
– so this is the yellow diamond superimposed on the blue square, about 38
percent of hospitals had bias exceeding 0.5 standard deviations. What this is
telling us is that adding POA is really the most important factor here. It is
not the number of diagnoses that we are adding,

When we added improved coding, which is the pink line, about 22 percent of
hospitals still had an acceptable bias, and when lab data was added to the POA
model, about 18 percent of hospitals were still biased. Then when lab data was
added to the POA model, with improved ICD coding – that is the very bottom
line, only about 5 percent of hospitals still had unacceptable bias.

We see a very similar pattern for the patient safety measures, but we did
not perform the same depth of analysis in terms of improved ICD coding for
patient safety.

Which specific variables were of most importance here? What we found was
that results of 22 lab tests entered at least one model. The results of 14 of
these tests entered 4 or more models. These are the lab tests that were
important and the number of models that they entered into. All vital signs
entered 4 or more models, but these vital signs were the most important.
Ejection fraction and culture results entered 2 or more models. The composite
scores entered 4 or more models.

In terms of abstracted key clinical findings, there were 35 clinical
findings that entered at least one model. Only 3 findings entered more than 2
models and that was coma, severe malnutrition, and immunosuppressed. What was
really interesting was that we found that 14 of these clinical findings
actually have existing corresponding ICD codes associated with them. It is only
because of coding conventions and coding rules that those symptoms are no
longer coded.

We also looked at the marginal cost associated with incremental additions
of clinical data. The top line summarizes the hospital level bias that we just
saw. The bottom two lines are two different ways of looking at cost
effectiveness. We did sensitivity analysis on the cost effectiveness analysis
across 3 different cost assumption scenarios. One scenario which was our high
cost scenario, we interviewed clinicians and medical record abstractors and got
information on what were the costs associated with abstracting specific types
of data elements.

A low cost scenario was based on studies that were sponsored by PHD4 in
order to see what were the costs associated with abstracting data through the
outcome system. We did a midrange scenario as well. What you can see here is
that cost remained relatively low for collecting administrative data, for
collecting the present on admission information, for collecting the lab values,
and the ICD codes but increased dramatically once we added the full clinical
information. These findings really held across all cost scenarios that we

So what we found was that administrative data can be improved at relatively
low cost by adding POA modifiers, by adding numerical lab data on admission,
and by changing our coding convention and coding rules.

In order to implement these findings and to encourage adoptions of results
of these studies, AHRQ this month released two RFPs aimed at expanding the data
capacities in the statewide organizations that currently participate in HCUP.
These are the 38 state data organizations that collect data from the hospitals
in their states and provide data to AHRQ. These 38 states comprise about 85
percent of all hospital discharges in the United States. In one RFP, AHRQ is
going to be supporting pilots in up to two states to add clinical information
to their administrative data. The other RFP is going to support planning
efforts in up to five states who are interested in enhancing their
administrative data who are not yet ready to engage in a pilot. Every state is
different. There are different relationships. There are different coalitions
that need to be built. These planning projects are really intended to help
those states at the initial stages of enhancing their administrative data.

Let me just provide you a little bit of information about what the pilots
are going to be doing. The major objectives of the pilot studies are to
establish the feasibility linking clinical and administrative data in the field
in the hospital, then to develop a reproducible approach that can be exported
to other states, and then to set the stage for integrating the clinical and the
administrative data streams in the future because we do not really see this as
simply the addition of a few additional data elements to administrative data.
We really see this as a way of identifying what sort of information from the
EMR is going to be most valuable for quality assessment and quality reporting
and then helping to specify how the analytic capacity of the EMR needs to be
developed to allow for easy access to these key clinical data elements.

The specific activities that the pilots are going to be involved in will be
to identify and select data elements, to translate the clinical data from the
electronic format except for the POA information. We specifically want to avoid
manual abstraction of data elements in these cases. They have to figure out a
way to electronically transfer the data from a minimum of 5 hospitals to the
data organization, process that data into a multi-house built database and
during this entire process, they are going to be required to collaborate with
stakeholders, hospital representatives, state government agencies and they are
expected to do work with researches and quality measurement professionals with
healthcare quality organizations and with regional health information exchanges
of a program. Then, they are going to be engaging in peer-to-peer learning,
information sharing, and dissemination in order to allow pilot and planning
states to learn from one another and then to disseminate what they have learned
to other states in the future.

In conclusion, what we found was the judicious addition of just a few
clinical data elements can significantly improve our ability to do quality
assessment using administrative data by just adding POA information, lab
values, potentially vital signs, and improved ICD codings that can get close to
the full clinical model from the mortality and patient safety measures that we
looked at here. And also through pilot and planning contracts, we hope to
jumpstart the process of adding clinical data elements to statewide hospital
data. Thank you. I will take your questions now.

DR. CARR: Thank you that was very exciting, very promising, and very

MS. ELIXHAUSER: We were hoping for logic.

DR. GREEN: That was gorgeous. Thank you for coming. Just a moment ago, you
said except for POA data we want to avoid manual data collection. My question
is why go to manual data collection for Present On Admission?

MS. ELIXHAUSER: I guess because the same people

who will be doing the same abstraction off the medical record for ICD codes,
there has to be a process of converting the written text in a medical record
into ICD codes. Those same people who do that conversion of text to ICD codes
are the ones who will also be assigning POA information for each of those
diagnoses on the record. Do you see what I am saying?

DR. GREEN: I understand it. It seems to me that that is the way the world
has always worked before. It makes no sense with where we are going, and by the
time your pilots are done it will be irrelevant at least if we do our jobs
right. The overview of the entire NCVHS and Populations Committee and all this
stuff we are looking at, surely we individuals are going to hold some of our
own personal health information. Surely it is going to be in electronic format,
and surely when we get admitted to hospitals someone will have the human
decency to look at it.

MS. ELIXHAUSER: I think you are absolutely right.

I think that once we actually have personal information that a person
carries around with them so we know when their diabetes was POA, but there are
always going to be some conditions like pneumonia that are not going to be
apart of that patient’s health record until the time that they may admit or
present to the emergency room and be admitted to the hospital. There will still
be some need for some evaluation at the time of admission for certain
conditions to distinguish complications from conditions that were present at
the time of admission to the hospital.

DR. GREEN: My apologies for getting too far into advocacy. I just so hope
that AHRQ—this is so exciting. As you move into these other pilots, I so
wish that you would look at the opportunities to capture those highly personal
data in electronic formats in the mode and mean in which we think we are trying
to design a nationwide electronic health information structure and begin to
plan and pilot for the future as opposed to planning for the past.

MS. ELIXHAUSER: I agree with you 100 percent. Even in the short term, even
before we reach a point of having a real personal health record, we can improve
how we link our data. For example, we can link past admissions to the current
admission so that we know what conditions were present during that prior
hospital stay and apply that to the current hospital study. There is not
only—we should not even wait until we get the personal record, but to look
at the data capacities that we have available right now and try to link those
together. Thank you. I agree 100 percent with you.

DR. CARR: We have Simon and then Paul and then Marjorie.

DR. COHN: Thank you for your presentation. I fear that I might be sounding a
lot like Larry. I remember coming onto my administrative roles. I have
practiced medicine for many years back in the late 80’s. At that point, one of
my mentors was a gentleman named Mark Umberg who at that point he was able to
establish in California this POA capability in the mid 80’s. Now we are talking
about is new and different and exciting. I guess he lived long enough to see it
happen again. My question here is, are we at the point where we need pilots or
are we really the planning contracts and implementation? How long does it take
to begin to see some of this happen?

MS. ELIXHAUSER: There are several reasons that we are doing planning and
pilots. One is that states are all at different points. They are all at
different stages of decision making about these sorts of efforts. My
understanding is that there are 10 states that will be adopting POA coding
within the next year and we are going to see it. There are other states that
are lagging very far behind that. There are still some states that do not have
statewide data collection efforts. We would hope to see those are going to
develop in a near time as well. Secondly, there is limited funding within AHRQ
to support full-scale implementation. The best we can do at this point with the
limited funding we have for this project is to provide seed money to get going
on something everybody agrees needs to be done and hopefully put together the
information so it can be disseminated more broadly in the future. I agree with
you. It is taking 17 years to implement POA coding is just unconscionable.
Hopefully things will speed up from now on.

DR. CARR: Then we will have ICD-10. Paul?

DR. TANG: I want to thank you also, Anne for such a wonderful presentation.
I am also going to come to Anne’s defense in terms of this, why do not we just
get on with it. It is like saying, why do not we have EHR’s now? There is so
much to learn and so much culture to get through that we have to learn how to
do it in ways that can be scaled to the rest of the country. I think she is on
the right track.

I have two questions. One is just a confirmation. When you talked about the
bias graphs, as you went along, when you got to lab that meant it was
cumulative with the POA measures, etc. Is that correct?

MS. ELIXHAUSER: Not all of them were cumulative.

DR. TANG: Lab in particular I am interested in, is that cumulative?

MS. ELIXHAUSER: You know what it actually varies.

There are two lines for lab, the green line is lab with POA coding and the
black line is lab with POA coding plus the improved coding that we are talking

DR. TANG: So the one question is whether lab alone would also get you almost
as much since that involves human effort of that definition question.

MS. ELIXHAUSER: I do not believe we ever tested lab alone. That is something
that we could do. We really assumed that POA was sort of on the threshold of
being here and that given how powerful it was–

DR. TANG: Yes, I certainly agree of its values and all those things. The
piece is when you talked about the cost effectiveness, I noticed that the ICD
coding enhancement and POA had a very low cost, did you include the cost or
pre-training all of the folks that have to do this abstraction?

MS. ELIXHAUSER: That was sort of in our high cost estimate.

DR. TANG: So even with that, it was still low?

MS. ELIXHAUSER: It is primarily because the information that you have to get
is so targeted. You do not have to go through very difficult parts of the
medical records. It is usually in one place.

DR. TANG: It is easy to get, but we have to retrain the entire workforce.

MS. ELIXHAUSER: We did not do a national cost estimate as to what it was
going to be.

DR. CARR: Marjorie did you have something to add?

MS. GREENBERG: I wanted to add to the chorus of thanks and not only for your
really excellent presentation but for doing this study. I think we heard that
this study was being launched in June of 2004 or something. It is really
exciting to see the results of it. It is sort of this déjà vu in
the sense that in 1992 this committee, based on a recommendation from your
mentor, recommended adding POA. I think we need to think about that. It is a
long time to move some of these things forward. I hope that what you learn from
this not only will enhance administrative data, but will really be factored
into the architecture of electronic health records. Those things that really
are useful for quality of performance measurement and everything need to be
structured in a way that they can easily be put into that. Also, I just wanted
to mention that, because you were talking about ICD-9 and I am emailing with
Donna Pickett about whether she has discussed some of this with you. She is
responsible for ICD-9 CM and worked on the coding guidelines as well. Some of
these are coding guideline issues and we will follow-up with you about that.
Some of them actually, under current coding guidelines, could be collected. We
will follow-up with you on that.

DR. CARR: Thank you, Marjorie. And cognizant of our time, Carol has a
closing comment and then we will get two additional very exciting

MS. MC CALL: Thank you very much. I will try to be brief. It is not a
question as much as it is a request. That is, as you go forward into these
pilots, two things call out my attention. One is that in addition to building
something that is reproducible that it be built to be intentionally dynamic and
to be a learning process. What I mean by that is we have heard a lot today
about the need for standards and those are important. We also need to have
intentionally dynamic processes that recognize the fact it will change. What
you introduced today was third level. We have heard about data. It needs to be
standardized, but then it needs to change. We heard about measures. They need
to be designed and standardized, but they too will change. The third is models.
I would ask that as you go through this that you think about the dynamism and
the transparency needed in getting them set and then how they will change and
how the process will be.

The second was you talked about engaging in a pier-to-pier learning
information and sharing dissemination. I would also add collaboration and to
think about brand new architectures. Architectures that include kind of the
collaborative wisdom of crowd environments, information markets, and take into
account what it actually means to discover something and have piers who are
truly knowledgeable and essentially aggregate their opinion and consensus and
to think about new architecture for that so that things can in fact proceed a
pace much more quickly than before.

MS. ELIXHAUSER: Thank you.

DR. CARR: Thanks very much. Bruce?

Agenda Item: Performance Measurement and Quality
Improvement – Niagra Health Quality Coalition

MR. BOISSONNAULT: My name is Bruce Boissonnault and I am President and CEO
of Niagara Health Quality Coalition. I am also a publisher of
myhealthfinder.com which we will talk about very briefly. There are two prongs
to what I am going to talk about. The first one is our work from the very
beginning as a beta test site with AHRQ QI’s, which I think are a great step
forward. That is a context for a view that our employers have and our
collaboration for a multi stakeholder collaboration. It is probably one of the
longest running in the country that is actually data savvy, and that is to move
away from the notion of administrative data toward the notion of a data highway
that can provide not only quality data, but also population screening. Again,
the notion is that in any other industry, if you do something but do not
provide the necessary metrics to the appropriate folks – I had a corporate
background with Disney and McKinsey — if you are in an operating area and you
do a good job, but you do not provide the data, you did not do a good job. We
are in an industry where we are not providing the data is the norm. We assume
it is still a good job even though we have to take your word for it that it was
okay. So, that is the two-pronged approach.

We have 2,000 employers statewide. Some of them are like General Motors and
General Electric. We work with 30 health plans, government, state, regional,
and federal and hospitals and health plans. We also do disease management. For
those of you who start to see the estimated GFR calculation, that was our
project. They piloted it in our region and expressly mentioned in media that it
was us who sort of helped them sort out which of the 32 measures aught to be on
the lab.

Let us jump forward. We are in a new era. People are looking at data. As
long ago as 1998, we started publishing individual hospitals, risk adjusted
mortality rate with the Ford General Motors hospital-profiling project where I
was one of the National Advisory Panel members.

I want to launch right into this for the sake of time. I often am asked the
question, do people actually use health care performance reports? It is sort of
funny because while that debate was raging back in 2001-2002, we were getting
up to 15,000 discrete users per hour to our website. That is up to 3 million
hits a day. That is for a report on one state’s statistics. Keep debating the
question. I am not challenging you about that. I just think it is a funny thing
to be talking about.

Public performance measurement systems must be judged by the degree to
which they affect positively the status quo and as an illustration, we had the
longest running patient survey project. We have 100 percent voluntary hospital
participation. The hospital paid for their own surveys. We went when aggregated
as one of the worst regions in the United States, and our region consistently
has hospitals that are among the worst financial performers in the United
States. We did not throw a lot of money at this, but I think we did this right.
We did understand Pay for Performance. We went from one of the worst regions in
the United States to one of the best. Our numbers have not moved much since
2004. We are continuing.

New York State Hospital Report Card, most of you know the coverage is
widespread. I am going to zip through. You can pretty much go to it at
myhealthfinder.com, but we typically are the front page of every regional
newspaper. We get a lot of electronic when we release the report. We have
credibility with the media, which was established before we were involved in
health policy. We do not sensationalize the report, and on the other hand we do
not let politics get it so obtuse that no one can understand it. I think we
found the right balance of stakeholders.

All of you know who Don is. I believe ours was the first public report that
Don had ever endorsed that had specific outcomes measures by provider up until
our report.

One of the questions I think you were searching for was, why did we select
administrative data? It is not based on secret input factors. It is superior
for public reporting because it is tied to something and is therefore more
difficult to gain. That is, it is tied to the billing information. So, if you
cheat too far you are committing fraud and you are in danger of having more
consequences than just a talking to. It is a sustainable data platform because
it is exempt from what I consider to be the extent of secrecy provisions of the
Patient Safety and Quality Improvement Act. I would caution you, it is a risky
venture to not define data as billing and discharge data because it looks like
provider identifiers remain a little bit at risk, especially for those outside
of government. I do not think state governments have to worry, but you have to
look at folks like us. We have value too. We have been publishing these data
for five years and we were used by the NQF as what should not happen. What I am
saying is, there is room for diversity of thought and the key to the database
is not the definition of perfect measures. It is very cost effective, so I am
not going to spend a lot of time on things you know.

What resources were needed to make the project worthwhile? I will just say
that money is not the key factor. Because administrative data are being used, I
think the key factor is integrity and independence. I do not remember who said
it, but someone said it is hard to be flexible in your thinking if you are paid
to think one way. We are designed to be outside of that paradigm.

When I was at Disney one of the partners that reported to me was a large
segment of the data management function and so I use come of the – from
industry to make sure that we don’t publish mistakes in the New York Times. It
is not hundreds of thousands of numbers that out there over the course of the

How do you use the data? We use it for all of them. I was on the Institute
of Medicine, I was one of the external reviewers for the IOM Pay for
Performance Report. One of the things that I am a long-standing advocate for is
understanding the difference — Pay for performance is part of the spectrum of
how you change behavior, but it really is the 600-pound gorilla because it is
expensive, sort of politically difficult, so you do not want to do pay for
performance on everything. So, we used a spectrum on what we recommend from
attaboy letters at the light end to public reporting all the way over to pay
for performance. The key for us has been getting CPO’s compensation tied to it.
So, I would add one other comment, which is, if you can get the CPO’s
compensation tied to a measure, you no longer need to do Pay for Performance
for it.

What interventions were triggered? I am just going to use one illustration.
Again, when we published these measures I feel a little like rock and roll
because when we started this, we were the cause of the end of civilization
according to some of the provider community. Today, we are very mainstream. We
aren’t quite elevator music yet, but we are definitely mainstream. This kind of
thing still happens weekly, at least. When we looked at the data, I did not go
to the media, but I noticed that one of the regional stroke centers had
terrible results. We were driving ambulances up to 20 minutes extra to get to
this hospital that appeared to have risk adjusted results and were
statistically significantly below the norm. We always approach this with some
caution because notwithstanding everything else, we know we do this in a
confidence interval. Meaning, even if the underlined data are perfect, this
could be due to random chance. So, we put the data up, and we invited the
employer leaders and all the hospital leaders and all the health plan leaders
and some community leaders to a closed meeting. We said, these are the data and
we are concerned about the stroke mortality statistic, because the physicians
and all the people who knew the situation who were in the room became apparent
that 24-7 radiology for stroke had sort of fallen off the cart due to a
negotiation problem with radiology. This was a longstanding on-again-off-again
problem. Long story short, it was fixed within three days via offsite reading
radiology at this hospital, and never again has surfaced and the number
improved. I am not going to go through them, but I could give you hundreds of
anecdotes of people calling me and saying, our system is ten times more
expensive than yours seem to be, designed to tell us what we want to hear. You
sometimes seem to be telling us what we need to hear.

We are seeing a drop in overall statewide mortality. Obviously some of that
is related to the improvement in clinical care. We also have indication that
some of it is due to transparency, however the volume problem hospitals doing
triple AAA or corroded artery surgery at low volumes, because New York is a
place where there are too many hospitals these folks just don’t want the beds
to be empty. I am not as happy with the progress even though there is some on
the hospitals getting out of procedures they should not do at low volumes. We
have done some research and the issue is more real than I think people realize
the difference between those performing a threshold. I know people realize it
is big but our data suggests that it is bigger in some instances.

I will just get into one example of the savings. Again, for the sake of
time, I want to make sure we hit part two. We get calls from hospitals saying,
you put several stroke patients as mortalities charged against us and they were
hospice patients. My comeback still is, okay if we rerun the whole report card,
are you going to reimburse the people that you over bill? They said, we did not
over-bill anyone. I said, this is the billing system. You bill them as ICU
patients. How could they have been ICU patients when you are billing them as
hospice patients? Nothing sanitizes like sunshine, even if you get missed data.
I think the key here is even if the data are a little misleading sometimes, and
I do believe these times and good people about their data. Sometimes good comes
out of that too.

I want to move on to the macro suggestions.

Again, this I have a unique background in that I am not really affiliated
with a hospital or an association or government. We are truly independent. We
are structured to be somewhat independent. This is just my view as someone
uniquely familiar with the details of risk adjustment and how the data come
together. We do the computations ourselves, so we don’t outsource this. I am
not talking as someone who hires someone. We actually are hired by states
asking us how to do this.

Winston Churchill was one of my favorite characters in history and a great
man because he lived in a great time. The first lesson that you must learn,
Winston Churchill said, is what when I call for statistics about the rate of
infant mortality what I want is proof that fewer babies died when I was Prime
Minister then anyone else was Prime Minister. That is a political statistic.

We do not want to run healthcare based on political statistics. This is not
a—this is something that I think we all can embrace. Government, when it
operates in the dark, I think it is susceptible to undo special interest
pressures. So, the policies that you all create, I hope will focus on what you
are doing, but I hope you do it transparently because at least on behalf of our
employers that some of the government statistics that were coming out when we
double checked them and understood how to run the numbers had some problems.
So, measures which cannot be replicated independently based on public data
sets, I believe will remain in doubt and deserve to.

Current US healthcare policy is – failing is a strong word. Not living
up to expectations did not fit, but anyway I know there is a lot of different
sciences and argument about who was right Denning or Sigma or whatever. Denning
observed that if you focus on quality first, then overtime quality will improve
and cost will go down. If you focus on cost first, then cost will go up and
quality will go down over time eventually leading to loss in trust in the
system. Again, having worked in corporate strategy I remember when everybody
was into the lets get the cost cutting down. It is a terrible metaphor but it
will work when I consulted for the tractor industry. I remember at one point
for a tractor that we sold 200 of we had 36 different foot pedals. No one was
counting the cost of all the inventory and all the other things it was just a
short term thing.

Quality equals result of effort times cost. I think sometimes in our
measurement, we confuse effort with results. That can drive up cost.

Where are we today? Today, I think what we have is this
disconnected—we do not have a US Health care data highway, which I think
is where we are headed. We think about creating a data highway the same way we
did in the forties and fifties our national highway system. What we have is
dirt roads and some paths and some unexplored woods. Each of the groups on the
left provide some information of some sort into some components of what is on
the right into this sort of disjointed thing, many of it after the fact. That
has been discussed today at some length. What we really need is data to be a
byproduct of care under the assumption that without adequate data, you are not
providing adequate care. I would add that it is not the EHR and performance
reporting, it is also population monitoring.

In the interim, I think there is a place for a hybrid system. How many of
you are familiar with normalization? What you are talking about is building
databases. What you are talking about is building databases. Normalization is
to the science of database management what supply and demand is to economics?
You cannot study economics. You cannot implement economic policy without
understanding supply and demand. Third form normalization, I think should be
required now. If you do not know what that means, if you do not know what that
means just talk to one of the mathematicians involved in your data base
management. That needs to be a policy thing. Needs to be billing and discharge
data. I think data should be defined as billing and discharge data first. A
case needs to be made that it should be secret rather than reverse. Sometimes
what I see in our work is a willingness to compromise things that should not be
compromised. The system of data should support the system of care, not the
other way around. We are willing to accept what data exists as a real
limitation. As a data guide, it is political, but the data issues are within
our grasp quickly.

What could we do today? I hope there is a charge that will decide to take a
look at what we really have now. I will end with this. I think by the middle of
next week we could sit down with the VA’s in electronic medical record and say,
which of these do we have now and yellow highlight those we do not. I am not
saying the VA is the end of the road, but it is a starting place. It is public.
The week after that, look at population screening and do the same thing, yellow
highlight what we are missing. I think you would see a lot of yellow missing
data elements but at least we would sort of start on the road. In week three, I
think you could take week three off. I don’t get any funding from ARC but I
really felt have built the best possible performance reporting system for
hospitals and some community wide measures that can happen with the available
data. So, there is more in here, but I will stop now for the sake of time and
just end with I really went back and forth on whether to get into some of our
data issues because we are very data savvy but I thought this was more

DR. CARR: Thank you, Bruce. I appreciate your very detailed slides. They
were very helpful in your overview. I think what we will do is if there is one
question or if we follow your segue way right into VA, I think that why do not
we go ahead and get right back on track. Thank you.

Agenda Item: Performance Measurement and Quality
Improvement – EHR

DR. EISEN: While Cynthia is getting this set up I will talk a little bit
about myself. I am an internist and rheumatologist. I have been at Washington
University in St. Louis since the late 1970’s and have been at St. Louis VA
since the early 1980’s. My areas of research interest are focused on physiatric
epidemiology. Because I started at the VA in the early 1980’s, I was there when
the VA was still an all paper system. I have lived through the transition from
the all paper system to an all-electronic record. The difference is really
quite remarkable. Because the VA is an all-electronic medical record system,
the databases are used extensively by our researchers for a wide variety of our
research projects.

So just a little about the VA, itself. The VA is one of the largest health
care system in the United States. There are 1300 facilities across the US, 153
medical centers and 105 of them are affiliated with academic institutions. You
can see that we train 81,000 professionals, support 9,000 residency positions.
There are some freestanding counseling centers particularly for individuals who
are associated with emotional problems from the Vietnam War and from the first
and second World War, all of the wars. I think the one aspect about noting the
large sizes of the VA system is the potential for having substantial influence,
if only by frame.

There are about 6 million veterans, maybe 20 percent of them still come to
the VA as users. They are getting older although the recent wars is decreasing
the age. They are primarily male with a small percent of female. It is
predicted that by 2015 it will be 12 percent of female users. Those have a
lower sociodemographic characteristics. Our patients tend to be complex with
many psychiatric as well as medical comorbidity.

The name of our record is called VistA, Veterans Health Information Systems
and Technology Architecture. A brief background, it was actually the when the
VA started getting computers in 1969. In the late 1970’s investigators started
creating VA databases for their medical center administrators. There is a big
advance that occurred in 1982 when the congress endorsed the development of the
VA development computer systems. In 1985, a hundred VA’s started using
computers through various administrative functions and by 1999 what we call
CPRS, Computer Patient Record System, was introduced. The major events of this
was that it had an expanded demographical user who we interface which really
made it much more widely acceptable. Remember when this was first introduced at
the St. Louis VA. The physicians were initially told they could use it or not.
It was up to them. It was never said that at some point they would have to
begin using the computers. I think everyone knew what was coming. After a year
or so, it happened. Paper records stopped appearing in clinics. Since I am a
pretty good typist, I was one of the early adopters. Within the VA system,
there are some of the older physicians aren’t typists so there was a lot of
resistance for many many months. It is just now you do not hear of it any
longer. The younger physicians all are used to typing. It can be time
consuming, but that is the process that is used.

Also initially, when use of the system was mandated, there were times when
the computers were pretty slow. They crashed, and so I would say there was
maybe a year and a half of period where there was a lot of unhappiness for the
computer system. Eventually, a better equipment was installed. Rarely it

Virtually, it has all of the medical record, all vital signs, all hospital
discharge diagnoses and procedures, all outpatient diagnoses by ICD code,
progress notes by all the health professionals, all orders, all lab test
results, all radiology reports, and also all x-rays are digital. The quality is
not as good because of the lower resolution of the monitors in the individual
offices, but it is commonly satisfactory for what individual physicians do and
it can be useful in illustrating to patients of what is going on and the
concerns. All consultation requests and results, all pathology reports, all
medications since 1997, procedure consents, Medicare data (for research), and
all data is available from any place in the VA system. This was dramatically
demonstrated with the event in New Orleans and the Southern Coast when people
were displaced including many veterans. Their records could be picked up almost
as if at they were at their home medical center.

So, this is a screenshot. Some things to note are, these are the tabs. This
is the face page. These are the active problems. This is allergies where there
is a specific note citing a description of what happened. This indicates that
the patient sometimes has behavioral problems. The VA has an extensive list of
clinical reminders, most of the patients I took care of do not have many of the
clinical reminders. The physician would click on this and see what procedures
are necessary. This indicates that there are lab results. These are
appointments, both past and future. This is in the so-called coversheet.

This is the list of problems with some detailed information about the
activity of the problems when they were first diagnosed and when they were most
recently dealt with medically. I should point out that there is this remote
data available. This is one of the ways that the clinician can gain access to
data that is remote. Something like a 20 percent of our patients are seen at
more than one VA medical center. Patients do travel to Florida and spend
winters there and summers in other climates. We have patients who use
medication abuse and they go to different medical centers to obtain pain
medications, and this would be available in this mechanism.

This is the medication, the list of medications and the status. This is
non-VA medications and this is in-hospital medications. When the patient is
hospitalized, this window expands and this window is narrow. These windows are
moveable with a mouse. This is the order window. There are also summary

DR. CARR: In the interest of time, I want to make sure we get to a little
bit about how we use this and how we are better for it. Obviously, it is
phenomenal. I do not want forgo the kind of what got better with this as we are
describing the many goals.

DR. EISEN: These are some of the laboratory

windows and how it can be presented. One of the things that has been
implemented as a result is these reminders that I described briefly earlier.
This is an example of reminders for people that cult blood. So, for example
with the implementation of this reminder, the appropriate follow-up for colon
cancer screening has decreased substantially. This is our VA corporate data
warehouse. This is a bit of the ideal. Eventually, this will be implemented on
a national basis. It has been implemented regionally. Several regions, and you
can see the sources of the data goes into a data warehouse which is made
available for both research and non-research purposes.

The data has been used expensively to address a number of issues. For
example, rational disparities within the VA so that changes and programs can be
implemented to improve them. There are a number of disease based studies that
have been published, for example evaluating the prevalence of psychiatric
disorders, the quality of care associated with psychiatric disorders,
psychiatric comorbidity, particularly depression are major areas of interest to
the VA.

Quality of care standards in terms of utilization of medications for
treatment for hypertension and appropriate control of hypertension. Pharmacal
vigilance is a major area of investigation within the VA because of the VA’s
integrated system and because almost all of our patients get their medicine
through the VA because of the economic advantages of doing that. We are finding
more of our patients are receiving medications outside the VA system. There
have been studies demonstrating the use of reminders in improving immunization,
success, and particularly with the flu immunization implementing improved
methods for monitoring dermatologic disorders. There is a major issue within
the VA of access to care. One of the specialists in short supply are
dermatologists and the issue is how do you get the dermatologists and patients
together. It is typically very challenging. VA researchers have evaluated using
teledermatology. Teledermatology is relatively easy to implement within our
electronic medical records system with the photographs becoming part of the
electronic medical record. Psychiatric consultation has also been a remote
psychiatric consultation that has been effectively implemented as well.

The strengths of the data is that for both organizational evaluation
efficiency certainly facilitates medical care. I think that the major issues
related to the VA’s electronic medical record is that because it is so large
now, there is an inhibition on innovation. Terms of improvement of the medical
record are in part because of data security issues that have occurred with the
VA’s and the database over the last couple of years. There is some research we
consider that they watched some control over. The electronic medical record and
responsibilities are increasingly transferred to the technocrats. Depending
upon what your interests are that can be considered to be a really good thing
from a researcher point of view and from innovation. I think it is a
difficulty. I think another major issue relates to the fact that a lot of the
data in the electronic medical record is in free text form. Investigators have
certainly done simple stream searches with the free text data. That is quite
feasible but it is very limited. One of the research programs that are now
being developed within the VA’s research division is to develop and encourage a
programmatic goal of encouraging researchers to do sophisticated text searches
because of the incredible amount of information that is locked away in medical
records. Not only in progress notes, but there is no ready way of getting
access to radiologic report results, pathology, etc. Finally, because of the
existence of large data sets, it increases our vulnerability to data loss. Just
a simple hard drive you can carry in your pocket can contain millions of pieces
of identifiable information.

DR. CARR: Thank you. I have a question which is, what advice would you give
today as you are building electronic health records outside the VA system? What
would be the three things that you would advise to make this system flexible
and usable in ways that the VA has not.

DR. EISEN: I think that from the very beginning that a mechanism has to be
developed that so that the data that characterizes the patient which goes far
beyond just the administrative data is readily available to access. It can be
done within a way so that it is putting it in a formatted process which makes
it easy for the programmers. The health care providers and the health system
have this sole variable that I think that there is really no single—it is
not feasible to create a formatted method to collect all data that researchers
and organizational individuals would want in order to be able to adequately
evaluate the quality of care. I think it is necessary to develop some sort of
sophisticated pre-text search methodology as an integral part of the record.

DR. CARR: Okay, as opposed to structured fields?

DR. EISEN: I think structured fields are also relevant, but the structured
fields will not cover everything that is going to be necessary, some of which
we know already.

DR. CARR: One other question, from your problem list, do you use ICD-9

DR. EISEN: Yes, the healthcare provider is presented with—the
healthcare provider cannot close out the record until he or she goes out of the
problem list. The problem lists were narrative in structure. Behind that is
ICD-9 codes. The quality of the diagnoses varies typically with the diagnosis.
There are some that have very low validity and others that have high validity.
There have been a number of studies that examine the validity of ICD-9
diagnoses by doing detailed evaluations of the medical record with reasonably
predictable results.

DR. GREEN: Process question, I want to thank you

Seth for making a point of that. Structure data is not quite sufficient.
Bruce, going back to you, you had a slide that you did not show that we have in
our handout here. It starts out with, without HHS Leadership—there are
several things there. I want to ask you to comment about one of your bullets
that says, made populations based surveillance mandatory as a byproduct of
care. Could you say what you mean?

MR. BOISSONNAULT: All I am saying is everyone is

focused right now on electronic health records. I think as we define what a
data superhighway might look like, we should think not only about quality
measures, cost measures, and the EHR, but also the population based screening.
What brought it to light for me was work I did with the IOM on safety medical
devices for children. That system does not need to be totaled. It could be a
byproduct of care feeding the information when you have a device that fails
should be a byproduct of care.

DR. CARR: I think we are not too far behind schedule. We will break for
lunch now and resume then at 1:30 back in this room. Thanks very much.

[Whereupon, the meeting adjourned for lunch.)


DR. CARR: Good afternoon. Welcome back.

We have an equally exciting afternoon panel of speakers. I would like to
stay on time as much as possible. We are starting on time now, at 1:30. We will
the same as this morning — look to have 20 minutes of presentation, followed
by 10 minutes for questions.

Seth, thank you for coming back this afternoon, wearing another hat — same
hat, but different topic. I will turn it over to you.

Agenda Item: Performance Measurement and Public
Reporting – NSQIP

DR. EISEN: These slides were provided to me by Bill Henderson, who is the
director of NSQIP, based in Denver, Colorado. I made some modifications, so I
share responsibility for them. Bill Henderson has been affiliated with NSQIP
for at least the past 15 years.

The basic goal of the National Surgical Quality Improvement Program is to
develop a standardized methodology that will permit evaluation of the quality
of surgical care within a medical center, within a medical center across time,
and across medical centers. The key points of that are that it is a
standardized methodology that is collected independently, for the most part, of
the surgeons and the other staff who are actually doing the procedures, and it
can be used to effectively measure the most critical outcomes — surgical
morbidity, mortality, length of stay, complication rates.

It provides patient risk-adjusted surgical outcomes to surgical programs
that permit evaluation with other programs. The data collection has to be
reliable and believable. That means that, for the most part, it is collected by
individual nurses who are trained and committed to collecting the data in an
unbiased fashion.

It empowers surgeons to review the quality. This is presented to the
surgeons in a supportive manner. It’s critically important for them to use this
information, of course, to identify where the problems might be in terms of the
surgical quality and to figure out what to do about it.

The NSQIP data is primarily intended for programmatic uses. It’s not really
intended to provide feedback to individual surgeons. In that way, it hopefully
is less threatening, at least to individuals, although it could certainly be
potentially threatening to programs.

NSQIP develops performance measures for surgery used by the program
administrators. It also maintains a registry of major operations and makes this
data available to researchers. One of the important byproducts of the process
— not only is it internal for use by the program surgeons, but it’s also
external and contributes to the overall knowledge about what is necessary to
provide high-quality surgical care.

Just a brief history. In the mid-1980s, Congress mandated the VA to compare
their surgical outcomes to the private sector. The problem at the time was that
there was no methodology for assessing surgical quality, not only within the
VA, but there was no method for assessing surgical quality outside the VA

A couple of years later, the VA awarded a couple of health-services research
investigators to develop risk-adjusted quality outcome data for cardiac
surgery. Initially, they focused on administrative data, but it became clear
that administrative data just wasn’t sufficient. The administrative data they
wanted the most commonly didn’t exist, and the data that was available didn’t
provide the outcomes that they were interested in.

In 1991, there was the start of what we consider the National Surgical Risk
Study. By 1994, there were 132 VAs that were participating. AHRQ also joined
the group, and now there are a number of non-VA hospitals that are
participating, and the American College of Surgeons is encouraging surgical
programs to participate nationwide.

The primary groups involved — the greatest interest is in major operations
that require general, spinal, or epidural anesthesia. But minor operations are
also of interest, although those that are well-known to be associated with
very, very low morbidity are generally excluded. For some of the more common
operations, such as TURPs and inguinal hernia repairs, while this data is
collected for these procedures, the number of patients on whom data is
collected is limited.

Finally, for high-volume programs, there is a limit to the number of cases
for which data is collected. But within the higher-volume ones, an appropriate
systematic sample is taken, with the intent that the data collected is truly
representative of surgical procedures.

I have been told that some surgeons can game it a bit. But I have also been
told that it is difficult, but possible.

A number of risk factors are collected in a standardized fashion. Nurses who
are committed to the program collect the data independently from the medical
records. These are the variables that are collected in preoperative risk
factors and the basic laboratory values. Also data is collected about variables
associated with the operative intervention and postoperative outcomes — vital
status, length of stay, whether or not the patient has to return to the
operating room, and complications. All the laboratory data is automatically
downloaded without further intervention. It’s downloaded into the database.

There are statistical programs that have been developed. The most important
are the mortality and morbidity associated with the surgical procedures. It’s
put into sort of a standard observed-and-expected-events kind of evaluation.

This is an example of some of the data analyses that have been performed.
This is a 30-day predictor of mortality and overall morbidity. This is a
ranking of the risk factors. You can see that for both mortality and morbidity,
serum albumin is actually the most significant. This is a surgical class
evaluation. You can see some of the characteristics for mortality —
disseminated cancer, emergency operation, age, renal insufficiency, et cetera.

Several feedback mechanisms have evolved. There are quarterly reports, which
focus primarily on the observed/expected ratios for the various operations.
There are more summary annual reports. Because they have more data, they are
more accurately representative of the surgical morbidity and mortality. Chart
audits are standard by the NSQIP nurses for patients who experience unexpected
adverse events. Occasionally, surgical programs ask for onsite evaluations to
help figure out why certain goals are not being attained.

This is an example of the unadjusted 30-day mortality rate for major
non-cardiac surgery. This is a function of time from 1996 to last year. You can
see how there has been a progressive decrease in programs that have been
participating in the NSQIP process.

This is unadjusted 30-day morbidity rate. You can see, shortly after the
program began, there was a marked drop, and now it seems like it’s a fairly
flat line. There seems to be a more dramatic impact of the program on mortality
than on morbidity.

NSQIP, as I mentioned, is also available for research use. There are now
over 1 million cases. So even rare surgical procedures can now typically be
found in the database. There are standardized procedures for applying for
access to the data. Over 100 scientific publications have so far resulted.

The publications are in a wide variety of areas. This is an attempt to
summarize the scope of the publications. Of course, mortality and morbidity is
a primary outcome, but also the relationship between volume and outcome,
surgical outcome, and teaching versus non-teaching hospitals, and modeling risk
factors for various operative procedures.

There have been a number of articles published about specific risk factors
and various complications and surgical outcomes in certain comorbidities or
subsets. Again, because of the large number of surgical procedures that have
been collected over the last 10 years, it’s feasible to do this sort of

This is a summary of some of the highlights of some of the research that has
resulted. There is good evidence that a strong level of feedback and
programming to the surgical participants indeed does have an impact on
morbidity, but there is not a significant impact on mortality. Laparoscopic
cholecystectomy has been evaluated. With the introduction of laparoscopic
cholecystectomy within the VA, the indications for actually performing the
surgery apparently have not changed, because the volume has not increased,
whereas in private care, the number of laparoscopic cholecystectomies has

Serum albumin has continued to be by far the single most important predictor
of surgical morbidity and mortality. Surely that reflects the underlying health
of the individual who is undergoing the procedure.

There is no relationship between surgical volume and risk-adjusted outcomes
in eight major types of operations within the VA system. Not surprisingly,
administrative data is not nearly as useful in terms of evaluating surgical
risk factors as the systematically collected NSQIP data. This helps ensure that
NSQIP will continue. They have proven their initial justification.

NSQIP reasonably predicts postoperative morbidity and mortality, both in VA
and in non-VA hospitals. Also of interest, postoperative morbidity and
mortality is higher early in the academic year compared to late in the academic
year. So try not to have your surgical procedure in July or August; wait until
March or April, at least if you are going to an academic facility.

It does cost money to collect this data. Whether you think it’s costly or
not depends on your point of view  about $40.00 for a major surgical
case done within the VA. Compared to other surgical costs, this is a small
amount. Presumably, the data that is collected and its impact on quality of
care offsets this cost, although, as far as I know, there has been no analysis
to demonstrate that that is actually true. Of course, how you cost the benefits
depends on how you cost the outcomes. If you include patients being able to go
back to work and their satisfaction with care, et cetera, the cost of data
collection relative to the benefits decreases.

This is the surgical cost at an anonymous VA, but it’s a real one. This is
another example of the kind of data that can be collected and analyzed from
NSQIP. This is one modest-sized VA, the accumulated data over the last several
years. Not surprisingly, there is an increased hospitalization cost associated
with postoperative complications — infection, cardiovascular complications,
thrombotic complications, and respiratory complications.

This is an article that either was just published or is about to be
published in JAMA. It is looking at the collected data. The
investigators looked at the relationship between hematocrit at the time of
surgical procedure and the subsequent postop 30-day mortality. I believe
surgeons try to transfuse to a hematocrit of 30. But the data from these
investigators indicates that there is an improvement in surgical mortality
transfusing preoperatively at least to a hematocrit of 35.

Of course, this is a retrospective study and would likely not be the basis
for introducing national policy. But it’s this kind of longitudinal data that
provides the basis for justifying more expensive prospective randomized

Overall, NSQIP has been around for a long time now. It’s a well-established,
well-oiled machine. I think there is good data that it has been effective in
helping surgical programs evaluate their quality of care, and when it doesn’t
match their own standards or standards by comparison to other surgical groups,
to encourage them to try to figure out what the problem is.

Another indication of success is its widening use nationally.

DR. CARR: Thanks. That was a great summary.

I think it has taken on very much nationally. Many programs outside of the
VA are now taking it on.

One thing that strikes me is that we heard this morning about the importance
of physician involvement in the Pennsylvania project, that without that buy-in,
it doesn’t have credibility. Now we are hearing it from the other side. It
began with the physicians developing what they wanted, and they have tremendous

I feel, as we hear the different themes of the day, here is a great example
of an incredible embrace of this within and outside of the VA, and very, very
important data coming out of it.

It’s interesting. What you were talking about as sort of administrative data
would be other things added in the labs. It makes me wonder if there wouldn’t
be a way to ultimately capture some of the things in NSQIP, even though NSQIP
began by saying administrative data is not sufficient. But with the
improvements that we have heard about, the refinements, you wonder if there
isn’t a middle ground that we could come to.


DR. EISEN: It sounds like you are raising the question — potentially, with
the increasing use of the electronic medical record, the risk factors that have
been identified can be automated, rather than designating a person to actually
physically collect the data.

DR. CARR: Yes, I think that’s right. Also what we heard this morning from
Anne is that there is logic that can be embedded into administrative data sets.
If you have pneumonia and you are here for elective hernia, you probably didn’t
have it on admission. Some of that sophisticated logic, I think — I am not
saying we are there yet, but I see two parallel universes beginning to

MS. MCCALL: First, a comment. What a lot of this suggests — you made a
comment at the very end, with the new paper that — did you say it’s soon to be

DR. EISEN: I have seen the preprint. I don’t know whether it has been
published yet.

MS. MCCALL: There is a kind of circle of life here. We have been talking a
lot about practice. Yet this is something that actually begins as research and,
through what might be called a pilot study, might suggest prospective research.
It’s not enough to actually be used to sufficiently change practice. But the
prospective, then, could ultimately get put into an electronic record that had
some automatic data capturing, once some of those decisions have started to be

What it seems to suggest overall — and I would like your comment on it —
are different approaches and policies around intentionally creating the
research-to-prospective-study-to-decision-to-policy move. I would like your
thoughts on that. I would like to know if you think that there are things that
you would recommend to take this to the next level, to close that loop.

DR. EISEN: I don’t fully understand your question. To some extent, research
certainly can drive the elements of data collection in a database like this.
But researchers are so creative that I think that the greater the flexibility
in a database, the greater the potential for really getting informative

MS. MCCALL: Are you talking about a research database or —

DR. EISEN: A database such as this.

DR. GREEN: I would like to ask you to teach us a little more about two of
your slides. You showed a slide that showed trends in adjusted 30-day morbidity
rate, which, for about 10 years, looked like they are insensitive to anything.
It was pretty much a flat line going across there.


DR. GREEN: While mortality was going down. Later on you talked about
selected findings, that in surgical services with a high feedback in the
program, you had a lower morbidity observed-versus-expected ratios.

DR. EISEN: I’m aware of the conflict.

DR. GREEN: It sort of looks like it flipped.

DR. EISEN: I can’t adequately explain why those two observations were made.

DR. GREEN: Do you know anything about what those high-feedback programming
events were?

DR. EISEN: I don’t know the details of how the surgical programs went back
to their participating surgeons and provided them with feedback information.

DR. STEINWACHS: I will take you off in another direction. Since you show
data like this to researchers — back when Congress mandated that the VA do
this, which was probably the era when Jack Winberg was producing all the small
area variations, which he still does, around surgical procedures and so on.
Today the issue is talked about in terms of comparative effectiveness of
alternative treatments.

Has there been anything done using this — you are looking at a set of
severity measures that could be applied to a person who doesn’t get surgery,
even though they are potentially eligible, or could be potentially applied to
trying to — if you could put a denominator population on it, in terms of high
surgical rates adjusted for specific risk factors versus low. Has anyone tried
to use this severity measure to broaden out and deal with the concern about
whether you should or should not have operated or whether it’s timely, and so

DR. EISEN: As far as I know, the data collection does not permit examining
that kind of issue — that is, broadening the basic denominator, having a
surgical procedure. No one has looked at the advantages or disadvantages of
undergoing a particular procedure versus not undergoing a particular procedure.

MS. GREENBERG: Thank you for your testimony. I am going in yet a different,
more techie direction, back to your pre-lunch presentation. You mentioned how
there is a lot of free text in the records. I just wondered if you could tell
us whether the VA is in the process of implementing a structured terminology or
some interface to that, what the status of that is.

DR. EISEN: There have been some initial attempts to address the issue of
free text. In one recently published article, an analysis of free text was
performed to evaluate the quality of examination for posttraumatic stress
disorder. So there are those skills and interests.

This has not been a focus of VA research, but I think it’s a very important
one. The VA has researchers in informatics scattered in various parts of the
country. One of the locations is Indianapolis, also in Salt Lake City and in
Ann Arbor, and other sites as well. One of the issues is, can we somehow join
this intellectual and experiential group into a coherent and focused research

So I would hope that that would go forward beginning in the calendar year.
How long it might to be to have some initial useful results I don’t know. But
my guess is that this is an area of research that would continue for many
years, unless the VA decides to abandon its free-text structure, which I think
is doubtful. At least there is nothing on the horizon right now.

MS. GREENBERG: Thank you.

DR. CARR: Thank you very much, Seth, for doing double duty. We appreciate

Now we will hear from Michael Lundberg on state initiatives.

Agenda Item: Performance Measurement and Public
Reporting – State Reporting Initiative

MR. LUNDBERG: I am Michael Lundberg. It is a privilege to be here today to
speak with you.

While the title has to do with consumer health transparency, actually I want
to talk to you about the underpinnings of this — or PSP. For those of you who
have teenage sons or daughters, I am not talking about the game console. What I
am really talking about is politics, science, and public reporting. Without all
three of those combined, it’s very difficult to move forward.

In order to do that, I need to give you a little history about the
organization. I would like to talk to you about some of the guiding principles
that we have had through our years with things that we have done. I would like
to talk to you about some of the specific things to do. While I do a little
round robin about some of the things we do with HMOs and others, I want to stay
focused on hospital reporting. Then I want to talk a little bit about the
direction we are moving, as well as what we are seeing on the national

For those of you who received our PowerPoint presentation, it’s rather
extensive. So while I am going to try to avoid “death by PowerPoint,”
we will go through this as quickly as we can and hopefully give you a little
background and maybe stimulate some questions.

Our mission statement has been unchanged for 14 years. We are around in
order to create health information, in order to help Virginians make more
informed health-care purchasing decisions, consumers and business, and to
enhance the quality of health-care delivery through the information that we
provide to hospitals and physicians and others.

We have been around since 1993, which is something that we are pleased to
say to start off with. We are in a political atmosphere. There are all types of
things — not only public reporting, but funding. There is also vitality and a
changing landscape. We have been fortunate to have the support of the General
Assembly, our health-care stakeholders, and others throughout the process, to
help us move forward and do some things that we are proud of and do some things
that we want to change and improve.

We do work through contracts with the state health commissioner, private
contracts, sales and services, and other government agencies, here in D.C., as
well as in Richmond and others.

I work for a board of directors. These folks are nominated by their trade
associations. I want to point out, that is a very important thing. While you
can have gubernatorial nominees that look and smell like a business
representative or a hospital representative, it’s critical to actually have
folks that are nominated by those associations, because the decisions that are
made on this board affect their direction. So these are done in order to
honestly give their input and help guide what we do.

When we started out in 1993, we were 100 percent funded by general
appropriations, which is a four-letter word, which means taxpayer dollars. We
were around 12 percent in the last fiscal year. We will be up a little bit more
because they did provide some funding for transparency, which we have been
towards for a number of years. They did provide some funding for additional
information. I will talk about that.

For the balance, provider fees help support something that is called the
EPICS system. It is, I think, the first time there has been a hospital/nursing
home mandatory surgical center efficiency and productivity ranking. I have
copies of some of those. That is called special dedicated revenue. Basically,
they pay for the privilege of us using their data. It’s a nice position to be
in. But it also supports them in many different ways, both through contractual
negotiations with the Anthems and the Medicaids of the world, as well a with
their own internal quality improvement and performance improvement.

Products and sales I will talk a little bit more about. That has to do with
licensing the databases, special services. We work a lot with Anthem Blue
Cross/Blue Shield in their multi-state pay-for-performance program, and there
are a number of other things that we do.

The information we get:

  • We start out with hospital discharge data. There are about 860,000
    discharges a year, plus the mental health data.
  • The EPICS, which I just referenced. Again, there are copies over there.
  • Outpatient surgical data. It has information, regardless of whether the
    surgery took place in a hospital outpatient department or a freestanding
    ambulatory surgical center or a physician’s office. We get some administrative
    claims data. We are an administrative claims data shop, plus we have financial
    and operational information, plus we collect direct primary data on certain
    things. I will talk more about the high degree of excitement that we have about
  • HMO data from the State Corporation Commission and NCQA, about 70 measures.
    We publish that.
  • Long-term care. There are about 1,600 providers — where they are, how to
    get to them, what their fees are, services, quality measures, and other
    financial and operational.
  • We collect information for the Commonwealth of Virginia for their
    licensure surveys that support their CLBM. We are a CLM state. So we have a lot
    of information on the utilization of those services that are regulated.

We have a series of performance measures that we have put out over the years
on hospitals, coronary care mortality, 30-day readmissions. We have
actual/expected length of stay and charges for different service lines. We also
have information on obstetrical care. We tend to put these things out in
consumer guides. So while we do have reports within those, it’s primarily an
educational process.

I think there may be some copies of this. We are on our third version of
this now. This is the second version. It is hospital- and physician-specific.
The first ones were pre-Web, pre use of Web. This is a hybrid, in that it was
published both on the Web and in hard copy. It has rates of cesarean delivery,
length of stay by physician, about 600, and by hospital, about 90 in Virginia.
We are in the process of updating that information as we speak.

Nursing facilities, CMS quality measures, which we applaud — efficiency
ratings, private dollars per day, Medicaid participation, profits, HMOs,
quality and satisfaction measures, which a lot of things are actually being
based on today.

So we have a number of things. What really guided us back in 1993, when we
decided how we were going to decide what to do — again, we were funded at a
rate of $300,000 in 1993. That was the big year. We are today a swollen and
bloated staff of six and a half. We are good to seven, and we are working
towards that.

But what was important was to have things that would be meaningful to folks,
and we could translate that as affecting significant portions of the
population, which is really why we started with OB as the very first one,
because that is about one in four hospital admissions. Actually, it starts out
as one in eight where the mother is admitted; one in four where there is a
discharge. So it depends on how you look at it.

Cost should either be a high-cost condition or the total cost as far as a
burden to the society.

Variation: The outcome of interest should actually have variation, or why
bother to look at that? If everybody is all the same, then you just say it’s
great and you pack up.

The ability to adjust for severity has always been a key and driving force.
That is particularly in which we have been limited primarily with
administrative claims data and then the tweaks, I think, that someone was
mentioning — the ability to use the data in different ways. We happen to
exclude hospice patients in certain cases. We exclude certain transfers that
are very, very high-risk, expanding the number of diagnoses, secondaries, from
nine to 24. All those things are intended to enhance something that was not
intended for measuring outcomes, but is being used nonetheless.

In developing the report, I really want to stress something that we learned,
which is that collaboration is important. There have been instances where
people have sat in a room and designed a report and published the report and
come out and they have done that once. You can usually do that once. But you
lose, because you don’t have involvement by the folks that are involved with
the process, who know the data as much as possible, who have the links to the
support that you actually need for this. So collaboration is key to this.

It takes longer, but it can result in better information, because you have
the input of all the different folks. They ask questions that you could never
think of. They know this information.

Good science: We tend to work with health-services researchers. I mentioned
the very swollen and bloated staff of six and a half. That does not include six
Ph.D. statisticians or physicians. We tend to contract with folks like that,
either at the University of Virginia or Virginia Commonwealth University’s
Williamson Institute. We find and work with the folks that can lend their
expertise, so as to help us make things as good as possible.

Surprises — just don’t have surprises. You can do that by keeping people
involved. When you work with physicians, when you work with hospitals, HMOs,
nursing homes, whatever, you keep them apprised of the process. It is hard to
do this, and it is slower. We are in the process of returning outpatient
surgical records to 2,600 physicians in the state of Virginia. Most of them
have never heard of this because a lot of those were reported by the hospitals
and/or surgical centers. So you have interesting discussions with them when
they see this.

But it’s important to keep them apprised so they are not surprised. We work
with them, and they know where we are going. There are no surprises.

Follow-up: Every time we publish cardiac care mortality rates and
readmission rates, we will hear from the facilities. It’s not usually the
physicians who look better than expected. It’s normally the folks that are not
as good as they think they should be. There are challenges. They would like
medical record listings. They want the statistical properties. They want
everything that we have done on this.

Although we publish most of this, we will follow up on each and every one.
We will give them the listings. We will give them whatever. We will sit down
with our scientists and go over them. We have never had an issue which they
didn’t understand and didn’t appreciate and then didn’t respect the approach.
There may still be some issues, some concerns, but that helps keep everything

I started off with collaboration. It’s absolutely critical, especially with
a small organization. Here is an example of good collaboration — people
sitting down, looking around, working together. There are no surprises here.
They are listening. They are paying attention.

Here is an example of collaboration that is not so good. If you don’t work
with people, they will work over you. So it’s really, really critical that when
you work with people, you are honest and you are open and sincere. Yes, I have
been the cat; I have been chased from the cake.

The data is easy to get hold of today. That is one of the things that we
hear about administrative claims data. It is relatively easy to get hold of.
It’s true. But it’s not so easy to use. Collaboration is important. You can be
guided in the right direction by working with people. I have never not had
someone work cooperatively with us, in everything we have ever done, if you are
honest about what you are going to do and you are honest about your approach.

Very briefly, Anthem has a pay-for-performance. It’s in multiple states. We
have nurses do the medical review. We have Web tools to collect the data.
Nurses do the evaluation. They used administrative claims data in the past. Now
they are doing workforce hybrids, some primary data collection.

We have a series of consumer publications. Some of these are over on the
desk. We tend to go print and then online. Some of the things are just online
— anything from HMOs to hospitals to cardiac care.

I mentioned this earlier, the efficiency and productivity. Essentially, this
is not a consumer product. It is designed for large employers who purchase
care. It does have contractual allowances, which is another way of saying
discount rates. It has the profits. It has charity care and others.

The reason I mention this is that we use some of this in some of our

We rank hospitals in their area by their cost per day. We are adjusting this
using the APR-DRG severity index. We have service lines and other things that I
will show you, too.

Cardiac care: Again, it’s an open process. We work with researchers. It took
a number of years to get people to accept this process. We are currently
working towards expanding to 30-day mortality by linking data from final
records. We use a modified approach to the APR-DRG risk of mortality and then
severity index for the readmissions.

That is about one in seven hospital admissions, so it is significant.
Cardiac care, medical cardiology, is the single largest. There are about 6,000
in Virginia. There are about 860,000 admissions. There are about 30,000
angioplasties. So we are looking at volume here. By wrapping all this under
cardiology, cardiac care, we are providing information that is more significant
for the population.

Essentially, when you go through there, you would pick it by the service
line and the region and the hospital, make a report. Essentially, this is a
consumer focus, where you would start off — what you see is really what the
readmission rate is. If you click on “Show detailed view,” it will
then take you to what the actual and expected LV ratio is. That’s what is
reflected in the consumer report thing.

We also take the medical cardiology and break it down into subgroups that
people can understand a little bit more, like heart attack or AMI and other

The service line report grew out of what we do for our industry report that
has length of stay and charge information. Consumers have said that they like
to know what percentage of hospitals do — they are looking for areas in which
the volume is significant.

I think we all know that volume isn’t always important, but it sure doesn’t

So they have that. That is the consumer version. We put that up because the
consumers said they wanted it, which surprised here.

Here is a version from the CD. This actually started out as a 1,200-page
report. Now it’s 34 pages, and everything is on CD. That is not consumer
information, but it’s something that researchers and providers and businesses
use that takes the information and gives you the length of stay and the

I just wanted to show you the different flavors.

I mentioned briefly outpatient surgery. We collect seven procedure groups
that are based on their volume, their cost, their actual and perceived risk,
their likelihood of moving to the outpatient basis — things like colonoscopy
and laparoscopic surgery, facial surgery, which actually gets you into the
retail market of physician services, as well as liposuction and others, knee
surgery and others.

A focus group that we held in a private company asked the five most
important things they would like to know. So we geared the way we will be
flavoring this thing based on those things. An example is laparoscopic, which
is a wide variety of procedures. This is written at a sixth-grade level, which
is not easy. You can’t use the word “abdomen.” You use the word
“belly.” How many people cringe when they hear that?

The point is, if you get down to those levels for consumers, you are
tailoring it to something that is very easy, very short sentences, few
syllables. We will tend to allow people to get more information than they want.

They are interested in the risk. They are interested in how it’s done, why
it is done, and others, and recovery.

If you were looking for it, we tend to use this flavor. In this case, you
would pick information by an area, pick your procedure, and then from there, in
this case, physician’s office. I skipped a step just to try to make this a
little shorter. This is the type of thing. You would have been presented with a
list of physicians that met your criteria. You would pick the one you want, and
you could hear how many they did in their office and their average charges,
compared to the minimum and maximum for other physicians. If they did them in a
hospital, it would actually let you know that they did them in a hospital. Then
you could link over to their performance within the hospital. Now you would
compare the charges with hospital charges. They are very, very different.

I mentioned the EPICS. We have information on contracted discounts. That’s
the difference between the gross charges, which you typically get in
administrative claims, and the payment amount. So the gross minus that is the
discount rate. We have that for Medicare. We have that for Medicaid. We have
that for all commercial wrapped together. We have that for all other, which
happens to be primarily self-pay. We don’t know what United’s discount rate is.
We know all commercial together. But we separate those.

It’s an important addition to what we are doing. It’s the first time we have
done it. We have been working with hospitals and others about using this
information. We just try to do that carefully. But we are trying to get
consumers better information.

That discount is based on their entire book of business. We don’t really
know that their discount rate is 46 percent for inguinal hernia. We know for
the whole book of business.

When we field-tested this with consumers, we really didn’t know what they
would think about it. They said this was a lot better — they understood the
retail world. They understood the difference between a car sticker and what you
pay. If they know the discount, then they can do the math and they can use it.
It gives them information to discuss further.

We are working to change our website to a consumer health portal. Right now
we serve business, consumers, and providers. We try to do everything for
everybody. We have information that is geared to all those. We are focusing
primarily on consumers. We will port the other folks off, not to another site,
but another section. So it’s primarily with consumers.

Consumer interest in price and quality goes well beyond hospitals. A
sixth-grade reading level makes the information easy to find. Integrate
information from other sources — as far as I mentioned, with contractual
allowances, but also the CMS websites and other websites that have been vetted
for different groups that we have.

Other languages — are they compliant? That is another whole deal all by
itself, a lot of effort associated with it.

For those folks who don’t have access to the Web there are still quite a few
of them — we have always had, for a number of years, a toll-free number.

Later this year or early in 2008, we will have another consumer guide to
obstetrical care that will be all this and more. It does have years in
practice. It has information on board certification for physicians. There is a
pretty detailed survey on hospitals, on their capabilities, on what they can do
as far as taking care of different levels of babies, their educational process,
and their breastfeeding and others. We are looking to update that. We will also
have rates of cesarean delivery, length of stay, and charges.

We are looking towards incorporating some of the AHRQ trauma indicators,
which are also very similar to the JCAHO indicators. We are also looking at
episiotomy rates to include on this. That would be hospital- and

That takes just a little while to do. There are a lot of issues on
episiotomy and some of the other ones.

Cardiac care and 30-day mortality — we are linking that information. We
apply the risk-adjustment factor.

We are looking to test independently AHRQ indicators for public reporting.
We are waiting for the feedback from the NQF. We are looking for them,
hopefully, to do a great job of doing some more work for us.

We are looking for the portal.

The other thing we are very excited about is the present-on-admission, lab
values. We think those are realistic to be working on. The other clinical
information is going to be more difficult, because it’s harder to get those

I think you are probably aware that the all-payer data, primarily
administrative, is available in 48 states. This actually complements an
organization that I have worked with in different capacities called the
National Association of Health Data Organizations. They represent organizations
such as myself, as well as the other groups that do this work.

A number of states do produce at least one sort of quality report. Medical
error reporting is becoming a big deal, the adverse events, like they have in
Minnesota, as well as the hospital-acquired infections.

Speaking of that, here is some information also that is hospital-acquired
infection legislation in the states. Those that are in the gray are looking to
use the CDC NHSN system for reporting. Virginia is one of those. We are also
looking towards other things.

This is an example of an AHRQ area-wide indicator on selected infections due
to medical care, which has actually done a lot of work to try to adjust out
infections that were not brought from an outside source. Without accurate POA,
it’s still hard to do.

We can very well say that we can see that this hospital treated this number
of people with infections, but we really don’t know if they acquired it. You
can do some things with nursing homes and others. POA, with the right training
of physicians and hospitals and follow-up, will be a good thing.

Certainly, legislating it is a good start. We will work towards doing that.
If you don’t have the training, if you don’t have the follow-up, it’s not going
to be very good.

Here is a little information, just simply taking similar patients with and
without infections, looking at length of stay — a fourfold increase for length
of stay with infections — percent died, as well as the total charges.

The same thing by hospital. You can see the variation. The red bar shows the
statewide rate. You can hospitals up and down, and you can see their volume.

Postoperative sepsis is another one that has had a lot of control from the
folks at AHRQ working on it. They are doing what they can with what is there.
It’s showing similar differences.

The fact is whether or not you want to hang a hospital for this, there need
to be ways to control infections before they get into the hospital or before
they occur in the hospital. It’s clear what happens when people have

Again, just showing hospital variation. We have seen some volume

Here are things you have all been hearing about today.

The other thing you don’t hear about is, it’s actually mandated to be
accurate. There are conditions of participation that speak to some of these. I
had a business rep tell me that one time. He said, “At least here you have
some underlying sense of accuracy,” which is interesting. We like that
because it supports what we do. But, honestly, there is something to be said
for that.

In Virginia, it takes about six months from the close of a discharge to
where we have the data up and ready. That lag time is something that bothers —
it’s a lot better than some states, but it’s worse than others.

I saw this quote recently. It’s clear that moving to clinical data will be
like this. A hybrid is a good way to start. I think a lot of people here have
been thinking about that, and the audience and others.

NAHDO has come up with a vision, that by 2010, all states would have some
form of POA or lab values and others, and works very hard with its
stakeholders, sponsoring workshops, sponsoring discussion groups, sponsoring
legislative forums, and others, to help support this.

National standards are important. Very little of what we are seeing started
out because of national standards. They started out because of individual
efforts, with the exception of the UV-02. But the idea is, there is a lot of
innovation that takes place in the states. It scares some folks when they see
that there is going to be someone who is going to come and help us by having a
national standard for everything. It’s good to have standards. But the fact is,
you need to have local innovation.

This is intriguing. Those of you who know Dr. Goldbeck know that he has been
big with the Washington Business Group on Health for many, many years. The
statement that he said here was actually in 1985, about assumptions about waste
and variations and the importance to do things right and to have standards, and
how important volume can be and how social disparities drive different
health-status indicators. This has been found to be true. About 12 years ago,
he came back and reminded us about that. It continues to be true.

The simple fact of the matter is that health is something that has been a
concern to many for many, many years, including Thomas Jefferson. It’s
something we need to pay attention to, and it’s the reason I came here and you
were here, too.

Thank you.

DR. CARR: Thank you very much.

May I ask, this wonderful work — we didn’t talk about what happened when
you publicly reported it. Do you have a couple of stories or highlights of what
got better with this when it is published?

MR. LUNDBERG: It’s always so difficult to say. We know that when we
published the 2005 cardiac care overall mortality rate, it dropped 12.7 percent
in the last three years. Should we take full credit for that? Of course not.
There are so many different things that are going to address that.

I would fall back on the growing body of research that is showing that
pay-for-performance — that public reporting enhances pay-for-performance. I
would say public reporting enhances everything. People want to look better. But
I cannot give you a measurable response to that.

I can say that length of stay has dropped since we published that. But can
we take credit for that? Of course not. We are not scientists ourselves, so
it’s very hard to tease out this.

Just the cardiac care stuff itself, beta blocker use and all these other
process measures have certainly had an effect.

MS. MCCALL: I know it’s difficult to measure, but have you had a chance to
look at patterns of information consumption behavior, and what people are
looking at, what they are paying attention to, and if those are the things
that, in fact, seem to be changing? Is there a relationship at all?

MR. LUNDBERG: I can tell you more about patterns in what people look at.
What people love to look at is — we have physician information on education
and years in practice. We have about 1,000 visitors a day. Sometimes up to 40
to 60 percent are looking at physician information.

Right behind that is the hospital information and nursing home information.

MS. MCCALL: Is there a relationship between the specific things, and pages
and metrics they are looking at, and that drop in mortality?

MR. LUNDBERG: I can’t answer that.

Of course, we see, whenever we have a good press release or something like
that, everything goes way up.

What is in the news is what — physician information is huge. So is nursing
facility information.

DR. GREEN: Michael, could you go further with your future slides and what
you are anticipating and say a few words about your thinking about public
reporting and performance of the insurance industry?

MR. LUNDBERG: Public reporting by the insurance industry?

DR. GREEN: No, of the insurance industry.

MR. LUNDBERG: We are thrilled that NCQA is stimulating PPOs to participate
in performance measurement. In Virginia, there are still 1 million people in
HMOs. There are 70 measures there: Are you happy with your doc? Are you happy
with the HMO?

So we think that the direction towards PPO information is very exciting. We
will be embracing that.

DR. GREEN: I’m not talking about PPOs. I’m talking about the rate at which
an insurance company pays a claim, the amount of money that they pay compared
to another insurance company — the very same sort of information for public
consumption about the payer side of the health-care industry that you are now
doing such a nice job of reporting about the provider side. What is your
thinking about where that goes?

MR. LUNDBERG: Where we have what would actually be paid? We currently have
per member per month, which is as close as we can get right now to the data we

Are you talking about provider reimbursement?

DR. GREEN: No, I’m not talking about providers at all. I’m talking about the
payers. I’m talking about the performance of Medicare, the performance of
Virginia’s Medicaid program, the performance of Anthem in Virginia, and what
its performance measures are for its members. Are you doing anything about
public reporting about that?

MR. LUNDBERG: So United’s performance measures for its clients? I’m sorry.
I’m having a hard time.

DR. GREEN: I think that’s the answer. I got it.

DR. STEINWACHS: I like very much your focus on consumers and trying to
provide information that consumers might use. It sounds like you have at least
been successful in getting them engaged in your website. I am not surprised
that they like to know something about physicians, since it’s hard to get any
information most places about physicians.

As you look to the future, are there ways to make the information more
relevant to consumers? From what little I have done in the area, sometimes
consumers — at least with mental illness — tell me they would like to know
about the treatment or the outcomes for people like themselves, and some way to
be able to go into a database, possibly, or to be able to characterize your
database: Here are people who have certain sets of conditions or problems. How
their outcomes look is different from someone else who doesn’t have that when
they go into a surgical procedure.

MR. LUNDBERG: By payer right now for mental health outcomes, you can see
information, at least for HMOs, on how well they do at any medication
management or other things, which are actually very important process measures.
Crossing those with readmissions and recidivism is something that — we do
readmission rates for mental health care and other things like that.

Does that mean a lot to the consumer now? I don’t know.

DR. STEINWACHS: I was trying to push you out to your plan for the future. Do
you foresee being able to structure things, certainly with a database, and
sometimes structure inquiry, where if I sat down and told you I had
schizophrenia and I also had diabetes and congestive heart failure, and my
physician has recommended that I have X procedure, what the outcomes would be
for other people like me having that procedure? You might not be able to do it
by hospital, because the numbers might be too thin. But you can at least go in
the aggregate and say, “You are likely to have worse outcomes in certain
ways or better outcomes in certain ways.”

MR. LUNDBERG: Given that psychosis is the leading mental health reason for
admission, I think it would be very easy to show differences in length of stay
and things like that, without jumping backwards into the outpatient arena,
talking about the different therapies and interventions and combination of
medical versus psychiatric.

The only thing we are really doing specifically on mental health right now
is developing this psychiatric-bed registry to help folks who place folks know
where there is an empty bed.

DR. CARR: Thank you very much.

Betsy Clough is on the line now. We will present her slides.

Agenda Item: Performance Measurement and Public
Reporting – Public Reporting, WCHQ

MS. CLOUGH: Just briefly, the purpose of my presentation and discussion
today: I will go over a really quick background about WCHQ, talk a little bit
about how we collect and compile the data for public reporting, how
organizations are using the information.

This slide just briefly gives an outline of our mission. We are a voluntary
consortium of organizations — hospitals, physician groups, health systems,
health plans, and various employers and purchasers from around the state
working together to improve the quality and cost-effectiveness of health care
for the state of Wisconsin. We do this by developing and publicly reporting
measures of health-care quality.

The four buckets on the bottom half of the slide depict our strategic
priorities for 2007 and 2008 — primarily a focus on performance measurements
and reporting, continuing to develop our portfolio of measures, as well as
using the data for improvement and then having other stakeholders use it,
whether that’s consumers, purchasers, or payers.

Just as a reminder, the collaborative was founded in late 2002 by nine
health systems from around the state of Wisconsin, with the goal of being
transparent, publicly reporting outcomes, as well as improvement.

Since our inception in 2002, we have grown to represent about 40 percent of
all of the physicians in the state and 21 hospitals around the state. Many of
them are in competing markets. It is our goal by 2010 to have over 75 percent
of the primary care physicians in the state represented.

As we were founded, each of the health systems that founded the
collaborative brought with them a business partner. The purpose of bringing
them into the mix was really to have them aligned with our effort, rather than
having multiple initiatives occurring within the state regarding transparency
and improvement. We thought that it would be best to get everybody aligned,
rather than having separate or competing initiatives. Most of these business
partners are thoroughly engaged in our work. Many representatives serve on our
various board groups. There are two business partners on our board of
directors. They all regularly attend our monthly meetings.

This is just a brief history of WCHQ. I won’t go through that.

As I mentioned, the first meeting of the CEOs to create the collaborative
was in the fall of 2002. By the fall of the following year, we had released a
public report.

As we think about the catalytic sparks that really spurred the development
of WCHQ, there are a few things. One was just transparency overall. We knew
that hospital reporting was coming sooner or later for us, so we needed to do
something about this, to be part of the solution rather than part of being part
of the problem. There was also a lot of internal pressure, as well as market
pressure to improve and be transparent. We had our business partners kind of
pushing us — every other industry is transparent, and we know how everyone
performs, but we don’t know much about health care.

Also during that time, there was a state-mandated physician claims database,
where physician groups had to submit data regarding outpatient visits to a
claims database.

So that was occurring, as well as simply the vision, that physicians had to
know that unless they did something and created it, someone else would do it to

I would say that the physician leadership and vision really was key to this.
That it was created by physicians and has complete physician engagement has
been important. We started small, representing geographically distinct and
separate markets, and have grown to represent most of the state of Wisconsin.

As I mentioned earlier, it was about a year from when the first meeting
occurred to when our first report was released. It wasn’t until about February
of 2003, when the CEO pulled in the quality folks and said, “Okay, we’re
going to release a report. Go figure it out.” Initially what we did was to
develop a set of criteria to be able to evaluate which measures to use. The
criteria included feasibility of data collection or harvesting it, the impact
on populations, the potential for improvement for the measures, whether there
was clinical evidence, and then the value to various stakeholders, including
employers and consumers and providers.

So we used those sort of as the focus point as we evaluated the measures. We
also realized that we had about seven months to get a report ready. It was
about what data and what information we had in place that we could look to.

After our release of the first report in the fall of 2003, there was a fair
amount of pushback from physicians. They didn’t necessary believe that we were
using the best data. So we set about developing a methodology that would
represent all physicians, all payers.

Also there was a desire, at that same time, from the medical community, as
well as the employer communities, to begin to look at some efficiencies, in
addition to effectiveness of care. They didn’t really define efficiency. It was
a hard thing to define.

As we started developing those measures, it was really about kind of
engaging the volunteer army, if you will, the data staff and
quality-improvement staff and medical directors and other clinical
professionals to begin to develop these measures.

What we found was really important was to make sure that no matter what we
were talking about, whether it was the efficiency measures or developing the
ambulatory care measurement, we had physicians engaged in the process all the
way along the measure selection and development.

This just depicts that public reporting that occurs with WCHQ at the
physician group level. The methodology that we developed does enable the
physician group to be able to harvest the denominator administratively and then
really go on a treasure hunt for the numerator, to complete the clinical data.
The net result is online, using a secure Web-based tool. They have aggregated
results at this time, but we are working towards moving to an individual
patient-level submission.

At that same time, once data are submitted, everyone goes through a
data-validation process to make sure that we are all measuring apples to
apples, and results don’t go live until that process has been completed.

The next portion of my set of slides is really focused on the improvements
that we have seen and the results that we have seen over the last four years.

If you go to WCHQ.org and click on “Reports,” you will be taken to
the most important part of our website. Here you can view our measures by
either the type of provider or clinical topic, and we also have them separated
by category.

Just a quick overview for each of the measures that we do display. We
publicly report at the system level, when we are talking about a physician
group, and then at the hospital level when we are talking about the inpatient
side. For each measure, whether we are talking about inpatient or outpatient,
we make sure that we are displaying the name of the system, obviously, and then
the population for each measure we are talking about.

If you click on the historical data link, you will be able to see historical
results for as long as we have the data for.

We made a decision early on to publicly report at the group level.
Initially, we just thought that were simply too many political issues, as well
as scientific issues, with reporting at a more granular level publicly. Due to
how we measure, groups are able to report internally at the individual provider
level. But until we have more empirical evidence about whether or not we should
publicly report at an individual level, we have chosen not to do so.

I would say that there really haven’t been any complaints or any issues with
reporting at the higher level.

This slide just depicts Bellin Medical Group. They are a group based out of
Green Bay, Wisconsin that has seen quite dramatic improvements with diabetes
groups, who are under good control regarding their hemoglobin A1c. As we have
spoken with them and have begun to understand how they are seeing such
improvements, it was really about using the data that they have. They used the
WCHQ measurement methodology sort of as a framework to begin to really
understand their practices around how they are taking care of patients with
diabetes. Rather than just measuring for measurement’s sake, they are really
studying it, understanding why these patients didn’t have a hemoglobin A1c and
then why they weren’t under control. They are feeding those results back to the
individual providers.

Another example of a dramatic improvement in diabetic care is Advanced
Healthcare. For them, the story really starts even before the results went
public. This really started when they decided to participate in the
collaborative. It’s about a strong commitment to transparency and also a strong
commitment to improvement. They really started to align their board, their
leadership, and their quality-improvement staff, as well as their medical
staff, around this transparency and improvement.

Similarly to what Bellin has done, they have also used the WCHQ measures as
a framework. Instead of kind of floundering around saying, “Where do we
start? What measures do we use? What disease do we start on,” they have
said, “This is what we have decided with WCHQ, and here’s the list,”
diabetes, hypertension, preventive cancer screening measures.

Regarding diabetes, they simply used the data to, number one, build a
registry using the WCHQ methodology, and then are publishing internally the
comparative reports for each physician and then are starting to prioritize
follow-up lists for patients. They have developed patient-notification

What also developed as a result of this was a way for them to better
understand their data, understand how to better use the DHR for both collecting
and reporting data, and then also the issues around documentation by providers.

We will release another round of results for our diabetes measures in the
fall. I am quite confident that we will continue to see an improvement in the
results that we are reporting.

This slide just depicts the population focus. A really big, important focus
for us is the impact that we are having on the overall population of Wisconsin
and those patients being treated by providers. For every measure that we report
on the ambulatory side, we give population results.

If you click on the historical link, what you will see is the improvement
that has been made over the last three years. When you look at the entire WCHQ
population, I must say, it’s quite impressive. Even with the fact that we have
continued to add physician groups and providers, we see an improvement.

This is just another example of overall population improvements.

This slide just depicts the pneumonia composite score summary. I just want
to talk for a minute about the work that we have done regarding efficiency
measurements on the hospital side. As I mentioned earlier, there was a desire
by many of our stakeholders to begin to understand not only how effective we
were at taking care of patients, but also how efficient. So we convened a
workgroup with a lot of different stakeholders. They met for over a year,
trying to decide how we would measure and then subsequently report the
information publicly. We looked at EPCs and different risk-adjustment models.

What we concluded was that, due to the fact that we are just physician
groups and hospitals, we are missing some data. Where we landed was what we
referred to as our attempt at efficiency measure.

What we start with, as a first step, is the Joint Commission’s measures for
different conditions. We report for three conditions: congestive heart failure,
pneumonia, and heart attack.

Using the composite score methodology developed by Premier, we calculated a
composite score for each participating hospital. What shown on this slide are
the features that make up the pneumonia composite score.

We partner with MetaStar, our state’s QIO. They actually harvest the
measured results for each hospital. They submit that to WCHQ, and then a
composite score is calculated in our database automatically.

The second step is to harvest the length of stay and charges. A business
partner does that for us in a similar way. They have access to our state
database, and so they harvest that information. Then the results are
risk-adjusted by one of our business partners using APR-DRG risk-adjustment

Both sets of data are then combined to create a dot for each hospital. I
would also add that both pieces of data are validated. We rely on the core
measures to serve as our validation. We also do validate and audit the process
that our business partner uses to harvest and risk-adjust the length of stay
and charges.

On this slide what you see is the quadrant for pneumonia and the resulting
dot, if you will, for each hospital, their composite score plotted against the
risk-adjusted length of stay.

Because we don’t have a good way to display the results historically, I have
shown the improvement for one hospital, Gundersen Lutheran, the improvements
that they have seen from 2004 through 2006. Really, what happened for this
particular hospital — on their composite score, they were in the low 60s, and
they said, “This isn’t acceptable.” So they convened a
multidisciplinary team and really began to understand the processes around
taking care of a patient with pneumonia when he or she comes to the hospital.
By understanding the data, they realized that not much was standardized. They
implemented changes and were able to show improvements.

This group continues to meet monthly to evaluate outliers and patients that
don’t meet the criteria, and then they work on various improvements.

This is just another example of the quadrant for the heart failure.

Then an example for heart attack as well.

This next set of slides just kind of wraps up and talks a little bit about
some lessons learned that we have been reporting over the last four years.

I would say our lessons learned are in a couple of categories, one being
about the data and the information that we are reporting:

• Data must be equally available and accessible, and if we are already
capturing it for something, if there is a way to tap into information that is
sitting in a state database, to get it from there.

• Whatever we are going to be reporting must be supported by sufficient

• Similarly, if we are going to report something, it must be something
that can improve upon.

• It’s also important that we are thinking about different audiences
when we are publicly reporting, so if they read it, they can interpret it and
get some meaning out of what they are seeing without significant explanation.

I have said a couple of times that multi-stakeholder involvement and buy-in
is key. Obviously, physician engagement has been important for us, as we have
moved throughout the last four years of our evolution.

One important thing that we learned, probably the hard way, regarding data
that we are reporting is that before anything goes live on a website or in a
paper report format, it’s incredibly important that physicians have the ability
to see the results in context with everybody else’s. otherwise, if they haven’t
seen it and they thought they were on the top and they are really on the
bottom, people get nasty phone calls. We don’t want that to happen.

There are lots of issues with display that I think we have had to work
through. Nomenclature is important, that we are defining what we mean by
efficiency, what we mean by charges or the costs. There are lots of issues

The most important takeaway message here is really about making sure that we
have credible, reliable data. If we don’t have that, then we lose the
engagement of all of the different stakeholders.

Lastly, just a constant reminder of vision, that we are doing this not just
for measurement’s sake, but for improving the quality of care that we provide
to our patients.

Just briefly, some plans we have for 2007:

We are continuing to refine and further reengineer our audit and validation
process. That is kind of ongoing, and we continue to work on that.

We continue expand our measures portfolio. We are currently working on
measures for CAD. Then we will move it to asthma and depression, on the
physician group side. We are trying to align, on the hospital side, with what
our hospital association is doing in the state of Wisconsin in terms of
measuring and reporting on that.

Obviously, we will continue to update measures

We are working really hard to develop a formal quality-improvement
infrastructure. What has happened so far has really been more organic, if you
will. We have had some quality-improvement workgroups kind of formed on the
side. We have fostered networking and collaborating, but we haven’t focused on
projects. So it’s our goal to figure that out this year.

We do have plans for a number of research projects to be implemented around
this idea of a regional coalition, as well as looking at ways to begin to
understand the physician-level reporting piece.

In summary, I would say that the work we are doing is really all about
improvement and making sure that we have all of the right stakeholders engaged,
particularly the physician community, and making sure we are all about
collaboration rather than competition.

With that, I would be happy to answer questions.

DR. CARR: Thank you very much.

We are running behind. I think what we are going to do is move on to Dr.
Yandell and invite you to stay on, if that’s possible.

MS. CLOUGH: That’s fine.

DR. CARR: Great.

Agenda Item: Performance Measurement and Public
Reporting – Public Reporting, Norton Healthcare

DR. YANDELL: I’m Ben Yandell. I work at Norton Healthcare in Middleburg,

I want to talk to you a little bit about some work we have done in public
reporting. Every now and then, I have to sort of stop and think about the path
I have walked to where I am. Those of you who have worked in hospital settings
probably have an experience somewhat like mine in quality, where you go to a
quality meeting and something gets passed around — maybe the copies are
numbered — and at the end of the meeting, you pull the copies back up and you
make sure you have accounted for them all. That’s the world I grew up in, in

I see a lot of smiles. You guys have lived it.

I knew things had changed when I was starting to work on our public report.
Toward the end of 2004, we decided that if we knew all this information about
clinical quality, about ourselves, the public that we talk about all the time
had the right to know it, too. Without doing collaboration and without doing
extensive work with stakeholders, we basically told a couple of key folks we
were going to do this. We contacted the local newspaper and told them we were
going to do it, which then committed us to do it when we started to get cold
feet, along about January of 2005.

I knew the world had changed when I had pulled together into a report all
the indicators I could find that claimed to be saying something about clinical
quality, about our hospitals. I shared that internally. I made one change to
that report before it went public. I took the words “proprietary” and
“confidential” off the piece of paper.

For me, that was a defining moment, to suddenly realize that I am in a very,
very different world than I was in 20 minutes ago.

With that said, what I want to do is tell you a little bit about what we
did, first of all, so that you have a feeling about that. I think this is a
report from maybe August 2008 that I am giving you, because I think we are a
little ahead of the curve; that’s about how far ahead of the curve I think we

I believe we have learned a little bit about trying to live with these
measures. We don’t invent indicators. In fact, it’s a rule of ours that we do
not invent indicators. I cannot tell you how much grief that saved me. I agree
with you, it’s stupid. It’s not my rule. But the conversation is over.

What I want to do is start by acknowledging some folks whose primary role in
this, I guess, was, when everybody was telling them this was a dumb idea, to
say, “Okay, do it anyway.” One of those is Steve Williams, who is the
president and CEO of Norton Healthcare. He has been interested in quality in
the hospital setting since the late 1980s. he had the courage and the
leadership to say, “I know nobody else is doing this. I still think it’s
the right thing to do. Let’s do it.” I just want to acknowledge that,
because all the fun I have had since then would not have occurred if he had not
said okay.

Bob Goodin is a physician and the chairman at the time of our board of
trustees and the quality committee. I said we did this without collaboration.
We didn’t do it without approval. These are the folks that said, “Yes, do
this.” Even when the going got a little bit tough, as we got closer and
closer to our launch in March of 2005, they still said, “Do this.”

In particular, I want to recognize the work of Dan Varga, who was a
physician. I think I probably learned more about leadership in working with him
than practically anybody I have worked with in a long career in health care. It
comes down basically to deciding to do the right thing because it’s the right
thing to do, not because everybody else thinks it’s a good idea and so forth.

With that said, a quick background on Norton Healthcare. We are tiny. We are
a little hospital system in Louisville, Kentucky. We currently have three adult
hospitals. We are building a fourth. We have Kentucky’s only designated
children’s hospital.

By the way, I haven’t heard it said today, so I’m going to say it while I’m
thinking about it: As bad as all of the work is in the world of adults, it’s
terrible in the world of pediatrics — in fact, shamefully terrible. The work
that has not been done in developing indicators and doing background work has
to get fixed, for a population that everybody agrees we need to be paying
attention to.

We have both owned physician practices and an independent medical staff.

What we did: We were determined to publish an objective evaluation of our
performance and make it public. We initially went public with about 200 quality
indicators. By the way, I don’t know how to count some of these. Is that one
indicator or six? I don’t know.

Being conservative, we are currently putting data out on about 400 quality
indicators. We are in the process of adding some others. I will tell you why so

I guess the thing that surprised people so much was that nobody made us do
this. We did this voluntarily. Part of the reason we did it was some of the
same frustration that I have heard around the room today. Let’s get on with it.
We have been talking about this forever. We thought, what can we do? The one
thing we can do is, with the part of the world that we control, which is our
data and what we do with it, why don’t we go on and do our part to move this
agenda along?

By the way, one of the things we have managed to do is to be kind of an
object lesson for doing this kind of work: We are still in business after
having done this. We were told this was going to be a field day for plaintiff’s
attorneys, that we were putting this out there, that this was a terrible thing
to do and so forth. We are still in business.

Another thing is, I know how Neil Armstrong feels. I have been in meetings
about this topic where some speaker said, not knowing I was in the audience,
“If little Norton Healthcare in Louisville, Kentucky can put all these
indicators out in the public, don’t tell me that the logistics of this are such
a barrier that you can’t do it.” It used to be, “If we can put a man
on the moon”; now it’s, “If Norton Healthcare can put quality
indicators out in public.” Okay, I may have delusions of grandeur.

Anyway, what does it look like? If you go to nortonhealthcare.com, you get
our flash page. One of the tings I am really proud of is that we have real
estate on our home page. For those of you who have ever lived in the world of
Web, to have any real estate on a home page is a pretty impressive thing. We
live there permanently. You can always get straight to us from our home page
just by clicking on “Quality Report.”

When you click on it, what you come up with is a page that is obviously
designed by a statistician, not by a graphic artist, which is a list of all of
the different areas that we publish things about.

I remember when we were trying to convince the local newspaper that what we
were about to do was something interesting. We actually, with the reporter in
the room, called the National Quality Forum, for the first time ever, and said,
“We’re about to publish every single one of your quality indicators.”

There was this long pause, and the person said, “Every one of

I said, “Yes.”

He said, “Just a minute.” I heard shuffling paper. “So you’re
going to do” — and he literally went through every single one of their
groups — “you’re going to do every one of the hospital consensus


“Every one of the cardiac surgery.”




He just worked down the list, and we said we were going to do every one of
them. He was just flabbergasted. That was the first time I actually saw the
reporter believe we were doing something other than a PR kind of thing, that we
were about to turn this thing loose.

So we do all the NQF stuff. We do the AHRQ quality indicators. CDC doesn’t
have a lot of guidance about what to do in infection control, but they do have
a position paper about what states should we. We took that position paper and
tried to do what they had to say about that.

In the world of pediatrics, I was desperate to do something. We have a
children’s hospital. So we put some ORYX indicators out there, and so forth.

We ended up adding patient satisfaction. We actually have our antibiogram,
our antibiotics susceptibility chart, on our public website, which is kind of

By the way, CDC really does say to do that. I can show you where they say to
do that.

Patient satisfaction is now out there. We put our balance sheet out there
for the public to see. We do the ambulatory indicators, cancer survival rates
— you get the idea.

Basically, what has changed in the time that we have done this, starting in
2005 to the present — when we started, the conversation was, why would we put
that out there? The conversation very quickly became, why wouldn’t we put that
out there?

It’s very interesting. After I launched the website, the first phone calls I
got from affected parties at Norton Healthcare were not, “You idiot, what
have you done?” They were, “Where am I? I can’t find myself in the
report,” which was fascinating to me. I thought I was going to get the
phone calls, “How could you put this out there.” “I’m trying to
market this service line. Why would you put something out there like

Just to let you know, a lot of the doomsday scenarios that you hear about
public reporting are not true.

What does it look like? This is a cardiovascular procedure page. We report
our SDS data. We report our ACC data. We are now members of NDNQI on the
nursing-sensitive stuff. We report that.

It’s interesting, by the way, to be where we are. One of the things you run
into with this is finding something to compare yourself to. That’s tough. The
second thing that’s tough is being allowed to tell anybody else that you have
something to compare yourself to. A lot of the databases do not allow you to
publish anything out of their database. You are allowed sometimes, with a lot
of coercion — I have had some unusual phone conversations with owners of
databases: “Can I at least put my data out there? If I don’t put anybody
else’s, can I put my data out?” “I don’t know. I’ll have to talk to
our attorney.” It’s my data. Why do I have to ask permission?

What we can’t do is display anybody else’s. I can show you that we are red
or green, but I can’t tell you why and you can’t audit my books. That really
bothers me, that I am not allowed to show you the national average.

There are a couple of things that I want to point out that I like about what
we do. We use some interesting words, like “better” and
“worse.” “Worse,” I think, is an interesting word for a
hospital system to use about itself, but we do. When we are significantly worse
than the national average, we say so. In general, what we have tried to do with
this report is be blunt and not pretty it up and not put spin on it, but — Joe
Friday — just the facts. Put it out there and show people.

A couple of things I am proud of. You are looking at a page that takes you
two clicks to get to. Some folks who publish their data — I felt like I needed
a machete to get through all of the marketing, to get to any actual data.
Sometimes I gave up and went to the next website.

It takes two clicks.

The other thing is, it’s data; it’s not text. I am not telling you what you
are seeing; I am just showing it to you. I think that’s valuable.

There is text. There are questions that people need to have answers to. We
put that in pop-ups. If I click on something and I want to know the definition
of it, for example, it tells the definition. We have tried to divide these
definitions into a relatively publicly oriented description and a technical

Why would you do that on a public website? I just want to use this chance:
Public reporting isn’t just about the public. Our own staff did not know these
statistics until we made them public. I actually consider the public our third
audience. Our first audience is our own staff. Our second audience is our
medical staff, who also did not know these numbers until we went public. The
third audience is the public.

I get challenged a lot by somebody who is looking at this for the first
time, who says, “How much does the public really care about this
anyway?” I don’t know. Do they care if they live or die? Do they care if
they get an infection or not? If they do, then they care that we are doing
this, whether they ever look at the report or not. I do think that’s what
public reporting does for you. It moves the agenda along. It gives it a kind of
urgency that it doesn’t have if you are not public.

By the way, one of the things that people struggle with — I struggle with
it — composite versus a bundle versus an index. How on earth do you combine
indicators? I get told, usually in the same sentence, “That’s way too many
notes. Four hundred indicators is way too many. And, oh, by the way, I can’t
find what I’m looking for. You don’t have any indicators about it.” And
that’s usually in the same sentence.

One of the things that is interesting, if you do what we have done, where
you have this confusing and terrible way of presenting data to the public,
which is this matrix — if you traffic-light it and it’s red and green, you can
kind of blur your eyes and not even read the words, and it begins to create a
kind of composite, a kind of bundle score. If you flip around our website a
little bit, it’s interesting.

I also want to point out the bottom line down there. Because as soon as we
have it and we trust the data we make it public, we are already ahead of
releasing our 30-day mortality. We happen to have a single provider number.
Unfortunately, I can’t separate the data by our hospitals on this, which is
kind of frustrating to me, because when CMS does its analysis, it does it by
provider number. So we are all one group. So when the report comes out, I am
told that you are not going to be able to tell individual hospital
performances, but in our report you can.

We have developed some principles about the report as we have gone along. I
talked with our quality committee of the board about this. I think that you
have to principles something like this, or what you are doing is advertising;
it’s not transparency. The principles are these sorts of principles:

For example, we don’t decide what to make public on the basis of how it
makes us look. I actually have a standing order from the board of trustees: If
the National Quality Forum endorses an indicator, I have a mandate to measure
it, get the data right, and get it on our website. I am not supposed to ask
anybody, including the board of trustees. That’s the kind of commitment that,
to me, is about transparency as opposed to bragging about how good you are. We
have been doing this for a little over two years now. We update it at least
once a month. We don’t know yet some of the things that are going to be coming
out, and we are already committed: When they do, we are going to publish them.

We give equal prominence to good and bad results. I have warned the board.
One of the things that is frustrating to people who don’t understand
significance testing is to be really, really good at something and not be
significantly different from the national average. That’s very frustrating. How
can I not be significantly different? I didn’t have any deaths. I had no
deaths. How could I not be significantly different? Well, it’s a really rare
thing to have deaths in this particular area.

So we now give them a quality ribbon. If you are the best possible outcome
— if 100 percent of the time you get the right medication to somebody, or 0
percent of the time you get an infection — we give them a quality ribbon.

I have warned them that this second bullet point about equal prominence
means that if we ever kill everybody or fail to give anybody the right drug,
there is going to be a little bomb or something like that in the report.

You see the other principles. They come down to not picking and choosing. I
get a lot of phone calls from other hospitals about trying to do a public
report. A very common first question is, “How do I pick which indicators
to report?” I say, “You know what? As soon as you start picking
indicators, you are in danger that you are into the world of marketing and
advertising, not in the world of transparency.”

You are going to find that the indicators that don’t make you look good are
also the ones that aren’t that valid. That’s why, even though there are so many
of these indicators that we don’t agree with, that we do find fault with the
definition of, we report them anyway.

That’s how we ended up with a big report. People ask me sometimes how this
report got so big. Because we report whole lists of things. We don’t pick and
choose. That’s why it’s big. It’s big to be unbiased, not because I really like
a big, long report — although, by the way, I do like a big, long report, and
we are going to make it a little bigger.

You can’t read this. You don’t have to. This is a list of SPC charts,
statistical process control charts, that are part of an internal report. We
have spent most of our time since we launched the public website on internal
reporting, which is interesting to me. Supposedly, the report is too big, and
that’s one of the issues with it for the public, and, mainly, it’s nowhere near
big enough to do the internal work that you have.

So what we routinely do on all of the indicators is statistical process
control charts, a patient listing. It’s not a special report. It’s available
every month. It’s a patient listing of everybody who hit the numerator of the
indicator. We always break things down by physicians. We are about to launch a
physician intranet site so that every physician can get to their own data on
everything that is in our quality report.

I also get asked, has this made any difference? What you are looking at is a
slide that is not from our quality report. I am trying to condense it even
more. You see a lot of red on this slide. Remember, red means we are
significantly worse than the national average. This slide is our data from the
last half of 2003 compared to 2005-2006 statistics. I am about to show you what
it looks like now, and I want to make it apples to apples.

There was actually a version of this before this that I call “the
Bloody Mary slide.” We actually started collecting these data in the
middle of 2002, and our data quality was just awful. Most of the changes from
that data collection to the slide that you see now — it does have a spattering
of green — were not changes in care, but changes in data quality. I want to
show you things that I think might actually have something to do with what
happens to patients, not just what happened on the database somewhere.

So this is what it looked like then. Watch this. I think this is very cool.
It’s mostly green. That is 2005. I will tell you — not because they turned
green, but because I know what we did — that’s real. We really did change what
happened with patients, at least in part, because the data were public.

I had the experience of sending out reports before and after the data were
public. I will tell you, sending the exact same report to the exact same set of
managers, it was a very different reaction in the world of “it’s not
public” versus “it is.” In the “it’s not public,”
it’s, “Let me get this straight, Ben. You’re the only one looking at
this?” “Yes.” “Thanks. Nice report.”

This is the last half of 2005.

Now I want to show you both good news and bad news. We have maintained,
though we have not improved since 2005. You see a little bit of random
variation, but it’s essentially the same thing. I liken this to being on a diet
and trying to lose the last five pounds. With a lot of this stuff, that’s where
we are. We have done the stuff that — “you moron, you didn’t have
something in place to accomplish this?” So we put that thing in place and
we get better. What we are working on now is tougher stuff.

By the way, we still struggle with making this stuff stay that way. Our
folks have kind of gotten used to public reporting and the Hawthorne effect
that comes from the initial public report. They are used to it now. In fact,
they are more surprised when something is not public.

Limitations: Obviously, because this is just our self-report, this is not a
model that we can use for quality improvement for the whole country. I don’t
think every hospital putting out its own personally developed public quality
report is the way to go. I wouldn’t advocate that.

One of the things that I do want to point out is that we compare ourselves
to the state of Kentucky and to the United States in our report. What we don’t
do is compare ourselves to competitors, because I am not going to decide for
them to publish their data, obviously. But that obviously limits the usefulness
of the report, if what you are trying to do is comparative shopping.

By the way, we seem to keep forgetting that the risk-adjustment methods that
we use do not allow hospital-to-hospital comparisons. We keep forgetting that.
At their root, they are indirect risk adjustment, which means we are all being
adjusted to our own personal population of patients. If I adjust to a different
population from you, I can’t compare my results directly to yours. So rank
orderings and the things we want to do with these — we are actually a little
outside of what the science says you ought to do.

Some quick thoughts about some concerns: I think these are the wrong
indicators. We have 400 of them, and I think probably 380 of them won’t be here
five or 10 years from now. They are mostly the wrong ones because we spend all
of our time trying to define the indicator into nonexistence. It’s like,
“But you haven’t thought about this. We need to eliminate this. It’s not
100 percent yet. It’s not zero yet. We obviously have some definitional work to
do. We know we have hit nirvana when it becomes like transfusion reactions or
‘left surgical instrument unintentionally in a patient’ or we gave an
aspirin to somebody who had a heart attack that shows up in the hospital. Those
are great indicators, because they are zero or 100 percent.”

No, they are not. Those are lousy indicators. They are lousy indicators.
They don’t distinguish among anybody. You might as well say, “How many
fingers does your surgeon have on his right hand,” as an indicator.

You need indicators that are much closer to 50 percent. That’s a good
indicator. The things that we currently call known complications of care —
when I hear that, I don’t hear, “Exclude it.” What I hear is, wow,
that will make a great indicator, because that is where the quality frontier
is. That’s where the difference is between a decent hospital and a great
hospital, managing those things that are, quote, known complications of a

I don’t know how much we can trust these definitions yet. A lot has already
been said about the loose definitions. I will tell you, as the guy who is
trying to live by them, in a very obsessive-compulsive way, you just go nuts
trying to figure out what they really mean here. It’s very hard sometimes to
know who is in, who is out, what to measure.

Incidentally, at the root of it all is physician documentation, about which
there are precious few standards.

I want to say that I think we have the wrong mental model for a lot of what
we are doing right now. We think that what we are doing is building a
comparative shopping guide for consumers. Someday, yes, that will be wonderful.
We are not ready. Do you really want consumers deciding not to go to a hospital
today on the indicators that we have today? I am not sure they should be using
them that way. We tell them not to, in the very first page of our report. I
don’t think it’s ready for that yet.

I do think it’s very important to think about a model like Consumer
. You do not have to ever have read Consumer Reports to be
able to buy a better microwave, because Consumer Reports exists. You
drive a safer automobile whether you have ever read an automobile crash test
result in your life or not. To me, that’s the right model to think of. Yes,
publicly reported, a lot of attention to the science behind it, but not,
“Kill it because the public doesn’t understand it.” The public
doesn’t understand crash test results, but the public benefits. I think that is
the key question.

By the way, how about the people who actually deliver the care? Do they have
the data, the feedback loop, to tell them it’s working or it’s not working?

One last thing. Some of the things that we are all worried about —
everybody worries about the unintended side effects of this stuff,
“teaching to the test,” if you want to call it that. If I only
measure these six things, which, by the way, I think are relatively trivial,
does that mean I now ignore all the things that are really, really important?
We haven’t found that. I would say our performance-improvement efforts at the
hospital are probably only — maybe a fourth of them were driven by the quality
report and three-fourths of them were driven by the same things they were
always driven by: We think we can do a better job clinically on this, so let’s
get to work on it.

I think we all worry about how real this is. Are we measuring real quality
yet or are we just improving the indicators? I think that’s a legitimate
concern. My gut but I have no data to offer — tells me it’s both. We are both
doing better data — some of which isn’t trivial data, by the way. Some of the
important improvements in capturing the core measures were capturing
contraindications. We act like that is just a data improvement. That’s an
improvement in what is in the chart about that patient for the next caregiver
who encounters that patient. That is not just data improvement. That’s quality

I guess the thing I worry about the most — and I will stop with this — is
that the problems that we see in this will make us kill it too early. I think
this has incredible promise. It has already shown some early returns that I
find very promising, in terms of informing the public, informing the people who
deliver care, and improving care. I don’t want things like a concern about
administrative burden or that sort of thing — or the science isn’t quite there
yet — I don’t want to kill it too early. That’s my biggest concern about all
this stuff.

I am very optimistic about what we have done so far. I think it won’t look
anything like this a few years from now, because it’s so embryonic. But I think
it’s really, really important to stay on the path we are on.

I would be happy to answer any questions.

DR. CARR: Thanks. That was inspiring, as well as informative. I have to say
that because in my institution, Beth Israel Deaconess, our CEO led the charge
by beginning a blog called “Running a Hospital.” On any given day,
anybody’s outcome, project, or initiative might appear. So we start the day
reading that.

But, in fact, last Friday, we also just went to our report. It is empowering
and it changes the culture.

A question I have is, although I know you don’t want to be concerned about
the administrative burden, could you say a little bit about what it takes to
get all this done?

DR. YANDELL: Sure. It’s a whole lot easier than it seems. It really is. I
won’t say that it doesn’t take some money and staff, because it does. We
launched our initial report — we launched it and we maintain it — with
probably two to three FTEs. It’s hard to quantify that, because this isn’t
anybody’s only job. Everybody involved in this is doing something else.

The reason for that is, most of this we were already doing. It’s either
administrative data, like the AHRQ stuff, or it’s core measures that you had to
do if you were a Joint Commission hospital, or it’s infection-control data that
your infection-control nurses are already collecting, so forth. So most of it
was data that we already had, and what we needed was somebody to put it into
some sort of shape and put it out there.

We literally did this in a giant Excel spreadsheet. That’s how we launched

DR. CARR: Carol?

MS. MCCALL: Absolutely fabulous. I have not so much a commentary on the
specific content, but a question for you generally. You have three wishes for
things that we might be able to help with, to help you further your cause,
whatever that is. What are they?

DR. YANDELL: Number one is to keep the pressure up. If there isn’t the
external pressure that we have to do this, “we” being hospitals —
and other folks, too — then we will quit. So whatever it takes to keep that
pressure going, I think that is absolutely number one on the list.

The second thing, I guess, is all the things that you have heard about
already in terms of very clear specifications, trying to get things aligned, so
that I am not having to keep a slightly different variation on this analysis,
because it’s organization A versus organization B. That’s an unnecessary
addition to the administrative overhead.

If I have a third one, I guess it’s to help get the message out that this
does not have a monolithic purpose, that this creates all sorts of benefits
beyond public reporting, beyond pay-for-performance. Keep always tying it back,
which, I know, anybody clinical does, to the fact that a real human being got a
beta blocker that they might not have gotten otherwise and had a better
clinical course because they got it, because some bureaucrat somewhere said,
“I think you ought to have to report on whether or not you give people
beta blockers.”

It doesn’t have to be that the public understands it. It doesn’t have to be
that physicians buy in to 100 percent of it. It creates databases that people
live off of. It has so many desirable side effects that I think it’s important
not to come up with any criterion to judge it against and say thumbs up or
thumbs down based on that one criterion.

MS. MCCALL: I also heard you say in there to personalize it, so it’s the
Consumer Reports. You may have been in a car wreck, but it was
yesterday, not 20 years ago, and you walked away and you still celebrated
Father’s Day.

DR. YANDELL: Right. Here is what I would love to do with ours, for example.
In fact, I am doing some work right now to try to figure out how to do this. I
would love to create a front end that is essentially a natural-language search,
á la Google, where I come to the website and I don’t have to have Dan
Yandell’s vision of how you display quality. I type in “knee surgery”
and up comes anything relevant I have on the website about knee surgery. Better
yet, I then start asking you some questions that you can answer, or not: How
old are you? What kind of knee surgery? What have you been told? Here are some
drop-downs about it. By the way, do you happen to be diabetic?

That all sits on top of a database that then gives back to the person,
“You know what? At our place, you have a 50-50 chance that you are not
going to be any better off six months from now, with your particular set
of” — things that are helpful in the clinical decision-making process. We
are a long way away from doing that.

But we have an internal agreement to try. In case you want to know where we
are trying to go next, that’s where we would like to go with it. But, my gosh,
do we have some work to do to get from here to there.

By the way, just so you know, we do not display and don’t intend to display
anytime soon physician-specific data. That is not just because we are wimps.
That is because when we try to even produce internal reports that are
physician-specific, the logistics of that are just maddening. It has nothing to
do with politics.

DR. CARR: Thank you very much.

Our next two speakers are going to tell us about nurse-sensitive measures.
We will start back at 3:50.

(Brief recess)

DR. CARR: We are ready to reconvene. Isis, please.

Agenda Item: Performance Measurement and Public
Reporting – NDNQI

MS. MONTALVO: Good afternoon, Madame Chair and members of the Quality
Workgroup. It’s a pleasure to be here to be able to share with you the rest of
the story regarding acute-care setting and measures that affect patient

I am Isis Montalvo, a registered nurse and manager of Nursing Practice &
Policy at the American Nurses’ Association in Silver Spring, Maryland. I
provide oversight of the National Database of Nursing Quality Indicators,
NDNQI, which collects and reports on nursing-sensitive measures.

Thank you for the opportunity to share our experience that we have had over
the last nine years in collecting and reporting nursing-sensitive measures

The ANA is the only full-service professional organization representing the
interests of the nation’s 2.9 million registered nurses. Our members include
RNs working and teaching in every health-care sector across the entire U.S.

The ANA’s work in nursing quality measurements really predicts the best
words that we hear frequently in this day and age regarding quality of
performance improvements. In 1994, ANA launched the Patient Safety and Quality
Initiative to evaluate and explore linkages between nursing care and patient
outcome. ANA fully funded the multiple pilot studies that were done across the
United States, in seven states, to evaluate those linkages and subsequent
nursing-sensitive measures. Multiple publications were generated as a result of
this work. The Nursing Care Report Card for Acute Care proposed 21 measures of
possible performance within an established or theoretical link to the
availability and quality of nursing services in acute-care settings. A final 10
measures were recommended as being nursing-sensitive.

In 1998, ANA established NDNQI, which currently is administered by the
University of Kansas Medical Center, under contract to ANA. The database is the
only national-level database that provides nursing data and patient outcomes at
the unit level. Data collected are structure, process, and outcome indicators,
which is based on Donabedian’s quality framework.

As of June 11, over 1,100 hospitals of all sizes participate in NDNQI in all
50 states and the District of Columbia. We also have international hospitals
participating in the database.

NDNQI mission is to aid the registered nurse in patient safety and
quality-improvement efforts by providing research-based national comparative
data on nursing care and the relationship to patient outcome.

We started off with 30 hospitals which joined when we established the
database in 1998 and continue participating, up until 1,100 hospitals for 2007.
So you can see the growth.

The NDNQI participants are voluntary. They are interested in quality, or
they might be interested to satisfy the Magnet requirements related to the
reporting of nursing-sensitive measures. Participating in the database is a
primary quality-improvement tool, can aid the hospital and the hospital and the
nurses in facilitating Magnet requirements, and also help to aid in meeting
regulatory requirements.

Forty-eight percent of the hospitals in NDNQI are academic teaching, 86
percent of them are not-for-profit, 20 percent of them magnet, 80 percent of
them urban, and we have good distribution across all bed categories, from fewer
than 100 to greater than 500 in bed size.

The NDNQI program is multifaceted. It is database participation, which
includes indicator development, Web-based data submission, and other
significant data, a high level of accuracy in reporting, on-time electronic
report, acceptability of many NQS-endorsed nursing-sensitive measures. There is
an optional RN satisfaction survey for all RNs.

The program also includes pilot testing. Because the indicator development
is research-based, the hospitals have the opportunity to participate in the
development and implementation of an indicator. Not only do we want to ensure
data validity and reliability, but we also want to ensure feasibility from a
data-collection perspective.

There is education and research that is ongoing with NDNQI. There are
quarterly conference calls that are held with all the facilities to support
them in their work. There is an annual conference that we have started. In
January 2007 was our first conference, which I will talk about momentarily.
There are publications where we have started publishing best-practice
exemplars, sharing experience of those hospitals, and those hospitals that had
a sustained improvement, and how they did it.

There is also internal and external research done on NDNQI via NINR, NIOSH,
as well as internal studies that are done by internal researchers.

The NDNQI measures that we include are multiple. Indicator development and
implementation is ongoing in NDNQI. Currently, data is collected on 13
indicators, with four more scheduled for implementation in 2007. Several of the
NDNQI indicators were submitted to the National Quality Forum and were accepted
as part of their consensus measure process in evaluating nursing-sensitive
measures, and they collaborated with the Joint Commission, via a grant they
received from the Robert Wood Johnson Foundation to develop the micro
specifications of the NQS measures.

Other NQS measures have also been included in the database. The indicators
that we currently have:

  • Patient falls;
  • Patient falls with injury;
  • Nursing hours per patient day;
  • Staff mix;
  • Percent nursing hours supplied by agency staff;
  • The practice environment scale, which is one of the options for the RN
    satisfaction survey;
  • Restraints;
  • Hospital-acquired pressure ulcer prevalence;
  • RN satisfaction;
  • RN education and certification;
  • Completeness of the pediatric pain assessment, intervention, and
    reassessment cycle;
  • Pediatric peripheral intravenous infiltration rate;
  • Psychiatric physical/sexual assault rate.

Indicators in development are voluntary turnover, which is scheduled to be
implemented in quarter three of 2007; and the three nosocomial infection
indicators, which are scheduled to be implemented in quarter four of 2007.

NDNQI requests that hospitals provide data from administrative record
systems or form special studies. Some data elements come from medical record
review, and hospitals with electronic health systems may pull some of the data
from those systems. Usually we ask them to take a look at our definitions. A
lot of times they are already collecting the information. So it’s just to
realign their processes to be able to meet the reporting mechanisms for NDNQI.

Some examples:

  • Nursing hours from payroll or staffing systems that collect actual, not
    just budgeted, hours;
  • Patient days from census data systems;
  • Pressure ulcer data and restraint use from prevalence studies and medical
    record reviews.

What we did this year, actually, to facilitate the acquisition of this
information is, when the quarterly prevalence study is done, we encourage the
hospitals to then do the restraint prevalence at the same time. That way, it
minimizes the frequency of data collection that they have to do. They do it
once and then they can report it at one time.

Data submission to NDNQI is done via a secure website. Hospitals may enter
their data by hand in Web forms or upload their files via XML. The programmers
will work with their IT people at the hospitals to give them the necessary

These data sources were selected for two reasons. They contain the
standardized information required by NDNQI, which facilitates data reliability
and validity when it comes to data-collection processes, and they have a known
level of reliability.

Specific processes were established in order to attain the project goals of
collecting standardized reliable data from hospitals across the nation, in
order to provide the hospitals with comparative reports that they can use in
quality-improvement initiatives and use in analyses of the relationship between
aspects of the nursing workforce and nursing-sensitive patient outcomes. We use
standardized definitions and data-collection guidelines to collect comparable
data from each hospital. Tutorials need to be completed prior to any data
collection, so that we can ensure consistency.

We also use in-person interviews with hospital site coordinators to
correctly classify units into unit types. The reports are done based on patient
and unit type, and hospital size or academic teaching status. To make sure that
we are collecting information on a medical unit, the hospital will actually
have a conversation with NDNQI liaison, to ensure that they are allocating
those units appropriately.

We solicit input from hospitals about data that they would like in the
reports they received from NDNQI. We appreciate that they are the end users of
this information. How is it meaningful? How is it relevant? What information
are they looking for? What is on the horizon? What are those adjustments that
we need to make?

We guarantee the confidentiality of data so that hospitals are motivated to
provide accurate data.

The resources that are utilized are pretty extensive:

  • There is investment capital required for development.
  • We have a volunteer advisory panel for indicator selection.
  • There is an expert literature review to identify nursing-sensitive
  • There is a secure website established.
  • Nurse liaisons with hospital experience to provide technical assistance.
  • The interdisciplinary team, consisting of nurse researchers, outcome
    indicator experts, statisticians, database and Web programmers, statistical
    analysts, and survey researchers.
  • Experts in database development and maintenance.
  • There is also third-party database management, for hospitals to feel that
    their data are secure and confidential. They are reporting their data to a
    third party.

Ascertaining data reliability: We initially used the ANA indicators that we
had identified because of the research that had already been done.
Subsequently, we have been incorporating the NQS indicators, which have been
through expert review for reliability and validity.

There are annual reliability studies that are done on the indicators that
include a survey on data-collection practices, rater-to-standard reliability
assessments or audits of reported data against original records. There was a
pressure ulcer reliability study that was done recently that actually
demonstrated moderate to near-perfect reliability when it came to the data
collection. The question is, how can you be sure that what a hospital reports
is reliable compared to another hospital? That information was published.

We also learned that certified wound ostomy continence nurses demonstrated
better reliability in wound assessment. That is kind of common sense — more
certification, more education, greater expertise when it comes to assessing
your patients and wound surveillance. What is meaningful about that is, if you
can assess the patient more accurately, you can then report those findings and
intervene more appropriately.

We also recognize that there is an opportunity to educate the everyday staff
nurse. So we actually created a computer-based learning module on pressure
ulcer assessment and evaluation. We disseminated that tutorial to all the
hospitals, so they could incorporate that into their own educational medium.
Then we posted it on the website publicly, so any staff nurse could go on to
the NDNQI website and learn more about pressure ulcers. It was meaningful to
educate the nurse.

The response has been very favorable. We have had over 5,000 to 6,000 nurses
who have already completed this tutorial. It has been very well received.

Data use: Hospitals primarily use the data for quality-improvement purposes.
We provide them quarterly reports on the indicators. It provides them trend
data. There are eight rolling quarters with an average for those quarters.
Depending on the hospital size, it depends on the size of the report. It can be
anywhere from 50 to 200 pages. There can be 26 tables all together if you are
reporting on every quarterly indicator.

The quarterly reports are separate from the annual survey report, because
that is administered annually.

The reports provide statistical significance, mean quartiles, and national
comparisons at the unit level where care occurs.

This is a significant finding. In the research that we have done, it does
make a difference on the different types of units, when you look at workforce,
when you are looking at skill mix, when you are looking at those structural
indicators that need to be considered when you are evaluating patient quality,
as recommended by Donabedian’s quality framework.

So there are details on structure and process measures in the quarterly

The annual survey is an annual report, again, and there is a lot of pre-work
that is done for the administration of the survey.

The reports really help to aid the staff, the nurse manager, the CNO, the
CEO in the decision making and help them measure sustained changes and improved

We also provide specialty and system reports, as a separate service. We are
also contracting with states to provide statewide reporting for public
reporting. If there is a state that has mandated public reporting and there are
hospitals that are participating in the NDNQI, then we will facilitate the
reporting of that information to the state subsequently, so there isn’t dual
reporting. The hospitals only have to enter that information once. It minimizes
the burden that they experience.

The national comparison data, again, is at the unit level where care occurs.
The reports are provided by unit type. It can be critical-care, step-down,
medical, surgical, combined medical/surgical, rehabilitation, psychiatric,
pediatric. It is grouped by hospital size or teaching status.

The database has really grown. When we take a look at RN satisfaction, when
we started off with our pilot, and then in 2002, we initially had a response
rate of 55 percent, with 64 hospitals participating and close to 20,000 nurses
responding to the survey. It’s a confidential survey. When it comes to the unit
reporting, they don’t get the unit-level information if there are fewer than
five staff in that area, so it is not easily identifiable.

Our response rates have been pretty stable, 63, 64 percent over the last few
years, with 494 hospitals participating in 2006 and 176,000 nurses completing
the RN satisfaction survey. There are some hospitals that have a 95 percent
rate in their response rates. They do work for that and they are very proud of

The number of units reporting: This is just a sample of the number of units
reporting that we have for the RN satisfaction survey, over 7,000 adult units,
over 1,000 pediatric units. When you look at the quarterly indicators, in any
given quarter, about 9,000 nursing units are reporting on any given indicator.

The outcomes: The research done on NDNQI has demonstrated significance at
the unit level. Studies done related to falls and pressure ulcers demonstrated
which staffing or workforce element was statistically significant at the unit
for the patient outcome. These are a few examples:

  • Higher nursing hours on step-down, medical, and medical/surgical units
    were associated with fewer falls.
  • A higher percent RN hours on step-down, medical units were associated with
    fewer falls.

The difference is, when you are looking at higher nursing hours, take
component takes into account RNs, LP and LVNs, licensed practical nurses,
licensed vocational nurses, and unlicensed assistant personnel.

You are also being able to drill down to skill mix. With the higher percent
of RN hours, it was statistically significant for the step-down and medical
units in helping them to have fewer falls.

In another study that we did, there was higher reliability with certified
nurses assessing wounds. For every percentage-point increase in percent RN
hours, the pressure ulcer rate declined by 0.3 percent. This is just a sample
of some of the findings. Staffing does make a difference. The workforce does
make a difference.

When you look at quality indicators, you need to look at the entire package.
You need to look at the structural elements related to nurse staffing or
certification or education, as well as skill mix.

Other outcomes: The program has grown. As I mentioned previously, we had our
first national conference in January and had over 900 people attend. It was
just fabulous and exciting to feel the energy in the room, with all the nurses
being able to walk away with helpful hints, with tools that they can walk away
with: This is what I can implement on my nursing unit to make a difference
today when it comes to patient outcomes.

That focus was transforming nursing data into quality care.

Our second conference is going to be scheduled for January of 2008. The call
for abstracts is open currently. It is workforce engagement and using data to
improve outcomes.

The other thing that we did is, we published a best-practice exemplar, which
profiled 14 hospitals in the database that had a sustained improvement for a
specified nursing indicator, and they shared their stories. They shared with us
how it was to use the data, what those things were that they need to
incorporate to get staff buy-in, how successful they were, what those lessons
learned were. We published a lot of the helpful tools that they had to use
within their own practice settings.

Future plans for NDNQI: Methodology development is one of them. We believe
that we need to develop methodology for unit-based acuity or risk adjustment.
This information is needed to include mixed acuity units and universal beds,
critical access hospital and hospital rollup. We appreciate that there is a
difference at the unit level based on the different type of patient population
and unit type. But for these other areas, there needs to be some further
stratification to make that comparison more comparable. It also gives the
opportunity for other types of facilities and hospitals to participate.

Hospitals like to have a hospital rollup. We appreciate that the statistical
significance is at the unit level where care occurs. But that’s the reality. So
it’s something that we want to be able to provide.

Indicator expansion: We have been adding indicators every year since we
started the database. What we are focusing on over the next 18 months is to
really expand the current indicators to other relevant units. Since it is
research-based, indicators developed and implemented are based on the
appropriateness for that particular unit. For example, it would be highly
inconceivable to think that you would implement a fall indicator on a neonatal
unit. No one should be dropping babies. So that is not appropriate.

It’s a bad example, but I think it makes the point.

So when we take a look at other indicators, it’s really looking at the
appropriateness of that particular unit. For example, the assault indicator is
very appropriate for the emergency department. That is what we are looking to
implement over the next 18 months.

Report enhancement: As the database grows and hospitals grow, they like to
be more sophisticated in their reporting. Currently, you can download the
reports via PDF or X file. So you can actually take your information and put it
in whatever medium graphics that you need to internally for your organization,
which is very meaningful. But what we hear from the hospitals is that they want
to be able to be more granular with their comparisons. They want to be able to
compare a coronary care unit with a coronary care unit, not a medical ICU. So
that’s something that we are looking to work on in the database over the next
18 months.

Lessons learned: You can never underestimate the level of staffing required
to operate a national database. Accurate data collection requires a high level
of technical assistance and diligence and monitoring when it comes to managing
the database. There needs to be ongoing quality monitoring checks.

Indicator development and implementation requires time and resources to
ensure data validity and reliability. It takes time to develop an indicator. We
just can’t implement it in a month. It takes time because it does go through
its process.

The significance and importance of implementing and evaluating indicators at
the unit level where care occurs cannot be underestimated.

NDNQI is in a state of continuous quality improvement:

  • Web systems require continuous monitoring and testing.
  • The database design, statistical programs, Web data entry screens, and
    some indicators have a life span of about three years, before needing review
    and revision.
  • Hospital environments and operations change, and we need to adapt to
    maintain the relevance of the data definitions and report design.
  • New information technologies emerge and must be incorporated for
    efficiency and to maintain interoperability with participating hospitals.

Collecting structure, process, and outcome indicators provides a
comprehensive means for evaluating the quality of nursing care and patient
outcomes. There is good distribution and representation of all bed sizes in the
database to provide meaningful comparisons at the unit level. With that, it is
very important to have a definition of a hospital to maintain data
comparability and validity.

Thank you very much for this opportunity.

DR. CARR: That was a great presentation, very informative.

It, in a way, harks back to one of our opening speakers talking about bay
care and how a data element prompted a look back at what was going on. It
wasn’t just the care; it was the systems of care. I think that you have really
driven that home very much.

I like the blended aspect of what you have. It’s not just ulcers; it’s
ulcers in a care unit with this staffing. I think it takes us to a new level. I
commend you for it.


[No response]

Sharon Sprenger.

Agenda Item: Performance Measurement and Public
Reporting – JCAHO

MS. SPRENGER: Thank you. I’m Sharon Sprenger from the Joint Commission.

I should note for all of you that in January I can’t even use that acronym.
We are the Joint Commission. That’s one thing I want you all to think of today.

I did want to comment, when Dr. Yandell started, he talked about,
“Remember in the days of quality assurance or improvement, we would pass
the report around.” While I started in quality improvement in grade
school, I remember the days when it wasn’t just passing the report around, but
it was important, the amount of paper that you had, because the more paper,
obviously, the better you were doing. I think that’s important for today’s
conversation, as we look to electronic health records.

What I would like to talk about is the nursing project we are working on in
alternative data sources. But before I do that, I just want to talk a little
bit from the perspective of the Joint Commission and hospitals, what we see as
some of the barriers and challenges to electronic data.

I am sure you have heard many of these things, as you have had different
presentations today. First of all, one of the challenges is the fragmented
health-information exchanges that we really need to address, looking across
different physician practice areas, different settings, et cetera.

We also need to be very concerned with the privacy of health information. We
need to be sure that we effectively protect privacy, while assuring broad
access to meaningful and relevant performance-measurement data, and ways to
provide information that provides longitudinal views of quality and safety
across the continuum of care.

Data quality is a very important issue at the Joint Commission. I will speak
to that in just a few moments.

Also there is a need for national measurement priorities, with a
standardized data dictionary with common data elements and definitions across
multiple venues of care. In the time I have been here today, I have clearly
heard that message.

I would really like to applaud the work of the American Health Information
Committee and the commission that they gave to the National Quality Forum, who,
at the end of May, convened an expert panel that Dr. Tang chaired that I had
the opportunity to sit on, to begin to identify core data elements for the
electronic health record and to begin to prioritize those performance measures
that we should start with. So I look forward to the work coming from that

In many ways, the current measure specifications are not designed for an
electronic health record. I think we have heard clearly today, for example,
that as we move to electronic health record, identifying patients concurrently
will be very challenging in terms of how we are going to identify the
population that we are going to measure, and that we need automatic exclusion
of all data issues.

Then there are measure-construct issues. I could spend days talking about
this topic. We have heard from some speakers on the issue of measure
exclusions. They are all over the board with the different measure developers
and how you identify them, et cetera.

But again, there are some efforts that you may be aware of that are going
on. Just this past Friday, there was actually a meeting at the National Quality
Forum with a workgroup that NQF has convened of the major measure developers
that included the AMA Physician Consortium for Performance Improvement, CMS,
the Joint Commission, and NCQA. What we were trying to do was to advise them in
terms of measure construct: What should be some of the rules of the road that
every measure developer needs to follow? What is the minimum information on a
measure that should be submitted to NQF? Wouldn’t it be interesting if every
measure submitted to NQF looked the same way, so that as you went to the form
to find the denominator or the exclusions, you could see that.

We even discussed some guidelines, if every measure had an algorithm. It’s
one thing to standardize on paper. It’s another when you bring a measure
through a calculation algorithm. If you use a different sequence of how you
retrieve that data, you can, in fact, end up with a very different measure

So I think there may be some very important work.

With respect to exclusions, I think there is some hope on the horizon. We
actually agreed, as the major measure developers, that there needed to be some
other people in the room to really talk about this issue of exclusions, which
is important. So the NQF will look to actually convening a separate meeting,
and the whole topic will be the issue of exclusions and seeing if we can
standardize that. Hopefully, there is some sunshine on the horizon.

Then there is the ability to capture and link various data sets. Hopefully,
as I talk about the nursing measures — I think you have already seen some of
that from the last presentation — looking at not just clinical, but financial,
administrative, or human resource-type data.

There is also, I think, a need for process changes in terms of
documentation. For example, isn’t it interesting, the measure in heart failure
that looks at left ventricular ejection fraction and receiving appropriate
medication. But in many EHRs, we can’t even identify what the patient’s
ejection fraction is. So again, some opportunities for some process changes.

We need to, as much as we can, minimize human error associated with manual
worksheets, record review, data abstraction.

We cannot forget the technology and implementation costs, in terms of
developing the functionality of an EHR, but then the tools needed to capture
performance data.

Also I think we have to keep in mind that the hospital environment is all
over in terms of their sophistication. For some hospitals, it may, for example,
be easy to collect some of these measures, but for others it’s very difficult.
They don’t have the resources. So we have to keep in mind that health-care
organizations will need to adopt IT before the electronic health record can
support performance measures.

To date, the pace of change to electronic data is slow. I think some people
think that everyone will be automated tomorrow. It’s not happening quite that
fast. I think we are seeing some very positive things happening, but it is a
very slow change.

I also think we need to think out of the box for future needs. The one thing
I would really like to share with you and leave with you today is that I think
we need to be really careful as we move to the electronic data — even if it
can really help us, on the other hand we need to be careful that we don’t go
backwards. We used to always tell people that a really bad way to develop
measures is to say, what data do I have; thus, what could I measure? We need to
be very patient-centered, and we need to look at the measures that are
important to improve care, and thus what data we need. We cannot lose sight of
that as we move to the electronic record.

I always like to illustrate all of the activities that are confronting
hospitals with respect to quality and patient-safety efforts. This is a slide
that Nancy Foster at the American Hospital Association put together a couple of
years ago. I think it’s really important to think. I often tell audiences at
particular hospitals, if you are tired, this is why.

These are all extremely important efforts, but we need to find more
efficient ways to manage our patient-safety efforts.

Just to give you an example of how fast this environment is moving, just
look at CMS. When they started out with the Medicare Modernization Act in 2006,
there were 10 measures. Now, if we look in 2007, there is a total of 21
measures. There are more coming in 2008. In 2009, we will have a total of 32.
Right now we are looking at adding hospital outpatient measures. We are not
sure what that number is. But that is just one initiative facing hospitals.

In terms of the Joint Commission requirements, I am sure that many of you
are aware that the Joint Commission has done a lot of work in the hospital
environment with CMS. We right now have four measure sets that we are aligned
on. By aligned, I mean we have the exact same measure specifications, whether
you go to our website or theirs. We also have two measure sets that we have
unique to the Joint Commission.

Currently, for our accredited hospitals, we require them to collect data on
three full measure sets. We will actually be moving, in 2008, to four sets.
Currently, we have more than 3,800 hospitals that are collecting and reporting
data to the Joint Commission on a quarterly basis.

I just want to talk for a moment about data quality, because data quality is
extremely important to the Joint Commission. To date, in terms of collecting
our measures, we use what we call a core vendor measurement system. They are
really the data intermediary between the Joint Commission and the health-care

I just want to highlight some of the things that we have done with respect
to data quality.

The Joint Commission has always been attentive to data quality and the
growing national dependency on the quality of performance-measurement data
being used for accreditation, for payment, for public reporting. But as there
is a substantial dependency on the functions of our Joint Commission vendors
for hospital data — if you are not aware, our vendors are data intermediaries
currently for the data that is reported through the Hospital Quality Alliance.
Ninety-two percent of that data comes through the Joint Commission vendors.

So we have a number of activities that we look at in terms of data quality.
But the thing I really want to stress today is that we are actually ramping up
our efforts this year. We do a lot of education. We do webcasts. We do monthly
phone calls. We have always done vendor audits, but starting this year, we are
actually beginning to watch the data on a quarterly basis. We will actually be
assigning points to our different vendors. Depending on how many points one has
in a quarter, you could get a call from us, we could do a desk audit of you, or
we could do an onsite audit. So while we have always been attentive, we are
really ramping up our efforts.

I just want to stress that even as we look to electronic data, we cannot
lose sight of how important the data quality will be in the electronic record.

In terms of how the Joint Commission uses the data that we receive through
our accredited hospitals with performance measures, first of all, we use it in
our accreditation process. We have what we call the priority focus process,
where we look at data that we have available to us on an accredited
organization. Right now, for hospitals, about 50 percent of the data that we
look at to really help us focus the survey is coming through our core measures.
We are actually able to use the priority focus and our performance-measurement
data to really drive surveys. If you have heard about our accreditation
process, we are doing patient traces, where we actually identify patients that
they follow through from where they entered and follow their care through. We
are actually using our ORYX data or our core data to do that.

We also have what we call the ORYX performance-measure report. When the
Joint Commission moved to unannounced surveys, four times a year — because,
again, we get our measure data quarterly — we post the data on a secure
extranet site for the hospitals, approximately a month after we receive all the
data we posted, so that the hospital should know how they are doing and there
should be no surprises. This is also the data that our surveyors are given
about two weeks prior to an unannounced survey.

The Joint Commission also has what we call quality checks. It is a report
that is available that tells you about our various accredited organizations, as
well as how they are doing on our measures for hospitals and our national
patient safety goals.

Then we have a new report that was just published in March that contains
data for 2005, which shows you how our hospitals have been doing since we
implemented our core measures, as well as how they are doing on national
patient safety goals, et cetera.

DR. CARR: Sharon, I’m just keeping an eye on the clock, and also on this
material, which is terrific. But if you could look to highlight the most
important things —

MS. SPRENGER: Okay. What I want to do — and I’m just going to piggyback on
what Isis said — I am just going to give some real-world examples. If you do
not have an electronic health record, while the nursing set is an extremely
important set, it can really be challenging to collect.

Just quickly, the nursing-sensitive care is a unique approach in assessing
the quality of care. This is a very interesting set, as we do not have a single
common population in the set. We are looking at patients, we are looking at
nursing care, and we are looking at system factors in the set.

Isis already mentioned that we have funding from the Robert Wood Johnson
Foundation to actually test this as an integrated set.

Just to highlight, there are 15 measures in the set. They came from eight
different measure developers. What the Joint Commission did was to create
standardized specifications and really put the 15 together as a set.

There are the 15 measures that we are going to test. Again, I think we have
talked about how they have a different focus — clinical nursing, system. We
have different populations.

Then we have different data-collection approaches. Some are aggregate
counts, some use survey data, and some use clinical assessment, as noted,
looking at pressure sore prevalence.

The other thing that is really interesting about this set that is different,
too, is the frequency of data collection. Some measures are monthly, some are
quarterly, and one is annual.

I just quickly want to give you some examples to think about why it’s so
challenging to collect this data, depending on the systems that you have in

If we look at the measure, falls with injury — you can look later at the
numerator and denominator — just looking at patient days by type of unit. For
the numerator, we need falls and an injury level.

Just think if you are in a hospital and you have no automation and you are
collecting this monthly. First of all, you have to keep track of these five
different units for every patient that falls. You have to keep track of the
injury level. If you had an electronic health record, it could all be embedded
in there — the definitions of the fall, injury level.

This would be information that isn’t used to calculate the measure, but to
start thinking about, if you had a DHR, the supplemental information you could
collect to understand and use your data. For example, if you knew that the
patient who fell had had a risk assessment prior to fall, if you knew that
patient had been assessed for the fall risk, did you implemented any type of
fall-prevention protocol?

I just want to highlight the infection measures that are in the set. They
are all derived from the CDC. There is urinary tract infection, the VAP,
ventilator-associated pneumonia, and the UTI. But keep in mind, again, if you
are doing all of this manually, you report these by IC location. So depending
on how many ICUs you have, just think about the data-collection effort to do
that. Again, if you had an electronic health record, it would be transparent.

Again, you have to keep track of the number of patients every day if you are
doing this manually that have a urinary catheter, a central line. Then you use
that so that monthly you can determine, by adding up your log manually, how
many patient days you have, cath days, et cetera.

Again, I think you can see the inherent issues with some potential for human
error and data-quality issues, et cetera. Think about going back and auditing
this. Quite a challenge.

I just want to point out to you that within these measures, the infection
measures, we also looked at babies in NICUs. So we not only look at the type of
device, but in this particular measure, the CDC, the way they designed the
measure, wants the devices broken down by five different ways. So again, just
think about if you are doing this manually and think what an electronic record
could do for you.

Then I put in here the example of the nursing-care hours per day. Again, you
are collecting this manually — some of the worksheets that one would have to
use to do this.

Again, the pressure ulcer prevalence. Again, we are assuming that everyone
is using the National Pressure Ulcer Advisory Panel definition of what a
pressure ulcer is. I think that is one of those opportunities that we have been
hearing about throughout the day for standardization and everyone using the
same definition, et cetera.

Again, just to give you an idea, if you were doing it manually, how you are
keeping track of all these things.

Again, Isis mentioned the voluntary turnover. This is to give you an idea
of, again, if you were doing this manually, what it starts to look like.

I think, by the last presentation, they certainly demonstrate that these
measures make a difference. Wouldn’t it be great if these measures could really
be rolled out on a national level? I think, in order to do that, they certainly
would be supported by electronic data.

So one of the things that I want to leave you with, just kind of a different
way to think about an electronic record and what you could do — just think
about the opportunities that a health-care organization could have to assess
and improve, with a standardized electronic data system. You really could start
moving to systems thinking and really understanding the relationship among a
system’s parts, rather than the parts themselves.

I just give you a couple of little examples. Within this measure set there
is a measure that looks at whether the patient was provided smoking cessation
advice and counseling. If you had the opportunity, through electronic data, to
explore some of those relationships, you could look at your smoking measure,
but you could see, did you not do well because of your nursing skill mix? Was
it due to the nursing-care hours per day? Is it because you have a lot of
turnover? Is it because your nurses are dissatisfied?

Then if you start thinking longitudinally, again using a smoking-cessation
measure, if the patient was readmitted and you had the opportunity to see that
they were still smoking, you would have the opportunity to look and say, who
provided the counseling on the previous admission? Should we use the same
person? Do we need a different person? What type of counseling did we provide?
Do we need to reassess our education plan?

I think one of the things that is so exciting, and the thing that we have to
be really careful of — you can see that there are so many things facing a
hospital in terms of what they have to collect. But as we continue to add
measures, we really need to have efficient systems so that they have time to
use the data, because it really is all about the patient and improving the

DR. CARR: Thanks very much.

How many people who are collecting this data have this in electronic format?

MS. SPRENGER: For our test — and we will actually be starting data
collection July 1 — we are looking for approximately 54 sites. Of the 54 sites
that we selected — we had over 200 volunteers — we asked questions of
different characteristics in terms of the electronic health record. Did they
lab system, patient assessment, et cetera.

Of those 54 sites, approximately 20 percent have 50 percent or more of the
things we ask automated. It will vary in terms of what they have
electronically. Some may have a full electronic health record. Some may have
certain pieces. But within their institutions, at least 50 percent of the
things that could be electronic are.

DR. CARR: So that means that they are findable electronically and that they
are in a relational database?

MS. SPRENGER: If you ask me a year from now, I will be able to tell you. One
of the things that was interesting, when we were asked to present today, is
that we are just beginning the data collection. But one of the things that we
want to assess during the 12-month period of data collection is the use of
electronic data sources and impact on reliability, whether databases are
linked, et cetera. Data collection ends next June, so we should have more

DR. CARR: For now, all the data goes to you and you have it, whether they
got it electronically or manually.


DR. CARR: Then you put it in a relational database and report back.

MS. SPRENGER: Right. For purposes of the test, keep in mind that one of the
things that we are doing is assessing how they work together and the
reliability of the data. Then we will report back during the 12-month period on
how they did. But we will have organizations in the test who are doing it
electronically. That’s one of the things that we want to assess and also,
through, that have the opportunity to see, whether they are doing it manually
or electronically, any changes that we would have to make to the
classifications, et cetera.

DR. CARR: Very interesting. Thank you for updating us on this.

Actually, Dick, what we chatted about at the halftime there, what you were
saying about mortality and publicly reporting hospitals versus others —

DR. JOHANNES: The question arose as to whether there were data that related
to the efficacy of public reporting as it relates to mortality rates. Many
people showed slides today that multiple disease rates are, in fact, seeing

We presented some work at the Academy Health meetings last year. I am going
to have to take you back to this morning. Think of the cylinder I showed you
that described the clinical parameters. Those are the parameters that are used
to build the risk assessment. So you plot a particular patient’s values and
calculate that person’s predicted mortality from the time of admission.

We have hospitals, both in and out of Pennsylvania, and hospitals in states
besides Pennsylvania that do public reporting. By matching patients with equal
risk of mortality on admission, you can create something called a multivariate
matching. What will happen is that the clinical parameters for those two
populations of patients will appear very similar. They will have the same pulse
rates, the same blood pressures, the same BNT levels, and all that.

We did that for states in and outside of Pennsylvania and then did the
pre/post analysis using a difference-of-difference method, and showed that the
hospitals with intensive public reporting had lower mortality rates than those
that did not.

We have submitted that as a manuscript to the American Journal of Medical
. But I can make the presentation available.

DR. CARR: All right. I will take the chair’s prerogative and just try to
summarize some of the themes that we heard today and ask for comment on things
that I didn’t include.

Agenda Item: Wrap-Up

We began with, where are we today with the hybrid medical record, partially
administrative, partially in some cases electronic, some of it data
abstraction? What is the state of affairs today?

The things that I heard were:

  • Outcomes are getting better.
  • Transparency is increasing.
  • The public is becoming engaged.
  • Metrics are becoming refined.
  • Physicians, nurses, and other clinicians, as well as senior leadership —
    CEOs, boards of directors — are becoming involved.
  • Collaboration is recognized as critical and appears to be increasing.

We began the day, however, hearing about the administrative burden. It’s
still large, and growing. We also heard about the financial commitment, which
is large and growing, both for FTEs to do data extraction and for development
and implementation of electronic health record.

Those are highlighted themes. I will just say a few more words about some of
these things.

I think, number one, in terms of achieving quality, public reporting helps
the consumer, but, more importantly, it informs the people who can actually do
something about the quality we hope to measure. With Virginia, a lot of times
the physicians are seeing their own data for the first time when it is publicly

Bay Care, who spoke this morning, said how they had had great success in
their first year of the Premier initiative, only to find themselves falling
behind and then having to look at themselves, realizing that they didn’t have
the systems and the infrastructure to support achieving the outcomes that they
had sprinted to in the previous year. As a result, they had a major cultural

In terms of achieving quality, I think we heard that transparency has given
credibility. Actually, this is something Simon and I heard last week. The
commitment to transparency by institutions, regardless of what those measures
are, demonstrates that there are measures and that an institution is committed
to improvement.

So that was achieving quality.

The second thing is, there is no perfect system now and there is none in
immediate sight. So we need to tweak our existing systems. It’s not about all
administrative or all electronic.

I think we heard very strong support of administrative data today. First of
all — something I hadn’t thought about before — administrative data is
bounded by coding rules and billing rules, ensuring reproducibility and some

We heard about the use of the AHRQ patient-safety indicators and quality
indicators that have been very helpful in institutions looking at safety.

We heard about very elegant research on risk-adjustment models, I think
strongly supported by publications in peer-reviewed academic journals.

Let me go on to convergence of approaches. We are getting more
sophistication in the manipulation of the administrative data. We are making it
better by supporting information such as present-on-admission, labs, vital
signs, and timing.

Importance of acceptance — I think we alluded to it before, having a CEO
accept and have a willingness to report performance, as was the case in New
York and also in Bay Care in Florida. In Kentucky, the same thing, including
the board of trustees.

I think a key thing that we heard today throughout was clinician engagement
and acceptance. If we think about the NSQIP data, that began with physicians
who walked away from administrative data, frustrated by that, and made their
own data set, which turned out to be very powerful.

But what we are founding is that many of the measures that NSQIP includes
are now probably achievable through this blended hybrid and, in some cases,
administrative data.

In Virginia, the emphasis for success was collaboration, science, no
surprises, open process, and follow-up on all inquiries. We heard that also
from Wisconsin. When clinicians whose data is being reported are questioning
it, it calls for a sober, thoughtful response that reassures the clinician or,
in the case of some error, improves the measurement, because it has been looked
at from all sides.

Things that have gotten better:

  • As we just heard, multivariate matching, in and outside of Pennsylvania,
    has demonstrated lower mortality in institutions with public reporting.
  • Premier programs showed an absolute frame shift in the lower deciles,
    hospitals performing in the lower deciles moving up.
  • Bay Care I mentioned.
  • Norton data was very impressive. Their report card went from red to green
    with the advent of public reporting, not just because it was reporting, but
    because it focused initiatives.

On to administrative burden: I think it was completely outlined this morning
by MGMA. I think one burden that goes across all these areas is the challenge
of uniformity of definitions, exclusions, as we have just heard, and uniformity
of the reporting format.

Another issue is financial burden. Hackensack Medical Center — we heard
about their costs in 2004. This year they found that their costs are rising.

Bay Care told us that switching to an electronic health record throughout
their nine hospitals, they expect, will be a $200 million investment.

One thing that I think was very interesting is that I think we are hearing
about a new job description related to this — data abstractors. These are no
longer just clerical people looking at a particular note. There is really a
growing sophistication in terms of understanding the data definitions,
understanding exclusions and nuance, knowing where to look in a medical record
and how to find it. That was perhaps a hidden thing. We think someone in
medical records will get this, but it’s not just someone; it’s someone very

A second thing was IS expertise. With some store-bought electronic health
records, you have to create your own internal hacking to get in to find the
reports that are not canned reports, but reports that you want. We heard that
programs that have IS resources and decision-support resources on site have a
much easier time — also building that IS-quality interface, knowing where the
data is and making it query-able in the data set.

We talked about the resources needed for data validity, reliability, review,
and revision, and then decision support for evidence-based care.

So I think one thing that is important as we move toward an electronic
health record, as we refer to the work that NQF is doing, the conference that
Paul Tang chaired, is that we are not going to have all query-able fields. An
electronic record is not going to have everything in a query-able field. As we
heard with the presentation about the VA, perhaps the technology that we need
is the software for word finding in documents. That may get us from here to

Some of the next steps I heard:

  • Keep the pressure up for public reporting. I think that was a message that
    came through very loud and clear this afternoon. Public reporting, in and of
    itself, changes things. It draws attention, it puts pressure, it sets
    expectations, and saves lives.
  • Keep the eye on the mission that it is better care to be delivered.
  • Continued work on measure development, exclusions, and specifications.
  • Alignment of reporting.

I think I will stop there and invite any others who might want to add

DR. GREEN: That was terrific. Congratulations.

One thing I would like to call attention to that I didn’t hear you mention
was the relative neglect of children in the use of administrative and clinical
electronic data for quality assessment. It seems to me we should draw attention
to that in some way.

To just reinforce where it seemed to me you came down with your last
sentence, the abiding message for me, from virtually everyone who testified
today, was that we can make progress right now with the administrative and
clinical data sets we have. We should not be using their imperfections and the
problems that they have to delay.

I heard several people ask the National Committee on Vital and Health
Statistics to do whatever it could to motivate action.

DR. CARR: Don?

DR. STEINWACHS: Since I was only here in the afternoon, I probably missed
some things. But it seemed to me that there were some very useful comments
about how much of this information is useful to consumers, and to think about
where we are in that process and maybe where we are going. I heard less about
where we ought to be going than where we are.

In Virginia, they found a great interest, as you might expect, in learning
something about individual physicians. Largely absent from public reporting
systems, in my experience — and others probably know better — is information
that helps you know a little bit more about a physician, other than asking your
neighbor, your relative, or your best friend. That might be one thing that is
worth taking away.

I think the question is, what else is really valuable to a consumer? Even
more so, as you think about the possible growth of consumer-directed health
plans that are asking consumers to make choices in economic terms, as well as
quality terms, what should we be holding out there?

I think all of us see public reporting as a way to drive the provider system
to be more accountable. But I’m not sure public reporting is really connecting
that well with the consumer, who ultimately wants to know, what is this going
to do for me, and what are the risks I’m facing? They may also be interested in
the economic consequences, in some cases.

DR. CARR: I agree. I was struck by the comment that it seems like it’s about
the consumers, and ultimately it is about the consumers, because care will get
better. But the actual public reports do more to change providers and
administrators than to change consumers.

DR. STEINWACHS: I would argue that part of that is because the construct,
the way in which we collect information and provide it, fits into the medical
paradigm. It doesn’t necessarily fit into, “I’m a person. How well am I
doing? What is this going to do for me as a person?” That is different
from a disease, an operation, a disorder.

The only reason I am raising it is that it seems to me that we ought to
recognize that there is a challenge out there still to make this useful to all
stakeholders. Particularly, what can we do to try to lay a path to help the

DR. CARR: I think we heard an example of that, where you could go on to a
site — was it the Virginia program — where you would type in “hip
replacement” and then get to a field and then be interactive, ask you some
risk factors, and then say, “In our institution, this is how you would

MR. SCANLON: I think that the initial burden is real. It’s partly related to
the fact that we are, in some respects, starting a process to improve in terms
of what we know about care and how we can use that to effect better care.
Whenever there is a transition, probably the most painful is starting the
transition. Maybe it’s a physics phenomenon, that it takes the most amount of
energy to overcome inertia.

We shouldn’t let that be the barrier and cause us to abandon this. But at
the same time, there is a real question of whether some of this burden is
totally unnecessary. How can we, in the short term, eliminate some of that
unnecessary burden?

All those different reporting requirements, the fact that different
requesters are demanding different things — and many of them are in a position
to demand it and various actors have to comply — it is an issue.

We have had things like VHS versus Beta. It served no purpose, in some
respects. It’s unclear exactly how we can cut through this to try to create
some kind of standardization that is protective of innovative, but at the same
time, does not tolerate a lot of waste in the process. I think that’s very

The other thing is that I think we need to really try to maximize what we
can get out of the electronic record, in terms of what is retrievable. It
actually makes me nervous to create an occupation of data abstractors. If you
look at the demographics of the country, we are going to have labor shortages
in the future. Health care is already sopping up an incredible amount of labor.
We need to figure out ways to make health care less labor-intensive and more
capital-intensive. I think there is going to be a lot of restructuring of jobs.
Part of that restructuring of jobs in the future is going to be how you use
capital like IT. It’s going to start at the top with the most highly trained
physician and it’s going to work its way down, through different occupations.

I think we would like to avoid having to have too much labor support, as
opposed to capital support. That’s going to be an important thing for us to
think about.

DR. CARR: Carol?

MS. MCCALL: I thought today was great. I loved everybody’s comments.

A couple of themes, just to build on what you said and what Bill just said.

I think there is a paradox about how we go about this, and it leads us to a
nation of data abstractors. I think sometimes how we think about this is maybe
10 or 15 years behind technologically and what is, in fact, possible. I think
it’s important to kind of bring what is here today into what is possible around
health care.

But I do think we are at a new point. I think we are entering a new age. I
think the pain of not doing something is going to become greater than the
difficulty of not doing it. I think we may finally be there.

I would echo that I think one of the biggest roles that we can play is how
we can limit the unnecessary burdens so that we can, in fact, move forward.

I heard Denise this morning talk about systems. I heard a lot about
“systemness.” But there was something that actually kind of disturbed
me about the discussion this afternoon. It was around some of the
nursing-sensitive things. I think there is so much opportunity there, it’s just

Systems, to me, have three things. If the system is big, it has a way for me
to get from A to B to C to D, and all the way around. It also has feedback
loops. It also has explicit and built-in means to learn and adapt. Those are

Back to the first one, these hooks that allow us to get from A to B, I think
we need to make sure that our databases have hooks, common data elements that
allow me to move from one view of the world to another. For example, these
exquisite databases around nursing-sensitive processes are not built from the
bottom up today — not all of them — with the patient as a key point of that
unit of analysis. I would love nothing more than to take the nursing data and
turn to Ben and Norton, can we marry that stuff up? I don’t think today that we
can. I would love to link it to socioeconomics.

So we have to think about hooks — not just the measures, but how we then
link it.

That was a big theme.

The second theme that I heard today was the hidden power of transparency. It
does have hidden power. It’s not just about turning people into good shoppers.
There is a hidden power that comes by just simply revealing this to the people
that are creating it. So there is a desire for everybody to find themselves in
the data. Whether I want to find my twin, as a consumer — a 44-year-old woman
with kids, some woman in Illinois — or whether it’s for physicians to find
themselves, who say, “Hey, I can’t find my data” — because we heard
that, too.

I also heard about transparency. I think we need to push to have it
extended, not just data, but also measures. Also when Anne spoke, I found
myself wanting to make transparent the methodologies, the analytics, the
algorithms, the models themselves. I think that becomes an important theme,
because I think they need to become standard. We can have the same measure, and
yet I can run it through a bunch of weird science and come up with a different

I think we can also harness more open models, á la Linux. I think
there is an opportunity to engage the experts that today are not connected and
to do so in an open and innovative form. I think that could be how we preserve
standardization and innovation. It is an operating system of a type, and at
some point, there are updates to it, at the right time and in the right way.
But there are ways to innovate in that world and there are models to get that

I think it taps the engagement of the people whose world they are creating.
Engagement with something, whether you are designing a metric or a measure or
an algorithm, or trying to change care, fundamentally changes your relationship
with that thing. If providers and nurses and the people who are delivering care
had some form in which they might have influence on the next generation of
measures or whatever, they would feel a different relationship to that thing.
It just changes everything.

With that, I will turn it over.

DR. CARR: Please.

DR. BICKFORD: Carol Bickford, American Nurses Association.

In the conversations that we have had this afternoon and the summary
session, I would like to invite you to consider that there is another
population that is not well represented in this conversation, and that is those
in skilled nursing facilities and long-term care. We have no
information-technology support for them, except in very rare cases. We are not
taking a look at the indicators from the acute-care environment that could
improve the quality of care in some of those settings.

The second thing is, we are focusing on pathology and what is broken. We
have not taken a look at the significant valuable commodity known as prevention
and health promotion. If we put that in our thinking caps, it perhaps creates a
new framework for what our quality initiative is. Rather than dealing with the
broken stuff, let’s look at how we can make sure that it doesn’t get broken.

DR. CARR: Thank you. Very good.

Thank you, everyone. I have learned a great deal. I think we have had a
great collaboration and sharing. I thank you for taking the time to come today.

(Thereupon, at 5:08 p.m., the meeting was concluded.)