AHCPR Conference Center
6010 Executive Boulevard
Rockville, Maryland
Call to Order and Introductions
Overview of Clinical Vocabularies and Issues
Overview of Terminologies and Issues
Statistical Classifications and Code Sets
MR. BLAIR: I think it is time for us to convene. Please take your seats.
This is the National Committee on Vital and Health Statistics, the Computer-Based Patient Record Work Group. The meeting is within the context of the mission of the work group, which is to study uniform data standards for patient medical record information and the electronic exchange of that information.
These hearings are part of a set of hearings going on throughout 1999, which will lead up to recommendations that this committee will make to the NCVHS and on to Donna Shalala by August of the year 2000.
I am Jeff Blair. I am with the Medical Records Institute. I am chair of the CPR work group. I think the next thing I would like to do is have all of the committee members introduce themselves.
DR. COHN: I'm Simon Cohn, a member of the committee and the chair of the Subcommittee on Standards and Security, which is the parent subcommittee for this work group, and welcome you to this hearing.
MR. MAYES: I'm Bob Mayes, Health Care Financing Administration, lead staff to the work group.
DR. FYFFE: Kathleen Fyffe, member of the committee. I work for the Health Insurance Association of America.
DR. FITZMAURICE: Michael Fitzmaurice, Agency for Health Care Policy and Research. I am liaison to the National Committee on Vital and Health Statistics and co-lead staff to the Computer-Based Patient Record Working Group.
DR. FERRANS: Richard Ferrans. I am a consultant to the VA and chief of informatics at LSU Medical Center. I am staff to the committee.
MR. GARVIE: Jim Garvie with the Indian Health Service, also staff to the committee.
DR. CHUTE: I'm Chris Chute, professor of medical informatics, Mayo Foundation.
DR. CIMINO: Jim Cimino from Columbia University.
(Whereupon, introductions were performed by members of the audience.)
DR. GREENBURG: I'm Marjorie Greenburg, National Committee on Vital and Health Statistics, CDC, and executive secretary to the committee.
MR. BLAIR: Is that everyone? Just to put this in perspective for you, we indicated that this particular work group was focusing on uniform data standards for patient medical record information and the electronic exchange of that information.
We have pursued that activity by defining a set of focus areas. One focus area is message format standards. Another is medical terminologies. Another is the quality and accountability of data. Another is the requirement to address the proliferation of different standards in different states related to electronic health records, and some infrastructure issues and some cost benefit issues related to these subject areas.
These are the first two days that we are having hearings on medical terminology. We expect that we will probably have one or maybe two more days later on. You may notice that as you glance through the agenda, today and tomorrow are essentially developers of medical terminologies. As we begin to look a little bit more into medical terminology, we will be looking at vendors and users of medical terminology, so that we can try to understand these issues a little bit better.
We have attempted to craft today and tomorrow into some areas of focus. For those guests that are here, we have a diversity on our committee of folks. Some of them have more familiarity with medical terminology and some have less. So these first sessions here this morning are really intended to educate the committee as a whole with respect to the principles and structures and objectives of terminology and in particular, medical terminology.
For that reason, we have asked Dr. Jim Cimino from Columbia Presbyterian Medical Center and Dr. Chris Chute from Mayo Foundation to begin the education process for us. That will be followed by Mark Tuttle of Lexical Technologies and Keith Campbell from Kaiser.
Jim, could you begin for us with our basic education?
DR. CIMINO: Sure. Good morning. I'm not sure whether I am a developer or a user. I try to use whatever I can and develop the rest, so I kind of come from both sides.
Mike asked me to come down and talk a little bit about some educational things, but also to say whatever the heck I wanted. So I have the recency effect and I am mostly doing the second.
I actually gave testimony to the full committee, I think it was back in March two years ago. In there, I covered a number of topics that were pretty well described in the minutes, talking about the multiple needs for controlled vocabulary, who needs them, who the different groups were, different constituents. Talked about capturing clinical data for multiple purposes and re-using clinical data. I covered some of the inadequacy of current systems, and then talked about desired characteristics in a model for the future.
I'm not going to try to go through all of that today. That was about 45 minutes worth of stuff, and some of it is fortunately outdated. For instance, some of the current systems have started to evolve to the point where they are addressing some of the former inadequacies. I'm not so sure what the model for the future should be anymore, so I'll just cover a couple of these.
First of all, this is one of the slides from a couple of years ago on what I thought was still relevant and important for the committee to keep in mind.
In the center here, for Jeff's benefit, is a box that says, collect patient data. That feeds into a clinical repository. Then in health care institutions, what typically happens is that we have to then decode the patient data. Then that goes into a financial database. So there is this process of collecting data and then recording it.
The recording is actually done by people other than the people that collect it. So there are problems with accuracy of the recording. Then the terminology that is used to recode is not the same terminology that is used to capture the data, so there are often translation problems that occur as the data are transferred in this process.
Meanwhile, in many institutions like Columbia, for instance, we have another box for you of collect patient data, and that goes into a research database. So we are actually collecting data twice on the same patient and putting it into two disparate databases, using two different processes, really three processes of coding patient data. All we need now is to have somebody come along and tell us there is yet another reason to start coding data. Then we'll have to start adding more of these boxes.
So at least to me, the obvious solution is to collect the patient data once, put it in the clinical repository, and based on the terminology that you have used for coding the data, then be able to generate your financial data and your research data from there, instead of having three or more separate processes for data collection.
I wanted to cover a few of the basic, what I have come to call desiderata for controlled terminology. There are about 12 of them. I don't have time to go through all of those here, but I wanted to leave some impressions of what I think some of the most important ones are.
The single most important one, I have come to believe, is concept orientation, which is the notion that each term isn't just a name, that you say, this is the right name to use for this patient. It is a concept. There is a meaning behind that term, and that concept is the actual focus of what you are coding, not the name that you are coding.
The things that go along with that is that there is one meaning meaning -- I've got my redundancy in there already -- one meaning per code. So there is only one -- each code has only one meaning underneath it, one concept. That is to avoid ambiguity in your coding system.
There should be a clear semantics. That is, when you say gunshot in atrium, are you talking about a wound to a part of the heart or are you talking about an event in a part of a building? You have got to be clear in what your concepts mean and what context they can be used in.
Meaning and concept are interchangeable notions. That is, a concept is something that is an object that embodies a particular meaning, so you can't ever change the meaning of a concept. That goes against the definition of what a concept is. But you can easily change the meaning of a code, and that happens all the time. ICD-9 does it every year. They will take the codes and they will rename then, and in the process of renaming, they will change the meaning. What happens then of course is that you have got a code that has two meanings over the course of time. If you are looking at the world as one year snapshots, I guess that is okay. But when you are collecting data in a clinical database, in a longitudinal database and trying to take care of patients over time, it is critical that you can retrieve information over time and look at trends and look at changes to be able to say, when did this patient first have this problem, or find me all the patients with this particular problem. If the meaning of your code changes over time, it becomes impossible.
The last thing is concept permanence. That is, once you have said that a concept exists, you can't go back later and say it doesn't exist anymore.
I pick out ICD-9 CM all the time, because they do that every once in a while. At one point, they added lots of concepts about different kinds of smoking, and then they said we don't care about those anymore. They had lots of concepts about HIV and then they got rid of them.
These concepts are permanent. The data are in the database with those codes. You can say that these codes are no longer used, that they are retired, or Randy Miller coined the phrase emeritus terms.
We don't use the term non-A non-B hepatitis anymore, but we still have lots of data that are coded non-A non-B hepatitis. We can't simply pretend that those data don't exist anymore. They are still very useful, especially with the patients who have the data associated with them.
I tried to boil down some of the do's and don'ts that I would like to recommend, and in the process try to explain as part of this educational mission that I have been given what these things mean.
It turns out when I looked at the structural aspects, the all turned into things, don't do this, don't do that. It's a lot easier to explain that way.
The first is, when you have a terminology, don't limit the depth or breadth of a hierarchy. Don't say you can only have 10 things at each level, or you can only have three things in a hierarchy, because as soon as you do that, somebody will come up with some nuance that they want to express as the lowest level of a hierarchy, or they will come up with a new version or something at a particular level and you have run out of codes and you have to put it into Other or you have to put it somewhere else. Limiting the breadth and depth over and over again will run into trouble.
When I started at Columbia, Paul Clayton recruited me. He came from the health system. I was trying to convince him that we needed a vocabulary that didn't limit the breadth and depth of a hierarchy. He had the health terminology, and the health terminology seemed to have plenty of room, because they used 256 codes at any level, and I said that is not enough. When we looked, it turned out there were places in the health collect where 256 simply wasn't enough things to have in a particular level. That is one don't.
Another don't is, don't limit to a strict hierarchy. Strict hierarchies are fine in some cases where you have mutually exclusive things. So for instance, the axes of a terminology, where you are saying, these are diseases and these are organisms, organisms are never diseases. Semantically, it is just not possible for an organism to be a disease. It can cause a disease. The concept of a disease may have the same name as an organism, but they are different concepts.
So you can have a strict hierarchy that separates these mutually exclusive classes. But there are many places where you can't settle for a strict hierarchy.
For example, you look in ICD-9 CM, which is a strict hierarchy, and you say, I want to find all the cancer terms, you can't find a class that embodies all the cancer terms, because they are scattered around in the lung diseases and the GI tract diseases and so on. You can't even say, get me all the lung diseases, because when you go to the lung category, some of the lung diseases are often in the infectious diseases category.
So if you want to start looking at terms in the classification, using classifications, multiple hierarchies are critical in medicine. You take the example of hepatorenal syndrome. Where does that go? Does it go under liver disease or under kidney disease? The answer is, it goes under both. If you put it under one, somebody who is saying get me all the patients with liver disease will miss the hepatorenal syndrome if they have been classified as kidney disease. Or if you say, get me -- well, that is sufficient for that one. I only have 20 minutes, so I don't want to use it all up on that.
Don't put meanings into codes. There are different ways of expressing this. There is saying you have non-semantic codes or meaningless identifiers. They all sound like the codes don't mean anything, because there is no meaning associated with it. That is not what is intended.
What is intended is, when you look at the code, it shouldn't be telling you the meaning of the concept. The reason is that you may change, for instance, where you want to put the term in a hierarchy. For instance, ICD-9 CM, the location of a hierarchy is specified by the coding system.
So if you say, we are going to now put peptic ulcer in the infectious diseases category, you have got to give it a different code if you want to do that, because you can't keep the same code. So now you have discarded one code and produced another one.
Also, when you start putting meaning in the code, you force yourself to say, okay, I'm going to have four letters, or I am going to have 10 letters or whatever, and you run out of room. You can't come up with enough abbreviations for the different things and cram them into some little coding system.
Some people would say, why not use the name? Just use the name as a code, and then you don't run out of room. The problem with that is, what if you want to change the name of your term? You are allowed to change the name of a term if you want to clarify their meaning, or they are misspelled or what have you. You don't want to change the meaning when you change the name, but you can change the name. And if you have used the names as the unique identifier, now suddenly you have got two unique identifiers for the same concept.
This is my favorite one to pick on, NEC, Not Elsewhere Classified. I urge at every opportunity to do away with Not Elsewhere Classified as a way of pigeonholing terms. It is not necessary any longer with computer systems now to have to force yourself to say, we've got A, B and C, and all the others are going into D, because we don't know what they are going to be, or we don't have enough codes for them, or we have some excuse that we are going to lump them all in there.
You lose information when you do that, number one. Number two, when you add another term to your terminology, the meaning of what other is changes. That is, if I've got three pneumonias this year and then another thing that is all other pneumonias, next year I have a new name for pneumonia, and previously those people were categorized under the other category, now they get their own code, the meaning of other has changed, so now I can't aggregate my terms. This violates the notion of concept permanence. So I urge getting rid of this.
In my opinion, the way to handle the problem of saying I don't have a code for this, they said pneumonia but it's a different organism and we don't have a code for pneumonia by that organism, the answer is, use the code for pneumonia or bacterial pneumonia, and then add the modifier as either free text or a code, and then you have in effect said, this is a pneumonia not elsewhere classified, because it is pneumonia. It is not just plain pneumonia, it is pneumonia with a modifier. Later, if you develop a term for that meaning of the combination of the pneumonia with a modifier, you can gradually go back and recode your data if you wanted to. But in any case you haven't lost information. You can go back and aggregate your data.
You could say, get me all the patients with pneumonia. You will get all those patients. You can say, get me all the patients with pneumonia not otherwise specified, the NOS, you can do that. You can say, get me all the patients that have pneumonia that is not otherwise specified and it is not a separate code, but it has a modifier. Again, you can do that with this and you haven't lost any information in the process.
Now some do's, on content. Try to strive for content that matches the -- especially matches the granularity of the primary creators of the data. If you want to successfully have that drawing that I showed in one of the early slides, where you collect data once and then use it for multiple purposes, the coding system has to satisfy the people that are capturing the data.
If you say to those people, look, we really are going to need a ICD-9 CM eventually, so why don't you just capture your data in ICD-9 CM. That will be inadequate for the purposes. It would be maybe adequate if all they had to do was represent financial data, and sometimes I think that is all my hospital wants me to do. But we have to capture it for patient care. That is the primary reason we are doing this.
Examples of this for instance would be the terminology, or the terminology used by the pharmacy knowledge base vendors who are creating the actual codes that can go in a record and represent what data are being captured about the patient. That is of primary importance. You can then derive the other things, the aggregations that you need from those.
If ICD-9 CM for instance wants to say, this year we are going to count these as things that we are interested in and all the other things will go in the other category, they can change that by changing the mapping from year to year. But the primary collection of the data has to be of the granularity of the people that need to capture the data and need to use it, primarily.
It is important then to have a feedback mechanism that incudes both the creators of the terminology and the users of the terminology, so that you are not simply saying, these are the things we want to create, but the people out in the field may not know how to collect those data. Or they may want to be collecting something that the people that are doing the aggregation and the finances of the statistics are unaware of. So the feedback mechanism has to include everybody, to make sure that what gets added is going to be usable and useful in the primary setting of patient care.
Finally, there needs to be development of a timely concept oriented update process. That is a loaded statement. Both timely and concept oriented are important -- timely because clinical medicine is changing. We add new drugs to our formulary every day, we need new codes for them every day, new lab tests every month, new diseases. As we start to accelerate our understanding of the human genome, the terminology that is going to be associated with that is going to explode. That is going to have to happen in more or less real time.
In my institution, I update my terminology on at least a weekly basis and sometimes on a daily basis, to keep up with the new terms that are in there. If I don't do that, my systems will fail. So if we are going to have standards for this, they have to match that.
The concept oriented part refers to the notion that when you do an update, it is no longer adequate to say, this term is no longer used, this is the new term. You have to have more information about what is happening with these terms. Why is this term not used anymore? Was it incorrect? Has it been subsumed by another term? So the old terms now -- maybe there used to be two terms that mean the same thing, so don't use that one anymore, just use this one. When you find the old one, you know that you can map it to this new one.
If you change the name of something, say why you are changing the name of it, so we can say, you changed the name and you also changed the meaning. That is not allowed. We need to be able to critique these updates by looking at them from the concept oriented viewpoint.
Some lessons to learn. Lessons from ICD. Everything that ICD does, don't do it. I use that as my example in my classes all the time. Maybe it will get HCFA mad at me, but what can they do, make me document more? It is a strict hierarchy, it is a number coding system, it is just -- and they change their codes every day, they don't tell us how they update them, why they do the updates. Everything they do messes up those of us who are trying to capture data and use it for the primary purposes of patient care.
SNOMED has learned a lot of lessons over the last three or four years, and has started to evolve into SNOMED-RT. I assume that -- Keith, are you going to talk about SNOMED-RT here? Good. So you will hear from them some of the changes that are being made to SNOMED-RT to for instance get rid of NEC, for example, to adopt a multiple hierarchy and make it explicit in the terminology, that sort of thing. So lots of lessons to be learned from SNOMED.
Lessons to be learned from . Clem can give you those lessons better than I. The lessons from are that, in order to create this terminology, they did two things that I think were unique.
The first was, they got the people that actually created the terminologies and did the terminologies into the same room and said, let's talk about what we have in common, let's try to understand what it is that we all do, and let's come up with a terminology that covers everything that we do in our laboratory systems. If we do that, then we will have something that is going to be really useable. We will capture data at the point of care, and then be useable for other purposes from there.
The second thing that they did is, they created a knowledge model. I don't think they thought they were creating a knowledge model at the time. What they thought they were creating was a fully specified name. They said, we want to have really good names for these things.
So let's say, the first part is going to be the thing that the test measures, and the second part is going to be the thing that it measures it in, like serum or urine or something, and the third part is going to be, what period of time, is it a point measurement or a timed measurement, is it quantitative or qualitative. They felt that by creating these expanded names, they would be able to create unique names that everybody would recognize and understand.
That was a great idea from the standpoint of just saying, look, we don't have a standard, we just have a standard naming system, a nomenclature. If I give this to people, they will understand what it is. As they started to enumerate these things they came up with codes for them, maybe unique identifiers.
They said, everybody agrees that there is this mean cell hemoglobin concentration. Why don't we put a code on that? Then we can ship the code around. Meanwhile, if we don't have a code for something, we can still use the fully specified name.
What they were doing though is, they were creating a knowledge model for representing the laboratory terminology, which will allow them ultimately to do some very interesting things, like look for redundancy in the terminology, look for -- do multiple classifications of data, aggregation of data, and so on. They created their terminology by looking at what was the knowledge, what were the concepts they were actually trying to model.
I think that's it. That's all I had.
MR. BLAIR: Jim, thank you. I'd like to bring up Dr. Chris Chute next. Then after Chris, please save whatever questions you might have had from Jim's session, and we should have about 10 or 15 minutes for questions to both Jim and Chris.
DR. CHUTE: Thank you, Jeff. I'm just going to try valiantly to switch little boxes here.
Simon Cohn, my colleague in many activities and on this committee, asked that I address in the context of these introductory remarks what the heck we have been doing in CPRI the past few years, since many of the activities that have surrounded terminology development have had a forum among others that has included the CPRI activities.
I want to begin with some reminders and motivations, and three or four slides of why anybody would care about this sort of thing. This is a quote which basically says, those with more detailed, reliable and comparable information for cost and outcome studies, so on and so forth, are going to win in the marketplace. There is a lot of drivel in between, but you get the idea.
The principle is, this is not a pedantic or academic activity. This is a significantly important activity to the business of health care. Any failure to recognize the crucial importance of this as a business characteristic in health care will be to the detriment of those that fail to recognize it.
This of course is my famous -- I think you have actually seen this one. This is an old slide, where we begin with the notion of patient information on one side of the circle, with a big sweep through clinical information, observational data, to the notion of medical knowledge, that medical knowledge continues to sweep around through guidelines and expert systems to continuously improve patient care. So this grand circle, where better patient care yields more knowledge, more knowledge yields better patient care.
Then of course, what holds this great circle together, what is the hub around which this rotates. Many of you know, because you have seen this slide 50,000 times before. But at the center of course is the notion of terminology. It is the substance which describes patient information. It is the substance which describes medical knowledge, and it is the glue which can link knowledge back to patient care, and from which knowledge can be derived. Not a new idea.
So to summarize this motivation very quickly, terminology is a crucial requirement without which health data is not comparable, health systems cannot meaningful or efficiently or effectively interchange information. The secondary uses that we are all becoming to recognize as more crucially important, of quality and the like, are not practical or possible, and decision support linkage is not efficient or effective absent a commonality to link knowledge and information.
The last slide in this little series. People thought of this before. My friend Bill Farr in 1839 said, look, unless we have a standard way of representing health information, we won't be able to engage in this as a practical science, because analogous to the physical sciences having weights and measures, the nomenclature is to health care a crucial and fundamental metric that is necessary.
This is just to review. There are three papers listed on this slide. They include the content coverage paper, done by the CPRI work group done back in '96, the phase two evaluation published in '97, and the framework for comprehensive health care systems, published in '98.
The notion here is that the CPRI working group has been addressing these problems for some period of time, and borrowed heavily from the work of others. Half of Jim's ideas were in the framework. It is not as though we invented them. But nevertheless, it was a coalescence of something in the field.
Among the conclusions of the study published in '96, although they were in final form back in '94, was that surprise, most coding systems lose information, and some remarkably lose more than others.
The notion that ICD even augmented with CPT loses over half of the clinical detail was news to some parties, but the consequences that misclassification in terms of the information that one would seek to study or bias from studies that are premised exclusively on administrative data, is highly likely as a consequence of these findings, and loss of information in ways that Jim described.
The framework that was published last December was very analogous to some of Jim's desiderata, where the characteristics of terminology, structure of terminology, maintenance details of terminology and administrative desiderata were more or less outlined at a relatively high level. It reflects again the thinking of the large community, and was done jointly by the CPRI working group along with the ANTSE HISBE working group. Membership on that working group included many people sitting around the table here and many people in the audience. So it was indeed the work of many.
Back in November of 1996, CPRI conducted a national summit, not focused on terminology; it was on health care as such. But one of the conclusions from that summit was that a national conference devoted to the topic of terminology, at which health care providers, CPR system vendors, payers, government and terminology developers should meet in a common forum to address the concerns that I think are becoming increasingly well recognized.
That meeting was one of a long series. There was a national conference in November of 1997 that emerged from that recognition. There was the joint conference on lexical solutions for the GCPR, cohosted by HL-7, which occurred about a year later, in August of '98. Then there was the second terminology conference establishing the consensus lessons from experience, which occurred several weeks ago in Tyson's Corner.
The materials that have emerged from this series of CPRI activities are now consolidated to a single website which points to everything you would ever want to know. I know this because I wrote the darn web page yesterday; it is now out there.
So every slide presentation that occurred in these fora is publicly available. Substantial summaries. I acknowledge Patricia Gibbons at Mayo for really doing a superb job to summarize and generate synopses. Whether 50 pages constitutes a synopsis or a treatise we could argue, but nevertheless, they are comprehensive summaries of these meetings and are publicly available. The references to the work that I have alluded to in this presentation are also on that particular web page with pointers to all the other activities.
For those of you that can't see the slide, it is www.cpri.org/terminology.
The first conference in '97 had as its goals industry defined terminology requirements, the goal of agreement upon a common framework for progress, a prioritization of requirements and the intention to generate knowledgeable input for national debates.
In fact, I followed as I tend to do Jim at the NCDHS meeting several weeks later and reported on some of the findings of that particular conference.
Among the agreed-upon issues from that was a tentative definition of what constitutes a clinical terminology. Essentially, it talks about standardized terms and their synonyms with a clinical granularity, with the notion that these terms can map to higher level classification systems.
This dichotomy between clinically granular information and higher level summary classification systems was also made explicit, and I think many a hatchet was buried at that first meeting. Prior to that, there had been a lot of fuss and hullabaloo about which was better, detailed formalized systematic classification systems or highly granular, specific, acyclic, multiple hierarchical clinical representation systems.
The answer to the question is both. Both are necessary, both are good for you. They serve different purposes, and they are fundamentally incompatible. It is perfectly reasonable to have detailed clinical information mapped seamlessly to broader level classifications. Indeed, some of the desiderata that have been written, including those that emerged from our own framework may have been somewhat too focused on a one size fits all mentality. The desiderata that pertain to a higher level classification may not be identical to the desiderata that pertain to a detailed nomenclature. One has to be careful about insuring that one is targeting concerns appropriately, whereas in fact the functional characteristics of these systems are intrinsically different.
But the recognition that they exist along continuum I think was widely accepted at the conference. Also, a notion that parallels many of the issues that Jim raised is this idea of an entry level terminology that can in turn be an entry point to an underlying reference terminal. Again, the properties of what a human being actually has to interact with versus the highly technical and computer science oriented properties of what constitutes a reference terminology may and probably should be fundamentally different.
Similarly, the ideas or concepts that emerged are stored within a reference terminology and can spill out into administrative terminologies du jour, which this week as best I can tell are ICD-9 CM and CPT and ICD-9 CM procedure codes. But clearly, as we evolve through ICD-10 CM or ICD-10 PCS or CT5 or whatever else may be emerging, the same reference terminology could contain an immutable, unchanging conceptual purity, clinical concepts we seek to preserve with their mapping into administrative system du jour, as I have said.
This is Simon's favorite slide, although I have cleaned it up, I swear. The notion at the bottom, Jeff, is a big red block of nomenclature, and emerging from it are different colored pipes that change hue from this pale red into a color of their own substance. The idea is that if you have an underpinning clinical representation of patient events, one should be able to derive from that a higher level classification system that would characterize patient data in the way that the clinical classification system is intended.
For example, ICD would have a different characteristic and substance and content from a procedure code, or from an image transfer code or from messaging tables and content. The idea of mapping from one system to another then, instead of having awkward tables that try to provide incongruous mapping, which is almost impossible, one would go back from these detailed or highly evolved differentiated classification systems to their clinical root and remap at the level of the underlying clinical representations to find an analogous representation in a classification of a different color, simplistically.
The final point that emerged from that first conference was dealing with an administrative issue having to do with cost. The notion that recognition and fair recovery of the development and ongoing maintenance costs are justifiable and necessary, absent a mechanism for public funding. This could and perhaps should be revisited if the underlying funding mechanisms for clinical system maintenance might change, but the idea that all terminologies should be free and available, while highly desirable, is just impractical, given no other reasonable funding mechanism to support the very involved and elaborate posits of creating and maintaining them.
I want to skip to the second conference which occurred just last month, a few weeks ago, where the emphasis at that conference was case studies and practical examples of what has worked through the interim. While one could posit this ideal of interlocking terminologies and how we should all work together, while a fanciful and desirable notion, there was previous little detail on exactly how one would do that.
So the conference focused on examination of a number of case studies with bilateral coordination between a classification and a coding system for nomenclature, was already in existence, where there was experience drawn upon and learned from. The hope was that we could generalize these experiences as examples to proceed toward the goal that I think is common to all of us.
Among the recent coordinations that have occurred since the first terminology conference include the NLM facilitation of bilateral mapping between and among CP, ICD and SNOMED. While arguably not fully mature and not ready for prime time, they do illustrate examples of activities and a working model wherein these linkages can be formalized and developed.
Similarly, there is a different kind of coordination between LOINC and SNOMED. I keep using the phrase, LOINC is included by reference, and the SNOMED people haven't completely challenged me. It is a bit more elaborate than that, I recognize. But the general principal that SNOMED would not presume to develop laboratory codes, but would essentially use the existing LOINC codes, that is evidently spelled out in excruciating detail on page one of some document -- is it the LOINC manual? -- if you want to know of the nitty gritty.
Similarly at he level of tool developers or fundamental resources for terminology, there was the merger of Lexical technology and Ontext announced as a letter of intent. I don't think it is a done deal, but this consolidation in the field, where similar strengths are recognized and put together in a synergistic way.
MR. BLAIR: Chris, some of the committee members may not quite realize that you are talking now about tools and enablers to develop these coding systems with Ontext and Lexical Technologies. Maybe you could clarify that.
DR. CHUTE: That's right. The under title here is terminology component vendors. These are organizations that produce materials and resources that facilitate the development of terminologies. Since they were doing really the same thing, but using highly complementary techniques and technologies, their merger made logical sense.]
Then finally and perhaps most spectacularly was the recent announcement of the merger between SNOMED owned by the College of American Pathologists and the UK Clinical Terms project that was formerly known as the read codes, to generate a common hybrid product that would derive essentially from the content of both, further unifying the major nomenclatures, at least in the Western world, that address clinical patient information.
The centers of gravity have become more focused and clear since the last conference as well. Specifically, the ISO working group on terminologies has been created and I think has emerged as a viable forum to address many of the meta standards around terminology development. There are newly emerged public and private efforts that I just alluded to. The HL-7 vocabulary working group has emerged as the predominant form for practical consideration of terminology issues. Their intention to register terminology systems was raised at the last meeting, I think a positive step towards again bringing some sort of order and orchestration to the process. Finally, NCVHS has provided a valuable activity.
The last meeting provided detailed updates on many of the major systems, including CPT-5, ICD-10 CPS, SNOMED-RT, LOINC and ICD-10 CM. The major points that emerged from that work were four.
One is the increasing recognition of the business relevance of terminology and health care. They are now recognized as enablers to quality improvement and outcomes, and they can enhance clinical efficacy and provide reliable linkages for decision support. This is no longer questioned and no longer argued, but is a done deal.
Furthermore, the second point that emerged is the obvious demonstrated coordination and emerging spirit of cooperation as evidenced by the case examples that were presented at the conference, and a positive consolidation in the field, which appeared to be escalating.
The third major point was not simply a recognition of this continuum from nomenclatures to classification, but evidence of their proactive adaption by many of the classification developers and many of the nomenclature developers in a synergistic, cooperative and collaborative way. This is not a war, this is now a party.
Finally, the recognition that public funding and the announcement by Betsy Humphries of the experiment to support LOINC in its initial development, where public dollars used for the development and maintenance as -- and this is my quote -- an infrastructure for the public good, with an aim toward reducing the end user costs, is both welcome and encouraged.
We are not done in this community. The remaining tasks in my mind are to fly engage and payers and the providers, since the vendors of systems and information are not going to implement these things unless there is demand for the vendors to do so, which is not fully mature.
Similarly, I think the next conference if we have one should focus on the notion of where does a terminology model, where does a structure and representation of information as characterized within a terminology meet the health information systems model or the reference information model, if you will, and how do we reconcile distinctions between those complementary, but nevertheless at the present different views of the information.
Then finally, we have to complete the transition of what I still think remains in the hearts and minds of many as a perception that terminology is still an isoteric interest that only a small community worries about and has no bearing or no direct application in patient care to a recognition that these are really a crucial infrastructure for the public good and for effective and quality health care to be delivered efficiently.
I think that's it.
MR. BLAIR: That's good. We have a good 10 or 15 minutes for questions. I would encourage not only our committee members, but members from the audience as well that if any of the basic concepts or premises or terminologies that were used in this first educational session, if you don't feel comfortable, this is your opportunity to get these points clarified. So I encourage you to ask questions.
Bob Mayes, sitting to my right, is going to assist me as you raise your hands.
MR. MC DONALD: I just wanted to make sure as I heard the two talks whether you were both saying the same thing. You are both nodding.
There was a statement about the continuum between classifications of terminology and another side that had entry systems and classifications on this side. Were they supposed to be equivalent entry systems classifications?
DR. CHUTE: No, I think entry systems are closer to the nomenclature access. There is nomenclature down in the lower left corner.
MR. MC DONALD: The question really is, is this blended issue -- are you saying the same things, about the blending? It sounds -- I kind of had a sense there might be a couple of axioms that didn't match. I wonder if you could comment. If you both say you are saying the same thing, then you are saying the same thing.
DR. CHUTE: In my professional career, I have always agreed with Jim. But the only nuance that might have emerged here is whether the desiderata that he has characterized and that I have more or less borrowed apply equivalently to the extreme end of the spectrum and classifications to the other end of the spectrum of nomenclatures.
Fundamentally, I am posing the question, does one size fit all. I think for many of the desiderata, the answer is probably yes. But I think it could bear examination on a point by point basis.
DR. CIMINO: I think that is a good point. And actually when you were saying that, it occurred to me that the laws of physics apply differently in the macro world than the sub-atomic world. I think that there may be differences.
One example is the notion of the hierarchy. A strict hierarchy may work at a higher level classification scheme, but a lower level may not work, or certainly doesn't work. So I agree that it certainly does bear examination, but otherwise I think we are pretty much saying the same thing in terms of -- there are other things I didn't go into for lack of time, but the notion of different levels of granularity -- the high level of classification schemes, that is the place where we need things for aggregation for statistics, for reimbursement, for decision support, even, and then down at the low level we need it for trying to capture what is actually going on.
MR. MC DONALD: One slight followup. The classifications have started. The world is different than it would be if you were inventing the world by mathematics. But wouldn't you say the classifications would be defined in terms of the elements -- assessed defined by its elements? They wouldn't stand alone? How else will you define a set?
DR. CHUTE: In an ideal world, that was the notion I tried to display graphically with a big red brick of nomenclatures essentially providing the underpinning and the atomic elements of classifications.
It ain't that way right now, but in a grand perfect future, one would recast existing classifications, so that they were in fact premised on a combinatorial or an explicit combination of reference terminology elements.
MR. KOLODNER: This is an excellent summary. I look forward to having it out on the Web where people can hear your explanation, and we can have the slides out there, so we can bring people up to speed.
One of the things that continues to be a source of confusion to people stepping into this is all the different things we use, the terms we use, nomenclature and classification codes. It might be useful to incorporate the definition of that. I came in a little late, so I don't know if that was in your first slides, Jim.
We are immersed in it, but I think we need to provide some clarity for people when they first come into it, to differentiate those.
DR. CHUTE: The CPRI summaries have appended to them a fairly large glossary which addresses to some extent those things.
MR. KOLODNY: Also, I know that we will have made it, Chris, when the $100 million five-year investment of the government in GCPR is included as one of the citations that you talked about, in terms of the activities. But that is our current estimate of what we plan to put into that effort.
DR. COHN: First of all, I want to thank both of our presenters. I think you have done wonderful jobs.
I also want to remind the committee that the committee has been part of a planning activity related to both terminology one and terminology two conferences. So I think we have obviously been participating in the effort.
Now, Chris, you had commented about the fact that we need to complete the transition from the esoteric view of terminology and esoteric interests to a crucial information structure for the public good. Certainly, on one hand one would observe that the public and society as a whole recognize diagnosis and procedures as not esoteric. They are fundamental to most things that happen in the outside world.
I would ask the question, is it terminology in general? Is it the fact that we need additional elements that are really next steps that need to be considered as standardized terminologies? What are the next steps on that?
DR. CHUTE: If you want to think of it graphically, I am in a graphic mood this morning, if you think of the continuum from classifications to nomenclatures, I agree. I think the public grasps clearly the notion that diagnosis and procedures are the substance of what is done in health care. But it is sliding that slider closer to the nomenclature end which at present is considered the esoteric realm. So there is an endorsement and a recognition that detail specific granular representation of these events and diagnosis and concepts are as important if not more important than the larger categories that they see on their hospital bills.
DR. CIMINO: If I could just add to that, I think the other aspect is the notion of how disciplined the approach to terminology is. Most people may acknowledge the fact that there is a need for terminology, but really don't understand -- they may think that it is already done. They say, don't you have this? You have been doing medicine for hundreds of years, don't you have the terminology now?
There is the notion of how disciplined this terminology is. So they may be thinking about what Chris was referring to as the entry terminology, the collection of things that we say, and boiling that down in some disciplined way into the collection of things that we mean. That is something that escapes most people.
DR. COHN: I think really what I am mulling about is, this work group is intending next year to make recommendations to the government having to do with issues that merit government attention or other actions to move things forward. And certainly, there are various ways of approaching the issues of health care terminology.
One is to focus on the issue of an overall framework and getting the framework right, and then letting the pieces evolve in. The other would be to take more of a piecemeal approach and say, gee, there are certain areas that need some help or need some standardization, or need something to help with comparable patient medical record information.
Obviously, we can do both, but the question is, which is the better approach and what are your views on that?
MR. BLAIR: We have three other folks that are in the question key here. There is someone standing at the microphone from the audience. There is Richard Ferrans and Michael Fitzmaurice.
DR. COHN: Can I ask them to answer the question first?
MR. BLAIR: Oh, I'm sorry.
DR. CIMINO: I wouldn't want to pick one method over the other. I think both are crucial. If you just do it top down, you end up with something that looks like ICD-9 CM, which is saying, here are the kinds of things that we have got to be able to ultimately say at the end of the day. If you do it from bottom up, you end up with all these disparate little pieces that are going to overlap, are not coordinated.
That is the real danger, and that makes things hard to adopt. You say, I'm using this thing, and it doesn't quite map to the thing that somebody else is using, so I can't use what they have got.
I think that both are crucial. I think there are well defined domains like laboratory tests in terminology that doesn't overlap with anything else. You are not going to confuse that with domain of drugs or anything else, although the terminologies used within LOINC definitely come from other domains. So the things that are measured from tests come from different domains. There are chemicals and organisms and cells and things like that, physical properties and whatever they are.
I think there needs to be a high level organization to figure out where all those pieces are going to be. But they can develop together and meet in the middle somewhere.
MS. GIANNINI: This is Melinna Giannini from Alternative Link. I have a hopefully helpful suggestion on separating when to do the mapping to the code sets.
I think that a clear delineation of maybe the life of a claim on the administrative side, and then the use of the claim on the clinical side would -- if that was very well delineated as to who is using the information for what purpose, then I think that your comparability would be more important to everybody than it would be if you just tried to merge the information all at once. So that is just a suggestion as to keep from doing a lot of crossover work that may or may not be necessary in the short term.
On the administrative side, I think that I keep not hearing what happens there and what the information is used for by which people. I think that we tend to overlook the life cycle of a claim for administrative purposes between providers and payers, and who has to view the information and what they have to do with it.
MR. BLAIR: Any responses to that by Chris or Jim?
DR. CIMINO: The only thing I would say is, I heard something that I wasn't quite sure what you meant there. You said something about how you would use claims data for clinical purposes. I wouldn't use claims data for clinical purposes.
I think the point is to collect the clinical data, and then if you need to generate claims data, generate it from the clinical data, but don't try to go back and say, this patient had pneumonia not otherwise classified last year, I had better give him this antibiotic this year based on it.
There is an inherent loss of information when you go across into a different realm like that. I think the key thing to remember is what we are trying to do is not pay for patient care, we are trying to do patient care. So we have got to collect data for patient care primarily, and everything else has to be seen as an outgrowth of that.
If we say, yes, it is important, we have to pay because if we don't pay then the hospitals go under, -- yes, it is important that all these pieces work. But the primary business that we are in, although it feels sometimes like the primary business I am in is filling up paperwork to get reimbursed, the primary business I am in is to take care of the patients, generate the data, and those data then can be used to generate the paperwork and so on and take care of everything else.
So it is a matter of perspective that has to be kept. When I was at the meeting two years ago, the person who preceded me -- she was from Mayo Clinic, I don't remember her name -- gave a beautiful talk about all the uses of data and all these other things. The one thing she left out was patient care, because she was all caught up in the reimbursement and the statistics and all these other things that are also important. It is just a matter of perspective, though.
MS. GIANNINI: I agree with you totally on what information needs to be attached to the patient. But I still believe strongly that until you understand who is using the information for what purpose, you are going to work too hard to get attachments and crossovers and crosswalks made to information where it perhaps is meaningful and where it is not.
A point in case. The payers -- I have encoded alternative medicine -- the payers have to understand for alternative medicine from a code level what is legal and what is not legal. That doesn't need to be the whole patient record. That needs to be, can they diagnose, can they prescribe, are they able to dispense. Those kinds of issues are the things that they need to know for a legal issue, and it has nothing to do with the patient.
So I just caution everybody that there are reasons to compress the information in some instances. When you are looking at patient care and the patient record, then you need the expanded definition.
DR. CIMINO: I don't think anybody would argue.
DR. FERRANS: I want to thank you both for your presentations.
I wanted to focus on a three-part question on three different dimensions of the problem. The first one has to do with vocabularies and the trends in particular, Chris, that you pointed to about consolidation, about mappings between. If you wrote a followup to your paper on the clinical coverage of classifications, a year or two from now, at the time where the committee is supposed to be making recommendations, where do you think we are going to be? Given the various trends, can you extrapolate forward how close are we going to be to having an 80 or 90 percent solution, realizing that there is going to be a significant amount of time to clean up overlapping.
In that question, does it really matter whether it is a single terminology, or whether we have good coverage when you combine multiples that are mapped together?
As part of that, Jim mentioned about organization and how to put together -- the importance of having some organization. What do you all see as the role in the National Library of Medicine in doing this? Obviously in the terminology arena, we would certainly look to them. How do you see their role evolving?
DR. CHUTE: Let me answer the first question, if I might. I see continued consolidation and precision and uniformity at the levels of higher level classifications and reference terminologies.
Frankly, I think there will continue to be -- and perhaps this is good -- variability, market products differentiation at the level of entry terminologies, what interfaces the clinicians actually see, language processing modules and whatnot, that would generate these underlying reference terminologies. I think that is an area that is too highly variable and dynamic to speculate much about at this time.
DR. CIMINO: I think I don't have a good vision for how these things can evolve. I think that work groups like CPRI have a real potential to develop consensus, and I think that is going to be as important as anything that can be dictated from anywhere else.
I think the National Library of Medicine has a lot of experience in collecting, producing, disseminating this information. I don't know that they would see it as their role to take on the task of deciding what the top level should be for trying to organize that.
Don Limburg said repeatedly that what they have is books, they don't have patient records, and they don't see themselves as experts in the area of representation of clinical data. But I think that there is certainly coordination with the NLM that will be crucial to do dissemination. I would see the NLS for instance evolving into something that is the distribution mechanism for whatever we end up using.
DR. FERRANS: The lat part of the three-part question was about the funding of the maintenance. I guess this is a particularly important topic as recommendations go forward. How do you see the model with LOINC and other -- how would you envision a role for funding or for public funding of maintenance of vocabularies? What do you think is appropriate?
DR. CIMINO: I think it depends on how the pieces are being developed. In the LOINC it is mostly volunteer, so as far as I know, the funding that is being used is mostly for paying the travel for the people to get together in the same room. But those people are not paid for doing the work. They are volunteers or they are contributed by their companies.
That model works in some cases. There may be other places, for instance, that have been working with the drug knowledge vendors to see if there is some consensus terminology that we can come up with that will help all of them conduct their business. They all have a vested interest in being able to see their terminology being transferred to other systems and vice versa. So there may be different models for different pieces of the puzzle.
I think that where you really have to roll up your sleeves and pay content developers to do that, then you look at things like the SNOMED model, where there are license fees to support that activity, and actually hire people who are experts in microbiology, who figure out what is missing and what needs to be added and so on.
DR. CHUTE: Personally, I think one of the roles of government is to fund an infrastructure that individuals or small organizations would not independently choose to pay for, because these are common infrastructures for the public good.
I see terminology development -- which turns out, if you look under the hood, to be a relatively expensive and difficult process if it is done well, and if it is done scalably and maintainably, as an ideal target for appropriate government support. I think the experiment with LOINC is one that will be looked at very carefully, and hopefully one from which we can learn a great deal.
It is my hope that it will prove to be a successful and prudent experiment that might recommend extensions of similar public resources to other HIPAA approved terminologies.
MR. BLAIR: We have one last question from Michael Fitzmaurice.
DR. FITZMAURICE: I too want to thank the presenters for their very cogent presentations. I learned an awful lot, and learned more than I need to know.
(Words lost) an agency where we try to improve quality measures and develop new measures. We need to be concerned about the efficiency with which we can get information for those quality measures. If you need to describe clinical performance measures such as are found in HETUS, to show business they are getting value for the dollar, to guide providers to a better quality of care, the question I want to ask you is, where is the starting point for new useful measures of quality of care, to be able to pull them out efficiently from documentation for medical care?
A couple of examples would be, did a pregnant woman see a physician in the first trimester? Is this a coding problem? Or did a young child with asthma receive the proper medication to avert a hospitalization? Is this a matter of classification, a matter of terminology? Is it a matter of coding systems with an access for clinical performance measures?
How would you approach this from the terminology and the nomenclature and the coding system?
DR. CHUTE: I actually have strong feelings on this one, Mike, so hold me back.
It gets very deeply into the issue of where an information model about the health record meets terminology. For example, I think coding which trimester one happens to be in into a terminology system is a very misguided notion. I would far rather last menstrual period and date of visit be recorded in the information model, so that one could compute whatever appropriate time interval was pertinent to the quality metric, and then have as fundamental clinical concepts a reserve terminology for coding fundamental clinical concepts.
It requires however a harmonious synthesis of information model about patient record with information structures of terminology. That is an activity which has yet to take place. If government were to encourage, fund or support conferences -- not that I would ever raise such a notion -- to address or improve these activities, I think that would be fertile ground.
DR. CIMINO: I think that your example of the asthma, is the child on appropriate asthma medication, one way to record that in the record is to ask somebody, is the patient on asthma medication, and create a concept of appropriate medication. That could be a concept, right?
The problem with that of course is that what it means changes over time. This year it is one thing, the next year it is another. This points to why it is crucial to point in the record the level of terminology that the clinicians are dealing with. So they would record what medication the patient was on, and then in whatever year you decide to do your survey, you decide which standards to apply, then you create an ad hoc classification, saying, I am interested in these medications, was the patient on any of these during this period. You can always go back and do that from the primary record, because the data are still there.
DR. FITZMAURICE: So what I hear you saying is that instead of creating codes for it that a physician would put in the record, create the use of fundamental terminology to let you derive these quality measures.
DR. CIMINO: Yes.
DR. FITZMAURICE: Have the physicians record them in the record, that is, encourage that at conferences and otherwise encourage it, so that you can draw the information out of the record, rather than having an abstract to go back to the paper record or have a physician even put a special code, but the fundamental terminology into the record.
DR. CHUTE: But distinguish what belongs in the terminology and what belongs as other attributes of recorded information about the patient. I think the example of trimester of pregnancy is a classic one. That should not derive from the terminology, that should derive from information in the record, such as dates of last menstrual period and date of visit.
MR. BLAIR: We are about to hit a break here. We are running about five minutes late. I'm going to have us reconvene about five minutes late at 10:30, but we will convene promptly at 10:30. Thank you, everybody.
(Brief recess.)
MR. BLAIR: We have Dr. Keith Campbell with us for the next session of the portion of our testimony these next two days, which is to help us get an educational foundation on medical terminologies. Keith, please proceed.
DR. CAMPBELL: Thank you. First, I just wanted to menton that my responsibility within Kaiser Permanente is to try and get Kaiser to have essentially nationally comparable data across the entire Kaiser enterprise. In order to do that of course, we have to have comparable terminologies across our enterprise. We are working on standardizing those terminologies.
We have been working on standardizing these for a number of years. I think we have some lessons that I would like to bring to the table today. So the title of my talk is trying to bring to you an enterprise view of terminology that is more than just an end user view. It is the view of the large enterprise. It is consumer data, but also consumer of information systems in an organization that provides care.
But before I went into specific detail of some of the experiences that we have had over the years in working with terminology, I wanted to share with you Kaiser's vision, and then translate that vision into how it is supposed to relate with information systems and how that then translates down into terminology.
One of the people that we have worked with extensively in developing clinical information systems are our requirements people. They try to emphasize to us that having the traceability of these recommendations is very important.
If for example we say here is an esoteric thing regarding terminology and it is very important, unless we can trace this esoteric requirement back up to the business requirements, as to why we are providing health care in the first place, often we get significant challenges when we ask for budgets to implement esoteric things that haven't been traced to business requirements.
So here is actually a vision statement that came out last week. It was put out by the chief operating officer of the health plan as well as the CEO of the Permanente Federation, which said that KP has strategically committed to differentiate itself from the health care marketplace on the basis of its ability to delivery high quality care and service.
A second part of that statement was that the use of the electronic medical record and other clinical information system tools are critical enables on the path to significant improvements in health care outcomes and quality service.
To translate this vision statement for the enterprise into clinical information system goals, I put three of them up here. One is that we have to have nationally interoperable information systems that integrate applications from multiple vendors and from multiple application areas.
Kaiser Permanente being a national organization with different regions, and has historically grown up in a federated model as opposed to a single corporate entity, we have laboratory information systems in different regions or from different vendors, and laboratory information systems in other regions. Similarly for pharmacy information systems, we have multiple vendors with multiple terminologies inside of them. We are trying to integrate these all together into a comprehensive health record.
In order to do that, we recognize that interoperable information systems must be founded on robust terminologies. We have even learned that it is actually more important to have robust terminologies than it is to have a single application implemented across the enterprise.
At one point, we had tried to implement common insurance systems or common businesses into Kaiser Permanente, and people thought that would solve the problem of national comparability of data, that if we just implemented the same system in all our different regions, we would be able to pool our data.
It actually turns out that was not the case at all, because when people tried to use these systems in specific regions, they often had to make their own enhancements of the terminology to make the system operate in a way that met the local needs. When that happened, the terminology was not comparable, and we could not pull data from two different regions that had terminologies that were different.
So a number of years ago, in the early 90s, Kaiser decided to make a strategic investment in terminologies. These terminologies must fit within our strategic CIS needs.
Let me just describe to you a summary of what those imperatives are. The first one was that the terminologies needed to be medical record vendor neutral. There are a couple of reasons for that. One is that we are in a period of innovation in the medical record marketplace, where people are trying to differentiate themselves based on unique value.
We want to allow that and take advantage of that, but we need to have to map the common reference terminologies that will be acceptable to multiple vendors. A number of years ago within Kaiser, we actually had four significant information systems efforts. We had one going on here in the Mid-Atlantic states, one going on in the Northwest region with EPIC, one in Colorado with IBM, and another one in Southern California with Oceana, and there was a fifth one in California with Oasis. We had to get all of these vendors to try and agree that they would make their systems interoperate with the terminology, but yet we had these proprietary battles, saying, but if we make ours work with this system, how do we know that our proprietary information isn't go to go from our work to our competitors' work through this terminology.
That actually was one of the reasons why we founded this notion of separating out the interface portion of the terminology from the reference terminology. It was just a practical matter. We had to do it in order to make our terminology efforts vendor neutral.
Another requirement is that we feel that the terminologies have to have scientific validity. So for example, for laboratory terminology, I think the effort LOINC has put forward is a good demonstration of trying to have naming be scientifically valid, where they explicitly state things, like what is the analog being named, what is the substance measured, and that the people participating in this are actually the users of the terminology, the laboratory technicians as well as the pathologists in charge of the laboratory, trying to make sure that what is being represented in terminology is what actually describes what is going on.
Next, the terminologies have to be well maintained, because if the terminologies aren't well maintained, we get creep in the underlying terminology, because people have to make local enhancements, and that repeats our experience with the business systems. If they are not well maintained, available with daily, weekly, monthly updates, you have drift in the terminology and you no longer have the ability to have comparable data, and then you get idiosyncratic versions of the terminology that may be specific to a particular vendor, and you no longer have that vendor neutrality.
Another issue that we actually think is very important -- I know those licensure issue has been brought up a few times so far, and I'm sure it was brought up again in the remainder of this conference -- we believe that the terminology itself must be self sustaining by some mechanism.
I'm not trying to adjudicate what I think that mechanism should be, because I think different types of terminologies can have different mechanisms for self sustaining. But given our position, where we have actually seen EMR vendors come and go, and often we have to pick up the pieces when an EMR vendor goes, yet we have developed a corporate dependency upon their software, we often have to internalize that software and develop it ourselves, and keep it up to date.
The data that is in our systems actually has a longer life than any one system, even if we kept with the same vendor. They are going to be upgrading their product, this version for the next version and so on. We need to make sure that our investment in data and resources is preserved as we implement new systems and as time goes on.
Finally, we have to have scalable infrastructure and process control to manage these terminologies. In order to again prevent the problem of having to make local enhancements in a way that violates the semantics of the overlying terminology to meet local needs, if we could have a responsive organization that we could say, we need terminology in this area, they are able to get it back quickly, they are able to do that on a large scale, so that they can have requests coming in from different regions across the country, for maintaining that terminology and getting it up quickly and synchronizing it. This is what we feel is strategically required to meet our enterprise needs for terminology.
Today's situation is that few terminology systems meet our strategic imperatives. I would argue that the only ones that do are the ones that work very closely with to try and help them to meet these strategic imperatives.
One of the things to note is that we have invested significant resources towards meeting these, and have tried to develop partnerships with terminology organizations to try and meet these requirements. Our challenge right now that we are faced with, an education and collaborative challenge, that we have a lot of experience in developing terminologies, we have an enterprise need of having these terminologies be high quality and robust and well maintained. We are actively trying to work with terminology vendors and EMR vendors to meet our clinical information system imperative.
But why are there so few terminologies that meet our needs today? I would say that the number one reason is failure to invest in robust infrastructure. The support for infrastructure
must be part of a self sustaining revenue model for the terminology, and that a scalable infrastructure requires significant resources.
You may be able to take a terminology and work to a certain point where you have a database on a personal computer that doesn't have distributed access, doesn't support distributed development, but it will not scale. You are going to reach a certain point where the abilities of one single individual to manage that terminology are going to be eclipsed by the demands that are being placed on that terminology, as people try to put it into practice. So then you have to start developing distributed systems, dedicated database administrators and replication of database across multiple sites, and being able to handle large transaction loads and so on. Developing that infrastructure is non-trivial.
Another challenge is that there is difficult forming collaborations. Some of the issues are that collaborations are technically hard to manage, because you have different organizations with different priorities trying to work together on something that arguably is the same, but in many senses is different.
This for example is one of the challenges we had in trying to work with different EMR vendors to try and move them towards working with a common terminology. They would say, we need to differentiate ourselves, all of the issues about wanting to make sure their proprietary aspects of their applications were not compromised as part of working within a collaboration, how do you make that happen.
There is also a problem of, not invented here. Many people say, I have looked at this system over here, and it may have a few concepts that are a little bit better than the system that you are working with, so I'd rather work with this system over here rather than this one.
The message I would have to this, which is one that I tried to give very strongly at the previous GCPR conference on lexicon solutions, was that the process and the scalability are more important than the starting point. You have to have groups that are willing to work together towards creating process and scalable solutions, and that is the foundation upon which your terminology solutions will be solved, not who has the better content today. In point of fact, those efforts could be quickly eclipsed by a group that had better resources, better infrastructure, better scalability, could be sustained in a way that would meet our enterprise needs, as opposed to having today's terminology solution work for five years.
We invest several hundred million dollars in the enterprise, developing information systems that make use of that terminology, and then it fails, and it fails because the revenue model for the terminology was not based on the terminology itself, but was a loss leader of a particular vendor, trying to say, we will put this terminology out for free for people to use, and we are going to make our money on knowledge bases that make use of it. If their knowledge base system fails to make the revenue that they need to sustain the other part of it, then we are really stuck with a $100 white elephant.
Those types of things are just untenable to us as a large enterprise.
Finally, the last challenge is getting sustained organizational commitment. I think one of the reasons for this is that the challenges, once you open up the hood and look at the intricacies of trying to support terminology development on a large scale, are not intuitively obvious. People get into the collaboration thinking this is going to be easy, and within six months we are going to have this problem licked, because we are working together.
Within six months, you might have had some initial agreements about process and scale, and then after that six months has gone by, and the naive initial view as to what was going to occur within that six months are not realized, trying to sustain organizational commitment beyond that is often a challenge.
One of the things that I would like to do is have two specific examples of systems that could benefit from some of the infrastructure that we have been talking about with regard to distributed scalable terminology solutions.
One that is particularly cogent, because this is something that this committee has endorsed for claims attachments, is the NDC codes. I just wanted to first talk about it with regard to our clinical imperatives, or our imperatives with regards to terminology systems.
First of all, are the NDC codes EMR vendor neutral? Yes, they are. They are a government mandated system, where manufacturers of drug products are required to submit their codes to a process for managing them. It doesn't give proprietary advantage to any particular company who might make use of them.
Are they scientifically valid? I would argue, sort of. They are scientifically valid for the specific purpose for which they were originally intended, but many people take systems and try and make them work for more than what they were originally intended for.
There are many ways that the scientific validity of NDC codes could be improved. That is why I hedged a little bit and said they were sort of scientifically valid. We could improve the validity of them. But are they well maintained? This is one of our criteria. The answer is, no, they are not. There is not one organization that says we will take accountability for publishing on a daily basis a quality reviewed version of the NDC codes that we will make available to everybody, so that then it becomes a credible standard for use in our information systems.
Is it self sustaining? Here again, I feel that there are different models for self sustaining revenue. I think in this case, it is a self sustaining model, in that it is a regulatory requirement for those people that are manufacturing drugs as part of their manufacturing process, they have to properly name those drugs. So the revenue from that regulation with regard to the manufacturer of drugs, is the revenue that drives the NDC process itself.
Then our next question is, do we have scalable infrastructure and process control. I think the answer there is clearly no. We do have a specific example of that, in that the Food and Drug Administration, the Health Care Financing Administration and the Veterans Administration all maintain different listings of the NDC in triplicate and these listings are different. If you take the VA version and HCFA's version and you compare them, they are not going to be identical.
I think that there are solutions to that. There have been different solutions proposed. One was to develop something completely new. I believe developing a completely new solution is a high cost, high risk proposal for solving the problem related to NDC codes and pharmacy.
One of the issues is that it develops new organizational accountability that has to be developed, they know. Also, depending upon what solution is picked, it may provide an unfair proprietary advantage. There are different companies already out there that develop a business model based on providing quality checked NDC codes to business. If you were to pick one or the other as the foundation for the solution, then you are creating a competitive advantage for one.
I would propose that a better solution would be to improve the existing NDC process. I think that this would be a moderate cost way of going forward with low risk. You work to refine existing organizational accountability as opposed to creating new organizational accountability, and incrementally add new functionality as part of this process to improve the scientific validity and solve some of the other imperatives we have regarding our terminology systems.
Here is just an example of how I view an improved NDC process. If you had a distributed development process with robust infrastructure, you could start out with an NDC version that had all of the codes to date. Then if different manufacturers and different repackers were to submit electronically their version of the code to a central site, and this would go quickly through a quality control process, where if the manufacturer had made a mistake in how they submitted it, they would go back to that manufacturer within 24 hours for them to do a rework and then resubmit it. It goes through the quality control process and is improved and then is published as part of the next version of the NDC codes. If this process was done in a scalable way so that it would meet the needs of all of the repackers, all of the drug manufacturers that produce pharmaceuticals within the United States, then you are starting to get to a credible, scalable solution that would eliminate for example the triplication of effort between the Food and Drug Administration, the VA and HCFA.
But in order to make that happen, there has to be an implementation of infrastructure that currently doesn't exist today.
Another system that I think is interesting to look at is the LOINC codes. I think that they are a very successful example of what can be done. I think that they are EMR vendor neutral; I think that is one of the things in its favor. They do have high scientific validity, the process that Jim talked about, where you got the users of the terminology and the laboratory systems in the room together to try and figure out what to do, a very important thing.
Are they well maintained? I think sort of, today. People have been making a very good effort. It is a volunteer effort, but it does not have dedicated infrastructure behind it to be able to scale to the system that I think it needs to scale for a large enterprise like Kaiser to depend upon it nationwide for its information needs.
Is it self sustaining? Today it is not. There is a recognition that it does need some funding. It did get some government money, but that government money is not a promise of continuation funding, or is it? The self sustaining model for LOINC is something that needs to be worked out.
Does it have scalable infrastructure and process control? I think the answer again is no.
Now, similarly to how we worked to improve the NDC codes, we have an improved LOINC process that allowed distributed development with robust infrastructure. Maybe for example the diagnostics vendors, the people that would trade new tests, new viral load tests, are the ones that need to propose a name, just like the drug manufacturers and drug repackers are the ones that need to propose NDC codes.
You have a central infrastructure where a diagnostics vendor can submit a new test name that goes through a quality control process, or perhaps it is reviewed by two different individuals for scientific validity. If they agree, it goes through the quality control process on the back end and gets published in the next version. If not, it gets sent back to the diagnostics vendor, where they can then clarify their request and send it back to the quality control process until it is published.
This type of infrastructure is something that we have been working on for awhile. If I can just distil the problem down a little bit, I think the problem is really one of facilitating collaborative development.
Now, collaborative development is complex. Anybody that has ever written a piece of software can tell you, the easiest way to write a piece of software is for one person to write it, but the problem is that often, systems are so large that you have no choice but to have more than one person try to work with that.
Operating systems are great examples of trying to write a mainframe operating system or a UNIX. Those were things that had pluralistic input from many different individuals with many different skills. They developed a whole complex process around configuration control of source code and the ways of managing versions of source code and branches of source code and releases of source code, and QA processes around software releases.
We need to have the same robust infrastructure around terminology development releases that we have today for software. When I started working at this problem in 1992, there was no process or environment to support collaboration. I was fortunate enough to be around when a critical mass of individuals got together to create something called a convergent medical terminology project.
The initial goals of that project were to develop and evaluate distributed development methodologies that are organizationally scalable, and by organizationally scalable, I mean as more and more developers are brought into the process, the infrastructure does not collapse on itself and make it so complicated that you can't continue to get work done.
Also, this architecture had to be developed on a distributed and scalable computer architecture, so that as the demands on the system increased and the number of people participating in it increased, we had to be able to increase the computing infrastructure so that it could be distributed and meet the needs of the end users.
The CMT project original participants were Kaiser Permanente, Mayo Clinic, the College of American Pathologists and Stanford University. It was mainly funded internally, with some support from the National Library of Medicine and the Agency for Health Care Policy and Research, and we had some software support. Tools were developed for us by IBM, Lexical Technologies and Ontix.
Here, we had this basic framework of evolution. In a sense, you would have a version of a terminology. People would work independently on their own branches of the terminology to make local enhancements that they felt met their needs, and submit those changes and go through a merged quality assurance process to create the new version.
I would argue that this is the same process that can be used for LOINC. It is the same process that can be used for NDC codes, and that the scalable infrastructure that we have shown can work in large systems could be brought to bear to solve some of these problems with regard to the other systems.
Over the years of our project, there is a publicly described methodology for it. There are mechanisms to generate local changes, mechanisms to collect local changes and identify conflicts, processes for evaluating and resolving these conflicts, and mechanisms to disseminate these local changes and global updates.
One of the points I am wanting to make here is not that the distributed development methodology is somehow unique or proprietary to the three companies that participated with us in this project, but in fact, it could be the basis for further standards development, and different vendors could create their own solutions, their own editing environments, their own configuration environments that are compatible with this environment.
I do have a document here that is probably the most comprehensive view. It is several hundred pages of some of the background that we have had, which I would like to leave with you for entering into the record. I can give you a PDF version of that, if you wish to put it on the website. I am certainly not going to try and recite the document today.
One of the results of going through this distributed development process is that we were able to improve the scientific validity of the content that we were working on. Again, the foundation of our original work was SNOMED.
Here is just a simple example of one of the things that comes up. SNOMED had a term called cellulitis of the skin with lymphangitis. We had primary care physicians working on the terminology. They looked at this for awhile and they said, we think cellulitis of the skin with lymphangitis should be defined as an infection of the skin and subcutaneous tissues with lymphangitis, and it affects the skin.
Other people had a more anatomic pathologist's view, shall we say, of cellulitis, and they defined it differently. They said it is an infection of the skin and subcutaneous tissues. The associated topo is the subcutaneous tissue and the lympathic vessel.
The real sticking point that they had was, the primary care docs felt that cellulitis was a feature that could be diagnosed on inspection of the skin that they would palpate, recognize in duration, inflammation, and make a clinical diagnosis of cellulitis. The pathologists said, all of the features of cellulitis are visible below the dermis. So here you have two different perspectives trying to be brought together, so that we can reconcile that terminology to meet the needs of all of the users, not just a particular set of users.
So I think in this case, the distributed development methodology, the ability to work at scale, where we can have multiple developers participating in the process may not be the most efficient way to develop a terminology, but I think it is the most appropriate way to develop a terminology that is of general purpose.
I think that is one of the things that Jim commented on with regard to LOINC, that one of the reasons that it worked was that it got all the people in the room together that had a stake in there -- the end users, the laboratory representatives, to try and come up with the right ways of representing it and going forward.
One of the advantages or initial benefits that LOINC had was that the domain they were working on was relatively small with regards to thousands of concepts, whereas here we are working in an environment that has in excess of 100,000 and probably soon to be 200,000 concepts that does require significant infrastructure to be able to manage that.
MR. BLAIR: Keith, for the purpose of keeping in time, I don't want to cut short what you are saying. Just a few more minutes.
DR. CAMPBELL: Sure. One of the other results was, we had scale infrastructure and process control. This is just a diagram of some of the systems that we have implemented. We do have a master site at Kaiser Permanente in Oakland, we have another master site at the College of American Pathologists in Chicago. We are able to synchronize our databases on an hourly basis. In addition to those, those master sites can have other sites that replicate off of them, one in Portland, one in Colorado, another one at Oregon Health Science University, and hopefully one soon at the National Health Service in the U.K. Also, the results that the system is well maintained.
Just to summarize, I would like to say that I hope that the CMT project has demonstrated that collaborative distributed development is a viable option for trying to solve some of our contemporary terminology problems. This methodology is publicly described and commercially available, and validated distributed development methodology can overcome some of these problems in the other systems, such as NDC and LOINC.
The recommendations I would have is that the government should work to facilitate collaborations, the government should try to collaborate with itself and with industry. I think the example with the NDC codes is particularly prominent. We should have a shared infrastructure so that HCFA, the VA and the FDA can share work on the same database for NDC codes. But we need to invest in infrastructure, and the terminology efforts of other governments should meet uniform standards of data representation, configuration management and reusable tools and processes.
Just to remind you, as we implement these recommendations, we have to make sure that we meet these strategic imperatives of being vendor neutral, scientifically valid, well maintained, self sustaining and have scalable infrastructure and process control.
That concludes my remarks. Thank you.
MR. BLAIR: Thank you very much, Keith. The next presenter is Mark Tuttle, Lexical Technologies. Mark, are you ready? Then right after that we'll have questions.
MR. TUTTLE: Hello out there in Internet land. I always wonder if anybody is out there listening. I guess we'll find out later.
Given the three previous speakers are the all-star hit parade on terminology, I have decided to shift the focus to what are we going to do about this, and give me recommendations to the committee on where we should go with terminology in terms of facilitating the electronic medical record.
I am completely serious about wanting you to think about the way that airmail influenced air travel as a way that terminology might influence the electronic medical record.
We have got a couple of problems here. In the '20s, air travel was primitive, disorganized and chaotic. If you remember the Jimmy Stewart movie or Lindbergh biography or whatever else you read about, Charles Lindbergh spent a year flying the mail from St. Louis to Chicago. If the weather was bad or he had engine failure or had to land, he got on the train.
But nevertheless, the government paid regardless, whether there was one letter, whether there were 50 letters, whether he had to go on the train or whatever. If they had too many engine failures, his company was going to lose the contract.
In the next decade, we are still looking at the beginning of the decade of the same primitive disorganized chaotic situation with the electronic medical record. This is why this committee is meeting and why we are here today.
A couple of notable exceptions. There are what are called EFC's, electronic filing cabinet versions of electronic medical records. Two of these happen to run at Harvard Beth Israel Hospital and Kaiser Northwest. I'll get to defining what I mean by electronic filing cabinet.
So in both cases, there are a few standards. Limited economies of scale. The public is poorly served.
Solutions. The government paid for airmail on the margin. In other words, the government didn't pay for all airlines in the United States in the '20s; it gave the airlines mail contracts that typically allowed them to stay in business. I am proposing that the government pay for electronic medical record results, with the emphasis on results, again on the margin. The public we hope got air travel, and we hope it gets electronic medical records. So I want to basically now talk about how and when such a thing might happen.
Where are we in 1999? Most electronic medical records that have ever been built have failed. You figure that most airlines that have ever been started have failed; maybe that's okay, maybe this is a learning process and this is what we have to go through.
Some electronic filing cabinets flourish, so let's define this again. The electronic filing cabinet to me -- I'm not a database person, remember -- is something where you can retrieve the information typically by a single key, like a patient identifier. You can put the stuff in and you can get it out, but it is a blob of stuff retrievable by the patient identifier. Within that blob is arranged typically by date, it is in chronological order and maybe by category, like lab, Xrays, notes, problem list, whatever.
Where these things flourish, physicians like them because they are available 24 hours a day, seven days a week. They can be accessible from home or anywhere in a hospital or an office, and they are embraced by users. But you all know the problems, which is again why we are here today: the data in these systems is not comparable. We can't ask the question computationally, are there any similarities between this patient and that patient, about what happened to a given patient last year and this year. It is just not an answerable question.
Furthermore, what comparability does exist is not sustainable in the ways that you heard from Jim Cimino and Chris Chute this morning, and it is not scalable in the ways you just heard from Keith Campbell. In fact, these things are highly idiosyncratic. As Keith pointed out, often to make things work locally, they have to be idiosyncratic, so we end up with something that is not going to be used by anyone else anywhere else.
So we have few opportunities for economies of scale. Everything is expensive and difficult. The existing opportunities for economies of scale are not exploited. So for instance, if every physician in the United States had access to an electronic filing cabinet, we would be farther ahead than we are now. We would be talking about the engineering problems of semantic normalization of these things. As it is now, most physicians don't have 24 hour a day, seven day a week access from anywhere to electronic medical records, even if a computer can't understand them.
So my point is that the public is poorly served. So again, let's go back to air travel in the 1920s. Local efforts were sustained through airmail contracts. The U.S. government in the '20s couldn't go to a single company and say we want to have airmail in the United States, or we want to communicate with airmail service around the world. It didn't exist. There were only local mom and pop airlines of varying sizes around the country, and the post office gave various of them on some criteria airmail contracts.
Note that the government did not pay for the planes, the airport, the pilot training, unless you consider military service the training of pilots the government paid for, or passengers. If these airlines wanted to take passengers, that was okay, that was the airlines' business. But the point was that the U.S. government contracted to carry the leather bag with the lock on it with the onionskin letters in it.
Competition, standards and regulation was incremental and reactive. Even in just the decade of the '20s, if you had too many engine failures, you would lose your contract. Or if someone came along and flew more often or more reliably or whatever it was.
Anyway, good things happened. At the beginning it was, anybody could fly anywhere on an everyday basis and get the mail through.
Clearly, this was a loss leader for the post office. No matter how much they charged for those letters, it wasn't going to pay for whatever they had to pay the airlines to get the mail through. But it was clearly the right thing for the post office to do this.
Finishing up with my air travel, the government paid for results on the margin, and it regulated things like what could be airmail. So again, early airmail was highly constrained. It preserved privacy, which is a very important attribute of the post office. The Founding Fathers recognized it. Contrary to mail in Europe, it was a bad thing if the government read your mail. The post office had a tradition with this. It is clearly one of the things that this committee has had to spend a lot of time on. It adapted rapidly. So by 1930, airmail was completely different than in 1920.
The government did not cancel the railroad contracts to carry the mail. This is Keith's point about risk management. Of the railroad contracts to carry mail, probably a few of them still exist today. The bulk of the mail, 99 percent of it, went on the railroad for a long time, and it would have been silly to cancel those things.
So let's look at electronic medical records in the '90s. It is going to look something like this. We are going to have an electronic medical record of some kind. Claims are going to be submitted to the government, and the usual required stuff is going to be in that claim, and you have already started filling out what these requirements are going to be.
I am proposing that we just have an optional add-on that is the comparable stuff, this is sort of going to be the airmail. So for the usual stuff, let's say the government pays X dollars in reimbursement. I am proposing that the government pay a very small fraction of X for comparable good stuff that would come with it. In other words, clinical descriptions that would fit all the criteria that you have heard Jim and Chris and Keith talk about.
So again, we are trying to finance this on the margin. People are already going to be sending stuff to the government in electronic form. We are just looking at an optional add-on. If the optional add-on meets certain criteria which we will get to in a moment, the government gives you a little incentive.
The usual stuff is the railroads, the optional stuff is the airlines. The railroads are predictable and reliable and not risky, and the airlines have the advantage of rapid evolvability. You can take more risks with the airlines if you know that the railroads are still going to be running. It is like when the weather was bad, Lindbergh got on the train.
How and when would we do this? The point is again on the railroad analogy, to do the doable with reimbursement claims attachment. You guys have got a tough job, you have my sympathy, but you are going to do whatever you can do, and that is why the NDC codes are already in there. That is the doable. It was very interesting, reading some of the criticisms of that on the Web that we researched before we put this talk together.
We want to pay on the margin for comparable results. Once we get these results, then the government is in a position to assess the value of comparability. Basically, it is a hypothesis that comparable clinical descriptions are going to be worth the public's while to pay for. We all believe that, we just don't know how much.
As you have heard in three different talks, comparability requires terminology that has certain properties. It has other things, too, for instance, some standards for lab values, the numbers that are in the lab tests. But anyway, there are things that are required for comparability, and for the point of this talk, terminology is the nut that we have to crack first.
Why the government should help. As you have just heard this morning, I think we know what to do intellectually, more or less. It is clear that we are stuck. As we sit in this room, the trajectory towards having any national degree of patient comparability is pretty small, the slope of that curve. it is like air travel in the beginning of the 1920s.
If we had marginal subsidies, it would level the playing field for everyone who is a resource, whether they are in academia, government, the private sector, whether they work for nothing, whether they are expensive, whatever. Again, the government didn't pay rich airlines more to carry the mail than poor airlines, they just paid whatever it was.
We claim more comparability would benefit the public. We don't know how much and we would like to know. The government should spread the marginal dollars among the players. So the terminology providers, as all three speakers have focused on, they need to get some of this. Otherwise, the terminologies are not going to be sustained.
The comparable prescription suppliers need to get a subsidy too, because there is no financial incentive in the short term right now to do this. In order for the public to get its money's worth out of this, there need to be people who analyze these descriptions.
The point is that we are trying to competitively satisfy government criteria, whatever the government, whether this committee or any other committee, decides is the right thing.
So to go back to this previous diagram again, let's say, how would the terminology providers get subsidized? The answer is that some of the fraction, the minuscule fraction of X -- again, if a normal reimbursement is X dollars the government pays the enterprise for some care, and a very small fraction of X goes back to the enterprise for giving the government some comparable descriptions to go along with that, some of that has to go to the terminology providers, whatever it is.
To qualify for these, the government should set up a number of hurdles. Initially in the airlines, it was, could you fly every day, assuming the weather was okay. And of course as the '20s went on, the hurdles got higher. Could you fly every day and get there in a certain amount of time, could you carry passengers, whatever.
So in order to qualify for the dollars, I am suggesting that the terminology providers have got to put up a terminology server that is optionally mirrored for the government, so that the government can't use the terminology for whatever it wants to, but it can certainly use it for what it is paying for, namely, to get these comparable descriptions.
This terminology server has got to be the authoritative archive of all the versions of the terminology since the start of the subsidy. Otherwise, the public isn't going to get its money's worth. It should be on the Web for all licensees. In other words, the terminology people are going to be in the business of selling terminology for whatever consideration. The government should say, you have got to be on the Web so that the entry cost is low for anybody who is trying to license your terminology to prevent the cream skimming.
Obvious things, like it has got to support class based queries like this group of diseases or that group of diseases or lab tests or whatever it is. It has to support aggregation.
The server has to supply changes to the terminology as say XML transactions, or in whatever standard makes sense in the future. Most importantly, the terminology server has got to support some notion of longitudinal query, what Jim was complaining about this morning, that certain terminologies don't do today. So the server -- again, this is just a terminology server, it doesn't have any patient information in it, and this is not perfectly specified today, but pretty well specified, has got to have some way -- I want to do a query over three years, and I know the terminology change in this time. I either want to do it from the point of view of three years ago today or something, this server has got to tell me how to do it, and it is up to the government to specify the hurdle, how high that hurdle is to get over.
So let's look at the comparable description providers, namely, the enterprises that are going to try to send in this optional comparable stuff to the government, along with their claims attachment. For them to qualify, they have to put up a description server. In other words, when they send a claim to the government, it has got to go into a data warehouse they keep. Obviously, this has to be highly secure, for the reasons that you know only too well. Otherwise, we lose the public's confidence and Congress is going to be upset, and it is never going to work.
Again, and this is a delicate political point -- I wish Braithwaite were here to talk about this -- if the government is going to pay for it, this server should be optionally mirrored at a site of the government's choosing, a delicate point. Somebody has got to pay for this, because it has got to be secure, it has got to be current and so on.
An incentive here is for the government to say, the server should cover the largest patient population for which the patient normalization problem is solved. In other words, do you have a master patient index or something, because apparently we are not going to have a national patient unique identifier for awhile.
It has got to support class based queries, just like the terminology server. It has got to support longitudinal queries, just like the terminology server. In fact, the server may choose to do it by sending a query off to the terminology server to get that job done. And it has got to be current, so that if the terminology world moves on and there is terminology in this warehouse, then either the queries have to be current or the terminology itself has to be current. And it has got to be sustained across all relevant terminology versions.
So we set up a list of hurdles again, so that if you are going to get paid on the margin for this optional comparable stuff, those are the criteria that you have to meet.
Finally, we are going to have description analyzers, either inside or outside the government. They are going to need to be able to get at all this stuff, so they are going to need to be able to build warehouses also that work at web scale, and again have to satisfy all these very strict requirements on confidentiality.
But basically, their job is to do the doable. If they are going to try to synthesize information from more than one site, because that is how they won their contract from the government, it has got to be optionally mirrored, and their job is to help the government focus on opportunities and needs for incremental improvements, whatever those are, and their real job is to try to quantify what is it worth to American health care to have comparable patient descriptions.
A plan. How would we ever do this? First we define what the early objectives are. We have to budget for the seed dollars remember, to level the playing field here. These amount to pre-pays. So basically we have three groups of talented people, the terminology maintainers, the description creators and the analyzers, all of whom are going to need seed money to get started, and that is essentially prepaid towards whatever incremental pay they are going to get for supplying the value that they do. We define a competition amongst them, and the winners get subsidized by the government, again on the margin.
If somebody put me in charge of this, the first thing I would do is, I would put Bill Braithewaite in charge of this. As many of you know, he worked on trying to analyze data at UCSF, I won't even mention how long ago. That was his doctoral dissertation. I think he is probably one of the best qualified in our field to make something like this work.
We need to make the evaluation of all this ongoing, obviously, change it as fast as possible. The whole point is that through communications standards focused on pluralism, we try to bootstrap the thing along, and right now there is no bootstrapping, very little communication, few standards, no focus. We have so much pluralism that you can't really call it pluralism usefully.
In some sense, it is a giant clinical trial, which is going to be a rolling set of objectives.
Budget. What am I thinking about here? We need to see these industrial strength servers. By industrial strength server, I don't mean a PC in a closet. I mean something that has a non-interruptable power supply, redundant web connections, maybe it is mirrored in some web farm somewhere, but in any case, the whole point is that it doesn't have to run fast, but it should be up all the time and fail as little as possible.
MR. BLAIR: Mark, can I give you a five minute alert so we will still have some time for questions?
MR. TUTTLE: All right, just about done. I am thinking that a half a million dollars paid to each of these things would fund the creation of these things, would get them started. These are in prepays, just my estimate.
In some sense, I think the organizations that would do this would put far more value than this into it, because it would reflect their mission in any case, and it would be worth it for them to put more resources into it than this. Then the government would decide how many of these things to fund in any given year, or new each year, or whatever.
Obviously, they would be awarded on merit, sustained on the margin, adjusted on value, meaning if we find out that there is more value in this kind of description -- like when Keith and I were talking about the NDC stuff, a stereotype is that drugs are a major rathole for health care money right now. We'd like to prove that or not, decide if it is manageable or not, and go on.
There would be some sunset provision, so that Congress doesn't get upset about this; you could set some time limit that the government support should expire or be revised. Again, the hypothesis here is that the only reason to do this is because we think the benefit is many, many times the investment.
If you only remember one thing, again, I am proposing to marginally subsidize existing and emerging efforts to achieve economies of scale in the public's behalf. Help us get unstuck, and I predict if we did this, in 10 years the optional part would dominate the part that everyone is submitting now.
That's it.
MR. BLAIR: Could I invite questions both for Mark as well as for Keith? I think, Carol Bickford, you had indicated that you had a comment or question?
MS. BICKFORD: Carol Bickford from the American Nurses Association. When I first began this session, I thought I understood what some of the language meant in relation to classifications and terminologies and nomenclatures. But as the morning has progressed, I have more confusion. So I would like the committee or the work group to begin establishing some standards on the ground that this is the common language that you use to help those of us who are not living in this environment understand what you are talking about.
MR. BLAIR: Could I maybe direct that question especially to Chris Chute. What she is looking for is some kind of definition between classification systems, nomenclatures and terminologies, just to help clarify some of the things that she is hearing.
DR. CHUTE: You and everybody else. The analogy is the cobbler's children have no shoes. The terminology used among terminology developers is, how do we phrase this, idiosyncratic.
In recognition of that, ISO working group three is promoting a meta standard which would be a meta terminology about terminology. That is a work project headed up by Angelo Zimore in Italy, and I think will come forward with what we hope to be the definitive meta terminology, to give some clarity and consistency to these issues that are bantered about, as you have correctly pointed out.
DR. CAMPBELL: If I could very quickly point out, the challenge is as much getting people to use the terminology that has been consistently agreed to as it is to agree to it, because there are standards around the differences that people don't adhere to.
MR. BLAIR: Simon, do you have another question?
DR. COHN: Yes, I actually had a question. I'd just make one final comment about the issue of what terminology is versus what codes and nomenclatures and classifications are.
There is actually a document that was produced a couple of years ago that Chris commented on, the framework document, that at least posits one man's consensus view of how some of this should be described. The view was that the terminology was the overall term that represents and includes nomenclatures and classifications, code sets, things like that.
So I think this will be further refined, but in my simplistic view, I tend to think of it that way.
I actually had a question for the speakers, and specifically for Mark Tuttle. First of all, I wanted to thank you both for what was really a wonderful set of presentations.
Mark, I didn't want to let you get out of here without asking you on something relatively different than what you were describing proposing. This has to do with your experience with the UMLS and what I think have been probably some of the more aggressive attempts to model terminologies one to another, than probably has ever been attempted, which is what I see a lot of the UMLS is.
Certainly, one of the discussions I think we all have at least as we move into this area is, we are talking about terminologies to meet the needs of health care. We talk about that in a plural sort of term. Usually, we also recognize that there are a lot of overlaps on the edges of any of these terminologies, both in terms of granularity and domains and all of that. We all feel that there are ways to map them all and to make them all comparable, so that you can move back and forth between them to do all the things that we have all been talking about this morning.
Now, I am asking you as someone who has been in this deep level for a couple of years, is that true? What are the requirements around that? You can answer now, and also give us some written information, if you would like to follow up.
MR. TUTTLE: I think you are asking me if the meta thesaurus is the answer. I think the only way that we are going to know is to do something on the scale that I proposed.
In other words, I can argue intellectually that the meta thesaurus creates relationships between terms, but having said that, that these are invaluable and shouldn't be re-invented. But having said that, the uses to which coding systems are applied are not universally reconcilable yet.
So on the one hand, we can take the names and coding systems and try to decide if they are synonyms or relatives of other terms. But the meta thesaurus does not yet explicitly represent the meaning of codes.
So a simple example, to borrow Jim's now infamous example, if a n says some term, NOS, and somewhere else in the same terminology it uses the same term without the NOS, the meta thesaurus says they are the same concept. So from a term point of view, it is probably the appropriate thing to do. From the code point of view, obviously it is not.
So we are left with a kind of an emergence of complex tasks that come out when we try to take all these terminologies and put them together. Having said that, it would be silly it ignore the utility and work that has gone into the meta thesaurus, because one of the things that it can do is, especially for humans, help people navigate amongst all these different terminologies.
Now, one of the reasons that I proposed the plan that I proposed this morning is that the real test of all this is whether computer programs can do such things. If we had people sending in comparable descriptions in a variety of terminologies, some of which may be overlapping, let's suppose that there is more than one standard. There is no way that HIPAA can guarantee that you can give a list of codes which are not redundant. This is something that you have written about extensively in your website. If that is true, is the meta thesaurus going to be sufficient to sort out the overlap, so that when one description comes in in one terminology and another description comes in in another one, and the meta thesaurus said they should or shouldn't be joined or should or shouldn't be related, we would never want to ignore that information, but the question is, is it sufficient.
I don't know the answer. I think we need to test that. Does that start to answer your question or not?
DR. COHN: It begins to. If you have other written information you want to submit, we would appreciate it.
MR. BLAIR: We have a question from Bob Mayes here.
MR. MAYES: Actually, I just want to make a real brief comment. I agree completely with Mark's idea of marginality. I would however point out that you have to be sure of what margin you are putting it in. The government is not monolithic. The government rarely if ever funds infrastructure purely for infrastructure's sake; it is always to support a today project. Most government agencies are actually quite like most corporations, in that they have a very, very short strategic time frame that they work under.
So a lot more effort I think needs to be done in identifying appropriate types of government activities by specific agencies, and then going after those margins specifically. I get a lot of vendors who come in and talk to me for a couple of hours. I can remember once when somebody came and talked to me about terminology and how I might be able to use that within the Health Care Financing Administration.
Just to put a little point of reality here, it is useful to realize that the Congress has budgeted not a dime for HIPAA standards activities. That is all being picked up out of existing operational budgets in various agencies. So we just need to keep some reality here, as to how the government funds things.
MR. TUTTLE: Can you tell me what fraction -- is to 50 percent of the current American health care budget that is paid for by the government? Is that what it is?
MR. MAYES: HCFA pays 30 percent, HCFA alone, so you could add on top of that. I'm not arguing that there are lots of ways that we could be doing this. I continuously look at my own operational requirements to see how I can incorporate these. But it is difficult from the outside sometimes to pick what margin to put these things in.
MR. BLAIR: Let me just get in the last two questions. We have both Clem McDonald, and then someone who is standing at the microphone. So could we try to catch those two?
MR. MC DONALD: This is really the focus that has been on much of the discussion, but you mentioned, Keith, the free volunteer versus the cost, and I don't dispute any of that.
You brought us something quite interesting in terms of both NDC and in terms of LOINC, that is, go to the original source and vendors of these things and get them to volunteer. There is a volunteerism in that, but it is self-interested volunteering. They have two categories of terminology vocabulary. One of them are those that are based completely on artifacts of humans, like instrument based things. What comes out of that is what they say is going to come out of it, drugs that get manufactured.
I think I would like to see a focus even more on how we can get focus on that, that the artifact generators would make requests for formalism. The difference in the NDC right now is that they just do it all on their own, and you have to go back to this comma so you have a cross classification.
We could do an awful lot, and actually, artifacts are becoming increasingly the dominant source of information, if you project some of the DNA realities. We may not need as much of this talk stuff that is harder to classify and make terms of.
MR. BLAIR: I was just advised that apparently there is no place to eat in this facility, which means we go out. We tend to compete with the lunch crowd, so forgive me if we restrict it to one last comment, so we can scramble to lunch.
DR. CAMPBELL: My quick response is that I think that the infrastructure supports large groups of paid people, that contributing content is the same infrastructure that supports large groups of volunteer people. The predominant concern I was trying to press is that we need infrastructure, and that that infrastructure
is reusable and therefore could