[This Transcript is Unedited]

Department of Health and Human Services

National Committee on

 Vital and Health Statistics

Full Committee

January 10, 2018

Hubert H. Humphrey Building
200 Independence Ave., SW
Washington, D.C. 


Welcome – William Stead

UMLS Update and Discussion – Jerry Sheehan

ICD-11 Update and Discussion – Donna Pickett

Health terminologies and Vocabularies – Linda Kloss, Vivian Auld, Suzy Roy

Briefing on Fast Healthcare Interoperability Resources (FHIR) – Paula Braun

Health Data Landscape – Nancy Potok

NCVHS 2018 Review of Next Steps – Chair

Public Comment – Rebecca Hines

P R O C E E D I N G S      (8:30 a.m.)

Agenda Item: Welcome

DR. STEAD: Welcome back to day two. Welcome, also, our guests joining us from the NLM. Let’s start with the roll call, starting with the members.

MS. LOVE: Denise Love, National Association of Health Data Organizations, member of the Full Committee and member of Population Health and Standards Subcommittee, no conflicts.

MR. LANDEN: Rich Landen, member of the Full Committee, Standards Subcommittee, no conflicts.

DR. CORNELIUS: Good morning. Lee Cornelius. Member of the Full Committee, Population Health Subcommittee, no conflicts.

MR. COUSSOULE: Nick Coussoule, BlueCross BlueShield of Tennessee, co-chair of the Standard Subcommittee, member of the Full Committee and Privacy ad Confidentiality, Security Subcommittee, and I have no conflicts.

MS. GOSS: Good morning. Alix Goss with Imprado. I am a member of the Full Committee, Executive Committee, and co-chair of the Standards Subcommittee.

DR. STEAD: Bill Stead, Vanderbilt University, chair of the Full Committee, no conflicts.

MS. KLOSS: Linda Kloss, member of the Full Committee, co-chair of the Privacy, Confidentiality and Security Subcommittee, member of the Standards Subcommittee and no conflicts.

DR. PHILLIPS Bob Phillips, American Board of Family Medicine, member of the Full Committee, and co-chair of the Population Health Subcommittee, no conflicts.

DR. COHEN: Bruce Cohen, Massachusetts, member of the Full Committee, member of the Population Health Subcommittee, no conflicts.

MS. MONSON: Good morning. Jacki Monson, Sutter Health, member of the Full Committee and member of the Subcommittee of Privacy, Confidentiality and Security and no conflicts.

MS. STRICKLAND: Deb Strickland, CONDUENT, member of the Full Committee, member of the Population Health Subcommittee and the Standards Subcommittee, no conflicts.

DR. STEAD: Do we have a member on the phone?

DR. MAYS: Yes. Vickie Mays, University of California Los Angeles, member of the Full Committee, and member of the Pop and Privacy Committee. I have no conflicts.

(Introduction of staff and guests)

DR. STEAD: We do have a quorum. I will start just by briefly reviewing the game plan for today. We are going to start with three briefings that relate to our terminology and vocabulary work, starting with the team from the NLM sharing with us really how UMLS works to do what it does. Then, Donna Pickett will walk us back through ICD 11 with an eye to, again, thinking about what it actually takes to make these things work. We are going to have a discussion that Linda will lead around next steps in our terminology and vocabulary work followed by a briefing on the Fire Interoperability Platform, again, targeted at how all the pieces work together. Then we will have lunch, followed by a briefing from the chief statistician in our – around our health data landscape work. We will close with next steps and public comment.

That is the game plan. Does anybody have questions or suggestions about the agenda? Are we good to go? Okay.

Well, then let me introduce Jerry Sheehan. Jerry is the new Deputy – not so new anymore, since July – Director of the National Library of Medicine. He came into that role with Betsy Humphries retirement. You will recall that Betsy briefed us earlier. He – early roots within NCVHS because he was part of the study that Paul Clayton chaired with the Computer Science/Telecommunication Board on For the Record and was part of reporting that out to this group. He has been part of the, until last January, the office – for the White House – Office of Science and Technology Policy as an assistant director for scientific data and information. He is very interested in the intersection of data access and the intersection of clinical data, other kinds of data, and vocabulary. So, Jerry, welcome and thank you.

Agenda Item: UMLS Update and Discussion

MR. SHEEHAN: Thank you Bill. I want to thank you and the NCVHS for inviting me here this morning and my NLM colleagues, here, who are getting a little bit more time on the agenda than we might normally have at an NCVHS meeting. I know this is, in fact, the second meeting in a row at which you have had some presentations from NLM on health data standards and vocabularies and terminologies. I think that is reflective of the fact that we continue to have very strong commitment to that work at NLM, acting as the central coordinating unit for HHS and the Department on these terminologies.

As Bill mentioned, my first introduction to the NCVHS and almost by extension to HHS at the time, was almost 20 years ago, as the study on For the Record: Protecting Health Information was released and had some recommendations around health – not health data standards, but some security and privacy standards. It has probably been almost about that much time since I have been back here with the NCVHS, but I have been watching the work of this group very closely in the intervening years as I have been working around issues of health data policy and health information and data policies, I guess would be the broader way to put it. We see the importance of this committee in helping advance that agenda, as well.

I know at the last or previous meeting, probably back in June, when Betsy Humphries and Patty Brennan, the new director of the National Library of Medicine, gave an overview of some of the standard development activities at NLM and I think highlighted some of the particular terminologies that we support and develop in a variety of different ways at NLM. We welcome the chance to come back this morning and tell you about one piece of that that I don’t think you focused on at the last meeting, which is the UMLS system, which also traces its roots back well before my first interaction with the NCVHS. I think this was a project that began largely at the instigation and under the management of Betsy Humphries and Dr. Lindberg, the director of NLM at the time, to help advance terminology development and, in particular, I think their phrasing at the time was to help computers be able to talk to each other and understand the different language – medical languages that were being used.

I think as we talk about trying to introduce new standards and terminologies into the system, we recognize that there is still not just one that we use. There are many different standards sometimes related to different types of health information and health concepts, whether that be clinical terminologies, drug names, laboratory names, or whether it be related to specific sub-areas within the health and biomedical research that also develop their own terminologies and phrasing for describing what it is that we do.

UMLS is a project that began as an attempt to provide a way of disseminating those variety of different standards. Also a way to help interconnect them so that we could move easily, seamlessly between them and understand the language and the connections between these different terminologies, helping different communities to speak to each other and, again, in an electronic way.

What we want to do today is give you insight into where the UMLS essentially stands today and the various services that it offers and some of the capabilities that can be used. I think I am going to give an opportunity for two other staff members – I think I am the only one who is listed on the agenda. What I have learned is that NLM has quite a deep set of expertise around terminologies and standards. As I am still not – I am new, but not completely new in Betsy Humphries previous role. I am becoming increasingly reliant on our good NLM staff to help bring some of that expertise to other bodies.

I am going to – we are going to have two speakers for this item this morning. First, on my right, is Olivier Bodenreider from NLM, who is in our cognitive sciences branch within the Lister Hill Center at the National Library of Medicine, and Patrick McLaughlin, who is in charge of our – parts of our terminologies and user customer service branch within the library operations part of NLM.

Some of this is reflective of the fact that the UMLS started as something of an R&D project. I think always envisioned as a long-term R&D project within NLM. But an R&D project that has since become operational. Operational does not mean at all static. It is continually updated, as many of you know. It is updated, both in terms of terminologies that are available within the UMLS and it is updated in terms of its capabilities and it is updated in terms of other systems and services that it can provide. Between Olivier and Patrick, they will give you an overview of those current services and our current vision for where NLM goes.

I think the final thing I will say before passing the microphone over to Olivier is even as – you probably heard it the last time NLM was presenting here from Patricia Brennan. We are – we were then in the midst of a strategic planning exercise at the National Library of Medicine, which is now wrapping up. We hope to have our strategic plan published early this year, probably in the middle of next month when our Board of Regions will meet.

Three of the main themes that will be highlighted in that report reflecting growing interest around areas such as data science across the biomedical research community: We will continue to be NLM, the National Library of Medicine, as a platform for discovery. That means discovery and enabling discovery across a wide variety of information and data resources. So, as a key, central portion of that capability is the standards development activity because we need to ensure that the data, the information we make accessible is understandable, is conformant with standards, is interoperable in different ways so we can compute across different data sets in the research community, and so that we can continue to share information among the clinical care community. Of course, that will highlight the role of standards in underlying all of that capability.

The other two key pillars of the strategic plan that I will mention briefly are stakeholder engagement. This is going beyond our usual say NLM outreach and trying to make people aware of our services, but trying to engage with a wide variety of communities, public and private sectors in the care communities, the research community, vendor communities, and others, to understand how to better support their needs and understand the directions that they are going in the health data landscape.

The third being around human resource development and showing that the research community, the clinical care community, the library and information sciences community are well-positioned to take advantage of the increasing volume of data and information resources around them. Again, part of that is ensuring that they are well aware of and can make use of standards and terminologies that are relevant to their work.

So, I think standards will continue to be a central part of the NLM functions over the coming years. UMLS is a central part of our terminology services. So, I think we will hopefully have many opportunities to come back and present to you on what we are doing in the future. I hope this discussion today will help us inform some of the ideas that we will take forward with us as we move toward implementing our strategic plan and coming up with more specific recommendations for about how to carry our work forward.

With that as introduction, I want to hand the microphone over to Olivier. Maybe it is even easiest for us to change seats, here, so you can see the screen. We will be moving to the description of UMLS and then some of its uses. 

DR. BODENREIDER: Good morning. I am Olivier Bodenreider. I came to NLM 21 years ago. Actually, I came to NLM for post doc because I knew a little bit about the UMLS and I wanted to do more with it. I have been playing with the UMLS ever since, I guess.

So, we are going to provide you some more details about what Jerry talked about. I am going to give you pretty much the anatomy of the UMLS. Patrick is going to give you the physiology of the UMLS. I am going to talk about what it is and Patrick is going to talk about what it is for.

So, the UMLS standards for Unified Medical Language System, which says pretty much that there is such a thing as medical language, that it needs to be somewhat unified, and that doing so is going to take a system. That is the idea behind this.

It was mentioned this started in 1986 as a research and development project. What is interesting is that some of the goals were stated earlier on – early on in this project. The two main goals of the UMLS are the following: the first one is it attempts to record and collect the variety of ways in which medical concepts are expressed and put them together into a system. Without doing this, it is really hard for computers to be able to recognize medical concepts in text, for example, and make sense of them.

The other goal – the other mission that it was trying to address at the time was distribution of these standards, terminology standards. Because if you are interested in SNOMED, you can get it, but it comes in its own format, which is different from the format of the International Classification of Diseases, which is different from the format of something else. When you want to use several of these standards together, each time, you need to pretty much get the source, load it in some sort of a database, and learn the internal details of each source. The UMLS attempts to normalize, also, the presentation of the sources and facilitate their distribution and access.

DR. STEAD: I can’t resist trying to make sure certain things really get clear. So, until this time, efforts would have been to create a new vocabulary. What this is trying to do is to show the relationship of all the terms that exist. That makes it extensible. You are not picking which term you want. You are actually showing the relationship of all of them. Is that fair, Olivier?

DR. BODENREIDER: It is fair. I can actually say a little bit more about this. At the same time, in Europe, there was an attempt called the GALEN Project. It was a European Union supported project. That attempted to create the vocabulary or the ontology for biomedicine. It was ambitious. They spent four years – we are talking about the early 1990s or the mid-1990s. They spent the first four years of this project creating the formalism that would allow to represent medical knowledge properly. It was the early times of description logics. They formalized the language. By the end of the four years, they had nothing. It was a great project. Alan Rector, who led the project, is a great contributor – has been a great contributor to medical informatics. It is hard. They get another four years of funding, in which they fleshed out some of the concepts in there. The expectation was that the community would pick it up and actually create the ontology, the biomedical ontology for the world. This didn’t happen.

NLM took, under the leadership of Don Lindberg and Betsy Humphries, took a completely opposite perspective. They took the perspective of a library, which is let’s collect all of these things, let’s make them available, and let’s interconnect them as much as we can such that the use and access can be facilitated. That is a really important difference.

What I should say is that GALEN, although they did a very good job at creating something that was solid, didn’t take off – never took off. Nobody ever contributed significantly to it that it had enough content that it could be adopted in a broader way.

So, I should probably go a little bit faster because we will never make it. It was a great point. It shows some of the reasons for the success of the UMLS and its persistence through time.

So, when you get the UMLS, pretty much you get a database. That is what it was originally. Over time, we created a series of interfaces. For geeks, there are APIs. For the rest of us, there is a browser that you can – a web browser. What we called the UTS, the UMLS Terminology Services, serves actually two different functions. One is the browser to help you see the content of the UMLS. The other one is also being the gatekeeper for some of these terminologies that are – that have intellectual property restrictions. The reason we get these terminologies into the UMLS is because we accepted to protect their intellectual property. Therefore, we cannot make them publicly available. The UMLS is not publicly available. It is freely available, but it is not publicly available. Patrick is going to tell you a little bit more about the license.

I am going to skip the other two. The main idea here is that the UMLS is to be thought of as middleware – it was Betsy Humphries’ term – rather than end-user application.

In formal terms, there are three components to the UMLS. The larger component is the UMLS Metathesaurus that has all of the concepts and relations from all the terminologies. There is also Semantic Network. I will explain in one slide what it is later on. And there are lexical resources that have been built when NLM realized that medical terms are text in the first place and if you want to process medical terms, you need to know how to process text. So, we developed all of these text – not text mining, but text processing resources that are accompaniment to it. We largely use them to build the UMLS, itself. Of course, other people have found them useful and used them.

So, there are three elements to the UMLS metathesaurus and the semantic network that I want to emphasize. One is that the UMLS organizes terms, it organizes concepts, and it categorizes concepts. We are going to go through some more detail about these three aspects relatively briefly.

So, the UMLS currently contains 153 source terminologies, not counting the translations. You can see some of them, all of the main ones are there: SNOMED, RxNorm, LOINC, ICD, and ICD 10-CM for the U.S, CPT, MedDRA, the Human Phenotype Ontology, just to mention one that is not in the mainstream as much as the other ones. There are terms in 25 languages, although NLM doesn’t translate the terms. It collects translations. It doesn’t translate all of the terms. We are talking about 10 million names and 3.6 million concepts. What is important is that they all are normalized in their presentation, which is not the case when we look at the original sources.

So, if we look at Addison disease in the International Classification of Diseases – that internal classification, I had a typo here, my apologies. The Addison’s disease is characterized in ICD under primary adrenocortical insufficiency. It doesn’t even show as Addison disease here. The main rubric is primary adrenocortical insufficient. What is important is to recognize that this is the same thing as Addison disease. We can put the same – we can put these terms together and make it possible for a computer to understand it as a reflective of the same condition.

This is an example from the Medical Subject Headings, MeSH, which is used for indexing the medical literature. Here, Addison Disease is under two places: one, under adrenal insufficiency and under autoimmune disease.

In SNOMED CT, we can see how it is represented in this guise. I am going to skip this to focus on what the UMLS does to organize these terms.

So, it takes all of these terms that denote the same condition, if you wish, when it is for a condition, here, and puts them in a box, if you wish. It labels the box with two things. It gives this grouping of terms a unique identifier. That is the concept unique identifier that you have here, C000143 for Addison Disease. It picks on of the terms from the box. It doesn’t matter too much which one. It picks one and labels the box with this particular term, which we call the preferred term, but my preferred term might not be your preferred term and you can pick whichever you want.

What is important here is that NLM is not the terminology police. We don’t do curation of the sources. We are a library. We take whatever comes to us and we try to reflect the content as it comes to us. There is no curation of these terms. There is input from NLM to decide which terms of synonymous with each other, but that is pretty much where the curation stops. If some terminology says that Addison Disease is a brain disease, we are going to reflect this because they might have some good reasons to say that. Actually, yes, I am not going to get into the details of this.

So, by virtue of putting all of these terms for different terminologies in the same box and interconnecting them by – because they are in the same box, we do terminology integration. For example, as I showed, the term, Addison Disease, from MeSH, is in the same concept as the term Addison Disease from SNOMED, which makes it possible to take a SNOMED term from an electronic medical record, for example, and do a PubMed search for information about this condition after translation to MeSH. That is just an example.

By interconnecting terminologies, by integrating terminologies, what we do is that we also integrate the subdomains these terminologies are used in reference for. Just in the interest of time, I am just showing you some of these subdomains here. I mentioned clinical repositories indexed with – coded with SNOMED, the biomedical literature indexed with MeSH, the genome annotations annotated with the Gene Ontology, et cetera, et cetera. All of these terminologies are in the UMLS. Therefore, the corresponding subdomains are, de facto, interconnected or integrated.

The alternative to doing this would be to create mappings, point to point mappings between terms across terminologies. Of course, that wouldn’t be sustainable. It exists in some cases. It exists between SNOMED and ICD, for example, but we cannot create point to point mappings everywhere. It is much more economical to do mappings through a central integration authority. That is what the UMLS metathesaurus –

DR. STEAD: Do people get that? If you look at trying to get away from the brittle and – the long pathway – it really plays into some of the conversations we have had around the Predictability Roadmap. If we are attempting – if each of these things has to be maintained separately and then mapped separately, it becomes hopeless. Whereas, by simply being able to show the relationships, when any local terminology or standard curator makes it different that is reflected. You can now see those different relationships. Nobody actually has to stop and do anything differently. It is a much more – it is a much more agile approach. I think this might be a way of thinking about it.

DR. BODENREIDER: So, the point to point mappings can be derived from the mapping to the UMLS in the first place. So, that was for terms.

The UMLS also integrates concepts and their interrelations, if you wish. I am going to talk about this briefly. So, we can think of these various terminologies as being trees. Many of them are. ICD is. In other cases, they are directed acyclic graphs. When you integrate trees together, you get a graph that has all of the nodes, but has many more relations.

That is what the UMLS metathesaurus does. So, we can think of the UMLS metathesaurus as a gigantic graph with point to point arrows, edges in the graph, if you wish.

What is important to know here, also, is that the UMLS doesn’t do curation of these relations. Whatever edges are in the original terminologies, we are going to find them in the metathesaurus, even if they are conflicting. If one terminology says that A is a parent of B and another one says that B is a parent of A for different reasons, well, we are going to reflect the cycle between the two. We don’t take sides. So, the UMLS is not perfect, if you wish. The global view might not be useful for all use cases, but there are use cases for which it is better than having the view of only one vocabulary.

So, this is what Addison disease ancestors look like, if we pick SNOMED, MeSH, and ICD relations. So, this was SNOMED relations. MeSH relations are the ones in yellow. ICD brings different organization to it. The UMLS metathesaurus puts them all together and adds concept from different terminologies, including the thesaurus of the National Cancer Institute. This is, of course, a tiny view. It is only a handful of concepts related to Addison disease. Remember, the whole graph is going to include 3.4 million concepts and over 10 million relations. I couldn’t represent it on one slide.

The last thing that the UMLS does is categorize these concepts. If we want to know what Addison’s disease is a kind of – of course, it is a dumb example because it is a disease. It is in the name. If you take a protein or if you take a drug and you have no idea – when you are reading the name, you have no idea what it is. One way to figure this out is to climb up the hierarchies. If you climb up high enough, you are going to find that Addison disease is a kind of disease. You are going to find the disease concept in the ancestors.

That is a difficult process. What NLM decided to do is provide direct categorization for each concept using a small number of tags or semantic types, as we call them, from the semantic network. So, there is roughly 100 semantic types in the semantic network. They are used to categorize the 3.4 million concepts or 3.6, I think, at this point.

As an illustration of this, I am going to show you the concept “heart” in the metathesaurus with its related concepts. So, it is part of the mediastinum. It is a kind of saccular viscus. It has parts that are hard valves. Fetal heart is a special kind of heart. There are also concepts that are siblings to heart. The esophagus and the left phrenic nerves are organs or components that are in the mediastinum together with the heart. There are concepts that are related to heart for different reasons. Angina pectoris is a disease of the heart. Cardiotonic agents are drugs that act on the heart. Tissue donors are related to heart because heart is one of the organs that can be donated. That is the view of concepts that you can get from the metathesaurus by integrating all of these terminologies together.

On top of this – I am going to skip the green numbers. On top of this, there is this semantic network, which is this small network of interconnected semantic types. In between the two, there are these relations, these categorization relations that tell you, for example, that the heart and the esophagus are body part, organ, or organ components. That is the anatomy stuff, if you wish. There is a shallow hierarchy. So, body part, organ, organ components is a kind of fully formed anatomical structure is a kind of anatomical structure. That is in the semantic network. The interest of this is that if, for any reason, you want to get all of the body parts from all of the terminologies, regardless of their interrelations in the metathesaurus, you can pull the handle of the semantic type body part, organ, or organ component, and you are going to get them all across all terminologies. That is also an element of integration in the UMLS.

With that, I think that is going to be – I am going to turn it over to Patrick for what the UMLS is for.

MR. MCLAUGHLIN: Questions before I jump into the UMLS users and how it is being used? All right.

All right. So, before I get started telling you about the users and how they are using the UMLS, I wanted to just reiterate something that Oliver brought up, our UMLS Terminology Services. This is the portal with which our users interact with the data. This is how they browse the data, search the data. This includes the metathesaurus, which is the heart of the UMLS, all of the concepts Olivier was talking about, as well as SNOMED CT. We have a SNOMED CT browser that integrates some of the UMLS data to have a richer searching mechanism.

Also, as Olivier mentioned, we have APIs for programmatic access to the metathesaurus, RESTful and SOAP APIs.

It is a distribution point. You can download many of our terminology products and services. That is the UMLS. You can get the metathesaurus and the other knowledge sources, as well as RxNorm, its current prescribing subset, SNOMED CT, the international and U.S. edition, as well as all of the subsets of SNOMED, such as the Nursing Problem List.

The UTS is also used for licensing and user authentication. As Olivier mentioned, it is designed to protect the IP of the source vocabularies and code systems within the metathesaurus. It also provides access – excuse me. It provides access to these terminologies for research purposes. For some of the sources, there are additional restrictions, which are outlined in the license. We provide contact information for those source providers. So, for instance, AMA for CPT, for putting CPT data into a production system that users will interact with or any kind of for-profit mechanism, of course. Additional licensing would be required through AMA. We provide that contact information for our licensees.

The next thing is the user authentication for external and internal applications. Once you get the license, you have access to these data, but you have to then get the data from there. You an download the data one time. It only requires you to authenticate that first time. If you are interacting with any of our applications or APIs, you have to authenticate each time you are accessing those data. So, your username and password or API key would be used to do that. A lot of external applications use our user authentication mechanism. We have APIs allowing them to build that into their system. Users of their data can interact. Basically, they put in their username or password or API key with the external system. For instance, USHIK, they provide the CMS quality measures. They are exposing some of these terminologies within the metathesaurus. There is a user authentication component there. Then in a lot of our internal systems, we also have that user authentication mechanism.

DR. PHILLIPS: Who is providing the CMS quality measures?

MR. MCLAUGHLIN: USHIK is involved in that process. AHRQ. That is just one component, one example. There are many – many organizations involved with the CMS quality measure program.

As I mentioned, lots of internal applications use that user authentication system. Basically, you don’t have to jump over to the UTS and then jump back into something like VSAC or MetaMap to access these data. You can use it within the application.

Mainly, what I am going to talk about is this final component, which is the annual report system that we have. When our users accept the license, they are basically – part of that license agreement is that they will tell us how they are using the UMLS and the various vocabularies within it throughout the year. We have this annual report system that we use that for.

The current state of our UTS system is that we have over 26,000 licensees. That is quite a bit. Susie knows this all too well. The annual report – each year, in January, we email our users about the annual report process and say, hey, let us know how you are using that license or how you are using our data throughout the year. She was attempting to do a 26,000-user mail merge yesterday. We discovered that is not exactly so easy with our new cloud mail system. We are working through that. If any of you are UMLS licensees, you will be getting an email soon.

So, 78 percent of these licensees are based in the U.S. It is a fairly U.S.-centric system. Just to add onto that, 134 total countries are represented in our licensing system. Access to the metathesaurus is throughout the globe. This graph, here – you don’t have to pay too much attention to the numbers, here. The trend is that we are getting more and more licensees. This started back with the High Tech Act and some of the components with that, such as Meaningful Use. As more and more systems and users had to gain access to these required terminologies, we got more and more licensees for RxNorm and SNOMED CT access. So, you are seeing that trend here. It is a huge jump. We had a couple thousand licensees before. We are up to over 26,000 now. So, it is growing and growing.

So, again, I am going to ask you – I put a lot of numbers on the screen. You can generally ignore the numbers here. What I will tell you is that the data we are going to look at is from CY2016. Right now, users will be reporting on last year, 2017. So, we are going to be looking at 2016 data, when we had about 23,000 licensees. Each year, we get about a 50 percent response rate to that annual report. What we are going to look at here is about 11,000 respondents. We are specifically going to focus on the subset of users who mention that they use the UMLS, specifically. These are users that they may or may not be using something like RxNorm or SNOMED CT separate from the UMLS, but they are, in fact, using the UMLS. So, those are the numbers we are going to focus on, that 4,500 user base.

This is just to give you a distribution of these – this is the users who are using the UMLS. This is a breakdown of their affiliations in the UTS system. Mainly academic with the UMLS. About a third of our users are academic institutions. After that, you’ve got for-profit and not-for-profit entities. It drops down a bit. Actually, individual use is a high proponent of our licensing system. Then you’ve got government agencies and other from there.

So, some examples of those affiliations. These are the organizations that those licensees are affiliated with. This is not the full list. This is just some of the – these groups are more represented or have a higher representation within our licensing system. You see a lot of those for-profit entities, IBM, Elsevier, Allscripts, and then you drop down into things like Mayo Clinic, MITRE, who works with the U.S. government quite a bit, some of the government agencies. The VA, the FDA have a lot of users that have licenses. Within the NIH, we have a lot of users across NIH. This actually excludes NLM, so NIH throughout the clinical center and so forth and highly represented. Lots of academic institutions. This is not a full list here. These are some of the more highly represented organizations.

So, what are the use cases, the main use cases of the UMLS? This is what we are hearing about on our annual report. The first one is utilizing specific terminologies from the Metathesaurus. Olivier mentioned this a little earlier. This is not necessarily pulling terminologies out of the Metathesaurus and trying to recreate them. This is really taking a part of those terminologies and either grouping it with other terminologies or using some portion of the terminology, so maybe the codes, the CUI, the concept unique identifiers, and the descriptions, rather than the full aspect of the terminology.

Next is mapping between terminologies. This is sort of the – one of the foundations of the UMLS, as Olivier mentioned, going from one terminology to the next through those CUI mappings. I will just say briefly here that – just to point it out. These CUI mappings are not clinically validated in any way. We don’t claim to say you can take these CUI mappings and put them in your electronic health record system and they will give you accurate data at all times. They are not validated. Although we do have editors that go through, subject matter experts who are mapping these terminologies and making these concept groupings, we don’t have the validation there.

Lots of other use cases here. Processing texts to extract concepts, relations, and knowledge. That is what the MetaMap tool does that we have built at NLM. A lot of other users, external users are doing similar things. So, I won’t go through the full list here, but you can see those on the screen.

Then some specific uses that are just like the use cases, but actual examples of uses at NLM. We’ve got the semi-automatic indexing of MEDLINE. We do have human editors – human indexers that go in and apply medical subject headings to the medical literature in MEDLINE. We have made huge leaps and gains in the semi-automatic indexing of these MEDLINE records. There is a lot of frontline indexing that goes on that is then reviewed by human indexers. There are suggested indexing terms through that system.

Information retrieval in PubMed and MedGen. That sort of speaks for itself. It builds on top of that indexing of MEDLINE and provides some richer mapping on top of the MeSH terms that we have in the MEDLINE data.

Consumer health information exchange for MedlinePlus Connect, which is our – it is our – basically our web service and API that allows EHRs to pull the consumer health monographs from MedlinePlus into the electronic health record systems for patients.

Finally, production of SNOMED CT subsets: the CORE Problem List subset and the Nursing Problem List subset. For instance, with the Nursing Problem List subset, you’ve got things – the nursing terminologies like NANDA, NIC and NOC, the intentions and outcomes terminologies, those are grouped together with SNOMED in these UMLS concepts. There is a subset, a Problem List Subset produced for nursing.

Some examples external to NLM. A heavily used application is the Mayo cTAKES application for text analysis and knowledge extraction from clinical notes and so forth.

VisualDx is a tool for diagnosis of skin conditions and disorders. I think Apple’s CEO was just talking about this. They have the app in the Apple store. They can now scan a skin disorder and actually recommend – give diagnosis based on the image, which is incredible. They are utilizing UMLS for some of their disease classification and interrelatedness of the diseases in their system.

I stumbled upon, in our annual report data, this CrowdTruth application. It is a framework for crowdsourcing the annotation of datasets, text datasets. You could – basically, you could take data and put it on a system like the Amazon Mechanical Turk and you can have users throughout the world sort of annotate data for you through this framework. It is kind of cool.

OHDSI, the Observational Health Data Sciences and Informatics group, heavily uses the Metathesaurus for their vocabulary resources, which is combined with their OMOP data model. That allows for observational data studies across millions of patients worldwide. It is really incredible, the work they are doing.

The Indian Health Service has this resource and patient management system, which they can deploy throughout their Indian Health Service. That uses the UMLS within their system.

Finally, PatientsLikeMe, that is a patient health information portal. It allows patients to sort of go on there and describe their conditions and learn and talk with other patients about what they are going through, how they are experiencing or interacting with the healthcare system.

Those are just some neat examples. I am not endorsing any of them in any way, but I do think they are neat, myself.

Let’s move on to some feedback that we receive that I think is particularly relevant to this group. Based on feedback – we were hearing it for a couple of years, but back I think in 2008, we started reducing the number of releases of the UMLS and the Metathesaurus throughout the year. I think we got up to four releases at one point. Our users told us to please stop. They can’t handle 20 gigabytes of data, four times a year, updating their systems with so much data multiple times per year. So, we are down to two times a year. That is released in May and November. I think we are up to about 30 gigabytes of data when you do the full subset of the UMLS. Of course, most people aren’t using the full subset because you generally have a more specific use case and you are picking the terminologies you need. It is still a fairly large dataset with all of the files we provide.

We often get requests for more, more, more. People want a lot more languages, even though we have 25 represented at this point. They want more vocabularies. I will add in parentheses after this that they also want it to be simpler. So, they want more, more, more, but simpler, which is a difficult way to handle things. We work on that. We add new terminologies pretty regularly.

Finally, we are just – we are seeing increasing pressure in terms of the licensing burden. Basically, once you use the UMLS data, you have to get that initial license and then you have access to the data. In order to use it in any kind of production system, if you’ve got some of these more highly restricted terminologies, you have to get a separate license. There is the restriction on the usage of content that we are finding users think it is a burden, and then also the need for authentication to access these data. So, what I mean by this is you can get the license and that gives you – you have the license to use the data. If you want to keep getting the data – if you want to hit our API or VSAC or some of these other systems, you have to keep authenticating. Particularly with APIs and with FHIR, the new Fast Health Care Interoperability Resources, it is kind of a burden to have to authenticate every time. We need to protect the IP, but people are asking, hey, why do I have to authenticate? Is there another way I can do this where I don’t have to authenticate every time and I can get the data? I have a license. I just don’t want to tell you every time. That is just a few of the things we are hearing about the licensing there.

I think we have made pretty good time here. We will accept questions.

MR. SHEEHAN: We look to our chair for guidance about time. I think we have left some time for questions.

MS. KLOSS: We have allocated until 10:25 for both presentations. We still have an hour. I think we can take 10 minutes for questions. You will have time, Donna.

DR. STEAD: Linda is going to lead off.

MS. KLOSS: How do you support licensees in their effective use of the tools? Do you have formal training programs? Is it one on one, some combination? Is that an issue? Or, is there just a self-selection that people who are smart enough to know how to use this are the ones that are trying to access it?

MR. MCLAUGHLIN: We have a range of users from full developers, who can take this, read a spec on our RRF, our rich release format, pop it into a system, and they can run with it. We don’t need to talk to them about it at all. We have got other users who want everything in an Excel spreadsheet. That is perfectly reasonable. We have to walk through – hold hands a little bit more with those types of users. We supply our data in pipe delimited text files, which is basically an Excel file. You just need to nudge it a little and change the text around a little. So, we do work with our users.

We have training videos, training documentations. We used to do a formal program, where we were training – I think we would do like an eight-hour session in a day to train users. We have gotten away from that recently because it is – they would then travel to us. They would have to sit there. As much as I like the UMLS, I would never want to sit through eight hours of learning about the UMLS in one day. We do consultations, basically, for users that need a little more assistance. We will do phone calls and email interactions.

MS. KLOSS: I became a licensee since this project started. I found it very easy to become a licensee. So, I am one of those individuals who will get a report and admit to having done nothing with it. I just kind of wanted to go through the process and open the kimono and see what it all looked like. What is the – is there a capacity? Maybe this is a really dumb question. If you have 26,000, do you run into a capacity issue? We are the hub of the vocabulary universe.

PARTICIPANT: Other than sending 10,000 emails in one day, on the terminology server side, as of right now, no.

MR. MCLAUGHLIN: No, we really don’t have any issues. We do review all of the applications. It is not a thorough process, but it does take resources. That would be where – that would be our one area where if this keeps increasing, we just have to devote more resources to it. In terms of the system, itself, the UTS can handle tons and tons of users. Our APIs can handle many, many users.

MS. GOSS: Kind of building on the licensing comment from Linda – and this presentation has been really informative and helpful. I live in the standards world and know much more about the transport and how the pieces come together to enable the good semantic vocabularies to be exchanged and business to happen. The licensing and IP protections I have a particular sensitivity to.

Can you talk a little bit more about this burden aspect in the authentication? It sounded to me like from the earlier presentation, it is pretty easy. Once you kind of go through the process of getting licensed and terms and conditions and you had to use your ID and a password, you could actually integrate them and they could be sort of transparent and just function in the API. I am not really getting this authentication and licensing snafu? Is it because of the cascade effect? What is it about?

MR. MCLAUGHLIN: It is a burden in the sense that an individual has to have a license. That individual can represent an organization, but you have to have a username and password or an API key for an account. If you are Elsevier, whoever, and you need to send multiple transactions and this is a – you’ve got some system that is going to recur for many, many years, if that person leaves your organizations – I think that is where the burden is coming in is the account management.

MS. GOSS: It is the traditional aspect of if you want access to something, we need to know you have it and that is a traditional part of our IT infrastructure.

MR. SHEEHAN: In our emerging world, if I can find it on the internet, it is free and I can use it without restriction. It is sort of a cultural difference. We find ourselves in a position and not just with the UMLS and terminologies, but other resources, too, that we are making them available, but there are strings attached. In this case, there are limitations that are kind of passed through by us from the developers of the content. I think there is just that sense that if I found it on the internet, why is there a limit. We can explain that. We try to make it easy. Now, you can get access to all of the different terminologies with one license through us. I believe there is also the every time you come back, we need to know it is you and you are the licensed user for this. Again, in the internet-enabled, web-enabled, I get it on my phone world, that just seems like an unnecessary obstacle.

MS. GOSS: So, you may be providing this service for free, but the content to develop those vocabularies took a lot of time and effort and maintenance and those are not free. So, I get the conundrum here. I also then wonder, when we talk about free, because we are providing a service on behalf of the federal government to our citizens and beyond, how are you funded? Do you have a specific line item to cover your providing this service for free and managing all of these services you have just described?

MR. SHEEHAN: There are probably many dimensions to the answer to that. All of the services we provide at NLM are essentially provided through our appropriation. We have, within that, the support to a variety of different standards and terminologies, including the UMLS. We provide support on behalf of the U.S. government for SNOMED, participation in the international health data standards organizations, SNOMED International, and in some cases, we provide support for or we develop, ourselves, some terminologies like RxNorm terminology. So, there are many different mechanisms through which we are providing that kind of capability. In others, I think SNOMED being a good example, where we don’t develop SNOMED, but we provide essentially on behalf of the U.S. government, membership in the organization. That allows us to offer it with a license for U.S. use. We provide that.

MS. GOSS: So, predominantly, you are appropriation based, but you might get a little revenue from someplace else, but it is a small revenue?

MR. MCLAUGHLIN: We don’t take revenue. Sometimes we may have some interagency agreements from other parts of the government that use or want development of a particular capability to serve their needs. That all happens within the government.

MS. GOSS: Speaking of other sister agencies, we mentioned USHIK earlier and AHRQ’s involvement with them. I am thinking that we need to dive a little bit more into that one because it keeps popping up in various areas for the committee members. I just want to note that for Linda. Thank you.

MS. LOVE: Speaking of licensing, many of our health information structures are billed on public use and public good and proprietary – based on proprietary tools. So, if I am a user, a licensee, if I just want to explore, dance around CPT to CCS and just on the surface, do I need to become a full license user of CPT?

MR. MCLAUGHLIN: If you are just – for the individual use, you have access to the data through the licensing agreement.

MS. LOVE: Excellent. Excellent. And on the individual versus – through your license. That is excellent. And then, individual versus organizational – now, sometimes, could like IHC get an organizational license that covers that organization versus the individual or does each individual within Intermountain have to have a logon for the API?

MS. AULD: This is actually one of the areas where some organizations consider this a burden because they are very concerned about legal aspects of one individual being tied to a license that is used by an organization. We are only set up for individual licenses. We do require that there be a person who is attached to the license. We have worked with some of these organizations, the Veterans Administration was one of them. We have worked with them to modify how we gather that information. So, you can tag yourself and say I am getting this license on behalf of my organization, but we still want an individual attached to that. We recommend that you – rather than using a personal email, that you use your group. So, for me, I might use my individual office email because that is a part of NLM. There are like 10 of us who read that. The same for your phone number. The basis is that especially if you are in a situation where you have high turnover, it is better, as much as possible, to tie that back to a group. No matter what, we are still going to need a person attached to it.

DR. STEAD: I will ask a little bit different question. The slide that you showed, Olivier, of the many different dimensions that now come together is impressive. My guess is it is a fraction of the scope of health, as we now are thinking about it. To what degree – what is the real denominator? If we were to have a UMLS that brought together all of the concepts and the related terminologies about health as we are now thinking about it, would UMLS be dealing with 10 percent of it or 90 percent of it or somewhere in the middle?

DR. DODENREIDER: The answer is it depends. So, when I give presentation about the UMLS, especially the Metathesaurus, I get one of two comments and sometimes two comments at the same time. One is, oh, the UMLS is far too specific for my needs. I don’t care about all of these details. They get in the way. I want something higher level. Usually, immediately after this one, I get, oh, but there is always 10 percent missing. There is always something that I cannot find. It is not fine-grained enough. It just depends on your use case. We are trying to, you know, get as much as we can to reflect what people need when we know they need it.

We tend to emphasize coverage of subdomains that are not well covered. For example, a couple years ago, we brought in the Human Phenotype Ontology. We are about to bring in Orphanet for rare diseases because we know that traditionally these are domains – you know, those genetic diseases and these rare diseases, many of which are genetic diseases, have not been well covered by the mainstream terminologies. It is changing, also. We are trying to extend coverage whenever we can. We won’t be able to satisfy all of the use cases. That is one of the issues with the UMLS. It is not – it hasn’t been created for one purpose. It is made for many different purposes. Your mileage may vary. 

DR. STEAD: I think one of the things that we are really trying to understand is how we can make the terminology development and dissemination process, in general, more agile. In a world where new – we are recognizing new pathways, new biological and social relationships all the time. What are your thoughts about how we – how we can begin to represent those early, just-being-defined concepts, such as a new unexpected drug reaction? How can we begin to represent them just-in-time in a way that will then allow them to be curated as we got smarter, so that we could actually get the terminologies in some way representing the leading edge, not the trailing edge of what we know?

DR. DODENREIDER: The answer to that is – well, first of all, individual concepts don’t get in the UMLS. I think, at this point, you all get this, that the UMLS integrates terminologies. You cannot knock on NLM’s door and say, oh, I have this brand new biological function that was just discovered or mentioned yesterday in Science. It needs to get in. We don’t do that. It is going to get in if the Gene Ontology, for example, integrates it as a new biological function and we get the next version of the Gene Ontology in the next cycle. That is how it is going to get in.

So, what I would say is that NLM remains attentive to new developments and new terminologies being developed, especially when they are used in research projects, used in EHR systems – when there is usage that backs, if you wish, the validity of these terminologies and their importance in the community. The way we look at integrating new sources is also by paying attention to usage.

MS. ROY: Also, just wanted to remind the committee about the United States Content Request System for SNOMED CT. One of the benefits of having the U.S. extension or U.S. edition of SNOMED CT is for that quick, agile kind of maintenance of SNOMED CT. Just as an example, the recent Ebola epidemic that we had a few years ago, there were a lot of lesion morphology descriptions that needed to be quickly added from the CDC and others into SNOMED. They actually were able to submit it quickly to us. We were able to add it to the U.S. edition. That came out I think two months later in the U.S. edition and then a couple months later, it came out in the new edition of the UMLS. So, we were able to quickly receive that information and add it to SNOMED, which then fed into the UMLS, which were then mapped to the other terminologies.

MS. AULD: The conversations that happened in the Federal government that led to those additions to SNOMED weren’t just focused on SNOMED, but looking at all of the terminologies, bringing in CDC, CMS, all of the relevant players, bringing in LOINC, bringing in RxNorm, seeing where do we have gaps that we need to fill quickly to meet this need quickly. We all worked together to make that happen.

DR. DODENREIDER: Another element to this agility is also through MeSH, the Medical Subject Headings. So, MeSH is used to index the biomedical literature. There is a chance that these new concepts are going to pop up from the biomedical literature that needs to be indexed. For example, all of the new biological substances that get published – even before they get into clinical use – are reflected in the supplementary concepts in MeSH. By virtue of being in MeSH, they are inserted into these UMLS on a regular basis. It is a multiprong approach to agility, if you wish.

DR. STEAD: Thank you all. I think we need to move to the next piece. This was extraordinarily helpful. I could not be more grateful.

MR. SHEEHAN: Thank you. I appreciate the opportunity. I think many of us will be here for the rest of the day, too. I will be here at least up until lunch time. We can have some hallway conversations, too. Thanks, again, for the opportunity.

Agenda Item: ICD-11 Update and Discussion

MS. PICKETT: Good morning everyone. Thank you for having us back. I am Donna Pickett, National Center for Health Statistics, CDC. I am the Chief of the Classification and Public Health Data Standards staff, for those of you who don’t know me. I also have another title. That is head of the Collaborating Center for North America for WHO-FIC, which is the WHO-Family of International Classifications, which includes the ICD, but also includes ICF and work being done to develop an international procedure coding system. More to follow on that maybe in future meetings.

Again, thank you for having me back. Some of you know I was here in 2016 giving you an update on the status of ICD-11, which mainly focused on content issues at that time. You were kind enough to allow me to come back in June of last year – thank you – where more content information was provided, but we did a nod toward some of the other issues that would likely surface as ICD-11 is coming to – the development of it is coming to hopefully closure at some point, and what the possible implementation issues may be facing us and, therefore, facing the National Committee.

Just by way of history, the National Committee, in the past, has been very engaged and active in the area of evaluating code sets, specifically in my frame of reference, ICD. There was an extensive set of work done for ICD-10. The Committee was very engaged in looking at the developments of 10, the implication of implementation in the U.S. There actually is a great report. For those of you who haven’t seen it, I will hold it up. This was in 1990, where it was addressing ICD-10 transition. I see some of you nodding, so you sort of remember it. Interesting, in preparation for this presentation, I actually went back and looked at what were some of the issues. What was remarkable is that many of the issues are still the same, painfully so.

I will be touching on some of those, but there are some activities or things that were developed as part of the 1990 report and during discussions on the implementation 10-CM that I think are quite relevant to the committee work for ICD-11 and probably some areas where you may need to gather up your best thoughts about how some of what you did previously needs to be updated to reflect current use practices. Let’s face it, back in 1990, a lot of the work was focusing on in-patient and in-patient only. Let’s face it. That is no longer the case for ICD-9 CM and now, happily, the replacement with ICD-10 CM for diagnosis. You are talking about a coding system that is used across the healthcare setting, inpatient, outpatient, home health, rehab. If you can name a setting, you will have – for diagnosis, you will have a use of ICD-10 CM. So, the usage is different.

Of course, we are also now in an electronic health record environment. There are many things that will be touching us as one might want to consider what to do in the world of a pending ICD-11.

This is just a historical slide. Some of you have seen it before. I can now say that we now have an actual implementation date on that last line of the table. For years, it was pending. It had question marks in it. ICD-10 CM, as all of you know, was implemented October 1, 2015. Now, we have reached a two-year point in the use. Again, what is still striking here is that mortality implemented the WHO version of ICD-10 back in 1999. That is an important issue as we think about what is likely to happen in the world of ICD-11 and if there is to be an implementation of 11 in the U.S, how that pathway works.

Yes, there is a need for ICD-11. Though, I think from the U.S. perspective, it is probably not as heartfelt for us as it is for others. Many countries implemented ICD-10 or variations of it beginning back in 1995. While we are sitting here having just implemented roughly two years ago, you have countries that have been using ICD-10 or a variation for quite a while. So, yes, they absolutely felt a need to update. Even though there has been an updating process for ICD-10 by WHO, which was new because, again, previous iterations of ICD there were no interim updates. Something was published. You used it for ten years, roughly. You waited for the next revision to give you any substantive changes in the healthcare system. Even with updates to 10, it couldn’t keep pace with the clinical information coming forward. That proved to be a problem. Again, you are dealing with structure issues and how do you fit something in that is quite structured.

There is increasing need to operate in an electronic environment. ICD-10 and previous iterations were curated in hardcopy. Different mindset completely. Fast-forward, we need to think about the current environment. That is exactly what WHO has done. And, we needed to capture more information for morbidity use cases. ICD, as you all know, had its heart and feet in the world of mortality. With the eighth edition, many countries, starting with the U.S. started developing clinical modifications to give a nod toward the more expansive use of the codeset for morbidity applications.

Not going to read through all of these, but WHO did have very specific goals in mind, as did the member states who contributed to the process, related to why they were needing an ICD-11 and what the basic principles were in defining what should be included and how it should be developed, how it would be curated, and moving away from the processes previously used to a more rigorous, scientifically-based, but electronically-based environment.

Again, important issue here, it is not based on mortality solely. You’ve got multipurpose and coherent classification for use for morbidity, mortality, primary care, research, public health, with a nod towards consistency and operability across different settings, not just inpatient and not just death certificates and underlying cause.

Based on previous presentations, I usually get a question after the fact, so I thought I would put it into the slide, as to whether or not the U.S. is actually engaged in the activities. The answer to that is yes. As you can see from this slide, Dr. Christopher Chute, who used to be with Mayo Clinic and is now with Johns Hopkins, served as the chair of the revision steering group. That is the WHO group that was putting together the development of ICD-11. There was a small executive group that looked at specific issues. Dr. Chute was also the chair of that group. Also, at the detail level, the clinical level, they had topic advisory groups. They were basically body system related. We had U.S. involved in that from the clinical groups. In fact, a U.S. representative from the American Academy of Pediatrics headed up the Pediatric TAG. Also, not being down in the weeds, the U.S. was also involved at a higher level, where the Mortality TAG was co-chaired by someone from the U.S. The Morbidity TAG, I had the pleasure of serving as the – one of the co-chairs for the Morbidity TAG. We actually had somebody from ASPE serving as a co-chair of the Functioning TAG.

The Topic Advisory Groups were sunsetted October of 2016. Replaced by the Joint Task Force. We still have U.S. involvement there, as well. Dr. Chute is still very much involved. He is part of the Joint Task Force. I still sit as part of that task force. We do have mortality representation. That person is from our division of Vital Statistics, doing the death certificate work. You have actually had presentations from Delton previously. Robert Anderson, who works under Delton, has spearheaded the mortality side.

Two new groups have been stood up within the revision process. The Medical and Scientific Advisory Committee. We have U.S. representation there, again, in the way of Dr. Chute. For the Classification Scientific Advisory group, I am one of the representatives. Now, as head of the Collaborating Center, I must say it is not just U.S.-based. Canada is part of the Collaborating Center. So, we also do have participation from our Canadian colleagues, as well.

These overview slides identifying the phase – I thank WHO for their slides. I just didn’t attribute them. I will be modifying the slides so I can give credit to WHO for developing these nice slides for us.

Phase 1: project planning, setting sup the various working groups and topic advisory groups, setting up the infrastructure, including the IT platform support for the development of the classification, and setting up a review process. That all occurred between 2007 and 2015. Some of you who have heard the presentation before, we have had kind of baby steps towards getting started. It was ICD-11 in 2011. Then it was ICD-11 in 2013, et cetera. However, the current date, which seems to be the live date, will be June of 2018.

The next slide talks about phase 2, which covers the formation of the Joint Task Force, which is made up of morbidity and mortality folks. You are not siloed. You’ve got a cross-section of people all engaged in discussions on how to make this classification, which is intended to be a joint classification for morbidity and mortality, how to make that work, but also, simultaneously, looking at issues related to the impact on mortality trends and death certificates and how all of that is supposed to fit together so that there is no inadvertent disruption of your trended data and all of the issues that go into the statistics that are used in the United States.

Again, the Topic Advisory Groups have been sunsetted. There is still active and ongoing updates to the chapters. There is active field testing going on. Groups are actually looking at how can you apply this – I have some slides later in the slide deck that talk about some of the parameters that are being used for the field testing. There is a development of a implementation package. I have laid out some of the things that are included in that. There is the governance for the maintenance process from ICD-11 after 11 goes live.  

DR. PHILLIPS: What is a statistical classification? What do you mean by that?

MS. PICKETT: The way the hierarchy is set up, a more detailed level for morbidity, for instance, you will have very specific codes, like we did in 9-CM and 10-CM, where you expand out the level of details so that it is not just the underlying cause that you are trying to capture, but the manifestations of that particular illness, which explains why you were admitted to the hospital. Did you have a complication of influenza? Do you have a complication of your diabetes? It is more detail. At the end of the day, when you are trying to produce statistics, you need to aggregate up so that you still have a way to produce your statistics so that your underlying cause of death is your diabetes on the mortality side. But, on the morbidity side, are you more concerned about the number of diabetic cases that you had that had retinopathy, renal manifestations, et cetera? Again, the idea for 11 is to sort of bring all of that together within one classification instead of having, as we did in the U.S., using ICD-10 WHO version for your mortality and a slightly different classification with more detail for your morbidity applications. The idea was to bring that together in ICD-11 such that you can produce your higher level underlying cause and possibly multiple cause of death coding, but at a more detailed level for your hospital encounters and your physician offices and ERs, you have something that is more useful from a clinical perspective.

MS. KLOSS: That is why when we attached the CM to ICD – the CM is the morbidity.

DR. PHILLIPS: My next question is then – sorry – is that only for mortality or can you roll up for statistical purposes across the whole classification scheme?

MS. PICKETT: You can roll up, but for mortality, when you are looking at underlying cause of death, they are looking for those buckets – your diabetes, your heart disease, et cetera. That usually is the statistical entity that you would be looking at. From a hospital perspective, when you are trying to look at your patient population, you are usually going to want more detail to know what is really going on with your patient, why are they really coming in. Yes, they have diabetes, again, as an example, but are they coming to your ophthalmology clinic? Are they going to kidney dialysis? At the end of the day, even if you wanted to compare your underlying cause of death statistics with what is going on at the hospital level, you can still take CM and aggregate it up and do your comparison at a higher level of aggregation. Currently, with ICD-10, it is like at the three-digit level, as it was with ICD-9 and ICD-9-CM.

Again, they still want to be able to generate statistics, but they also understand the morbidity use cases, which now includes quality and patient safety, benchmarking. There are so many other things that you need more detailed information for, but you can always aggregate up to give you that statistical information that you are looking for.

So, 2018 to 2021, phase 3, again, WHO will be releasing, as hopefully planned, an implementation version of ICD-11 that they are hoping that countries will start to implement. Again, you will have various implementation readiness. It depends on how a country actually uses the classification. Some countries only use it for mortality statistics and maybe are not electronically as engaged as other countries are in reporting out death certificates. So, you may have early implementers who are electronically capable. There is a lot of work going on in modifying MMDS and IRIS, the automated coding systems, to actually incorporate ICD-11. You may have early implementers there, assuming that all of those changes are actually ready to go. But those groups have already started meeting. WHO has convened those working groups. They have convened working groups looking at the statistical impact and making modifications and tweaks to ICD-11 to address some of the concerns that have been raised.

MSAC is the Medical Scientific Advisory Committee. What that replaces is the process we had for updating the ICD-10, which is the Update and Revision Committee, which basically was made up of representatives from the member states, classification experts, and some medical experts. There are going to be two groups now for ICD-11 update, where proposals to modify the classification will be specifically looking at the scientific need, the medical need to update the classification. Is it scientifically correct? Is it clinically correct? Does it belong in the classification? They will work in tandem with the Classification Scientific Advisory Group to figure out placement, multiple parenting, and ultimately, in a classification, you can only have a concept go to one particular spot in a classification. What is the appropriate spot for the placement within the classification?

MR. COUSSOULE: How does that differ from what currently happens with the changes in ICD-10?

MS. PICKETT: The WHO version of ICD-10? The current process, of which the U.S. is part, basically what happens is proposals will come forward to the Update and Reference Committee. Again, the membership is from member states, but you may have classification experts only from a given country or you may have clinical expertise if part of that classification group from a given country also includes physicians. For the U.S. specifically, even though I have a medical officer in my program, basically, what I do, I reach back out to the subject matter experts here, in the U.S. So, I reach out to AAP, the Academy of Neurology, Psychiatry, any of those groups, which are separate from the URC updating process, but I would reach back out to my subject matter experts to make sure I understand the clinical validity and need for a proposal before actually voting on a position within the Update and Revision Committee.

With the establishment of MSAC, WHO is actually bringing in the medical community as part of the process. It is part of the process. It is not kind of an adjunct to the process.

MR. COUSSOULE: I guess part of my question was is there any thought about pulling that up into the current 10 changes? Is it just 10 is going to stay the same and then the 11 process will – 

MS. PICKETT: For the ICD-10 changes, there are a lot of things that did come forward, Nick, that were problematic because while the proposal may have been clinically valid, the current structure of ICD-10 could not accommodate the change. There were many things that came forward to Update and Reference Committee that had to be kind of either withdrawn or forwarded over to ICD-11 for consideration.

MS. KLOSS: Maybe everybody understands this, but I wanted to just reiterate something we have talked about in our process – discussions in the past. We have a completely distinct coordination and maintenance process in the United States for ICD-10-CM. That will go forward. It means that the base system, the ICD-10 WHO system, that maintenance and update kind of freezes once World Health Organization switches to 11.

MS. PICKETT: To that point, it is a challenge. Even though WHO has an updating process for ICD-10, it is still sort of based on a mortality model, where countries can only handle major updates every three years from a underlying cause of death mortality perspective. For those countries like the U.S., who need it for morbidity purposes, we have a more active and I would like to think robust modification process for ICD-10-CM in the U.S. Plus, we have a mandate from congress that, as necessary, that we have the capability of updating twice a year. That is a process that WHO, at this point, really can’t handle. We are mandated to do so in the U.S.

What we have done as part of our update process in the U.S. for ICD-9-CM and now ICD-10-CM is whenever changes are made to the WHO version for ICD-10, if it is not already represented in 10-CM, we then actively bring that modification into ICD-10-CM. It also can go the other way. We have features in ICD-10-CM that don’t exist in 10 that other countries found useful. Again, we just implemented ICD-10-CM two years ago, so we weren’t actively bringing proposals to the table for the WHO update of 10. Things were going into 10. What WHO also did was to look at the clinical modifications, ICD-10-CM – and there are others, Australia, Canada, France, Germany, all have their own clinical modifications. In working on ICD-11, they looked across all of the clinical modifications or national modifications, as they are called, to see what were the common themes, what were the important issues that countries represented in their own national modifications, to have that as a starting point for 11 and not base it solely on what the content was of an ICD-10.

Finalization of mortality rules. I should have also included in there that for the first time, there will be a more robust set of morbidity rules as part of the ICD-11 implementation. Updates to 11 are being kind of compiled from member states, who are already evaluating ICD-11. Once implemented, updates that will come from the users, primarily the member states, you will also be getting feedback to WHO on updates from ICD-11 in the future. The first full update to ICD-11 is not anticipated until roughly three years out. Likely a freeze until countries can get up and running and actually get something implemented and then do a big update based on user feedback.

MS. KLOSS: So, that is why you have the range of 2018 to 2021. Assuming that first three years is –

MS. PICKETT: Right. Is countries looking at the implementation version, evaluating it, seeing what needs to be done, possibly identifying typos, errors, or things that maybe clinically don’t gel well that need to be modified. Sometimes you really don’t see what needs to be fixed until you actually try to implement it. That is pretty much in any code set.

Phase 4 – I apologize if I am speaking quickly, but I know we have some time constraints here, so I am trying to get through everything. We only have nine more slides. Actually, eight, because one of them is a thank you.

Phase 4 – regular maintenance, which would basically start with 2021. Then they would start to look at the updating process, scientific information coming in, the needs for different use cases and whether something is needed for quality measures or whether it is needed for case mix or whether it is for patient safety. There is a number of things that need to be looked at and across the board, not siloed. Is it something that all of the use cases could benefit from versus is it something that only needs to be maybe considered for a specific use case?

User guidance, that was one of the issues, in terms of standardization of implementation. You need to have good guidance out there so that when everybody implements, they are doing it the same way.

And a robust index so that when you look for something, you are actually finding it. And the development of associated tools, which I have covered in past presentations, where it is curated quite differently from the index that we all know and love now that was part of ICD-10. That is the pathway.

You will see that the CSAC and the MSAC are mentioned, but the Morbidity Reference Group and the Mortality Reference Group are involved there, too. Those are the groups comprised of member state representatives that are active in those fields that really can kind of give the real examples and scenarios of how something is or isn’t working based on a mortality perspective or a morbidity perspective.

I love WHO because all of their things are animated.

Okay, field trials. Again, WHO has planned field trials. Unlike ICD-10 and some of its previous iterations, where you didn’t have a lot of field testing, with ICD-10, you may have had field testing of specific chapters, particularly those chapters that had changed the most, mental health being one of them for ICD-10, the injury chapter where they changed the axis, which confounded the heck out of a lot of people, but it changed anyway. Again, you didn’t have a lot of field testing. So, robust field testing. Again, test for fitness for multiple purposes, not just from the mortality perspective. Try to ensure comparability between ICD-10 and ICD-11, so there is no discontinuity. Purposefully, yes. Inadvertently, no. That is sometimes where the problems really hit. Then increase consistency, identify improvement paths, and reduce errors.

They are looking at ease of use, reliability. Is everybody coding it coming out with the same answer at the end of the day or not? Does that mean you need better instructional materials, better reference guides? Again, all of the tooling is being done as part and parcel of the development of ICD-11 and not as an after thought in some instances. Again, looking at the different settings. Not just inpatient, but primary care, general health, research with a focus on pop health and clinical research. Again, kind of recognizing the fact that even though many still kind of look at ICD as being mortality based, it isn’t. The clinical modifications and/or national modifications have proven that. Even though ICD originally wasn’t for those purposes, that is what it has evolved into. There needs to be acknowledgement and recognition and incorporation of those into the classification.

For ICD-11 – this was a slide I used previously – there are actually tools that have been developed simultaneously. There is a coding tool. There is a browser. There is a proposal tool. Those who have actually registered for the site, if they see a gap, they can actually enter a proposal to say you missed this or we think you got this wrong or we think it might need a little bit of tweaking. It all goes through that platform. So, you are not getting individual proposals that are individually reviewed. It is all on a platform. Everybody can see it. You can actually see comments to the proposals. I think it is good. I think it will get better as more people become familiar with it and learn how to use it and also see themselves, in using the content, the outputs from this.

A translation tool – WHO is committed to having a tool that instead of just producing the ICD initially in English and French and then relying on countries to create their own language versions, that they actually have something that will do that automatically, so that these other language versions in the official languages of WHO are populated, but also done automatically in an automated fashion and not curated individually. Again, with French, it depends on which French you are looking at. Are you looking at French Canadian? Are you looking at French in one country and then when you line the two classifications up, you realize there are disconnects, which is something that happened with 10 and other language versions? Then again, you have some languages where nobody else is going to be using it. Maybe it doesn’t matter much, but when you are looking at comparability across countries, it starts to matter if things have been interpreted differently. Then there is a mapping tool.

MS. LOVE: Is that like Map It?

MS. PICKETT: No. Again, for those of you who really want to get a little bit more into the weeds, there are FAQs. There are videos. There are factsheets. There are quarterly newsletters that WHO produces relative to the work. Of course, you can – if you get lost on the website, you can always reach out to us and we can try to help you if you really want to get down into that level.

So, 2018, that is kind of – in many ways, the end of the road for WHO in terms of finalizing and stabilizing an ICD-11. In terms of updates, that isn’t the end of it. However, for countries, including the U.S., 2018 is the beginning of the road. You have implementation considerations and challenges. This slide tries to lay out some of them. It doesn’t lay out all of them. I hope that this kind of gives you an idea that will kind of guide discussions about what things need to be considered by the committee. Some of the activities and discussions will likely be very similar to the discussions that – while it took 20 odd years to get 10-CM implemented, some of those questions are likely to be the same, but tweaked because we are in a new world in terms of how we use classifications and what are the implications.

For this slide, I only picked some of them, mainly because they were things that were referenced in the 1990 report when we were talking about ICD-10 implementation. There are licensing implications possibly. For the ICD-10 discussions about licensing, the legal eagles did get involved in those discussions. I am assuming that is likely to happen again. What are the operational mechanisms regarding copyright restrictions? Well, those haven’t been entirely laid out, but WHO, as they did with 10, will have copyright on ICD-11. For the agreement that we have with WHO for ICD-10-CM, it broadly defined on what U.S. government purposes are. U.S. 10-CM is used everywhere. We don’t have any copyright licensing restrictions. However, I think those discussions will likely occur with the implementation or considerations for implementation for ICD-11.

As noted in that 1990 report, for government use, government purposes, it is not a single definition. Just by the way we have heard discussions over the last couple of days on how different classifications are used, especially ICD, what does government use mean? Is it Medicare using it for DRGs? Is it CDC NCHS producing mortality and vital statistics? Is it something else because other entities are using the codes for other reimbursement and case mix schemes? Is it AHRQ because they are using it for quality measures? Again, lots of issues that we probably didn’t have to address back in 1990, but I think are very much on the forefront of discussions, in terms of where we go with ICD-11.

Impact that the copyright could have on the cost and use in the U.S. and what are the vendor implications? For those with electronic records, where ICD is part of the electronic foundation, what are the implications there?

Again, I am kind of laying out questions/issues. The Committee will likely have to have hearings to identify what the broader sense of questions are, what the concerns are and the challenges are. This is kind of just a startup list, so to speak.

It has also been noted that WHO would like to limit the development of national modifications. The reason being when you look at, across the board, the number of national modifications, it is rather challenging and daunting because sometimes we did things differently. France may have done something differently with how they implemented a change for diabetes. The U.S. may have done it differently. Can they map? Yes, but from an international perspective, when you are looking for standardization and looking across the board, having at least 10 to 12 national modifications from a WHO perspective is deemed to be somewhat problematic. It is why they took that broad landscape look and said let’s look at what each of the national modifications included and how we can incorporate that into 11 so that you don’t necessarily need a national modification.

Again, different countries use the classifications in different ways, with the U.S. being somewhat unique in that, again, we use it inpatient, outpatient, home health. It is across the board. We don’t do spinoffs of the classification for different settings. It is one classification per HIPAA across the board. And our updating timeframes are different. How can a WHO process be resourced efficiently to address concerns of any of the countries that have national modifications, much less the broad use that we have here in the U.S. for our clinical modification? Again, discussions about how that needs to proceed is something where I think the National Committee certainly would be able to weigh in on and invite testimony on to form a broad-based consensus on how things could move forward.

MS. AULD: Along those lines, Donna and I have had a very, very preliminary conversation that if the WHO goes in a direction so that the U.S. can’t have a clinical modification and yet we feel that we still need that, because of the tie-in between ICD and SNOMED, we would look into whether or not we can use the U.S. extension to SNOMED to help cover that gap. As I said, that is very preliminary conversation.

MS. PICKETT: As Vivian pointed out, we have had a number of preliminary discussions. My office has also reached out to some of the standards development organizations to start looking at the implications for transitioning, recalling yesterday’s discussions about how sometimes some of the standards move slowly in modifications. As all of you likely recall, the 4010 could not support ICD-10-CM or ICD-10-PCS. We had to wait until there was a 5010. If anybody recalls how long it took to get from 4010 to a 5010, you likely understand some of the challenges that we will be faced with in terms of trying to kind of line up all of the ducks in a row to actually move things along. Those are some other considerations.

There are some new enhancements to 11 that may make some of the discussions more challenging because you have post coordination, clustering, where you will have additional codes to describe something that currently, in ICD-10-CM, we describe maybe with one code or two codes. In a post-coordinated fashion in ICD-11, it could result in five, six, or more codes. How do you handle that? How do your relate those things when you didn’t have to relate them before because all of the information you basically needed may have been in one, two, or possibly three codes? Some other things.

If you are wondering, post coordination, how does that work, let me give you an example. ICD-10-CM has a lot of codes where laterality, left or right, is embedded in the code. That is pre-coordination. In a post-coordination world, you would have additional codes, separate codes, to identify whether something was left or right or bilateral. Then how do you link all of those things back together when you’ve got an application, basically, that was not designed to do that? What does that mean in an X12 environment where you now can report 25 – you have 25 fields to report diagnosis codes, but in a post-coordinated way, if you are now doubling or more the number of codes needed for a given condition, what does that do for your administrative data, your claims adjudication, and things of that nature? A lot of things that need to be part of that thought process.

MS. KLOSS: I am looking at the time. We made a decision we will have questions after the break.

DR. STEAD: We will go on and take the break. Let’s be back here at quarter of. We will deal with closing questions and begin to get into our next steps discussion. If some of the next step discussion has to carry-over into the later afternoon, so be it.

MS. KLOSS: I think this is a great kind of a closing –

MS. PICKETT: The last two slides – some of you commented that when I used these slides back in June that you really liked it. I put them back in. This is the ICD-10 implementation timeline. You see that we start at the evaluation of 10 to see if it was fit for purpose to replace 9-CM in 1994 – between 1994 and 19997. The National Committee held hearings between 1997 and 2003. We had an NPRM in 2008, a Final Rule, another NPRM, a Final Rule, an Interim Final Rule. Hopefully, those time – I won’t look at you, Denise. Hopefully, those timeframes would be shortened. I do want to point one thing out about some of the timeframes. Between 1997 and 2003, remember, we had the implementation of HIPAA. So, the first set of codes that were implemented under HIPAA were the de facto codes, which were basically your ICD-9-CM. So, you had some other intricacies going on there, but you still had the substantive discussions, also, about whether 10-CM should or should not replace 9-CM. The bottom line is I would hope that the timeline might be shortened a bit. I still think there would be a lot of discussion, as people really become more familiar with ICD-11 and some of the enhancements, but what some of those enhancements mean as it relates to how stakeholders currently do data capture and what they do with it and output and analysis, et cetera.

The final slide – it was just my overlay of if we took what we know about ICD-10-CM implementation and we just applied dates to it for an ICD-11, starting with 2018 – you notice I stop with the first two bullets. I just could not. Again, we don’t know if all of this would happen. I think it is going to be an interesting set of discussions in terms of where we go from here. With that, I will stop because that is actually the last slide. Bill said we will come back for questions after the break.

DR. STEAD: We will combine questions with next steps after the break. Let’s be back on time, team.


MS. KLOSS: Before we move on to next steps, are there any burning questions for Donna. That is not your last chance because she is here and our discussion – but I actually just had one. What is the process for the US making a decision go-no-go on a clinical modification? Where is the authority and who makes that decision?

MS. PICKETT: I can only reflect on what happened when we were looking at possibly transitioning from 9-CM to 10-CM. One of the first things that the committee had asked was that there would be an evaluation of 9-CM compared to 10 to see if it was fit for purpose and whether or not a clinical modification would be necessary.

MS. KLOSS: This committee needs to be active enough to —

MS. PICKETT: Based on the recommendation that came from the committee, an evaluation was undertaken and the report of that evaluation was presented to the committee and used widely with others to show that while ICD-10 had some benefits, the fact that once they stopped developing it, but the US has a process that continued to modify 9-CM. We were continuously adding details. That will change with 11 because, again, they will continue to modify 11 and it will not be on a totally three-year cycle. They will be updating as information comes in. But, again, is it fit for purpose out of the gate for the uses that we have in the US when you consider how we use the code set in the US? That is kind of a key question.

MS. KLOSS: As we begin to frame a roadmap, I think as you and I talked, the system would be stable after the 2021 test period. After that would be a time for some evaluation as to what the benefits of that might be.

MS. PICKETT: A key question is does the US want to do an evaluation effective whenever WHO publishes the implementation version of ICD-11. That would be in advance of them actually taking the classification to the World Health Assembly for a vote. Or do you want to look perhaps at 2022, it seems so far out, but 2022, after the first full update to ICD-11, knowing that things will be identified in those first few years of implementation?

Do you want to benefit from the work that other countries have done if there are early adopters and make your review and decision based on a more modified, not completely updated, but an updated version or something else? Which is why in the slide I had 2018 to 2021. But then I had 2022 with a question mark. Again, that will be part of – at what point considering the 10CM was just implemented two years ago, and certainly is not at the breakpoint that 9-CM was at where we just basically had no capacity?

MS. KLOSS: Our revitalized journey with health terminologies and vocabulary – Alix.

MS. GOSS: Building on the question that you asked, Linda, and Donna’s thoughts, there is certainly benefit to having stability in the code set. However, there is also in preparing people to start to use that and the standards.

My sense from some discussions with fellow standard subcommittee members is that our standards bodies may not even have the right qualifiers to support the ability. If we anticipate, for instance, an X12 standard coming forward to adopt a new version 7030, let’s just use that as an example, would 7030 then in fact have the qualifier that would be needed because it might be ten years that we live with 7030 or something far longer or less? We do not know. There is a sense of – I think there are multiple aspects to implementation readiness. I think that we might want to think through some of the phases to make sure that the nuts and bolts to enable the usage when it is the right time for the usage are in fact there in the ecosystem. I am not sure from your perspective, Donna, if you have any sense of whether or not we even need to do that. Are we ready with the nuts and bolts for 11?

MS. PICKETT: Similar comments came up at the June hearings last year and following those hearings of X12. I started a dialogue to take a look at even based on what WHO had posted as a beta version of what concerns there might be from an X12 perspective as it relates to structure, syntax, representation, number of fields. We started a very preliminary discussion. But, again, taking that discussion further without having a more stable version was sort of risky.

I think conversations would likely be taken up again once WHO posts in June an implementation-ready of ICD-11. But I really do not expect the structure to change that much. However, I think the issue of clustering and post-coordination and what that means in terms of the number of data fields you need for your claims and adjudication, et cetera and how you represent that in an X12 837 that needs to be looked at.

My recollection is that we had ongoing discussion when trying to move to – well, what turned out to be the 5010, but Michelle Williamson worked very closely and Michelle was part of our office. We worked closely with X12 to try to prepare the standard to accept 10-CM and 10-PCS. That took a whole bunch of years before that actually occurred. Even though there may have been at some level a readiness for the industry to use 10-CM, the fact is you still needed a standard that could actually accept the codes. That process is like two trains running. How do you get all the trains in alignment so that everybody kind of reaches the endpoint at the same time?

MS. KLOSS: That is how we are doing Predictability Roadmap.

Just to take us back to the overall project on health terminologies and vocabularies, your presentations today on UMLS and ICD-11 deeper dive kind of completed what the committee had identified as some gaps in the environmental scan process that we were doing, which was launched in June, continued in September. We took a hiatus with our November meeting. We are kind of at the point of having completed our environmental scan findings.

And what we have as an output from that just to frame, I am not going to go through this whole slide deck, but it has been an updating process. Our goals for this project were to take a contemporary look at the landscape and its implications for timing, an approach to health terminology and vocabulary standards adoption, hence this discussion and our pending retirement, needs, opportunities, and problems with development, dissemination, maintenance, and adoption. We were coming off of ICD-10 and saying we cannot go through that again. We have to anticipate what is coming forward. I just throw out to the committee.

Now that we have been through our environmental scan, is this the right set of project goals, the right scope? Our scoping document is a living document. Do we want to raise some questions that are maybe more visionary about what the vocabulary terminology ecosystem should look like in the United States ten years from now? That could almost be a parallel path.

At any rate, I just want to represent this, not to say that we have time in the next 20 minutes to resolve it. As we move along, we can still look broadly while we are dealing with more nuts and bolts questions and more pragmatic questions.

MS. PICKETT: In my quest to move through things very quickly, I did want to note that there is a difference likely in pathways between morbidity and mortality. I specifically focused my remarks on the CM version and the implications for morbidity and the multiple morbidity use cases. But there is a pathway forward for mortality that is separate, but that the committee has also previously been engaged in discussion about. They are not covered by HIPAA. They have different challenges, I think, is a good way to put that, but they still have to be addressed. While some of it is the automated systems that would have to be revised, there are specific mortality issues that also would be part of the discussion. And, again, my presentation really focused on the morbidity aspects, but mortality is an important issue as well.

MS. KLOSS: Ruth, can I ask you to bring up the other —

DR. STEAD: May I make just a comment in response to your question? I think that those are still logical goals for this work. We need to think about what the answers are beginning to look like because the changing landscape – we need to package the environmental scan and then say what that tells us about the changing landscape and then that might guide where we want to place our emphasis as we work through the rest of this year to get to the first report following the environmental scan.

As I listen to both of these sets of presentations this morning, I sense a collision, whatever, between a rapidly changing understanding of biology and our approach to figuring out how to measure useful population health statistics and a rapid change in our health reimbursement that might decrease the need to codify some of the things that the US codifies that other people do not related to our old reimbursement structures.

I emailed a colleague at Nashville to just give me one example of the shift in our understanding in cancer. Right now, most of our accounting around cancer is organ or cell morphology based. What are some easy examples of pathways that we now know are shared and are actually the cause of cancers we used to think were distinct and separate and unrelated to one another? Anaplastic lymphoma kinase first identified as part of a large cell lymphoma, now recognized to be responsible for rare pediatric inflammatory myeloblastic tumor and non-small-cell carcinoma.

How we think about aggregating diseases is flipping as we speak. As I look at this timeline, I think that that could suggest – that could be the kind of thing that we might flesh out over the course of a year that might change what we want to do going forward. I do like the idea of locking down the environmental scan. I think it is useful step forward. I like where we are headed if that is helpful.

MS. KLOSS: I like the fact that we are going to have help from the NLM and locking down the current environmental scan. We just want to review the project timeline for this collaborative project over the next year or two years and to figure out how we work collaboratively on this.

Vivian, do you want me to just scroll and you and Suzie can —

DR. STEAD: Again, the committee knows this, but just to be clear. Jerry and Dr. Brennan agreed to help us because this work so fit with the strategic plan of the NLM. Suzie and Vivian are going to provide staff support for this work going forward. We are immensely grateful. Thank you.

MR. SHEEHAN: And we appreciate the opportunity to say this is a good connection with our work now as we implement our strategic plan and are also trying to get a better sense of the landscape of health data standards and where things are moving. I will turn things over to Vivian.

DR. STEAD: While we are doing that, let me let Dave Ross read himself in.

DR. ROSS: I am sorry. I blame my tardiness this morning on the yellow line. There was a train delay. This is Dave Ross, member of the Full Committee, member of the Population Health Subcommittee and I have no conflicts.

MS. KLOSS: This is based on the NCVHS’ work plan. We have fleshed this out with regard to the support.

Agenda Item: Health Terminologies and Vocabularies

MS. AULD: Essentially, what we are going to do is first we are going to look back at the presentations that have taken since June, do a summary report so that in one place in one concise format you have a listing of all the relevant pieces and making sure that there are not any other pieces that we need to bring in as well.

And then also taking a look at the previous Power Point presentation you were just showing, summarizing what the questions from there and starting to look at how we might go about answering them. That will take place within the first quarter of calendar year 2018.

Over the summer, we will use that report as a basis for roundtable discussions to start to address some of those questions that were raised, whether there are other questions that need to be raised, and beginning to look at what recommendations would be needed from the committee to help move this in a useful direction. The intent is that that would take place over the summer.

In the fourth quarter of 2018 and the first quarter of 2019, the Full Committee would work through all those recommendations and finalize those and likely a recommendation will come out of that to the secretary. And then in that same period, we would determine whether additional next steps are going to be needed or whether we are all happy and done and want to move onto something else.

MS. KLOSS: The environmental scan report will be input into this roundtable process where we would invite and engage others who are stakeholders that perhaps have not come before the committee yet such as the AMA. We have had a number of discussions. They are very interested in participating in this discussion as are others. We look at that as a broader meeting.

DR. STEAD: As I think we have now done in a number of things, the work we have done around vitals, the work we are doing with the Predictability Roadmap and Beyond HIPAA, what we are going to try to do is as we step back from the environmental scan is have a reasonably narrow focus for the roundtable so that we ensure that we in fact get a deliverable out of that. The recommendations may not only include things that recommendations of the secretary, but also could be things that we should next and they could also be things the industry should do next so that we would – this is in total a very big question that is in front of us.

What we are going to need to do is break it up in a way that really does get the environmental scan out now and then another deliverable before the end of the calendar year. That will then help us collectively decide how to best use resource going forward. We are going to try to be intentional about that.

That also just so you know plays into our other things in that we will be producing now our 13th report to Congress on HIPAA and we will be writing that over the holidays next winter. We want to have this to feed into that.

DR. COHEN: A general observation from the initial question that you asked, Linda, from wearing my population hat. I am very jealous of the fact that so much progress has been made with respect to developing unified, medical language systems. It just points to me how little progress is made developing unified health language systems when we think about population health and all of the indicators that we feel impact individual and community health that are not in the clinical domain.

I think this committee as we do strategic planning moving forward should try to link or just have that concern how these efforts can help fit in or educate or influence any development we want to make in terms – the framework was a great push to get us to think about what are the indicators and what language and what are the variables that we need to look at community health. How can we link this work with developing clinical vocabularies to the need to describe population health? I think there is an enormous amount of work that can be done in that area. I would love for us to down the road think about how to integrate or align those efforts.

MS. AULD: Can I clarify? Is another way of phrasing that topic, social determinants of health?

DR. COHEN: Social determinants are a subset of that. We are talking about transportation, education, environment, influences. The estimates of how much clinical issues impact individual and population health range from 20 to 40 percent. There are another 60 to 80 percent of things that influence population health that we need to get a handle on and integrate with the phenomenal work you have done in developing comprehensive approaches to aligning clinical information.

MS. AULD: Just so you are aware, we are – NLM is getting increasingly tapped. The community that is interested in social determinants of health is becoming very vocal to the point that the IHTSDO, the SNOMED International folks, are recognizing that in the US that is a priority. Because we are one of their biggest members — we are their biggest member. They are promoting that as something more active to look at. We are on the edge of that.

We are in the position that now this is on our radar, but what exactly is needed. Working with you to scope that out would be excellent.

DR. COHEN: That would be a great project that cuts across a variety of the National Committee’s concerns and issues to think strategically about how to do that kind of development and integration.

DR. STEAD: That was square center in the NLM strategic plan drafts.

MR. SHEEHAN: Yes, it is. I was going to say at the entrance to Dr. Brennan’s door is the chart showing determinants of health and percentage that are clinical and social and it is based on some recent research. I know this is foremost in her mind. As Bill was saying, it is something that has come up in the context of strategic planning when we talk about stakeholder engagements and community engagements. It is how to get access to better information that informs health and health decisions related to these different types of measures beyond the clinical domain. The M in UMLS is largely an artifact of 1986 thinking. It has certainly evolved even in terms of its current content.

But I would also see this as being an incredibly valuable effort and a place where we could take some additional guidance as I am sure the committee could. There may be opportunities for collaborating as we have been trying to do in other domains with other US government agencies that collect some of the relevant indicators.

MS. HINES: That would just be huge. The idea that Bruce, you, have just put out, the parallelism to have a parallel UMLS type thing for the world of population health measurement would be profound.

MS. ROY: One of the keys with UMLS is to make sure that they are not parallel, but actually the UMLS is actually encompassing all of it rather than having two parallel efforts that are not harmonized. Having that within the UMLS so that then with the UMLS, you then have the linkage to the genomics field, to the benoit fields, to the drug fields and all the others.

DR. STEAD: This played into really my question where I was trying to get at the denominator. I clearly did not ask it very well because I think the response would be can we identify source terminologies of any sort in any space that if incorporated in UMLS would bring those dimensions in and by doing that, you see all the relationships to all the things. It would just like what happened when we put in genomics.

DR. PHILLIPS: I want to support this idea of moving from medical to health. The National Academy of Medicine had a report on integrating public health and primary care about eight years ago that actually has a chapter that talks about this need for shared ontology and the ability to communicate and update a share between the two. That might be a useful entrance into this.

My other question relates to the capacity of the UMLS. This was like another graduate course for me today. Thank you so much. The ICD classification system largely shredded the capacity to do epidemiologic work in primary care. It did not support for a long time the ability to do decision support based on that epidemiology. I am encouraged to hear about the – I do not know if they are new capacities, but the current capacity is to do roll ups, which might help rebuild that epidemiologic basis. But I am even more interested in UMLS in its capacity to support that in primary care.

My question is – I understand how your – I have a limited understanding of how you are doing this work. But is it being built to some functional capacity? Are there functions that you are building to that this has to support this function like epidemiology by setting?

MS. ROY: I can take that. Essentially, that would not necessarily be in the UMLS, but that ability to roll up or do across field specialization and for population health statistic kind of analyses can be done with SNOMED-CT. SNOMED-CT is a polyhierarchical terminology. By building or by essentially encoding electronic health records with SNOMED and with how it is structured, it has all the children for a particular domain. That is kind of that rollup that you heard with the ICD. It is already built into SNOMED.

Part of the concept model inherently with the polyhierarchical structure allows for that across field analyses where you can drill down to the very specifics or up at a higher level analysis.

Of course depending on the domain, there might be more information within the sub-hierarchies, but that is where we are starting to work with and why we want to continue to work with member countries as well as local groups what areas need to be developed to get all the way down to those specifics. You essentially can go all the way up to higher level classes within SNOMED.

MS. AULD: And because SNOMED is part of the UMLS because ICD is part of the UMLS because all these other terminologies are, if everything that you need is not in one of those, you can leverage the UMLS to extract the part that will represent everything you need and create tools or resources you need to get the type of data that you want.

MS. KLOSS: It is getting to be a new world. I have Vickie.

DR. MAYS: — report that Bob just brought up because I think it is a critical way of looking at how difficult the shift will be in terms of social determinants and some of the things that we have to think about. The National Library of Medicine I think has been focused on the medicine part. It is great to hear that there is a desire to make the shift to do social determinants. Particularly in the area of social determinants, we get far afield of these medical terms.

And the question really has been in terms of what I understand the pushback has been is how to start incorporating things like understanding the non-medical terms and how that is going to interface. The shifting in the public health has been – if we are going to go even more far afield and as Bruce is saying, talk about things like transportation, talk about things like crime and all this other stuff, what is it that we can do to help with that shift to a more public health and then a social determinants perspective from your viewpoint?

MR. SHEEHAN: I will start an answer on that. I think for us maybe we would think of this as not a shift in perspective as is much is kind of an expansion and extension of what we can do with things like the UMLS. I think the UMLS probably had in its originator’s minds some particular kinds of applications. And over the 30 now years it has been in place, I think those have extended. As we said, this was a long-term R&D project, but also has operational components. It is being used in many ways as you could see on the slides that Patrick showed. In many different ways, we could not have anticipated.

We are also in a position where we have many different communities that would like it to do more with their own sorts of data. In the NIH community, a lot of interest in data science in big data to support research needs across the spectrum of things NIH invests in. Of course, we invest in some work that would relate to social determinants of health.

I think where NCVHS and other groups could be helpful to us is probably both in helping us understand what are certain priorities from different communities and different uses to which the UMLS could be put and what are some of the source vocabularies and others that may exist or maybe where there is an opportunity to develop new ones. As you say, at some point, we are getting far afield of our own expertise in these and we will need some guidance to help build those bridges between these different communities and trying to figure out both where there are opportunities and some degree of feasibility to make some concrete steps.

I think one of the things I always appreciate about our work at NIH and NLM in particular was have a vision to guide us, but we try to be very pragmatic because we try to build systems that people will come and use and have to work. I think we will need some additional guidance as we think more about these areas as we implement the strategic plan and it is still very early stages of that thinking.

MS. AULD: And one point to clarify or reiterate a fine line that we have described here today in that the UMLS is not a terminology. It is a collection of sources. You would not be coming to NLM saying we need codes within the UMLS. We are figuring out do those codes need to be put. Do they need to be in SNOMED, in LOINC, in RxNorm, in CPT, and ICD? From that perspective, what is needed and then how can we help facilitate getting them into there? Then they automatically go in the UMLS. What tools can we use to better facilitate pulling them out so that they are useful?

DR. STEAD: One possible avenue is if you take the measurement framework for community health and well-being and the work that is underway with 100 Million Healthier Lives to build out metrics that could turn into a terminology. It could be curated as a terminology and could be then – and then handed to UMLS where it could then be incorporated in all the links worked out. It is a possible way you could begin to get there.

MS. KLOSS: Denise, I think you get last question before we need to move on.

MS. LOVE: Actually, Vickie asked my question and then Bill sort of touched on the other. I am just assuming that a project that is going on – I have spent two days at Census blending and we are working to blend public health data with Census data to get some of these indicators. I am assuming that kind of project could ultimately roll up. I heard the answer was yes.

MS. AULD: We want this to be useful to everybody and we need guidance on where are the gaps, where are the holes so that we can work with the community with the source producers to fill in the terminology gaps and then also work on improving how the terminologies work together because they each have different unique scopes, which are not going to shift. But they also have places where they have to properly link together and people have to understand how they link together, how to use them together. Guidance on how we can help facilitate that conversation is useful so that it covers all parts of the equation.

MS. KLOSS: I know I am excited about moving forward with our project. In fact, we have a meeting with the Standards Subcommittee co-chairs, beginning at 3 o’clock today and we are launched. Thank you very much.

MS. HINES: Paula, are you on the phone?

MS. BRAUN: Yes, I am.

MS. HINES: Welcome.

MS. BRAUN: Thank you.

DR. STEAD: I will introduce Paula Braun who is the Entrepreneur-in-Residence at the Division of Vital Statistics at the National Center for Health Statistics. She is going to give us a briefing on the FHIR interoperability resources. Take it away.

Agenda Item: Briefing on Fast Healthcare Interoperability Resources (FHIR)

MS. BRAUN: Thank you. It is really an honor to be here. Since I am presenting remotely, you will hear me say next slide quite a bit. Let’s go ahead and move to the next slide.

Just to give everybody a sense of the topics that I would like to discuss today, you may be wondering to yourself what in the world is an Entrepreneur-in-Residence. I will talk a little bit about what that program is and the types of projects that we work on, specifically through the lens of the project that I have been asked to work on with CDC.

Then we will shift to a more detailed discussion of FHIR. That is actually going to be the bulk of the presentation will be in this section. I think to understand FHIR, it really helps if you can get a sense of the historical context of HL7 as an organization. To really tease out some of the characteristics of FHIR that are distinct from some of the other standards that HL7 and other health standards development organizations support. Then we will provide some concrete examples of these more theoretical concepts that we talk about both from within vital statistics, which is the area where I work and across public health.

Finally, we would like to spend a little bit of time where we can talk about a framework and give you some intuitions on where the standards and health IT community is heading so that way you can have a more nuanced understanding and at least have some considerations directly on the top of your brain as you think about these issues and as you think about the questions that you might want to ask. And then we will end with a question and answer period.

The Entrepreneurs-in-Residence Program is a way to bring outside talent into CDC. It fits within the chief technology officer about the Department of Health and Human Services. As one former chief technology officer proudly said, the EIR Program is a way to bring outside people into the government, be paid by the government to cause a raucous within the government.

I am a little bit unique from an EIR perspective in that I actually had substantial professional experience working with the federal government before I became an EIR. I actually started my career almost 13 years ago here at CDC as a presidential management fellow. I had done other things. I had gone off to work for the government accountability office. I had audited war time projects in Iraq and Afghanistan and then later went back to school to get – and then went back to school to really study this fairly new and emerging field of data science and analytics.

When a former colleague of mine said we would like for you to come back to CDC, I thought I am working in a predictive analytics firm and I am teaching at public health. You want me to come back inside of a bureaucracy. He said yes. I gave it some consideration. I really saw this as an opportunity because the EIR Program, the way it was set up, it was more like a Silicon Valley type organization. But it was not just this concept of we are going to take this techie, loose cannon and put him the federal government. It was actually very different.

A federal government entity within the Department of Health and Human Services like the CDC would say, these are the challenges that we are working on. We would like somebody to come in and help provide a fresh way of thinking about things. That really appealed to me. The opportunity to work very closely with the program people here at CDC, with the states and other people that we are trying to influence as well as some of the brightest minds in health technology.

There were synergies that were happening at that time that introduced me to FHIR. Mark Scrimshire, over at Centers for Medicare and Medicaid, was in the process FHIR enabling the claims data. Adam Culbertson, who was a fellow that was working with the Office of the National Coordinator, along with HIMMS, was thinking about ways to link patients together, using something like FHIR. That was my baptism by FHIR, no pun intended, into this whole concept of FHIR.

The project that I was brought on to work on was this idea of how do we modernize mortality data systems across the United States. I know that you have had several briefings on the National Vital Statistics system. I am not going to belabor the issues here. But this was my really first step as an EIR was to think about how do the data actually flow to CDC. It became apparent to me that there was really this complex ecosystem, this system of systems. In order for us at the National Center for Health Statistics to aggregate data from all these different death certificates from across the country, we were deeply dependent upon the work that the registrars performed at the state level. You see that in the middle of the picture with the red star and EDRS.

And ultimately all of us were dependent upon the people that figure out what were the cause and the manner of death. That is where you see the physicians, the medical examiners and coroners as well as the funeral home directors that were working with the families to collect all the demographic information.

When I first thought about this, I thought how do all of these systems come together. I was imagining the mouse track game that I played as a kid where it had a seemingly simple purpose, which was to trap a mouse. But you had to have a guy jump in a barrel and then something would fall down. I could not even believe that all of this could come together.

But what I learned very quickly was that because of serious dedication really at the state level, they had made tremendous progress on very little resources investing in these electronic death registration systems. That had helped with a lot of the timeliness that we were seeing. The number of records that could get to the CDC within ten days or less just continued to increase. Because we had that momentum working in our favor, there was really an opportunity to think about how do we engage particularly the medical examiners and coroners along with the physicians, the ones who are making those important cause and manner and death determinations and summarizing that on the death certificates. How can we better integrate with what they are doing as well as the end users of mortality data? How can we offer data up to them in a way that also fit within their workflows?

After I mapped it all out and got to talk to a lot of people, I came up with this working definition of interoperability and what would interoperability look like within the context of mortality reporting. While this is specific to mortality, you could probably think about making a few changes here and there and thinking at a more abstract level to other aspects of public health or health care exchange.

In order for us to achieve our mission, we really had to come up with ways that we could provide more specific and more up-to-date information to the people who needed it. They had to trust the value of this system in order for us to get the investments that would be needed to allow the systems to continue to evolve. We had to do it in some very coordinated and consistent and secure way because currently there is lots of variability in how this is accomplished. We have to do it in some way where if you have made an enhancement in one system, in one part of the country, you would have some level of assurance that you could apply similar techniques and modernization in other parts of the country. It had to fit within the workflow of the people that were asking to provide data to us as well as the people who were requesting data on the other end. Fundamentally, out of all of this, we had to create more value than what it cost to make these types of upgrades.

Because FHIR had this way of making data acceptable in an agreed upon format using modern technologies, the same technologies that underpin the Internet, I really saw the potential for us. If we were able to use it to bolster the centuries’ old tradition of public health surveillance that dates back to the builds of mortality then maybe there is some opportunity on how what I was learning within this limited project around mortality reporting could be applied to other aspects of public health, human services, and other aspects of health care.

Now, I am going to shift into really the meat and potatoes of this discussion and try to help tease out some of these issues around what is FHIR. FHIR is a community within the Health Level Seven, HL7, a standards development organization. HL7’s mission in a nutshell is to help people get the data, make sense of it, and do more valuable things with it.

Health Level Seven has been around for a little over 30 years. They just celebrated their 30th anniversary last year. If we think about the original standards that were developed back in 1987, it helps if we think about the way health care was being practiced at that time and the types of systems that existed. Even within a hospital, you would have a different system that would run your lab orders, for example. You would have a different system that nurses might use. The pharmacy might be another part of your hospital that is doing a different system. Just the way that computer systems were designed at that time, you would really pull together the best of class for a specific purpose and they hadn’t really been designed to think about how to exchange information with one another.

The V2 standards that emerged at that time were really more akin to a telegram message where tell Dr. Jones that Mr. Smith’s lab result was positive. It was discreet. It was sent and where it worked it was heavily adopted. Over time, starting around the mid ’90s, there was a stretch where people began to say that there are so many semantic inconsistencies with how information is captured and maintained within health care.

In order for us to really do this meaningful exchange of information, we have to have some level of assurance that the data are comparable, that we are doing this apples comparison. There was this concept of a reference implementation model where they would try to have a really almost perfect representation that everybody could point to and so everybody could use it and exchange information.

What happened there is that the types of assumptions and decisions that were made in trying to come up with that implementation were not always thoroughly tested in the real world at the earliest points of making those assumptions. As you can imagine, the further you go down the line to try to implement something, you just learn things along the way that if you had involved those implementers from the beginning, you might have made a different decision.

We began to see that the Version 3 way of doing things emerged in the early to mid-2000s with the clinical document architecture way of exchanging information. And the way you want to think about a CDA is that at that time, information would largely exchange via fax machine. There was a desire to move from that paper-based fax heavy way to something that was more nimble.

The proposed idea there was to use the V3 and the RIM and everything that had been developed and to secure – create these secure emails that doctors could send to each other especially when you had chronic disease patients, for example, who may be cared by eight or ten or more physicians and they all had to know what the other had done so that way nobody fell through the cracks.

That was around the same time that a lot of things were happening in the regulatory environment. Because the CDA was the best available standard at the time, that is a lot of what had been named in regulation, the same thing with the V2 standards that had been in place for several years up to that point.

And then fast forward to I guess about five to seven years ago at this point 2011 is when HL7 really decided to take a fresh look at its practices. This way of thinking happened around the same time that the Department of Health and Human Services had commissioned the JASON, which is this group of advisors to the government, which usually opines on things related to national security.

There were just a number of different entities that were all looking at health IT at the same time and basically came to the same conclusion, which was that the way health care information was stored and structured across the United States. There were a number of challenges, but probably the most fundamental challenge that they were all trying to address was how do we get these disconnected systems to act more like centralized systems that could add additional functionality that had not already been envisioned at the time that the system was being developed. That is really where the FHIR group emerged within HL7.

It was this idea that by trying to represent everything in a standard and by not always doing the best job of involving implementers from the beginning, they created ways of exchanging information that were very complicated to implement and then people would implement them differently. You did not achieve the economies of scale that you would want to be able to achieve with electronic systems.

With all of that history of HL7 – it was quite a mouthful that I just presented to you there. You may be wondering. What actually is FHIR? I like to think of FHIR as three separate things that happen concurrently. At its core, FHIR is the community initially of the early adopters, but it is becoming more and more common that people are adopting FHIR as an approach. But this whole idea that you could take these multi-stakeholder teams that have proprietary concerns as well as at time counter-availing demands for what they want to do with the data and that you could get them together and get them to agree on what the priorities are going to be, how are they going to coordinate their work and most importantly how are they going to use the modern technologies that are available and that have caused so much innovation and so many other industries. How can they use that for the exchange of health data? Also, a key aspect of this is how can they open up health data for secondary use. That is the community level of what FHIR does.

It is actually pretty sophisticated when you really pay attention to how they have leveraged the existing HL7 standards development work and turned it into this series of sprints and these design and test and build and learn feedback loops. They just iteratively build the standard over time.

The next piece of FHIR is this idea of how are we actually going to represent the data. That is the data model’s component of FHIR. You have to serve up data in a consistent way that people can look at and figure out very quickly to know whether it is going to meet their need or not. This concept of reuse by multiple parties and breaking very complex clinical concepts down into small reusable chunks that can be perpetually recombined and remixed to do something very simple such as grabbing a lab result or something much more complex such as building the electronic health record system for an entire country. The same building blocks would be set up to do that.

That actually goes hand in hand with the third point I want to make about FHIR’s set of modern, Internet-based technologies that lets you do things that could not easily be done at scale in health care before.

Let’s talk about FHIR is a little bit different from some of the other standards in the sense of what has emerged in the marketplace after FHIR became an option. One of the first things that people like to point to is that because we have FHIR, we are now able to have an app store for health care. And the SMART project, which is the project out of Harvard, is about Substitutable Medical – I forget what the R stands for. This idea that just like your Smartphone works – extensively, the purpose of your phone is to be able to make the call. But because those data such as your location, such as your acceleration, such as all these other sensors that are embedded into your phone – because the cell phone companies make those data available in a consistent way then everybody can design for the Apple app store. Everybody can design for the Android app store. It is able to open up these new environments.

The folks over at Harvard had trying doing this with the previous versions of technology that we had just talked about. They ran into a number of barriers particularly around scalability. FHIR was the first really viable standardized way of serving up health information that allowed this whole ecosystem to emerge.

And the other piece of it is really important is that through the process of developing this SMART app store for these applications, they thought about how do you do the security components of it. They used the modern architectural component. OAuth is just a way of saying are you actually authorized to access the data that you are requesting to access. The same thing with OpenID. Are you actually who you say you are.

What then they were able to do was to take the existing electronic health registration systems out there and begin to build new interfaces that were more meaningful. What I mean by interfaces – you can think of the example of pediatricians. They have a specialized patient population. They have to do things all the time such as charting the growth chart for every single one of their patients. There aren’t many products out there that were really designed to do that. This is the same manual process that they were doing over and over again.

The folks over at Harvard and at SMART had developed a really state-of-the-art application that would do that for the pediatricians. The amazing thing about what they did is that they could run across any of the electronic medical records’ products that had adhered to the FHIR protocol and that were using the SMART APIs this way of serving up the data.

The next point I want to make is similar to what I had been saying about SMART. It is this focus on this nimble way of accessing information through the modern web. It is not just about grabbing data out of an electronic health record and showing it on a pretty app. It is actually much more complicated in that sense when you think about trying to connect different parts that are needed to help somebody make a decision.

The Internet is the most efficient and effective way of exchanging information that we have. If we could use the same architectural principles and the same technological underpinnings that allow the Internet to flourish then we can begin to take advantage of the multiple opportunities that are available.

Let me give you an example. One of the SMART apps that was developed by I believe it was Partners HealthCare. They wanted to help their physicians do a better job of determining what vaccinations were needed. They wanted to tap into the rules engine that the American Immunization Registry Organization makes available. And because that organization used the modern technology stack and made it available on the web, this application could run within the electronic health records. It could see what immunizations that had already been administered to that patient. They could send that data up to the rules engine and it would feedback what the recommendations are for the additional vaccinations for that individual. It allowed the physicians to do a better job of doing vaccine forecasting.

If you can imagine that across any other type of clinical decision support that you would to be able to make available to clinicians at the point of care, we begin to think about how we adhere to these modern technologies. Then it just becomes more like an exercise in LEGOs. If we have a level of assurance that the pieces are going to fit together, we can allow people to open their imagination and build whatever they think is a possibility or we can start to develop certain recipes, certain things that we want to have built and provide kits to make it easier for them to build it and then you can really begin to get more out of the federal investment and state investment and other investments that have been in technology.

The third piece of this that I think is incredibly important and somewhat distinct from the other standards development work that has happened up to this point is really this coalescence among these broad stakeholder groups. The first of those is the Argonaut Project. If you remember, I spoke previously about the JASON report that came out saying that you should use these APIs for health care. You should use these more modern approaches for exchanging data within health care. The guys at HL7 apparently have a little bit of a sense of humor. Let’s put JASON in the Argonaut. What that was is an original collection of health systems such as InterMountain, electronic medical records such as Cerner or maybe about a dozen of them or so that all came together and said yes, HL7. We like this concept of a fresh look. We will work together to come up with what the specification can and could look like and something that actually implementable.

And even more than that, we are going to use our own funds to fund it. This public/private partnership of industry articulating what their priorities are and government and other nonprofit entities articulating what their priorities are. And everybody is saying let’s look at the common 80 percent that already exchange. How can we do that well so that way more people can benefit off of it.

We can think about our investments in technology like putting pennies in a piggy bank where each one that you put in, you only get one value stream out. Or you can begin to think about our investments and technologies as the possibility of having compounded interest. If we work together in a more synergistic way, you can put much smaller investments in that could potentially lead to much larger outputs and additional value streams. This first group of Argonauts really saw that potential and really dedicated their time to fleshing out what the specs would look like. They also dedicated their time implementing this SMART on FHIR application programming interface.

Another one that is emerging is called the Da Vinci Project. And they are so recent that I do not know their entire origin story. What I do know about them is that they are looking at value-based care in ways that that can be achieving using the FHIR standards.

As I mentioned before, we saw a lot of uptake of the clinical document architectures and to some extent the V2 standards because of federal regulation and other things that had been tasked. While FHIR has not be named in any specific statute or regulation that I am aware of yet. Recent legislation had called for a FHIR-like approach to making health care accessible.

And probably the best example of that is the 21st Century Cures Act, which requires the electronic medical record vendors to make data available without special effort. They did specifically say this use of application programming interfaces, which is just that Internet-based way of exchanging health information.

Similarly, the meaningful use Stage 3 regulations require that patients have the ability to access their data through APIs. Again, they do not specify the differences between a standard phase API like FHIR versus a proprietary API that a vendor product may want to make available. But this concept of allowing more nimble, more granular access to data is – it is important that it is specifically named in these constructs.

What is interesting about FHIR I think largely in part because it grew up in and came out of the larger Health Level Seven movement is that had backward compatibility in mind from the beginning. While it advocates for this API way of exchanging data, it also allows for and makes room for the older ways so the messages and the documents.

One of the most exciting things outside of the API support and backward compatibility component of FHIR is that it allows you to tap into those web services that we had talked about earlier, those trusted bodies of knowledge.

You can think of that really in two different streams. You can think of it as a trusted entity such as public health defining guidelines or rules that they would like for physicians or individuals to follow and putting them out there in a way that the apps can tap into that and provide information right when somebody is about to make a decision, right when they are likely to potentially change their behavior. We have that level of precision that is possibly available.

The web services are also potentially useful for aggregating data in real time in building more of an empirical body of knowledge kind of the way that Google gets better every time that you type a search result in. It learns over time what people are using based on real data that is fed back to them. The modern web services approach allows you to also integrate that more analytics-driven, machine-learning driven, deep learning ideas into health.

We have spent a good chunk of time talking about FHIR. Now, we are going to spend a little more time looking at how I have used FHIR both in the mortality space and how I have connected people who know about FHIR with other public health partners and what we have learned from those experiences.

One of the first projects I was able to initiate on my own at CDC was this idea that most physicians do not complete death certificates on a regular basis. Some physicians find it very difficult to complete the death certificate for several reasons. One, they may not know anything about the patient. Or, two, they may have just spent the last 24 hours trying to save this patient and that individual died. It elicits the idea that I may have done something wrong. I tried to understand what the human component of filling out a death certificate would be like and are there ways that we could use technology, in this case the FHIR technology, to provide a tool to them that could provide and give them the information they needed right at that moment so they could make a more informed decision about what to put on the death certificate and that it would simplify the process of sending it to the state. Instead of having to break their workflow and sign into a new system, they could just use the tools that they were already using every single day.

Could we do it in a way that others could potentially tap into it? For example, the National Center for Health Statistics has a vested interest in improving the quality of information on death certificates. We developed a validation and edits web service that we could potentially pull into this tool and make available. The same thing as if you are trying to take health record data and present it to physicians in a reliable way. It is helpful if you can translate that to a common way of representing information so you know what to show. We would tap into the UMLS web service.

We wanted to make sure that this tool was not just a replication of a paper form on an electronic screen. We really wanted something that could help people separate the signal from the noise and help further their understanding of etiological disease progression so that way they could have what they need at the moment that they needed it to complete the death certificate.

The other piece of this is could we make this bidirectional. Could we think about after the data had been aggregated and we wanted to send the coded information back to the care provider, the hospital team, would it be possible to do that? Those ICD-10 codes can be incredibly valuable if you want to better understand your population.

These were the goals. I was very fortunate to get to work with Michelle Williamson and Hetty Khan from NCHS who have done a great deal of work in the standards development process. This first part was actually fairly straightforward. We just took the death certificate and said which of these elements map to existing FHIR resources. And then the next piece of it is developing that clinical decision support tool, that piece that is going to grab the data from the electronic health record system and put it on a timeline so that way physicians would not have to say I do not know anything about this patient. They could tell something about the patient’s health history.

And then the final piece of it which is by far the most challenging piece of it and that we are still working on was this concept of analytics and how can we develop a meaningful machine learning algorithm that would help the physicians determine the sequence of events that led to death for that particular patient.

This is just a representation of what that mapping looked like. I am assuming that many people here may have seen a death certificate before. But you have the pieces of it about the patient at the top. You have a great deal of medical information in the middle. You see that in blue. We just spent a little time saying how does FHIR represent these concepts and what pieces of that can we just reuse and not reinvent the wheel.

This is a rough prototype of what it looks like. What is nice about it is SMART on FHIR gives you these launch parameters, meaning you tell it what patient you want to have a clinical decision support made about and it spills out information. Right at the top, we see that this Joe Smart. We have the patient’s ID. We know how old he is. We have the ability to go back and forth to other parts of the death certificate. We have the ability to go back into the electronic medical record if we want more information. But we could just hover over that timeline and really get a sense of the temporality of what were the diseases or events that unfolded in the time preceding this patient’s death. You can zoom in very closely to the moments right before death to further back in the patient’s health record.

The orange-yellow lines that you see underneath that are actually the much more complicated piece of that in thinking through how we develop the analytics to be able to make that type of informed recommendation. Very similar to when you are shopping on Amazon, it will tell you customers like you also preferred X, Y, and Z. That type of association analysis. But it is a bit more complicated when you have to factor in the timing of these things and when you have to factor in the complexities of medicine. We are really just at the beginning stages of that aspect of it. But the timeline piece of it and the ability to complete the form within the electronic health records are things that we are ready to start testing now.

This is the current of this project. I cannot take credit for the title Death Worm. I thought it was pretty clever. But it was a registrar in the State of New Hampshire that came up with it. He came from a background of being a funeral director. He thought that we asked physicians to do too much with death certificates and that you needed to have a worm-like thing that could just crawl through the records and give you real-time results that would help you make better decisions. I have to thank Steve Wurtz of New Hampshire for the title.

As you can see here, the mappings are pretty much done. The next piece of it is the process of FHIR that is called profiling where you bring a number of stakeholders together and then you get some understanding of how are we actually going to represent this data. How are we going to constrain it for the vital statistics use case? We are excited about that because we are testing it at connect-a-thons this January, next week. Cleveland IHE Connect-a-Thon. Our state partner, Utah, asked us – we are going to be doing some of this other interoperability work there. Can we use the code that you have developed for the Death Worm to also try it out? We said of course. It is open source. Anybody can use it.

One of the things – I mentioned that there is broad adoption of these standards among certain types of groups.

The other thing that is really important that has emerged very recently is that the major electronic health record vendor systems particularly Epic and Cerner are two of the ones that come to mind. They have developed their own galleries for hosting these applications. The reason why that is important is because they are beginning to recognize that they are not going to be able to cater to all of their customer’s demands. They have used FHIR and the SMART on FHIR protocol and other access mechanisms to say if you want to build a tool that you think meets a real need, here is how you can do it in the Epic environment. Here is how you can do it in the Cerner environment. Here is how we are implementing FHIR.

We were working with a team of students at the Georgia Institute of Technology to develop this application. They build it right out of the box using the Cerner app environment. That one just happened to be really well documented. It was easy for them to do. We developed it against the Cerner box and now we are going to test it against the Epic box in January 2018. Never before was it as simple to be able to build a tool once and run it across these various environments. As I mentioned earlier, the initial work that we have done around the analytics is very promising, but it is going to require additional investments.

At this point, when I originally started to learn about FHIR and I saw the opportunities within the realm of vital statistics, I thought to myself in order for this to be truly transformational, it would have to work for other parts of public health. Within the startup community, there is this idea of you just take a dandelion and you blow on it and you see where the seeds fall. You figure out whether the conditions were ripe for that dandelion to flourish and continue to grow.

Through a series of serendipitous events, I was able to meet Dr. Mark Braunstein from the Georgia Institute of Technology. He is a real expert on this FHIR standard. I learned a lot about FHIR through him. He teaches a class through Georgia Tech’s online Master of Science and Computer Science. It is entirely scalable in the sense that there are literally people all over the world that meet the entry requirements for a Master of Science and Computer Science from Georgia Tech and who take their classes entirely online.

And the way Dr. Braunstein teaches this course is it more like a practicum than like a college course. He was looking to pair his students with project mentors in the real world and just try to do things that had honestly never been done before.

I have been the main person at CDC who has been connecting people around public health to get them exposed to this FHIR standard and just to see what types of projects were they interested in working on. How does this concept of standards fit within the receptive missions that they are trying to achieve?

At this point, we have successfully mentored more than 30 of these projects over the last two years. I guess we are going on two and a half years now. The projects tend to cluster around three main ideas. The first one is how do we have more timely and automated reporting to public health. As much as we would love to believe that physicians will always report everything that is reportable as soon as they see it, the reality is unless they see something that really scares the heck out of them, it might fall lower in their list of priorities of things to do that day.

We paired people who were looking at ways to use this technology to help with reporting for birth certificates, case reporting for sexually transmitted diseases, reporting to cancer registries, initial case reporting for other infectious diseases. I think we are up to about 10 or 15 of those types of projects.

The next piece of this were a set of projects that really looked through this idea of how can we inform decision making in both the clinical environment as well as providing patients the information that they need to make better informed health decisions. That would be more your traditional, clinical decision support tools, kind of like the Death Worm Project I just talked about, as well as how do we get public health guidance into the hands of clinicians as they are making these decisions. There is a great deal of interest in things like how do you identify expecting mothers who are at risk for Group B strep so that way they can make sure they get the appropriate treatment protocol. How can we detect and prevent falls in the elderly? Are there ways that we can provide better support to young mothers who are breastfeeding? Are there ways that pediatricians can better refer their patients who are at risk for overweight or underweight or non-healthy weight individuals to resources in their community to make sure they are getting the appropriate level of nutrition needed?

The way we have approached this is say imagine that the information could just flow wherever you want it to go. How would you want to design that system? Who are the people that need to be involved and what pieces of information do they need to do? It has been really fun because it has been a multiplier effect around the agency. People who are ingrained in doing their work day in and day out take a moment where they pause and write up a specification of the change that they seek to make and then they get paired with a team of students who over the course of a semester make something that they can touch and see and feel. It becomes real to them. They can begin to understand some of the assumptions that they had been making.

The third type of project that we have been working on with this Georgia Tech class – I like to call the FHIR-centered health department. I would say that our colleagues in Utah are really doing a tremendous example of this. What they have done is they have taken various health systems that they know they want to connect within the health department and even sources outside of the health department.

For example, they had a master patient index that let them do things like link their birth registry with the Medicaid registry and with the death certificate data. But there were other parts of the health department that wanted it, particularly the infectious disease folks. By working with the students that understood FHIR at Georgia Tech, they were able to develop a standard space way of doing that and again quickly learn how these tools and techniques can help support their practices.

When Dr. Braunstein tried to teach this course four years ago before FHIR was really at this level of maturity, he had very little interest because computer science students want to be able to apply their skills across a broad array. When you ask them to learn something that is domain specific, there are certain opportunity costs that are involved there. Now that he started opening it up to FHIR and it is something that these tech-savvy students can learn very quickly just through the documentation that is available online and the lectures that he makes available also online, now you have people with hard to acquire technical expertise. You have created the condition where it is easy for them to learn our domain and to start working together with us, transform the way that we do the things that we do.

Before we go into the Q&A, I am just going to offer some of the ways that I have thought about FHIR over my ten year as an Entrepreneur-in-Residence. After that, we will open it up to questions for the committee.

I think the most important thing is whether we are going to use a FHIR-based approach or some of the other standards that we had talked about earlier. We are obviously not going to be able to make it perfect. There are ways that we can make it better. We want to have a sense of what are the clear priorities here. Where is it worth the effort to put people in a room and to get them to agree on something they may not have agreed to do previously?

And then the next idea is that we have to use technologies that people are using and that the best way of measuring whether something works or not is who else is using it. I think in the case of FHIR, it has a built-in ability to evolve because it is using that modern tech stack that is fueling innovation in other areas of the economy.

What is fun about these approaches is that some of them make it very easy to test and see what works. Again, regardless of what type of standards end up getting adopted thinking about achieving a really complex goal through a series of sprints towards smaller, more achievable components. We have to design this within the tools that currently exist. It is fun to design for the future. But if we are thinking about implementation, how do we tease out between what is real and what is theoretical? And if there is something that is only theoretical, how can we advance it more quickly to make it real?

And then this idea of involving the implementers early and often. We cannot hear enough from our public health partners how valuable it was to be working side by side with the students as they are building these apps because they begin to see some assumptions that they have made or they begin to see some decisions that they were going to have to make that they did not even know needed to be made. That synergy of working together is really important.

Thinking about how can we provide tools that can perhaps even run automatically in the background so that way they do not have to do anything special to make it work. Something just as simple as pre-populating a third of the information on the form sends a psychological effect of this is not as much of a hurdle I need to overcome or making it happen seamlessly in the background. There is a lot of potential there.

When we think about the cost of these technologies, I really like to think of it in terms of three separate ideas. We have the price, which is the check that you write and what you pay. Then you have all the different types of costs and opportunity costs. What does it cost us now to do it the way that we are doing it?

And then there is this concept of value. I would like the committee to consider that even if the price is low, if the perceived value is less than the price then there is a mismatch. We have a potential sustainability challenge.

And obviously adopt best practices for privacy and security.

I think the next slide is pretty obvious, the types of questions that I think are important for me as I went about considering. I will stop there to make sure that we have at least some time for Q&A.

DR. STEAD: Thank you very much. Bob, is your card up?

MS. HINES: This is Vickie. Can I ask a question?

DR. STEAD: Yes, Vickie.

DR. MAYS: Thank you. First of all, incredible, awesome, wonderful presentation. It really does make for an incredible leap in the way in which the data can be pulled together in terms of the mortality records. I like that you were thinking about the people component part.

I have a couple of questions. In this approach, is there any way that you can make sure to attend to the quality aspect of the data in the sense of making sure that race/ethnicity is right, making sure that age, gender, things that sometimes are not necessarily thought about particularly the race/ethnicity. We end up with a birth certificate that has one race/ethnicity and then a death certificate that has another. If those kinds of things could be attended to that would be great.

And then the second would be can you see the potential for also moving this process to the National Violent Death Reporting System as well in terms of helping them with the narratives, being able to standardize the narratives that are done, trying to find out more about the cause of death.

MS. BRAUN: Great questions. Thank you. I think I heard your name was Vickie. The quality issue similar to the semantic interoperability issue persists. FHIR is not a magic wand that fixes that. What I am learning and what I am gaining in appreciation for is that as – people do not like to just put information into a database that goes nowhere. As we ask people to provide information to us, if they in turn are getting something valuable back from that system then there is a hypothesis at least that they are more invested in and engaged in the quality of information that they are entering because on some level they depend on the results that are returned back to them.

I think that the data accuracy issues are always going to be an ongoing challenge. But I think one of the best ways of addressing that is for them to see that the information does not just sit on a shelf. It is really being used for real purposes.

DR. MAYS: I think in this instance, you could populate it from other places. That is what I am saying. The race/ethnicity could come from the birth certificate. There are things that could come from other places so that they do not have to even do it.

MS. BRAUN: Yes, exactly. That is the potential. But if there was an error when it was initially entered — you could imagine where you have web services that are really focused on accuracy that could ping maybe five or six different systems simultaneously and give an informed recommendation of that does not quite look right. It could be flagged in some meaningful way. There is a lot of potential there.

Your question around the National Violent Death Reporting System. We have been in communication with our colleagues over the Injury Control Center. They are beginning to understand and appreciate this technology. We do have some pilots that will begin based on funding that we receive from other sources where this is definitely on our radar.

MS. BEBEE: Paula, this is Suzie Bebee at ASPE. It was a great presentation. I am working with Delton now. This is a question that has to do with systems changes on the vital statistics. Are you a part of those changes as well that are ongoing?

MS. BRAUN: Yes. I am very active in various components to what we are doing within the National Center of Health Statistics around vital statistics. We have a next generation EDRS effort that I am on the steering committee of. There is a number of pilot projects beyond what I had mentioned in this presentation that I am also either the PI on or I am heavily involved with.

MS. BEBEE: I could not remember seeing your name across that project that has to do with the PCOR funds. I just wanted to make sure you were part of that because this is going to be a systems change. Thanks.

MS. BRAUN: Yes, I am part of that.

DR. COHEN: Hi Paula. This is Bruce Cohen. I do not know how familiar you are with the work that we have been doing around the sustainability of vitals. And actually yesterday, we spent a significant amount of time thinking about how we move forward from the hearings we had in September around issues of vital statistics. I would love if you are not fully familiar with this to pull you into the conversations as we move forward in our activity around improving vitals.

MS. BRAUN: I welcome the opportunity. Thank you. There is a lot that we could learn from each other.

DR. COHEN: I know probably Kate would be the place to start. She can update you about what we are doing now and pull you into some of the conversations as we move forward. Thanks.

DR. STEAD: This is Bill Stead. I will close this out because we are at the end of our time first by thanking you for presenting such compelling examples of how agile this is and in a space that ties closely to some of the interests of the committee.

I think that we are likely to be asking more and more questions as we think about the potential intersections of this kind of thinking, not just with next generation vitals, which I think it may move quite faster than we thought was possible, but also potential implications for the terminology and vocabulary work and for the Predictability Roadmap. Thank you very much. It is awesome. We appreciate it.

MS. BRAUN: Thank you all. I really am honored to have the opportunity to present to the committee. I appreciate the willingness that you had to grapple with these newer ways of doing things.

DR. STEAD: I want to thank Donna for working with Paula and making this happen.

MS. PICKETT: I would like to thank Kate and also Lorraine Doo in working in the background to have this presentation occur. Thank you.

DR. STEAD: We will adjourn until 1:30. We will need to be ready to start at 1:30 because we will have Nancy Potok here to share her thoughts.

(Lunch Break)



Agenda Item: Health Data Landscape

DR. STEAD: Welcome back. I want to welcome Dr. Nancy Potok. She is the chief statistician of the United States and chief of the Statistical Science Policy Branch in the US Office of Management and Budget. Prior to 2017, she served as the deputy director and chief operating officer of the US Census Bureau. She has over 30 years of leadership experience in public, nonprofit, and private sectors. She served as a deputy undersecretary for economic affairs at the US Department of Commerce, principal associate director, and CFO at the Census Bureau, senior vice president for economic, labor, and population studies at ONRC at the University of Chicago, and chief operating officer at McManis & Monsalve Associates, a business analytics consulting firm. Also of note to us, she most recently served as a commissioner on the Bipartisan Commission for Evidence-Based Policymaking.

Nancy, welcome and we are hoping that you will shake up our view of the world a little bit. Thank you very much.

DR. POTOK: Thank you for inviting me. I am going to be talking a little bit about all the topics I love. I hope we have good discussion about where federal statistical data are headed and that includes some aspects of health data and other programmatic data that for evidence-based policymaking really will fall under the umbrella of what I would call statistical activities.

I guess I should start. A lot of people when they meet me they say what a cool title, Chief Statistician of the United States. Then there is a little pause. They say what do you do. I do not sit in a room all day doing statistics.

But to take a few minutes just so you know where I am coming from in the context that I am speaking from, I will tell you a little bit about my job, which is actually laid out in the Paperwork Reduction Act, which a lot of you may be familiar with, probably not in a positive way. I view it as very positive because there is a section in the Paperwork Reduction Act that says that the statistics produced by the US need to be objective, accurate, have integrity, be relevant and timely. It creates the position of the chief statistician to make sure that that happens. That is really kind of the crux of the job. It is done through a variety of activities.

As you very well know, we have a very decentralized statistical system in the US. There are 13 designated agencies. But there is also about 100 plus units around in different agencies doing statistical work even though they are not a designated statistical agency. There is a lot of data out there. It is problematic if everybody is doing their own thing.

Part of my office does is issue standards. We put out the race and ethnicity standards. We designate metropolitan statistical areas. We have several statistical directives that talk about how you release information to make sure that there is no political tampering with the information before it is released such as the economic indicators. We have directives that say how you do quality survey and what the expectations are if you are a federal agency conducting surveys, et cetera.

There is an Interagency Council on Statistical Policy, which consists of the agency heads of the 13 statistical agencies plus a rotating member. Right now, the rotating member is from the Veterans Administration because they have a lot of really interesting data and can contribute a lot I think to the discussions that we have about the federal data strategy overall for statistical data. And then I also get to do some fun things like represent the US internationally at the UN Statistical Commission, UNSC, the OECD and things like that.

It is a lot of different things, but what it causes me to do is to really think about the big picture of not just health data, but all data and how it connects and what all the agencies are doing, whether there are program agencies that have what I would call very high value data sets or whether there are actually statistical agencies. That is the landscape that I really want to talk about today, not just statistical data, but all the data and how it comes together.

That does kind of segue into what Bill mentioned, which was my role as a commissioner on the Commission for Evidence-Based Policymaking. One of the nice things about being a commissioner was I was able to go back to OMB and really light a fire and say what can we do to try to start implementing some of these recommendations whether or not we ever get actual legislation. There are a lot of things that we could start doing. What are the things that we really feel we would like to push for to try to get it to be an administration policy to maybe have additional legislation.

I noticed on your agenda that you got a briefing on the legislation as it stands so probably what in the enacted House Bill. What you probably noticed is there is nothing in there that says we are going to amend Title 13 to make Census data more available or we are going to amend Title 26 so that there is more access to tax data for research purposes or to build evidence or some of the other data sets that we know are protected and very difficult to get access to. The legislation does not address that at all.

I just point that out because a lot of the commission recommendations really not only go towards setting up some kind of capacity to do data linkages and provide secure access that protects confidentiality and privacy, but also making more data available. I think that is really the heart of the matter is access to data. I am going to talk a little bit about that.

But at OMB, we have set up a data council and we are looking at all these kinds of issues. And the new person who was nominated to be the deputy director for management at OMB is named Margaret Weichert. I think she just had her Senate confirmation hearing yesterday. She has made data and data management a very high priority, which is a very exciting development.

Let me talk a little bit. The commission report had three major themes, as you probably know: improving access to data, stronger privacy protections, and really creating more capacity to do rigorous evaluation work in other evidence-building activities. That is really what we are trying to follow up on.

We are putting together plans for how do we work with agencies productively to try to build to capacity for evidence building. How do we work with agencies to even implement some of the guidance that is already out there from OMB that says, for example, every agency needs to have an inventory of what data are available so that people know what they can access, whether it is in a protected environment where you are trying to get micro-data or it is open data or public use files or things like that. That guidance came out in 2013 and most agencies have not completed an inventory yet. But how do you put together really an organized way of accessing if you do not even know what the contents are?

We are looking at how we can advance those things, how we can get agencies themselves to better coordinate within the agencies. There are a lot of silos. You sort of have CIOs in one place and some agencies have chief data officers. And then you have chief privacy officers. Sometimes you have a chief evaluation officer. Sometimes the CFO is involved in data in some way. And then you have the data owners who are running programs who are sitting on top of all the data who actually are the subject matter experts and hopefully understand what they have as a resource. You have the statistical agencies, which are owners of not program data, but statistical data that is being created from various sources. How do you get all those people together to work productively to advance evidence building in agencies and get them out of their silos and get out of these turf wars that seem to occur in the agencies? Sometimes the legislative authorities are a little conflicting. It is not at all clear.

It is important that there is a cohesive policy in the agencies to advance these things and that people are working together in a coordinated way. Those are the types of things that I think OMB is positioned to address rather than just one of those single groups that have an interest in this area.

We are putting a lot of effort and thought into how do we make this work, not how do we put out more instructions to agencies that may not be followed completely. But how do we really create an environment where people understand the value of the data that they have and put it to use because we are not seeing enough of that?

What is it about the data that makes it harder or easier to use or access? Do we need to look at more standard meta data? Do we need to look at some of these interoperability issues in terms of linking data and those kinds of things? We really want to tackle that.

But what that is to me a subset of is really how do we modernize federal data collection and dissemination for the 21st century. I think in many instances with a few exceptions and some pockets of innovation around the government, we really spend a lot of time perfecting 20th century methods for data collection and quality. It is like how perfect can your survey get.

We are really at a point where if you look at information and how it is used in the world, what once was the biggest source of data, the federal government has not really shrunk relative to other sources of data that are out there. The federal government is now a small piece, but it is a very critical, important piece. I do not think I need to remind you why it is a very critical, important piece, but I will anyways because I like hearing it.

Because for the most part, it is accurate. It is as accurate as it can get. When it is not accurate, you have measurements of error. You understand it. Generally, you can tell where it came from. You can rely on it. It is released regularly. You can put together longitudinal data sets with it that are consistent. Usually it is consistent for the whole country. It is not dependent on some company’s business model. It does not have the biases in it that commercial data may have for commercial purpose. I am not saying there is no bias in it at all. Of course, any data is going to have some bias. You will understand what the bias is. It is not hidden in some algorithm someplace. It is not to benefit a commercial purpose. People have come to rely on federal statistical data as kind of a gold standard for quality.

But we are moving into a world where just relying solely on surveys and some censuses to collect federal data. I think everybody in this room if you are thinking about health data, you know that we are beyond that point. We have to get passed the surveys as a primary data collection vehicle and for many reasons. First and foremost, people hate doing surveys. It is getting harder and harder to keep the response rates up. When I say harder and harder, that means it costs a lot more money because you have to do a low of follow up to the non-responders.

I think we are all aware of the current budget environment, which is that there is a lot of pressure on the federal agencies to cut costs and be more efficient. If you are relying on mechanisms that are just going to keep costing more in an environment where you are trying to save money that does not leave you a lot of options. You either stop collecting some of the data and cut out surveys that are expensive or you have to find other means of getting similar type of data. Maybe if you are really creative and you are thinking about this, you can create new and better data products because the other environmental factor that is out there is the users.

Now, we have a set of data users who are not content to wait until the end of the year to get annual data regardless of the type of data it is. They do not want to wait even six weeks to get monthly data like retail sales at a national level. Data users out there want to know within a day what were sales in New York City. Anything that you can get at a lower geographic level and get it as quickly as possible.

There is starting to be more and more sources of data out there. The problem is it is questionable quality. Do we want people working at places like the Federal Reserve or other agencies or other places where they are looking at policy, getting data faster, more granular, but not really at the same quality that you would expect from federal statistical data that is being released? Of course, the answer is no.

What we have to do is change the way we are producing federal statistical data to meet the needs of the users. That means moving away from a very heavy reliance on surveys and looking more at combining data from other sources. There is a lot of activity going on in that area within the federal statistical system.

I have to say just to be clear. When I talk about the federal statistical system, I am not just talking about the designated statistical agencies. I am talking about statistical activity under a much broader umbrella. But I call it statistical activity because I am really talking about data that is used at an aggregated level to inform policy or for program evaluation or for other things as opposed to data that is used by agencies to just improve their own operations. I think you understand the distinction that I am making here. And not open data either.

When I talk about statistical activities, it means there is probably confidential data, individual micro records that have to be protected, but will ultimately be aggregated to come out with a product that will inform evidence building for policy or related activities.

I do focus a lot on the statistical agencies because they are big producers of this data. I think as the world changes, we have to think about ways that we are going to take some of these administrative records. People do not want to continue to give information to the government in multiple ways. If the government has already collected it and has it in program records, why would you go out and ask them the same information in a survey? Most of the work that I have seen that focuses specifically on comparing data that is collected in a survey with the administrative record shows that the administrative record is much more complete for a variety of reasons, not just non-response on the surveys but also because usually the administrative record can give you 100 percent of the universe of the population that you are looking at and also because people have bad memories.

I think anybody who has worked in survey business for a while knows. People have bad memories. They forget things or they do not understand the questions. But if you have the program record, you can see if somebody was receiving benefits in a program during the year. Did they have Medicare or Medicaid coverage? Were they receiving SNAP benefits? Because if you just go to a house and you ask people, you might get somebody who does not know the answer to that. I think the folks in this room understand that issue.

How do we move forward in more uniform way? As I say, we have pockets of research going on, which is great. But we have seen this situation in government before where agencies go off on their own. They are doing really exciting work. And then you want to get data from different places and you find you have apples and oranges. They have used different methodologies.

There really is not a standard out there right now for how you measure and describe quality of combined data when you are using multiple sources, whether it is coming from commercial records or administrative records or a survey. There is a lot. If any of you have ever looked at OMB Statistical Directive Number 2, it is a very deep dive into survey methodology of what constitutes a good survey and how you talk about the quality of data you have collected through a random sample survey.

But there is not a directive out there. There is no common vocabulary. There is a lot that is developing in the private sector. There is not anything comparable that is developing in the federal arena to really standardize how we are talking about some of these quality type things.

I have been working really closely with the Interagency Council on Statistical Policy, which I mentioned earlier, to set priorities. We have been working on this for the last 12 months. We continue to be a very high priority area for us is how can we develop standards and methods for federal agencies that are working with combined statistical data. How do we improve access to that data for research purposes and for evidence building? How do we get what is publicly available out to the public? How are we going to get people with these new skills that we need to work with combine data? We need not only the subject matter experts and some of the people that we normally think of as being engaged in these activities from the social sciences, but we also need data scientists.

I think the data scientists are starting to realize that if their work is really going to be adopted and useful in multiple environments, they are going to have to link up with the statisticians and the subject matter experts and the demographers so that there is meaning to some of the work that they are doing with machine learning artificial intelligence, some of the modeling and the data mining that is going on so that it actually makes sense to people who are trying to develop data sets that really say something that are useful.

Do we bring data scientists into the federal government? Do we put them into the data agencies? Are we going to partner with academia and with the private sector? There is always competition for a lot of the new graduates who are coming out of school with these types of skills. I will say the private sector in general pays a lot more, but it does not have as interesting of a mission as many of the federal agencies. We are thinking through all of these things and how do we sustain that.

And then the other piece, which I am sure you will appreciate, is that a lot of the data that we are really interested in is actually collected by the states and then given to the federal government. Part of the issue there is that if the states are not using it, if they are just reporting because they have a reporting requirement and the federal agency gave them a template and they are supposed to report on a grant or they are supposed to report on like a certain file for SNAP or WIC or something like that. What ends up happening is the data are incomplete generally. There is not a lot of good quality control in there.

There are a couple of ways to address that. You could have the heavy hand of the federal government to try to come down on 50 states and say you will do this. That is usually not a very successful approach in the end especially if there is no money that comes with it and there is not likely to be a lot of money that comes with something like that.

What seems to be a very promising approach that I have seen and would love to see grown and replicated and generalized a little bit is pilot projects that are springing up around the country where local governments in particular, but state governments as well, have partnered with academia and with the federal program people to produce data that is useful at all these levels of running the program.

I have seen foundations funding some of these pilot projects. There are really incredible learning labs right now. It is very exciting. To me, that is probably one of the most exciting things that is happening in the data arena is getting together these intergovernmental groups with intergovernmental data with people like data scientists and others who can really help manipulate the data and put it together in ways that make it decisionable and actionable and informative.

I do not know where all of those pilot projects are going. I have seen some consortiums formed at different universities and some centers formed where they want to start pulling that type of learning and information together so that there are places, resources where you can go to if you are interested in participating in things like that.

From my standpoint, I really want to do what I can and encourage the federal agencies that are collecting all these data to reach out and start working in a very productive way with their state and local partners on these things. If we are going to be using these data and relying on them to create new statistical data products and data files that can be used for research and for evidence building, we really want to improve the quality and completeness of them.

The way to do that that provides the most incentives is people are actually using the data. Then they start to see what the issues are in the data files and then they will fix them because they are using it. If they are not using it, it is hard to get people to pay attention to those things. It is just here is the file, here is the file. Good luck. That also is a priority as to really think about how we can best leverage some of the things that we can do at the federal level to encourage more of those joint activities.

And then we of course we cannot abandon surveys entirely. I think surveys are still a really important method of data collection. They may in the end turn out to be smaller because you can collect some of the information from other sources. They may be less frequent. But we still have to worry about the public and the public’s trust in the government and their willingness to provide very personal information in some instances, in businesses, willing to take the time to respond to business surveys. We have these issues across the board. We still have to look at response rates and what can we do to help people understand the importance of participating and responding to surveys.

The next steps that we are taking to work on this consist of some of the things that I will just describe to you now. There is a federal committee on statistical methodology. That is a committee that is under the umbrella of OMB so my office. It consists of career federal employees who are identified as being the most expert among their peers on statistical data and all aspects of it. That committee has been around for a very long time.

What I am trying to do now is to focus on some of these issues on quality that relate to combining data with survey data, administrative records, and commercial data primarily.

The Federal Committee on Statistical Methodology has sponsored three workshops related to data collection, data processing, and data dissemination. They held their first workshop in December. The second workshop is this month, I think January 26, which looks at processing issues, editing, things like that, related to when you are combining data from different sources. And the third one is on dissemination, which really also has to touch on a growing problem of re-identification of confidential data in the world because of all the competing capacity and all the data that is out there. What do we do on the dissemination end? It is getting to be more and more of an issue.

Also, the February workshop will summarize what we learned and what was talked about in all three of the workshops. In the beginning of March, I think the 7th and 8th and 9th roughly in that timeframe, we will have a big research conference where the results of the workshops will be presented. Other people in the federal environment who have been doing research in this area will have an opportunity to present their research. It is a three or four-day conference. It has a lot of tracks on it. I expect pretty high attendance there. It is open to anybody who wants to sign up for it. It is co-sponsored by the Federal Committee on Statistical Methodology and COPAFS, if you are familiar with COPAFS, which is the Council of Professional Associations on Federal Statistics.

Hopefully, out of the three workshops and the research conference, we will start building a body of knowledge about at least identifying what are some of these quality issues that we have to tackle that we would like to try to have standard approaches to across federal data creation and dissemination. We are also really working with other sources to feed into that.

This Friday I am expecting a Federal Register Notice to be published. It is in the system. I am going out with an RFI, a request for information that will be open for two months to ask the research community to please send in a description of research you are doing in these areas that relate to quality of combined data. What are some of the issues that you are finding in your research? What are some of the things we should be looking at? What are you addressing? Because I would like to bring a lot of academic research to bear on this problem.

Ultimately, in a relatively short period of time and over the next – once that research is in and we can take a look at it or at least the descriptions of the research, put together a research agenda on this topic for the world really. But really to kind of focus the federal research and any folks in academia who are interested in moving this topic forward. These are the research topics that we are interested in. I am hoping that by putting out a research agenda like that and having a body of this is the research that is out there that will also attract some research money from foundations and others who are very interested in research on data and combined data. They are out there. I really want to move this ahead. Friday that notice will come out. Look in your Federal Register on Friday if you are interested.

The National Academy of Sciences has a committee on national statistics called CNSTAT. They have been very actively engaged and helpful. They had a panel that convened that looked at some of these issues around combining data for multiple sources. They put out an interim report and then a final report that is not like the official fully cleared final report, but it is available on the National Academy website.

It really lays out I think in a very organized and helpful way some of the things that we can tackle systematically to really look at these quality issues. The people who served on that panel and others from CNSTAT have been really generous with their time in helping and trying to think through some of these issues and help us connect the dots on some of this stuff.

And then of course a lot of work has been done internationally particularly in the European countries that have moved much farther ahead of us with administration records and have more organized health records, for example, because they have national health insurance and some other things that we are trying to see what are the issues you are tackling. What are the lessons learned? We do not want to reinvent the wheel. For some of the countries that have been working with these records for a long time, I think there is a lot for us to learn from.

Just so it is not too vague, when I am talking about dimensions of quality, I just want to be a little more specific about some of the things that we are looking at. Transparency is big. Transparency is kind of like an overarching thing that we want to make sure that anybody who is using federal data really understands how that data set was put together in all of these other dimensions of quality that I will go through.

The other thing that I think is really important for users is the fitness for use for various purposes because we do not want to say we are not really sure of the quality here. We are not going to put the data out or make it available. But I think if we can get good at saying it is good if you want to use it for this, but it is not really going to work for you if you are trying to use it in this way. We do that now with survey data. If you want real small area data on some things, we describe in a lot of detail the American Community Survey, exactly what you are getting, when you would want to use it, when you maybe do not want to use it. Maybe you want to model the data if it is a small world area, things like that. We want to really be able to describe to people. If you are looking for certain information, this is what you are getting.

And even thinking about is there some way that we can come up with a standard quality type of rating, even a broad one like bronze, silver, gold. This one is really great. This one is a little sketchy so if you are using it, beware, but it is there kind of thing. We are not sure where we are going with that. We have to really think it through.

The privacy protections, the disclosure avoidance and re-identification avoidance. What are the elements of the micro data that we would be putting together? Ownership issues. Who owns it when it comes from multiple sources in the end especially commercial data that you might be getting under a license? What happens with that?

Breaks in series. What do we do if the sources disappear, come or go? What are the risks involved with relying on third parties to get your data as opposed to your own survey? And then there are a lot of issues with the post-collection processing, the editing, imputation, all kinds of things.

That is really what we are dealing with. I think I will stop here to leave time for discussion and questions because I think we are ending at 2:30. That leaves about 20 minutes. There we are.

DR. STEAD: Thank you. I see tents beginning to rise. Dave and Nick.

MR. COUSSOULE: Thank you very much for that. One question. Absent a good inventory, it seems it would be incredibly difficult to even understand the value derived from what has been created, not to mention the gaps in value that might be available people do not know about. How do you see or how might that process get accelerated to create that better visibility?

DR. POTOK: It is an excellent question. I wish I had a simple answer to that kind of like all we have to do is. It is going to take a lot of work to make sure that agencies are dedicating the resources they need. But I think an important part of this is prioritizing. Which data sets are the most important ones to identify and have out there? You could spend a lot of effort for data that actually nobody wants to use anyway. Probably having agencies prioritize with input from the data users is like what are the most important things agency by agency that we should be getting out there.

You do not want to wait or depend on actual implementation of some of the things that the Commission on Evidence-Based Policymaking was recommending, which did include an inventory. It included an inventory of all the data sets that had been linked that would be a big boost to transparency so people even knew what was there and what was being linked and how their information was being used.

It is a matter of, again, finding the right incentives. You can tell agencies to do something and they can say, yes, the effort is underway. But if it is months and months and months, this happened, that happened. It is very challenging to do that in a federal environment with the whole federal government because big agencies doing other things. It is not always a priority for them.

But I think if we can focus on high-value data sets initially and at least get those inventoried and out there, that would be a big step forward.

DR. ROSS: Thank you, Nancy, for the presentation. I really enjoyed it. I have two different kinds of questions. One, when you talked about data scientists and integrating or bringing in data scientists into agencies and bringing new approaches, it raised for me a question about do you now or do you envision maybe under the Evidence-Based Policymaking initiative being able to sponsor demonstrations that actually test new data science approaches that could potentially be adjunct to or replace existing methods.

Let me give you an example. In public health, one of the things that CDC focuses heavily on is flu. One of the major systems they run is called Flu Trends. They are the ones who put out the regular projections of the flu season. A few years back, it probably has been six or seven now, Google decided to try to look at click data. You know about Google Flu Trends. It was a nice interesting clever idea. The trouble was is that Google Flu Trends came up with data that seemed in a time series way to be about two weeks ahead of predicting sort of the same information as CDC Flu Trends. But there really was not any way to formally say let’s look at this. In fact, the opposite. It was sort of like the agency saying we ought to protect ourselves. We do not want our data set or our system to be undermined. This would be outrageous. Is Google going to take over predicting flu? It was a threat.

Have you thought about how you position this in ways so that it is not a threat to change, but actually provokes really positive change? The end of that story was that after a lot of investigation, it was concluded that nobody could quite figure out what Google Flu Trends – what that signal meant. And CDC Flu Trends was scientifically valid and did know what it was signaling. How do you address that?

DR. POTOK: I see it kind of differently. I think the risk is in not changing. The risk is in the status quo. It is very evident that the world has changed in that there are a lot of private sector data providers out there. Rather than being a threat though, it should be a spur to innovation.

DR. ROSS: Part of my question though is that I do not see the incentive structure for agencies right now particularly on the budget side. Who pays for that test then to explore that? If you – agency and divert your resources to do that test, you are taking them away from something else you have been mandated to do.

DR. POTOK: You absolutely are. And I have to tell you. It is just over a year ago I was the chief operating officer of the Census Bureau. Very familiar with the federal budgeting environment and not getting enough money and trying to be innovative and try to really change a lot of what you are doing in an environment where you cannot get Congress to pay for it.

Quite honestly, the only thing you can do in that environment is you partner with people outside of the federal government with foundations that have the money to pay for it or you stop doing your lower priority stuff and you reallocate the money. There is no magic money falling from the heavens to pay for this.

You either have to pay for it outside of the government and become a partner to other people who are leading it, but who care a lot about your data or you stop doing other things and pay for it. There really aren’t a lot of other alternatives.

DR. STEAD: We have a hard stop – we will come back if we have time.

MS. KLOSS: Thank you very much. I am asking a couple of questions with a hat on of a co-chair of our Privacy, Confidentiality, and Security Subcommittee. I was interested in your comments about trying to look at dimensions of quality, including some rating system. Do you anticipate that there might be ways of looking at ways to do risk assessment of harm from re-disclosure?

This has come up as we have looked at the adequacy of de-identification protocols under HIPAA. Some of the testimony we have received said we just need to learn how to do a better risk analysis. We are not going to prevent re-identification, but to understand the consequences and again I suppose manage a little bit better by exception.

DR. POTOK: I think you bring up an excellent point. There is no zero risk. Generally, what you have to do is kind of figure out what your risk appetite is. How much risk are you willing to accept? Finding good ways of measuring the risk is a key part of that when you are making a decision about how much data to release, which data sets you are making available and in which ways. I know you know all this.

Again, it is an area that there is a lot of research underway but I do think in terms of risk analysis, there is a whole body of work out there particularly in areas looking at enterprise risk management. I happen to teach enterprise risk management. It is one of the things I do at GW. There are many methods for doing risk assessments. Some are quite quantitative and others are a little more qualitative, but they are out there. I think people do not utilize those tools. A lot of risk assessment is more subjective. There are some very solid proven methods for doing risk analysis and risk assessment out there.

It is a matter of getting people more familiar with those methods and adopting them and applying them in this particular situation, which is one where I think has been – the risk has been looked at as if we do disclosure review, disclosure avoidance techniques, mask some of the data that will work. That probably did work about 25 years ago, but it does not work now.

On some of these things, I know researchers have this, but I really think we are going to see more synthetic data out there. I am not sure what the alternatives are. But there are people who are working to look at how do we make life better for the researcher if we are using synthetic data. There are proposals out there to do things like we will create a synthetic data set, give it to the researchers to do their work then take the results of the research and run it against the real data set and then help in an iterative way improve the results of the research and the modeling and all of that without actually putting the real data out there. I think we are going to see more and more of that. It really does start with figuring out what is the level of risk you are willing to accept.

But I would point people towards the existing field of risk management to see what can we take advantage of there that already exist that people have been doing for quite a long time to assess risk and up the game on that a little bit.

DR. STEAD: I get what you are laying out in terms of the research agenda, et cetera to figure out how we can take advantage of these new approaches. I resonate with your statement that there might be more risk in not acting then in acting or a sense of that.

We just had a briefing around ICD-10 and 11, which were in essence how people have tried to be able to compare causes of first mortality and other things internationally. In a time when our understanding of biology is changing dramatically in three to five years, the cycles for these have now turned into 20 year cycles.

I at least think we need to think about is is there another way to assess, to compare mortality and morbidity. It might be helpful if you could help us develop a way to measure the fact that the method we are using now does not scale. It may be sound. But the population is growing faster than it can deal with. The complexity is growing faster.

So that we actually have a concrete way of saying we have to try something different. Rather than just having the new thing be able to prove itself, we have somehow had to get a way to stop doing the stuff that is no longer stale, it is good. Do you have any thoughts about that?

DR. POTOK: I guess I would say that because we look at data in so many different areas, that is an effort that I would really want to see a lot of what I would call subject matter expert involvement of people who really understand the data, what it is supposed to be showing and design the research partnering up to think about what are alternatives. What is out there? What would that produce and how good would that be? That is a perfect example of the kinds of things that we have to be looking at and addressing. It is a method that worked in the past that is not going to work in the future. We have to develop new methods. It is a partnership. I think as we have been saying, you cannot just take a data scientist or somebody in a vacuum to do that.

And what I would like to see is some of the federal agencies recognizing those problems and leading the way in thinking about how do we develop new methods. Who are the experts out there that we could call upon and what are we learning from that that we can also pass along for others who are in a similar situation? I would want to see that from other research. There might be something out there that could really help you that people are already doing.

A lot of this is pulling in what is going on. What are the questions that people need? What are the pressing issues that we should be addressing and how can we collectively put our minds to moving this forward?

DR. LANDEN: We had a central theme about the increasing difficulties with survey methodologies. One of the things you indicated you are exploring is to be able to substitute for surveys, pulling data from different program data sets. What is your thinking about matching the data from those disparate data sets? Are you thinking in terms of we will need to solve the problem of the individual identifier across the data sets or is there some sort of methodology that you could pull the data from the disparate data sets and process it without identifying the individual across all the data sources?

DR. POTOK: There may be other methods. I would like to find them if there are. Right now, the methods that I am most familiar with use substitute personal identifiers. I am most familiar obviously with what the Census Bureau does for record linkages. There is a Center for Administrative Records Research there where they are basically providing a service across the federal government for record linkages. What you start with is a file. You have to bring in lots of records from different places. They do personal identifiers on them, but you have a few people who specialize in stripping the identifiers in substituting in these random generated numbers as a substitute. You have a number to link records across different files and unlink them when you are done if you need to do that. But it is a substitute random number. It is not like a social security number. As far as I know, most of the record linkages that are going on right now do use – you have to make sure it is the same person too.

In some of the things, you want more than one identifier to make sure you are actually linking the same person or if it is a person or business if you are linking businesses. There are maybe four or five variables that you want to check besides let’s say a social security number to make sure that it is actually the same individuals like date of birth and maybe place of birth and those kinds of things.

It would be great if there was a way. It may be that it exists in a proprietary business model in a company like Google or Facebook that is working with these really large data sets. They may have figured this out already and how to do it without a personal identifier out and how you link somebody. Although I think they are using people’s accounts when they do that.

As far as I know within the federal government in the civilian agencies, which is what I know about, you still have to use an identifier that you de-identify.

DR. STEAD: Thank you. We need to respect your hard stop and thank you very much for sharing with us. It has been extraordinarily helpful.

MS. HINES: Nancy, will the reports from the workshops you mentioned be available publicly?

DR. POTOK: The Federal Committee on Statistical Methodology has a website. All of that should be on their website. I think it is FCSM.gov probably. All the information about the conferences, the workshops – it is all up there.

Agenda Item: NCVHS 2018 Review of Next Steps

DR. STEAD: Team, we are now where we need to huddle and agree on any next step work we need to do to come out of the chute. We are just trying to close the loop on any work we need to do around next steps. I did not know, for example – you and I had some conversations about the CIO Forum, whether there were pieces we would still need to talk about now. I think we did a good job of getting back to where I think you have a pretty clear idea of what you are thinking about doing. Would it be helpful to think through what the current view of timing of the forum might be to give you something to work back from or is that – right now, it is out there in the April to June timeframe with April lightly penciled in.

My gut is that it may be sooner than practical given the kind of people you are going to try to get in the room. Do we need some further alignment conversation there or are you good to work that at the subcommittee level?

MR. COUSSOULE: I think we can work it. We probably just have to start from the end and work backwards and try to line all that up. I know we have had a couple of conversations like that – put it down on paper —

MS. GOSS: I agree with you, Nick, that I think we need an offline conversation to process the information – feedback, add that to the existing subcommittee discussions. Clearly, April is just too aggressive. June may be too. I think you may be spot on, Bill. I do not want to forecast anything at this point.

MR. COUSSOULE: I think my gut would be the same.

DR. STEAD: What we will at least do is just leave it at this juncture as Q2, but drop the word April from the description.

MS. GOSS: I could see it actually being more like the next border like August maybe. The thing is is that I do not want to wait too long because I really would like, as with Nick, to make progress on the roadmap to get to the recommendations. There is a little bit of a chicken and egg dynamic that we do not want this to go too far. But you are right. We need to get the right people.

DR. STEAD: Where I am hoping we may have helped was re-narrow in the scope of both the types of people we were going to be trying to bring to being in essence information officers, not innovation investment, et cetera, and to really focus around predominantly the issues of the roadmap. I am hoping that might make it easier to bring it to closure.

I think with all of our work, we are going to be helped if we can use narrowing scope to let us get a meaningful deliverable that helps move both get something out to HHS and industry and help us figure out how to correctly scope the next step. I think that is a common theme in many of the things we have been discussing, all of which tend to be pretty big things we are trying to fight off. That sounds like enough said on that.

Now that we have the letter in with the request from the DSMO, you will decide how to best process that at the subcommittee level.

MS. GOSS: Meaning that we have now formally received a letter, which we have not fully read and need to process that so we can schedule a virtual hearing, which is the nice part. It is going to be a full hearing, but it is going to be done virtually for advancing the next version of the National Council for Prescription Drug Program standards.

DR. STEAD: Very helpful. The letter we received mentioned that we will receive something else under separate coverage that I at least could not find, but I figured Lorraine will get that for us in some way.

MS. GOSS: I was actually looking for Nancy Spector who had been sitting in the audience for the last two days. She might have a sense of what the DSMO is planning to send us. We will get that.

DR. STEAD: It seems to me we are good to go from a point of view of standards. We are planning to huddle around TNV after this. Anything else we need or did we get you to where you wanted to be for that huddle? It sounds good.

Beyond HIPAA, we agreed on the two at least tentatively on the scope of problem cases.

MS. KLOSS: Our next step will be to try to get regular biweekly calls for the subcommittee on our calendars probably beginning first of February so we can move the scoping documents for those two or maybe we do it in one scoping document so we could have the two-way arrow. But at any rate, get that prepared and ready for a discussion in May.

DR. STEAD: Next Gen Vitals. You are really set to go with the report. We will go on and get the environmental scan out, post it.

It seems to me that what we heard with the FHIR briefing may in fact influence our thinking about alternatives. We may need to figure out how to bring in some idea of how fast that is in fact moving because my sense is it has actually been moving pretty fast. That may make that more robust.

MS. BRETT: We did talk a little bit about how our next steps of putting out to the participants at the hearing from last September and what we have heard today and getting their feedback like we talked about yesterday and also bringing in more information about what is going on with FHIR.

DR. STEAD: Will we try to do anything? My sense was there was fair agreement that it would be useful if we could figure out who to partner with or who to – that we might want to do something in terms of getting some of the data Gib was suggesting. And rather than us necessarily doing that, the way our charter reads – we are encouraged to put the need on the table with other potential partners and to help them address the need and to bring that back in. It sounds to me like that is another thing that could be moved in a fairly concrete way while we are thinking about some of the more abstract ones. We might have a sub-deliverable somewhere along the way.

PARTICIPANT: That is a good point.

DR. STEAD: Any other things in that space? We agreed to – I think we are largely in stay tuned mode around CEP although I was encouraged to hear that OMB is basically working without – doing what they can without waiting for legislation while obviously working to have the legislation do what it needs to be. That is encouraging.

In addition to us getting the information about the workshop report, it would probably be useful to share with her the health data framework report, which had some of the – a pretty detailed table of the kind of meta data that would get at some of the things she is talking about in terms of tagging the data sets. She may have all that. But it seemed to me to refer her to that part of that report that is short might be helpful.

Data access issues. I think that is going to be worked at the level of pop health in terms of whether there is any additional data finding that needs to be done there.

Is there anything else we need in terms of next steps? Are we good to go? If so, we can move to public comment.

MS. HINES: Just to let the Full Committee know process wise that the Executive Subcommittee meets monthly. When you are in your monthly subcommittee calls, make sure if there is something you think that we should be making a decision about that your co-chairs, Bruce, Alix, Nick, Linda, et cetera, that that is where we sort of move things. As you have probably noticed, we have moved quite a ways from September. That is no accident. We will continue to do that. Just to understand that we will be meeting until we all come together in May to keep everything moving.

DR. STEAD: We are now basically working with a regular cadence of whenever we do not have a Full Committee meeting a month, we have an Executive Subcommittee meeting that lets us continue to move and do the planning work that sets us up for the Full Committee.

MS. HINES: And coordinating the different one offs, the hearings, the roundtables, the forums so they do not happen on the same week or even the same month.

DR. STEAD: Anything else before we open it up to public comment?

Agenda Item: Public Comment

MS. HINES: I believe on the WebEx, the public can see how to submit a comment through the comment and nothing has come in. How long has that instruction been up?

PARTICIPANT: We put it up when we started speaking about this.

MS. HINES: The public – you have ten minutes to send us your comments. I have not received any emails. You also have the instruction for that. As far as we can tell, there is nothing virtually. Is there anyone in the room who has a public comment? I believe this portion of the agenda is taken care of. Of course, the public is welcome to email me as instructed on the slide that you are looking at on the WebEx at any time. Thank you for your attendance.

DR. STEAD: I think we are adjourned. Thanks everybody. It was an awesome meeting.

(Whereupon, at 2:45 p.m., the meeting adjourned.)