Shaping A Vision for the 21st Century Health Statistics

Dorothy P. Rice
Professor Emeritus
Institute for Health & Aging
University of California, San Francisco
3333 California Street, Suite 340
San Francisco, CA 94605

On this special occasion of the 50th anniversary of the National Committee on Vital and Health Statistics, I want thank NCVHS, CNSTAT, and Dr. Ed Sondig, Director, NCHS, for his dedication and commitment to shaping a vision for the 21s century health statistics. I am excited and pleased that this effort has begun so effectively. We now have an excellent document – the Interim Report – that identifies the gaps and the cross-cutting issues shaping the vision. This report sets forth ten important principles for the 21st century vision. I heartily endorse these principles, and will comment on what I regard as the most glaring gaps and cross-cutting issues, and talk about the next steps.


Subpopulations Groups and Minorities

As efforts continue to reduce health disparities among special population groups of low-income persons, racial and ethnic minorities, and persons with disparities, it is recognized that data are needed to monitor our progress toward eliminating these disparities. Except for the data derived from the decennial Census and from the vital registration system (birth and death statistics), the existing sources of health data do not permit examination of socioeconomic differences for any but the three largest race and ethnic categories: non-Hispanic white persons, non-Hispanic black persons, and persons of Hispanic or Mexican origin. Data shown for broad groupings usually mask significant differences among subgroups. For example, “Asian or Pacific Islander” includes persons with ancestry in such countries as China, Vietnam, the Phillippines, Japan, and Samoa, while “Hispanic” combines persons whose origins were Cuba, Puerto Rico, Mexico, or any other countries of Central or South America. These subgroups often have very diverse health status and risk behavior. It is essential that our health statistical systems at the national and state levels capture this diversity.

Longitudinal Data

The surveys, surveillance, and vital statistics programs meet many of the current needs for health data. The cross-sectional survey data give a “snapshot” at a point in time of the health status of people at different stages in their lives and allow periodic examinations of changes over time. Still needed, however, are large-scale longitudinal efforts that record in sequence the health events of life. Longitudinal efforts in the health area are limited. Recent examples of relatively short term followups of survey participants include the NHANES I Epidemiologic Followup Study, the Longitudinal Study on Aging (LSOA), NHIS Disability Supplement, MEPS, and MCBS.

The needs for additional longitudinal studies have been specifically addressed in at least ten of the 62 reports dealing with some aspects of health data and data systems published by the National Academy of Sciences (NAS) since 1985. These reports have specifically recommended the need for longitudinal studies, such as the recommendation that NCHS develop and implement a continuous, longitudinal survey of health care utilization and expenditures, and their health care providers, using cohorts of individuals selected from among NHIS survey respondents.

Health Delivery System Statistics

The organization, delivery and financing of health care services in the United States is complex, comprising an interdependence of the private and government sectors of the economy. This pluralistic health care economy, with its pragmatic mix of public and private organizations, has produced a wide range of data bases that enable us to monitor the health of the nation.

Health care expenditures have been rising rapidly in the United States and claiming a larger share of national resources during the past three decades. In 1965, only $41.1 billion was spent for health care, comprising 5.7 percent of the GDP. In 1998, health care expenditures in the United States totaled $1.1 trillion, an average of $4,094 per person, comprising 13.5 percent of the nation’s gross domestic product (GDP).

The American pluralistic health care economy presents special problems for data collection, analysis, and dissemination. Health statistics systems have grown rapidly with the growth of the industry and the expansion of private health insurance coverage and public health care programs.

There is general agreement that data are needed to monitor the health of the nation, to plan and develop better health services, to deliver those services in an effective, efficient, and equitable manner, to measure their effectiveness, to make decisions on resource allocation, and to conduct research. Data also are needed to facilitate effective policymaking, planning, management, and evaluation. Private organizations of health professionals, health service providers, health insurance, and many others have important interests in the collection and use of health data The federal government needs a variety of data to support its major role in improving health and medical care delivery systems throughout the nation. State and local government agencies also have key roles in disease prevention, delivery of health services, and health planning and evaluation that require timely and reliable health statistics. As pointed out in the Vision Document, the fragmentation of health care delivery today makes it essential to have integrated, effective statistical systems in order to better understand the health care system and how people fare in it.

HMOs – An Example

The growth of HMOs in recent years, now covering 77.6 million people, about 30% of the population, have resulted in some problems and gaps for health statistics because encounter and visit data are not uniformly reported for the population covered under capitated payment systems

Outcomes and Quality

Little information is available on outcomes of care as well as the quality of care provided. Furthermore, there is no agreement on what measures should be used to measure quality of care. There are measures such as HEDIS, QUALYs, DALYs, and many variations of them.

Health Status, Health Care Utilization, and Medical Care Costs

Statistics abound on health status and use of medical care services at the federal, state, and local levels. Here we see an array of Federal data systems.

NHIS and NHANES are only two of the many national federal surveys that collect data on heath status, medical care utilization, and insurance coverage. Other important federal surveys that collect similar data as well as data on medical care expenditures include:

  • The National Immunization Survey (NIS).
  • The Medical Expenditure Panel Survey (MEPS) conducted by the Agency for Healthcare Research and Quality (AHRQ) is a study of approximately 9,000 households. MEPS is a subsample of NHIS participants, providing health status and other data for enhanced analytical capacity. Use of NHIS data in concert with the data collected in the 1996 MEPS provides the capacity for longitudinal analysis. Each sample panel is interviewed a total of five times over 30 months to yield annual use and expenditure data for two calendar years.
  • The National Household Survey on Drug Abuse (NHSDA), conducted by SAMSHA, focuses on the incidence, prevalence, consequences and patterns of substance use and abuse. In 1997, the NHSDA was expanded from 18,000 respondents to about 25,000 respondents to generate estimates for the nation and for or two states (California and Arizona). In 1999, the NHSDA was further expanded to 70,000 respondents to generate estimates for all 50 states.
  • The Medicare Current Beneficiary Survey (MCBS), conducted by the Health Care Financing Administration (HCFA), is an ongoing rotating panel survey of approximately 12,000 aged and disabled Medicare beneficiaries, consisting of four overlapping panels of Medicare beneficiaries surveyed each year. Each panel contains a national representative panel of beneficiaries who are interviewed 12 times in the community or a long-term care facility to collect 3 complete years of utilization data. The survey provides comprehensive data on health and functional status, use of medical services, covered and non-covered health care expenditures, and health insurance for Medicare beneficiaries.
  • The National Health Care Survey (NHCS) is a family of NCHS provider-based surveys that measure the utilization of health services through a series of surveys of providers. Included are hospitals (National Hospital Discharge Survey), physicians (National Ambulatory Care Survey), emergency and outpatient departments (National Hospital Ambulatory Medical Care Survey), ambulatory care centers (National Survey of Ambulatory Care Surgery), nursing homes (National Nursing Home Survey), and health agencies providing home health care services and hospice care. (National Home and Hospice Care Survey).
  • The National Survey of Family Growth (NSFG) is a periodic survey of women ages 15-44 years. The purpose of the survey is to provide national data on factors affecting birth and pregnancy rates, adoption, and maternal and infant health.
  • The Healthcare Cost and Utilization Project (HCUP), conducted by the Agency for Healthcare Research and Quality, consists of the State Inpatient Database (SID) and the Nationwide Inpatient Sample (NIS). SID contains all hospitals and all discharges from 22 participating states. AHRQ receives the data from each statewide data organization, processes the data into a uniform format, and then returns the uniform SID files to the statewide data organization. The NIS database contains a sample of hospitals selected from SID. The NIS comes with weights that can be used to produce national estimates, regional estimates, and state estimates for participating states.
  • The Current Population Survey (CPS) is a monthly sample survey of about 50,000 households conducted by the U.S. Bureau of the Census for the Bureau of Labor Statistics. The CPS is the primary source of information on labor force characteristics of the U.S. population. Monthly estimates from the CPS include employment, unemployment, earnings, hours of work, and other indicators. The annual March supplement produces national and state estimates on health insurance coverage, including private health insurance, Medicare, Medicaid, CHAMPUS, or military health care.

State Private Sector Data Systems & Surveys

In addition to the federal health statistics surveys and programs briefly discussed above, each of the 50 states, and the private sector maintain data systems and conduct many surveys of hospitals, health professionals, and health care organizations. The private health sector includes organizations of health service providers, health professionals, health insurance payers, consumers, industry, and private philanthropy. Many national and state data collection activities are conducted by these private organizations, but their quality is variable. The results of all these statistical efforts are duplicative and overlapping data systems in the public and private sectors. Hospital inpatient data, for example, are collected in the public sector by NCHS, AHRQ, HCFA, SAMHSA, Veterans Administration, and others. Most states have their own hospital discharge data systems conducted by state rate setting, planning offices, and health systems agencies. In the private sector, hospital data are collected by the American Hospital Association, many abstracting organizations, Blue Cross, Professional Standard Review Organizations, and health maintenance organizations. It is recognized that hospital data are necessary to understand, monitor, and evaluate programs related to hospital-based delivery of health care. The reporting burden on hospitals, however, is great; the recording, storing, abstracting, and processing of medical records is expensive for both the institution and the users. The rationale for these overlapping and duplicative hospital inpatient data is difficult to justify.

California Health Interview Survey

Let me tell you of one collaborative and successful effort in my own State of California. The California Health Interview Survey is a collaboration of the Department of Health Services, the UCLA Center for Health, and the Public Health Institute in Berkeley. The estimated cost is 10.8 million for the first cycle, including data collection by a survey contractor, analysis, and dissemination. The State of California, the National Cancer Institute, and CDC have together committed necessary funding for the first cycle of CHIS. Other federal funding agencies are being asked to provide the remaining funds.


  • Privacy Protection and Confidentiality

The conflict between freedom of information and invasion of privacy in relation to data collection has received increasing attention in recent years. A balance must be struck between the public’s right to know and the right of individuals and institutions to protect their privacy. Even in those programs where strong legal safeguards and technical procedures protect the confidentiality of the information collected, there remains a persistent fear that this vast complex of information might be used as an instrument of social control, if not for commercial purposes.

Advances in technology and the increasing collection of personal data for public and private decision-making are raising concerns among many Americans about the confidentiality of the information they provide for use in government surveys. Both individuals and businesses are questioning how the information is used and who has access to it. At the same time, data users, especially those outside of the government, are increasingly frustrated by limits on the amount of detailed information they can obtain from statistical agencies.

Data Sharing and Data Linkage

One aspect of the privacy and confidentiality issue that could have a significant impact on reduction of duplicative and overlapping reporting systems is data sharing and data linkage among government agencies. It has long been recognized that the development of comprehensive data systems concerning the interrelations among various aspects of social and economic patterns sometimes requires that various data sets be combined. In the past, recommendations have been made for exchange of statistical data under legislatively mandated “protected enclaves” for selected statistical and research agencies within the federal government. There are current efforts to designate eight federal statistical agencies as statistical data centers that allows for limited sharing of statistical information by other agencies with these data centers and sharing among the statistical data centers. If approved, this Act will go a long way to sharing of data among federal agencies. For example, NCHS could use business data from the Census Bureau or BLS to construct sampling frames for surveys of employers or health providers, or using Census data from the Census Bureau to augment population samples.

Linking public and private data is an area with tremendous potential for analysis of issues of confidentiality can be overcome. Linking such data is especially important for medical effectiveness and outcomes research, which examines the effects of alternative treatments of a given medical condition on the eventual outcomes realized by the patient.

Data integration among current large data collection activities should be carried out to maximize the results of separate efforts. Linkage of data files should be encouraged when there is a good reason to believe that the results of a specific linkage program will be sufficiently complete for the specific purpose and that biases and limitations of linkage studies will not be so severe as to vitiate results.

Quality and Reliability of Data.

Many organizations, especially federal statistical agencies, such as NCHS, have made and continue to make considerable and commendable efforts to maintain and improve the quality of their major statistical series. In the private sector, however, the quality and reliability of data are uneven and unknown in many ongoing databases. Survey results are subject to sampling, reporting, processing, and nonresponse errors; the data cannot be fully understood and properly used unless these errors are reported. Standard errors are routinely reported in federal statistical reports on survey results but are unavailable in most reports emanating from facility and manpower surveys conducted in the private sector. The improvement of the quality and reliability of health statistics in the private sector is most urgently needed.

Standardization of Data Elements

Health data are collected by many organizations and by multiple geopolitical levels for a variety of uses. Standardization of data elements across programs is necessary to permit comparisons and avoid duplicative efforts. Considerable progress has been made at the federal level in providing standards for data collection, analysis, and distribution by the Office of Management and Budget. Currently, the surveys often produce different estimates because of the lack of standardization.

Some progress has been made in the development of uniform minimum data sets under the auspices of the National Committee on Vital and Health Statistics (NCVHS), but this effort must be continued as the needs for data at the state and local levels continue to grow.

  • Investment in Health Statistics

There is a lack of political commitment to invest in quality health statistics in the public and private sectors.


Reluctantly conclude that despite improvements, health statistics production in this country presents a picture of fragmented data collection, lack of common definitions and uniformity of reporting, duplicative and overlapping systems, and resistance to data sharing. It is encouraging that progress has been made along some fronts, but we have a long way to go to fill the data gaps and to provide the health statistics needed for the 21st Century.


Recognizing the philosophy that the federal government should only be in the business of doing things that cannot be adequately done by states and/or the private sector, it may be necessary to reassess the core programs of the federal statistical system.

Regardless of what changes must be made in the core programs we must ensure that an information base continues to be available that will provide baseline data, be useful for monitoring trends, and have the ability to quickly detect any changes or aberrations in the economic, social, or health characteristics of the nation. The appropriate federal role in statistics is to produce national level data useful for those purposes as well as provide norms to which sub-national data can be compared. The data must be of high quality, produced in a timely manner, and relevant to issues of the day.

  • Develop and Promulgate Standards

Federal statistical agencies must assume responsibility for activities that cannot reasonably or feasibly be assumed by individual states, local governments, and the private sector. The federal role must include the development and promulgation of standards and procedures for assuring the validity, reliability, comparability, and quality of statistical products and the provision of technical assistance in these areas. Federal statistical agencies also must anticipate future needs for information and design today’s systems to meet those needs.

  • Enhance Resources

In considering future prospects for improved health statistics to meet the needs of the 21st century, we must recognize that resources will not grow parallel to demands for data and services. The demands for health data are greater than our ability to produce them. Budgetary pressures are requiring assessment of current data collection and dissemination procedures. Statistical agencies must make choices between data collection, research, and analysis, and among needed data sets. NCHS needs a constituency and we need a cultural change to invest in the 21st vision for health statistics.

  • Implement the 21st Century Vision for Health Statistics

As we move closer to our objective of a national and systematic approach to meeting the information needs for health policy development and program evaluation, we also need to coordinate our data collection activities, both within the federal establishment and between government and the private sector. Although considerable progress has been made in coordination, we must continue to avoid unnecessary and costly duplication, to encourage comparability of information collected by different systems, and to use the ongoing data collection programs to provide specific information for many organizations. More effort is needed to provide essential data, yet reduce the burden on individual and institutional respondents. We must develop, articulate, and implement a 21st century vision for health statistics.

I want to add my thanks to the NCVHS for 50 years of leadership in health statistics.