Testimony of the cancer Biomedical Informatics Grid™ (caBIG™)
Data Sharing and Intellectual Capital (DSIC) Workspace

Before the
National Committee on Vital and Health Statistics (NCVHS)
Ad Hoc Work Group on Secondary Uses of Health Data

By Wendy Ehrenkranz Patterson, Esq.
National Cancer Institute

August 1, 2007

Washington, D.C.

Thank you for the opportunity to speak to you today on behalf of the National Cancer Institute’s cancer Biomedical Informatics Grid™ (caBIG™) about the uses of health data in biomedical research  and the framework for sharing such data within the caBIG™ infrastructure.  My name is Wendy Patterson and I am a Senior Advisor in the National Cancer Institute’s Technology Transfer Center where I provide guidance on data sharing and intellectual property matters arising from NCI grants, contracts and cooperative agreements.  One of my responsibilities is to serve as the NCI facilitator for the caBIG Data Sharing and Intellectual Capital (DSIC) Workspace, which seeks to address the legal, regulatory, ethical, policy, academic, proprietary and contractual barriers to data exchange for public health and research purposes.


Toward Personalized Medicine: The Convergence of Clinical Care and Research

Biomedicine in the Twenty-First Century is characterized by the progressive transition to the personalized medicine paradigm, in which the unique characteristics of an individual patient and his or her disease drive delivery of prevention and care.  While medicine at the point of care has always focused on the distinct traits of a presenting patient, the explosion of knowledge of disease at the molecular level allows care providers to target the specific etiologic underpinnings of an individual’s disease, condition, or risk.

The National Cancer Institute (NCI) is charged by Congress through the National Cancer Act to lead the nation’s cancer research efforts.  In service of this mission, the NCI seeks to improve quality of care and outcomes by utilizing personalized medicine to drive the selection of medications, therapies or preventive measures that are particularly suited to individual patients at the time of administration.1-3

The NCI senior leadership has recognized that barriers to the use of information technology represent a critical and early stumbling block to leveraging the benefits4 of personalized medicine.  In response, the NCI has established the strategic goal of utilizing biomedical informatics to create a virtual web of interconnecting data, individuals, and organizations to redefine how biomedical research is conducted, clinical care is provided, and patients interact with the biomedical research enterprise.  To achieve this objective, NCI launched the caBIG™ initiative in February 2004.  NCI’s intent is to create a standards-based distributed informatics infrastructure – bridging individual institutional and organizational silos in the broader cancer community.  The caBIG™ vision is “a full cycle of integrated cancer research, extending from bench to bedside, and back again.”5

This shift will require an extraordinary synthesis of multidimensional data and the joining of diverse communities.  Yet it is not without precedent in the cancer community.  Treatment of childhood cancers today relies on a model that joins diverse biomedical  communities – including researchers, care providers, and data repositories – to direct care based on the molecular characteristics of disease.  This model arguably is responsible for the tremendous successes that have been observed during the past several decades.

As is soon to be the case in adult cancer, childhood cancer is the chief cause of death by disease in children between the ages of one and 14.6  However, unlike adult cancer, mortality rates have declined nearly 50% since 1975.7  These numbers are even more impressive when attention is focused on the leading form of childhood cancer, acute lymphoblastic leukemia (ALL).8  Among children ages 0-14 with ALL, mortality rates have decreased more than 70% since 1975 and five year survival is more than 87%.8

The pattern of care associated with childhood cancer differs remarkably from that of adult cancer.  First, childhood cancer is treated in a context that blends care delivery and clinical research.  On average, more than 50% of children receive treatment in a clinical trials setting; for treatment of ALL at pediatric centers, this number is nearly 85%.9  ALL in children is also treated with consideration of the individual’s biomarkers that are believed to reflect the molecular origin of the disease.  These molecular markers are currently used to tailor the intensity of therapy to minimize toxicity.10

The success of the childhood cancer treatment paradigm offers important lessons that can be extrapolated into broader biomedical contexts.  At the clinical end of the spectrum, researchers and practitioners are able to correlate experimental lab data with patient data (treatment, histories, pathology, outcomes, and the like).  At the molecular end of the spectrum, researchers are able to cross-reference and study accumulating bodies of multi-dimensional molecular information.  The information forms a continuous cycle so that the clinical data are utilized to evaluate outcomes and the results of those evaluations are fed back both to researchers to develop and better refine evidence-based standards at an increasingly individualized level and to care providers to improve quality through better adherence to those standards.  Ultimately, one can envision that a patient’s encounter with the health care system will be informed by the seamless and continuous flow of information generated by these same techniques – diagnosis, treatment and prevention of disease on an individualized molecular level.

Impact of the Current Regulatory Environment on Personalized Medicine

The promise of personalized medicine can be realized only by assuring appropriate use and flow of data from bench to bedside to bench again.  “Central to the ability to deliver safe, effective, and patient-centered care is a need for better and timelier evidence on which to base clinical decisions about which medical interventions are best, for whom, and under what circumstances.11  Data derived from multiple and diverse sources – care providers, researchers, and patients themselves – must be integrated and made accessible to patients and their care providers to enable truly informed identification and assessment of treatment options.  In practice this occurs in some settings even today.  But advances in information technology offer unprecedented opportunities to pursue the ultimate goal of delivering care targeted to individual patients based on their individual characteristics to enhance safety and improve outcomes.

Thus the lines typically drawn in today’s regulatory environment and among some experts12 between “primary” activities (medical treatment and related payment and operations such as patient safety and quality improvement initiatives) and “secondary” uses (research and public health activities, for example), are increasingly artificial, and in many cases will be indistinguishable.  For example, when a patient receives a cancer therapeutic agent through a clinical trial, related information will be used for both her personal clinical care and for the research.  Moreover, the information will inform expectations regarding the drug’s risks and benefits if it is approved for distribution to patients.  Increasingly, researchers and care providers will not know a priori what information may later be needed for clinical care, quality improvement, or research purposes.  Because secondary use typically is not considered central to health care delivery, and thus is disadvantaged in health information exchange and health privacy regulations, these artificial lines also threaten to impede the vision of individualized health care.  Indeed, the devastating impact on research and public health of such differential treatment has been well-documented.13-15  The impact is exacerbated by the extraordinarily complex web of laws and regulations that govern health information exchange, which cause confusion among health care providers and researchers in understanding their obligations.[*] 16, 17

Numerous approaches to health information exchange regulation and privacy protection have been advanced.  Some argue that individual autonomy and privacy protection are absolute imperatives and that patients must be permitted to exercise absolute control over the use of their data, even for treatment and payment activities.  Indeed, legislation recently was introduced that would move substantially toward that objective.18  Others argue that a fledgling health information exchange industry should be left alone and that the market will reward entities that successfully build trust relationships with their customers.  The burgeoning growth in the transfer of health data for commercial purposes gives many pause.  For example, internet search firms solicit health data from individuals and others to target advertisements for products or services and to profile users’ internet browsing.  Data brokers obtain access to health information from a variety of sources, then analyze or resell it for various commercial purposes.

While it is beyond the scope of this statement to discuss in detail, we note that these commercial uses, though they may prove beneficial to the delivery of individualized health care, are largely unregulated, at least at the federal level, because the involved firms often are not “covered entities” under the Health Insurance Portability and Accountability Act of 1996 (“HIPAA”).19  Thus, they present far higher risks to individual privacy than do uses for federally regulated research or provider-driven quality improvement activities.

To avoid the unintended negative consequences of overly broad restrictions on the use of health information, restrictions that can eviscerate the ability to deliver personalized health care, we urge the Work Group to consider a middle ground that would recognize medical treatment and related activities, including research, as central to the health care delivery system.  To assure ethical use of health information in the conduct of research studies (e.g., consistent with an individual’s informed consent or under an IRB-approved waiver in appropriate circumstances), such preferential treatment of research uses and disclosures might be permitted only for activities that are approved by and subject to the oversight of an institutional review board (IRB) operating under a Federalwide Assurance.  New models under development may provide additional alternatives.20, 21

The National Cancer Institute, through its caBIG™ initiative, is working to eliminate technological barriers to data sharing and to assist health care providers and researchers in addressing various legal and regulatory challenges to advance the move toward personalized medicine.  We believe it essential to the success of this and related initiatives that the Work Group and other stakeholders recognize the centrality of research to the clinical enterprise and reject constructs that elevate quality improvement or public health activities as both distinguishable from and more essential than research.


Notwithstanding the advancing paradigm shifts in health care delivery, it remains imperative that patients and research participants have confidence that the deployment of an electronic system of health information exchange does not jeopardize their privacy interests.22-24  To this end, caBIG™ is developing a framework for managing data exchange, which may provide a useful national model.  This approach recognizes that there are varying levels of sensitivity of health information and that many data exchanges require agreements, validation of users, authorization of intended uses, and so forth.  The caBIG™ program does not focus on the nature of the “use” of the data, but rather on the controls that are needed to appropriately secure access depending on the data’s sensitivity.  Because the caBIG™ infrastructure is premised on the concept of federation, individual entities that control access to data are responsible for assessing the risk and consequent protection required for any data set to be shared.  Thus, for example, an entity in a state with very restrictive privacy laws likely will impose more restrictions on access to its data (e.g., prospective written individual patient consent) than an entity in a state with less restrictive requirements; the caBIG™ infrastructure enables a differential setting of terms of access.

The concept of federation within caBIG™ rests on a technical infrastructure that enables the sharing of data and analytical tools across multiple participating institutions and at different time points.  The participating institutions retain legal responsibility for research data generated within their institutions in accordance with applicable laws, regulations and policies.  Hence, the  institutions determine who will be authorized and under what terms and conditions (e.g., levels of security controls, types of contractual provisions, and so forth) to obtain access to data which they control.  For the exchange of sensitive data (e.g., identifiable information about research participants, or data with significant proprietary value), trust agreements will be necessary.  However, because the legal, regulatory, ethical, and contractual limitations on data exchange facing various participating institutions are similar, a considerable amount of even moderately sensitive data can be exchanged using common template agreements or contractual provisions devised with input from a diverse group of stakeholders.  Where highly sensitive data is to be exchanged, data providers and users can negotiate any additional protections that may be warranted.

The caBIG™ Infrastructure

The developing caBIG™ infrastructure25 utilizes community-defined standards and architecture to support interoperable software applications and enable the sharing of cancer data.  This infrastructure leverages existing resources and supplements these with new applications, toolkits, and other devices developed by experts in the caBIG™ community to:

  • provide scientists with the ability to collaborate and integrate data and findings to accelerate research;
  • assist the cancer community with priority setting, decision-making, and participation to accelerate completion of clinical trials
  • empower advocacy groups and individual patients to participate in clinical research; and
  • help health care providers become patients’ partners in the research enterprise and educated consumers of research findings.

All products created with NCI caBIG™ program funds are made available on an open development, open source and open access basis.

The caBIG™ infrastructure is designed to promote the emerging paradigm of personalized, molecularly-based medicine by creating the capacity to integrate and aggregate information that has been collected at different times, in different locations, by different clinical and research groups.  We have already begun to pilot a set of caBIG™ tools that integrate clinical trials data and high throughput molecular analysis to support the transition to tailored therapy.  These applications allow researchers to access and analyze clinical and experimental data collected across multiple participating institutions and study time points.  Examples of NCI initiatives that are using these tools include:

  • REMBRANDT (REpository of Molecular BRAin Neoplasia DaTa), which integrates genetic and clinical information from brain tumor clinical trials for improved research, disease diagnosis, and treatment.   Ultimately, the REMBRANDT project will produce a national molecular/genetic/clinical database of several thousand primary brain tumors that will help physicians give patients a more accurate prognosis and select treatments more likely to be effective in any individual tumor.26, 27
  • I-SPY (Investigation of Serial studies to Predict Your Therapeutic Response with Imaging And moLecular analysis), a correlative science study for women undergoing neoadjuvant chemotherapy for breast cancer who are evaluated with serial core biopsies and magnetic resonance imaging (MRI) during the course of their treatment. The effort involves the integration of specific data types associated with the study including gene copy number, gene expression, protein expression, MRI, demographics, and short- and long-term responses to therapies throughout the course of treatment.  By collecting and sharing biomedical research data across organizations the I-SPY team hopes to integrate the research results for analysis in support of translational research, in particular by developing a database of breast cancer biomarkers that predict response to therapy during the course of the cancer treatment.

The caBIG™ Community and the Data Sharing and Intellectual Capital (DSIC) Workspace

The caBIG™ community convenes through numerous teleconferences and face-to-face meetings, which are open and available to anyone who chooses to participate.  Participants are drawn from academic institutions, industry, standards development organizations, advocacy groups, government sponsors and regulatory agencies.  They are organized through various Workspaces, which in turn identify and execute caBIG™ priorities:

  • Domain Workspaces focus on informatics problems in a particular domain of cancer research – clinical trials, imaging, tissue banks, genomics, proteomics, epidemiology and population sciences.
  • Cross-Cutting Workspaces integrate the Domain Workspaces together into a common framework with consistent architecture and standards.
  • Strategic Workspaces address issues of concern to both the Domain and Cross-Cutting Workspaces and set the overall guidelines and goals for the caBIG program.

The Data Sharing and Intellectual Capital (DSIC) Workspace is a strategic level caBIG™ Workspace whose members include biomedical researchers, clinicians, technology transfer experts, intellectual property and regulatory attorneys, policy specialists, patient advocates, bioethicists, and bioinformaticists.  Members participate in Workspace-wide activities and through two individual Special Interest Groups (SIGs): one that focuses on regulatory issues and another that focuses on intellectual property and proprietary concerns.

caBIG™ participants represent a broad range of organizations including health care providers and researchers, patients and research participants, public and private sponsors, application developers, and more.  The diversity of this community creates substantial challenges to data sharing as noted below:

  • caBIG™ participants have varying obligations under federal and state data privacy28 and security29 laws and standards, including HIPAA and associated privacy and security rules,30 FDA security regulations,31 and FISMA.32
  • Human research is subject to oversight by a broad range of ethical review boards – IRBs – whose local requirements regarding collection, maintenance, and use of identifiable data often vary substantially based in part on the requirements of the Common Rule,33 FDA regulations,34 and other federal, state, and voluntary regulations, standards and codes.
  • Academic considerations (the need to secure grants and publish research results in peer-reviewed literature) often discourage sharing, particularly during early stages of research.
  • Researcher and sponsor concerns regarding ownership and control of intellectual property are substantial; industry funding and material transfer agreements often require at least temporal restrictions on data sharing.
  • Patient safety concerns related to premature access to unvalidated information discourage researchers who otherwise might be inclined to share data from doing so.
  • Public perceptions regarding privacy, security and confidentiality of health information, informed at least in part by widespread distrust of electronic data storage and of the human research enterprise more broadly, make it difficult for researchers and research institutions to champion data sharing initiatives.

DSIC’s mission is to facilitate data sharing between and among caBIG™  participants by addressing the legal, regulatory, ethical, policy, academic, proprietary and contractual barriers to data exchange for public health and research purposes.  Our members believe that strong confidentiality, privacy and security measures are both necessary and feasible in any electronic health information exchange environment, and that the measures can be scaled to accommodate a broad range of participants, without unnecessarily impeding scientific discovery and medical progress.



The caBIG™ community has, from the start, anticipated the need to accommodate diverse stakeholders’ varying needs for data confidentiality, privacy and security standards and assurances, and continues to work to eliminate or reduce identified barriers to the broad data sharing that is necessary to advance scientific progress and speed medical discovery.  The caBIG™ community’s efforts have focused in three areas: a federated architecture for data sharing – to maintain local control of clinical and research records; an analytical framework designed to encourage consistent analysis of legal, regulatory, ethical and other barriers to data sharing and identify solutions; and standards, tools and infrastructure broadly available to members of the caBIG™ community and beyond to facilitate data sharing.

Federated Architecture

The technical infrastructure of the caBIG™ program is based on a set of technologies known as caGrid. This infrastructure is designed as a standards-based, service-oriented architecture, federated biomedical information network that allows systems constructed according to a series of compatibility guidelines35 to interoperate with each other, and with properly authorized and authenticated end users.  Like its overall architecture, the security infrastructure of caGrid is federated, allowing users to authenticate (assert their identity) at their local institutions, while allowing data providers to retain local control of the decision to authorize access to any particular data resource.  This process is implemented through a combination of technology and trust agreements among the entities managing elements of the caGrid infrastructure and local data and identity providers, which can be enforced through a combination of laws, regulations, formal contractual commitments and informal terms of use.  The caGrid technical framework operates within the preexisting legal and regulatory framework for biomedical information exchange, that is, within the Federal, state and local requirements that govern access to health information as well as the body of contract law and other commercial laws that govern contractual relationships.  Thus the “federation” comprises a coalition or community of participants that are bound to one another via individual, voluntary commitments under a community-developed set of governance procedures.  The legitimacy of the governance framework is conferred by the open and transparent process pursuant to which the framework is developed.

The NCI’s “instance”[†] of the caGrid infrastructure is known as ‘NCI-caGrid.’  The federated, distributed nature of caGrid technology also allows for the creation of a series of caGrid-connected networks that are managed independently but are interoperable with the NCI-caGrid. This flexibility allows an individual medical center, cooperative group, or other entity to set up its own instance of caGrid that can operate behind a firewall, or with different security requirements than those of the NCI-caGrid.  However, participants in a local caGrid can interact with the NCI-caGrid so long as an appropriate trust agreement is implemented.  Similarly, participants in the NCI-caGrid can interact with other large-scale implementations of caGrid technology that are expected to be compatible with caBIG standards, such as the upcoming CardioVascular Research Grid (CVRG) or the United Kingdom’s National Cancer Research Institute Oncology Information Exchange (NCRI ONIX).36, 37

Strategic Implementation: Analytical Framework

DSIC recognizes that different data require different types of protection.  Some data, such as individually identifiable human health information, are highly sensitive and require significant protection to address legal, regulatory and ethical constraints on access and use.  Other data, for example, highly aggregated or completely deidentified datasets (that is, data sets containing no personal identifiers), typically do not require such stringent protection.  To address these differences and facilitate data sharing within the caBIG™ community, DSIC has developed the caBIG DSIC WS Framework for Data Sharing Terms and Conditions (“Framework”) (see Attachment 1, also available on the caBIG website) that we believe can be helpful in analyzing challenges and identifying opportunities for the caBIG™ community and to electronic health information initiatives more generally.

The Framework is designed to empower and encourage individuals and institutions seeking to share data to consistently analyze any constraints on such efforts, grouped into four broad categories: (i) economic or proprietary concerns of researchers and research institutions; (ii) federal and state privacy and security laws and regulations and institutional policies; (iii) ethical considerations, reflected in explicit consumer- and IRB-imposed constraints on data sharing, including restrictions specified in informed consent documents; and (iv) contractual restrictions imposed by research sponsors.  Once identified, DSIC believes that many of those barriers can be reduced or even eliminated, at least for some subsets of data.  These objectives are being accomplished through the series of existing and planned standards, tools, and infrastructure arrangements described below.

The Framework encourages data providers to determine access in accordance with a systematic analysis of these various requirements and offers a mechanism for speeding the electronic transmission of data consistent with an institution’s judgment regarding the sensitivity of a particular dataset.  Individually identifiable health data are available under the stewardship of the providing institution, which determines who is authorized to access the data consistent with its legal, regulatory, ethical and internal policy obligations.  Thus, for example, a health care provider might make available through the NCI-caGrid to the broader caBIG community only data that have been completely de-identified.  That same provider might be willing to share identifiable extracts of the same dataset only with prior patient (or subject) consent, or only with a researcher at another institution (typically through a locally operated instance of caGrid), and then only if the researcher has secured appropriate IRB approval for the project and has agreed to certain assurances, either through standardized trust agreements or less formal click-through terms of use that meet applicable regulatory requirements (e.g., data use agreement assurances), or through non-standard bilateral written agreements.

Technical Implementation: Standards, Tools, and Infrastructure

DSIC and other caBIG™ Workspaces are developing and implementing standards, tools and infrastructure to support data sharing consistent with the constraints described above.  For example, to complement the framework described above, DSIC is developing:

  • Web-based terms of use and standardized contractual provisions for trust agreements designed to facilitate data sharing consistent with HIPAA and other applicable privacy and security laws and with human research protection regulations and accreditation standards.
  • Model language for applications submitted to IRBs designed to educate their members regarding caBIG™ and the NCI-caGrid, the benefits and risks of data sharing, and the various mechanisms utilized in various caBIG™ tools to mitigate risks.
  • Model language for informed consent and authorization documents, designed to encourage consumers to participate in the caBIG™ initiative, consistent with legal and regulatory requirements specified in the Common Rule, FDA regulations, accreditation standards, and HIPAA.

The caBIG™ Tissue Banks and Pathology Tools (TBPT) Workspace has developed caTISSUE Core,  a tool designed to track the extent to which individual patients or research participants have given permission for their collected biospecimens and related data to be used for research purposes.  These permissions frequently are granted during the course of a patient encounter or clinical trial.  This tool allows users to track different tiers of consent as well as decisions by research participants to withdraw consent for the use of specific specimens.  The primary drivers for this tool are the need to fulfill ethical obligations to patients and research participants to honor their expressed wishes, and the desire to reduce ambiguity with respect to the ability to utilize biospecimens for translational research.  Other tools being deployed to support other components of the clinical research endeavor include caExchange, under development by the Clinical Trials Management Systems (CTMS) Workspace, the National Cancer Imaging Archive, implemented by the In Vivo Imaging (IMAG) Workspace, and caArray, produced through the Integrative Cancer Research (ICR) Workspace.

Finally, DSIC provides support to the NCI-caGrid Security Working Group (SWG) on security policy matters.  The primary authority for implementation of security policy and procedures for the NCI-caGrid infrastructure is the NCI Center for Biomedical Informatics and Information Technology (NCI CBIIT), which approves all policies and procedures relating to security on the NCI-caGrid. To support this activity, the NCI CBIIT has authorized the creation of the SWG, which develops recommendations for security policies and requirements covering systems that are attached to or access the NCI-caGrid. This group is considering input from caBIG™ domain workspace participants, the NCI, regulatory agencies, and members of the public and  will then make recommendations regarding policies and procedures to the NCI CBIIT through the NCI’s general contractor for the caBIG™ program.38  The SWG mission statement can be found on the NCI’s GForge website.

The SWG will offer recommendations in the following areas: 

  • Periodic security risk assessment procedures for caGrid infrastructure and portions of NCI-caGrid-facing services or components;
  • Security policies regarding federated authentication, certificate management/provisioning, group-based authorization, protection of sensitive data, user security policies and procedures; and
  • Security policy implementation procedures for use in NCI-caGrid-facing components across caBIG.

The SWG is presently creating a set of baseline policies that will allow a low barrier to entry of data via the NCI-caGrid, particularly for systems that carry non-sensitive information.

The SWG then will develop a set of policies and procedures sufficient for an entity with highly sensitive data to confidently permit access to those data through the NCI-caGrid.  Such confidence can be achieved only if these policies and procedures are created through an open process that seeks input from all members of the diverse caBIG™ community, as described above.


Resetting the Context

The implication of the context-setting questions initially presented to DSIC by the NCVHS Work Group is that “direct patient care” is somehow distinguishable from quality improvement and other activities such as research.  We understand from the subsequent questions that in fact there is more openness to dispensing with these increasingly artificial distinctions.  As described above, we believe that research is an integral component of health care delivery in the 21st Century.  Thus, we encourage the Work Group to recognize the critical importance of research in developing its future recommendations.

Adequacy of Existing Privacy Regulations

With respect to the adequacy or not of existing privacy protections, we take no explicit position.  Opinions vary tremendously across the country and even within our own workspace.  Consensus on the point cannot be achieved until we agree first on our priorities: do we embrace privacy in the name of patient autonomy and to the exclusion of public health and other societal objectives?  Even assuming the answer is “yes,” can patients make an informed choice in the current environment?  Are they aware, for example, that by withholding information for clinical care, quality improvement, outcomes evaluation, or research, they will contribute to a reduction in the validity of results and potentially negatively impact their own individual care?  Responses to these questions need to be developed carefully within the broader context of the public’s expectations for cost-effective and technology-driven health care delivery in the 21st Century.

We do believe that HIPAA is, at best, an imperfect approach to health privacy, particularly in the context of research and the emerging personalized medicine paradigm.  It offers no protection at all to PHI maintained by a broad range of commercial entities.  And although the Privacy and Security Rules apply to most health care providers conducting research, even one of the Privacy Rule’s primary authors acknowledges that, when initially drafted, HIPAA “was not a regulation about research.  Research was not a central consideration, nor the thing that got the most attention, and it was also a difficult issue. . . . So, as a more difficult conversation that was not central to the policy debate, it was put off until late in the process.  In the end, research did get a fair amount of attention, although not from people who were intimately familiar with how the research world operated.”39  Unsurprisingly, HIPAA suffers well-documented shortcomings that impede medical progress without significantly advancing individual privacy or autonomy.14, 40, 41  Thus, notwithstanding widespread commitment by researchers to respecting study participants’ privacy and securing the confidentiality of their data, there is little appetite within the research community to extend HIPAA’s detailed and onerous administrative requirements and restrictions beyond their current application.  By contrast, there is likely to be broad support for enhancing penalties – and enforcement – against truly deleterious misconduct.42

Given these concerns, we encourage the Work Group to consider carefully any recommendations that may impact research and the drive toward personalized medicine – and to assure that the research community remains centrally involved in the development of any new standards.


Personalized medicine offers the opportunity to identify and apply evidence-based standards “in real time,” thereby improving our ability to deliver effective prevention and treatment to patients fighting cancer and other diseases.  Personalized medicine cannot exist apart from research, which includes clinical investigations, quality improvement activities, population science and epidemiology/public health surveillance.  Electronic health information initiatives offer the research community the ability to leverage existing data both to expedite medical discovery and to optimize individual patients’ care.  While we are tailoring our security policies and infrastructure to accommodate current rules concerning patient privacy, we are concerned that these frequently onerous compliance obligations will not scale to permit the next generation of health care delivery as the vision of personalized medicine firmly takes hold.  We encourage NCVHS to support efforts that enhance, not further limit, the ability to effectively conduct research while protecting individual patients’ privacy and respecting their autonomy.  If NCVHS recommends additional legislative or regulatory action to assure adherence to specified confidentiality, privacy and security policies and procedures in connection with the use of “secondary” data, we urge the Work Group, and the NCVHS as whole, to recommend actions that do not undermine existing accommodations for the research enterprise.


I would like to acknowledge the invaluable contributions of the following individuals for helping me to shape the ideas and opinions described in my testimony and for preparing my materials: Elaine Brock, J.D., M.H.S.A., University of Michigan; Kenneth Buetow, Ph.D., NCI; Deborah Collyar, B.A., Patient Advocates in Research; Margia Corner, B.A., University of Michigan;  George Komatsoulis, Ph.D., NCI; Rachel Nosowsky, J.D., University of Michigan; Patricia Weeks, B.S., Fox Chase Cancer Center; and Marsha Young, J.D., Booz Allen Hamilton. I also wish to thank the members of the DSIC Workspace who reviewed drafts of this document and contributed to its development.


  1. Meadows M. Genomics and Personalized Medicine. FDA Consumer Magazine 2005;39(6).
  2. Personalized Medicine. Wikipedia, 2007. (Accessed June 17, 2007, at http://en.wikipedia.org/wiki/Personalized_medicine.)
  3. HHS Secretary Leavitt Announces Steps Toward a Future of “Personalized Health Care”. 2007. (Accessed June 20, 2007, at http://www.hhs.gov/news/press/2007pres/03/pr20070323b.html.)
  4. Muller AJ, Scherle PA. Targeting the mechanisms of tumoral immune tolerance with small-molecule inhibitors. Nat Rev Cancer 2006;6(8):613-25.
  5. About caBIG(tm). National Institutes of Health, 2007. (Accessed June 17, 2007, at https://cabig.nci.nih.gov/overview.)
  6. American Cancer Society. Cancer in Children. Atlanta; 2007 Feb. 20, 2007.
  7. American Cancer Society. Cancer Facts and Figures. Atlanta, GA; 2007.
  8. Ries LAG, Melbert D, Krapcho M, et al. SEER Cancer Statistics Review, 1975-2004. Bethesda, MD, http://seer.cancer.gov/csr/1975_2004/: National Cancer Institute; 2007.
  9. Shochat SJ, Fremgen AM, Murphy SB, et al. Childhood Cancer: Patterns of Protocol Participation in a National Survey. CA Cancer J Clin 2001;51(2):119-30.
  10. Rubnitz JE, Pui C-H. Childhood Acute Lymphoblastic Leukemia. Oncologist 1997;2(6):374-80.
  11. Roundtable on Evidence-Based Medicine. Institute of Medicine, 2007. (Accessed June 20, 2007, at http://www.iom.edu/CMS/28312/RT-EBM.aspx.)
  12. Safran C, Bloomrosen M, Hammond WE, et al. Toward a National Framework for the Secondary Use of Health Data: An American Medical Informatics Association White Paper. J Am Med Inform Assoc 2007;14(1):1-9.
  13. Aronovitz LG. Health Information: First-Year Experiences under the Federal Privacy Rule. Washington, DC: Government Accountability Office; 2004.
  14. Nosowsky R, Giordano TJ. THE HEALTH INSURANCE PORTABILITY AND ACCOUNTABILITY ACT OF 1996 (HIPAA) PRIVACY RULE: Implications for Clinical Research. Annual Review of Medicine 2006;57(1):575-90.
  15. Melton LJ. The Threat to Medical-Records Research. N Engl J Med 1997;337(20):1466-70.
  16. Gross J. Keeping Patients’ Details Private, Even from Kin. New York Times  July 3, 2007.
  17. Testimony Before the Confidentiality, Privacy, and Security Workgroup of the U.S. Department of Health and Human Services, American Health Information Community. (Accessed July 21, 2007, athttp://www.hhs.gov/healthit/ahic/materials/06_07/cps/flstatement.pdf.)
  18. S. 1814: Health Information Privacy and Security Act,. In; Introduced July 17, 2007,.
  19. Health Insurance Portability and Accountability Act. In: 42 US Code § 1320d; 1996.
  20. Kohane IS, Mandl KD, Taylor PL, Holm IA, Nigrin DJ, Kunkel LM. MEDICINE: Reestablishing the Researcher-Patient Compact. Science 2007;316(5826):836-7.
  21. Wendler D. One-Time General Consent for Research on Biological Samples: Is It Compatible With the Health Insurance Portability and Accountability Act? Arch Intern Med 2006;166(14):1449-52.
  22. Lin Z, Owen AB, Altman RB. GENETICS: Genomic Research and Human Subject Privacy. Science 2004;305(5681):183-.
  23. McGuire AL, Gibbs RA. GENETICS: No Longer De-Identified. Science 2006;312(5772):370-1.
  24. Malin B. Testimony Before the Confidentiality, Privacy, and Secuirty Workgroup of the U.S. Department of Health and Human Services, American Health Information Community (AHIC). In; 2007 June 22, 2007; Washington, DC; 2007.
  25. MITRE Corporation. caBIG(tm) Overview. McLean, VA May 2006.
  26. caIntegrator/REMBRANDT – Repository for Molecular Brain Neoplasia Data. National Cancer Institute, 2007. (Accessed July 20, 2007, at https://caintegrator.nci.nih.gov/rembrandt/.)
  27. The Nation’s Investment in Cancer Research: A Plan and Budget Proposal for Fiscal Year 2006. National Cancer Institute, 2005. (Accessed July 21, 2007, at http://plan2006.cancer.gov/pdf/nci_2006_plan.pdf.)
  28. Health Privacy. National Conference of State Legislators, 2007. (Accessed June 17, 2007, at http://www.ncsl.org/programs/lis/privacy/medprivacy.htm.)
  29. State Laws Governing Security Breach Notification. Crowell & Moring, 2007. (Accessed June 17, 2007, at http://www.crowell.com/pdf/SecurityBreachTable.pdf.)
  30. U.S. Department of Health and Human Services. Standards for Privacy of Individually Identifiable Health Information and Security Standards for the Protection of Electronic Protected Health Information (HIPAA Privacy and Security Rules). In: 45 C.F.R. Parts 160 and 164, ed.; 2000.
  31. U.S. Food and Drug Administration. Electronic Records; Electronic Signatures. In: 21 C.F.R. Part 11, ed.; 1997-2004.
  32. Federal Information Security Management Act of 2002. In: 44 USC §§ 3501-3549; 2002.
  33. U.S. Department of Health and Human Services: Office for Human Research Protections. Basic HHS Policy for Protection of Human Research Subjects. In: 45 C.F.R. Part 46 Subpart A, ed.; 1991-2005.
  34. U.S. Food and Drug Administration. Informed Consent of Human Subjects. In: 21 C.F.R. Part 50 Subpart B, ed.; 1981-2006.
  35. caBIG(tm). The Cancer Biomedical Informatics Grid(tm) Program: caBIG(tm) Compatibility Guidelines. In; 2005.
  36. McKinney M. The Wisdom of Grids. Government Health IT 2007.
  37. NCRI Informatics Initiative, Implementation Plan. (Accessed July 21, 2007, at http://www.cancerinformatics.org.uk/Documents/Imp_Plan/Implementation%20Plan%20FAQ_final.pdf.)
  38. NCI caBIG Security Working Group Mission Statement. 2007. (Accessed July 27, 2007, at https://gforge.nci.nih.gov/frs/download.php/2150/NCI-caGrid-SWG_Mission_Statement.doc.)
  39. Herdman R, Moses H. Effect of the HIPAA Privacy Rule on Health Research: Proceedings of a Workshop Presented to the National Cancer Policy Forum. In; 2006; Washington, D.C.: Institute of Medicine of the National Academies; 2006.
  40. Aronovitz LG. Health Information: First-Year Experiences under the Federal Privacy Rule. In: Government Accountability Office; 2004.
  41. Ness RB. A year is a terrible thing to waste: early experience with HIPAA. Annals of Epidemiology 2005;15(2):85-6.
  42. Bradbury SG. Scope of Criminal Enforcement Under 42 U.S.C. § 1320d-6. In: U.S. Department of Justice, ed.; 2005.

[*]       The boundaries are especially blurred for smaller providers and community hospitals that want to engage in clinical research activities.  For example, we are informed that a large nonprofit health system of community-based hospitals participating in an NCI program believes that HIPAA prevents a primary care physician from sharing protected health information (PHI) with a specialist for a consult without written patient authorization. (Personal communication with Dr. Kenneth H. Buetow, July 17, 2007.)  Notwithstanding explicit guidance to the contrary from the HHS Office for Civil Rights (HIPAA-Frequent Questions: Providers and Other Covered Entities-Treatment & Payment & Health Care Operations.  U.S. Department of Health and Human Services, 2007.  [Accessed July 21, 2007 at http://www.hhs.gov/hipaafaq/providers/treatment/481.html.]), the hospital seems to believe that this type of use is not part of the primary treatment of the patient.

[†]       The term “instance” is borrowed from the language of software design. Most modern computer programming languages organize data into “classes”, collections of data that are related. For example, a ‘Patient’ class might contain a patient’s name, address, phone number and the name of their physician. A class however, is more like a blueprint; it specifies its structure but does not contain the actual data. When it becomes necessary to work with the actual data, an “instance” of a class is created that contains the information, much as one might create a real house from a set of plans and physical building materials. Thus, an “instance” of caGrid is an installation of all of the hardware and software components required for its operation. An instance of caGrid is the basic unit for policy development, as different instances may implement different policies.