Guidelines for Determination of Need for IRB Review
Research involving collection or study of existing data, documents, and records can be exempted under
Category 4 of the federal regulations if: (i) the sources of such data are publicly available; or (ii) the
information is recorded by the investigator in such a manner that subjects cannot be identified, directly
or through identifiers linked to the subjects.
The latter condition of this category applies in cases where the investigators initially have access to
identifiable private information but abstract the data needed for the research in such a way that the
information can no longer be connected to the identity of the subjects. This means that the abstracted
data set does not include direct identifiers (names, social security numbers, addresses, phone numbers,
etc.) or indirect identifiers (codes or pseudonyms that are linked to the subject's identity). Furthermore,
it must not be possible to identify subjects by combining a number of characteristics (e.g., date of birth,
gender, position, and place of employment). This is especially relevant in smaller datasets, where the
population is confined to a limited subject pool.
The following do not qualify for exemption: Research involving prisoners, research involving collecting
protected health information from HIPAA-covered entities and FDA-regulated research.
When does the secondary use of existing data not require review?
In general, the secondary analysis of existing data does not require IRB review when it does not fall
within the regulatory definition of research involving human subjects, as referenced above.
In order for the committee to evaluate research which includes secondary analysis, the researcher will
need to provide:
1. A complete protocol for the secondary study;
2. The details of primary data collection (which may include the original protocol, consent and approval,
if research), or the source of publicly available data; and
3. If the data are not publicly available, a letter from the source authorizing access to the data or, if the
data were purchased commercially, a copy of the contract authorizing the use of the data.
After these documents are submitted, the committee will be able to decide if the research is exempt,
non-exempt, requires a new consent, or does not need to be reviewed further.
Terms useful in discussing Secondary Analysis of Existing Data:
Existing data are data that exist at the time the research is proposed.
Existing samples must already be "on the shelf" (meaning, they must have already been gathered) at
the time the research is proposed. For example, existing blood samples, existing tissue samples,
completed surveys, existing interview notes, and existing audio- and video-tapes.
Public data: Public use data sets (such as portions of U.S. Census data, data from the National Center for
Educational Statistics, National Center for Health Statistics, etc.) are data sets prepared with the intent
of making them available for the public. The data available to the public are not individually identifiable
and therefore their analysis would not involve human subjects.
De-identified data are data from which all identifiers have been removed. Identifiers include obvious
information such as name, address, social security or medical record numbers, photographs, address,
telephone number, etc. as well as things such as biometric identifiers (voice and finger prints) and even
zip code, if there are less than 20,000 people in the geographic area. A birth date coupled with a
diagnosis may be sufficient to identify an individual in many research populations.