This guide was created by Galvin Library to help IIT researchers and staff understand data management and the data management and data sharing provisions in government grant programs from the National Institutes of Health (NIH) and National Science Foundation (NSF).
What is data?
The Office of Management and Budget (OMB) defines data as "the recorded factual material commonly accepted in the scientific community as necessary to validate research findings, but not any of the following: preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues." (OMB Circular A-110 Revised 11/19/93 as further amended 9/30/99, subpart C, sec __.36(d)(2)(i)).
What is data management?
Data management, sometimes referred to as data curation, is more than just the safe storage of datasets. The complete life cycle of the data must be considered, such as how long to retain datasets, how to migrate from obsolete file formats, and how to properly dispose of datasets at their end of life. Also, because the datasets need to be shared in order to have value, they need to be appropriately and meaningfully “tagged” with metadata. Data management also includes identifying restrictions on the use of or access to the datasets and the means to control access.
Data management questions to consider
- What type of data will be produced? Will it be reproducible? What would happen if it got lost or became unusable later?
- How much data will it be, and at what growth rate? How often will it change?
- Who will use it now, and later?
- Who controls it (PI, student, lab, IIT, funder)?
- How long should it be retained? e.g. 3-5 years, 10-20 years, permanently
- Are there tools or software needed to create/process/visualize the data?
- Any special privacy or security requirements? e.g. personal data, high-security data
- Any sharing requirements? e.g. funder data sharing policy
- Any other funder requirements? e.g. data management plan in proposal
- Is there good project and data documentation?
- What directory and file naming convention will be used?
- What project and data identifiers will be assigned?
- What file formats? Are they long-lived?
- Storage and backup strategy?
- When will I publish it and where?
- Is there an ontology or other community standard for data sharing/integration?
What is metadata?
Metadata is data about data--it fully and meaningfully describes the data and makes it possible to efficiently search a large data repository for specific datasets. In the case of datasets, this metadata may include: the identification of the creators for proper attribution by users a general description of the actual datasets, including size, format, and dates created information about the experiments or processes that generated the data, including the techniques used for capturing and recording the data a definition of the researcher's intellectual property rights to the data
How can I manage my datasets?
IIT’s institutional repository—repository.iit—provides the framework for storing and tagging your datasets. repository.iit provides a robust system for not only storage, but also for the assignment of the metadata and access restrictions. For more information, contact firstname.lastname@example.org, or visit Galvin Library’s guide to repository.iit.
How do I create the metadata I need to define my datasets?
Some of the metadata fields available in IIT's data repository are:
- Title: The main title for the dataset
- Creators: A list of all authors or creators of the dataset being described
- Date of Content Issue: Has the data been published or publicly distributed before being added to the data repository?
- Date of Digital Creation: When was this dataset created?
- Other Contributors: Anyone other than the main authors or creators who contributed to the creation of this dataset
- Series/Report No.: Use this if the dataset is part of a report Identifiers: use if the data is part of a report or paper having an identifier, such as ISSN, ISBN, or government document number
- Type: In this case, the type should be dataset Language: language of the description and any supporting documentation or of the dataset itself if textual
- Subject Keywords: This is the meat of the descriptive fields. Any keywords that will help to fully and appropriately identify the nature of this dataset go here
- Abstract: A brief description of the data and how it was created Sponsors: if the research that created the data was sponsored, enter that here
- Description: A fuller description of the data and the methods used to create it
Good metadata is essential for effective data management and sharing. Librarians at Galvin Library have the expertise to help you with all your metadata requirements. Contact your subject librarian or Adam Strohm (email@example.com, 7-5107) for any questions you may have.
IIT Office of Sponsored Research and Programs (OSRP)
OSRP provides comprehensive services for faculty & staff on the preparation of proposals and the administration of grants and contracts for sponsored programs.
Office of Research Compliance and Proposal Development
- If your research involves human participation, animal studies, or rDNA, biological materials, or potentially hazardous agents, approval is necessary by one of these review boards
- Use the Graduate College's Proposal Development services if you are looking for help developing the content of your proposal, or are looking for funding opportunities for your research
research @ iit home page
Main portal to research at IIT
Federal Funding Agencies: Data Management and Sharing Policies
Curious which United States federal funding agencies require data management and data sharing? The California Digital Library has put together a useful summary, complete with direct links to policy and guideline documents.