Research Guides: Library Portal for AgriLife: Research Data Management

Research Data Management Services

USDA Ag Data Commons

The Ag Data Commons is a data access system maintained by the US Department of Agriculture's National Agricultural Library. It is compliant with U.S. Project Open Data standards for federal agencies providing access to publicly-funded data. Ag Data Commons holds data files managed directly by NAL and also links to datasets and resources located on other websites.

Ag Data Commons Instructional Tutorial

University Libraries' Research Data Management Services

The University Libraries provides services and tools for helping you to optimize your research data management (RDM), create a RDM Plan, and preserve and provide access to your research data. See our Research Data Management Guide for more in-depth information.

The DMPTool is an online platform guiding DMP development according to the requirements of specific funding agencies. Texas A&M University researchers log in with their NetID and passwords.

DMPTool
Get started with the DMPTool by creating an account. Choose option 1 and search for Texas A&M University.
Quick Start Guide
Learn how to log in and navigate your dashboard.

Components of a Data Management Plan

Data description

Who will be responsible for the data and their management?
What types of data will be collected or created?
How will data be collected or created, and processed?

Standards

Which file formats will be used for working data?
What additional information is needed to make data meaningful, how will this be captured?
What forms will additional descriptive material take?
Which metadata standards will be followed?

Policies for access, sharing and re-use

How will data be stored, backed up, and maintained during the project?
How will you manage access and security?
Are there any ethical, privacy, or legal issues?
How will the data be shared or made available? Are there any restrictions?
What are intended or anticipated future uses?

Long-term storage and management

Which data will be preserved for the long-term?
Which file formats will be used for long-term accessibility?
Where will data be archived and how will they be maintained to ensure preservation?

Template Text

Template text for using the Texas Data Repository in grant applications

The Texas Data Repository (TDR) is the Texas A&M University institutional data repository. The TDR It is a flexible online platform for researchers to publish and archive datasets and data products. If appropriate for your needs, use this template language in your DMP to indicate your strategy for data preservation and sharing.

Texas Data Repository Template Language
Use this template language about the Texas Data Repository in your data management plan.
Preferred Data Formats
Research data can be structured and stored in a variety of file formats (refer to Recommended File Formats created by Texas A&M University Libraries).

Data Sharing Requirements

Identifying Funder Policies

The Scholarly Publishing and Academic Resources Coalition (SPARC) tracks article and data sharing requirements for federal agencies. See current and forthcoming policies by federal agency.

Data Repositories

What is a Data Repository?

Data repositories are tools for sharing and preserving research data. There are hundreds of repositories worldwide. Some cater to a specific research community, while others are general-purpose. Repositories may be called data centers, data archives, or scientific databases.

They are often divided into three categories:

Institutional Repositories (IRs) are affiliated with a researcher’s institution. Faculty, staff, and students from Texas A&M can use the Texas Data Repository for small datasets that contain non-sensitive data.

Domain-specific or Disciplinary Repositories (DRs) are discipline-specific and often operated by a professional organization, a consortium of researchers, or a similar group.

General-purpose or Open Repositories (ORs) allow researchers to deposit and make their data available regardless of disciplinary or institutional affiliation. See this chart to help you select a General-purpose repository.

About the TDR

The Texas Data Repository (TDR) is the Texas A&M University institutional data repository, made available to researchers by the University Libraries. It is a flexible online platform for researchers to publish and archive datasets and data products.

The TDR is available to individual researchers at Texas A&M University, as well as labs and groups, seeking to meet data sharing and preservation requirements from funding agencies and publishers, or seeking to publish collections of archived data.

The TDR is intended for small data projects.

Full datasets (which may include multiple files) should be no more than 50GB.

Individual files must be under 4GB (note some users may have trouble downloading files over 2GB over wireless connections).

The TDR uses DataCite to assign DOIs to datasets making it easy for others to cite data.

NOTE: If you are depositing data in the TDR, once you've logged in with your NetID authentication, you need to navigate to https://dataverse.tdl.org/dataverse/tamu and/or click on the TAMU icon (see screen shot below) before clicking the Add Data button. See also our dataset cheatsheet and our dataverse repository cheatsheet for more information.

image showing arrows pointing to TAMU logo in TDR

Using the TDR

For more information and to use the Texas Data Repository, review the Terms of Use.

About Disciplinary Repositories

Disciplinary data repositories are set up to accommodate the data needs of a specific research community. They are the most likely to offer both the specialist domain knowledge and the data management expertise needed to ensure data are properly kept and used.

They may provide the ideal solution to meet data archiving and public access expectations of funding agencies, publishers, and the researcher community. However, they are also the most likely to be selective, requiring advance planning to meet standards for metadata and documentation.

Using Disciplinary Data Repositories

Since there are many data repositories, it is important to review terms and conditions before use.

1. Is the repository reputable and who supports it?
It may be listed in re3data, FAIRSharing, or broadly recognized by the research community. Better yet, it is endorsed by a journal, funder, or professional society.

2. Will it take data you want to deposit and how are data deposited?
Data may need to be of a particular type and file format. Some repositories allow self-deposit while others mediate deposit.

3. Will the repository be safe in legal terms?
Some repositories may be capable of safely storing sensitive or restricted data, while others may not. Ideally the repository allows depositors to assign terms of use and licenses.

4. Will the repository sustain the data value?
A repository can add value by making data findable, accessible, interoperable and reusable (FAIR) for the long term. This includes assigning persistent identifies (like DOIs) to datasets, requiring standard metadata for discoverability, and conducting file preservation activities.

5. Will it support analysis and track data usage?
Repositories may also provide citation information to users and usage tracking for the depositor.

From: Whyte, A. (2015). ‘Where to keep research data:DCC checklist for evaluating data repositories’ v.1 Edinburgh: Digital Curation Centre. Available online: www.dcc.ac.uk/resources/how-guides

Finding Data Repositories

Registry of Research Data Repositories
Re3data is a registry of data repositories, covering a wide range of disciplines from around the world. It allows researchers to search for repositories in their discipline and to identify relevant polices and terms of use.

FAIRSharing Catalog
FAIRSharing is a curated catalog of databases, along with associated standards and policies. It also includes standards and databases recommended by journal or funder data policies.

Scientific Data Recommended Repositories
A list of disciplinary and open repositories evaluated to ensure that they meet the data access, preservation and stability requirements of Nature's Scientific Data journal.

NIH Data Repositories
National Institutes of Health-supported data repositories that make data accessible for reuse. Most accept submissions of appropriate data from NIH-funded investigators (and others), but some restrict data submission to only those researchers involved in a specific research network.

If you are going to post data in a data repository you may need to inform and gain consent from human subjects that data about them will be posted online for long-term preservation and also possibly curated. In the consent form explain how you will protect the subjects' identities, anonymize if necessary, restrict access to the dataset if necessary, and as a best practice only collect necessary data.

Here is some Sample Language you can use (specifically for the Texas Data Repository):

Will my study-related information be used for future research?

De-identified data collected as part of this research study may be used for future, unspecified research or shared with other researchers without additional consent. [Explain]. In addition, a copy of the de-identified data will be stored in the Texas Data Repository, a platform for publishing and archiving datasets (and other data products) created by faculty, staff, and students at Texas A&M University. This repository is publicly available and searchable via the web, however, identifying information about you will be stripped from the data prior to storage in this platform.

Data anonymization is conducted to protect peoples' privacy. It is the process of removing personally identifiable information from data sets, so that the people whom the data describe remain anonymous.

The UK Anonymisation Network provides the Anonymisation Decision Making Framework (ADF) provides a booklet, templates, and tools to use to consider when you are de-identifying, aggregating, or anonymizing your data.

Guide for writing a README file
A README file provides information about a data file and is intended to help ensure that the data can be correctly interpreted. This site includes best practices and a template to write your own README file.