Data Management

Data Management 101 for Scientists

Fall 2018 - Present
Video · Slides

Scientists work with lots of data both big and small, and in many formats and systems. This workshop will introduce data management practices for scientists to consider and apply throughout the research lifecycle. Good data management practices pertaining to planning, organization, documentation, storage and backup, sharing, citation, and preservation will be presented through a sciences lens using discipline-based, concrete examples. While good general data management practices are relevant across disciplines, participants working specifically within the sciences are the intended audience for this workshop.

Data Management 101 for Social Scientists

Fall 2018 - Present
Video · Slides

Social scientists work with lots of data in their research, be it qualitative or quantitative, primary or secondary. This workshop will introduce data management practices for social scientists to consider and apply throughout the research lifecycle. Good data management practices pertaining to planning, organization, documentation, storage and backup, sharing, citation, and preservation will be presented through a social sciences lens using discipline-based, concrete examples. While good general data management practices are relevant across disciplines, participants working specifically within the social sciences are the intended audience for this workshop.

Data Management 101 for Humanists

Fall 2018 - Present
Video · Slides

Humanists work with various media, content and materials (sources) as part of their research. These sources can be considered data. This workshop will introduce data management practices for humanities researchers to consider and apply throughout the research lifecycle. Good data management practices pertaining to planning, organization, documentation, storage and backup, sharing, citation, and preservation will be presented through a humanities lens with discipline-based, concrete examples. While general good data management practices are relevant across disciplines, participants working specifically within the humanities are the intended audience for this workshop.

Data Management 201: How and where to publish your data

Spring 2020 - Present
Slides

Data management practices help researchers take care of their data throughout the entire research process from the planning phase to the end of a project when data might be shared or “published” within a repository. Building upon the foundational concepts covered in the Data Management 101 courses offered this year, this workshop will provide hands-on experience where participants will learn strategies for how to prepare data for publishing by “curating” an example dataset and identifying common data issues. Participants will also learn about the overall role of repositories within the data sharing landscape and learn strategies for locating and assessing repositories.

Research Reproducibility: Tips and Tools

Spring 2020 - Present

In response to a growing focus on the importance of reproducibility, replication, and transparency in the research endeavor, scholars are adapting their practices and learning new skills and tools. This workshop will introduce some foundational strategies that can increase the reproducibility of your work. You will also learn about specific tools and protocols that you might use within your research workflows including the TIER protocol, git and GitHub, and online containerization tools such as Binder and Code Ocean.

Data Management 101 with Disciplinary Discussions

Fall 2019
Slides

Researchers work with lots of data both big and small, in many different formats and across various digital systems. The first hour of this workshop will introduce good digital data management practices and how they can be practically applied throughout the research lifecycle. Good data management practices cover pre-project planning, active workflow organization, documentation, storage and backup strategies, and optimizing your final “data package” to fulfill research quality and reproducibility requirements and facilitate new research. The second hour of this workshop will allow participants to break up into broad disciplinary groups (sciences and engineering, social sciences, and humanities) for a facilitated discussion on how to specifically apply good data management in your field. Be prepared to share your own tips and tricks and challenges for this portion of the workshop.

Data Management 101 with Tool Demonstrations

Fall 2019
Slides

Researchers work with lots of data both big and small, in many different formats and across various digital systems. The first hour of this workshop will introduce good digital data management practices and how they can be practically applied throughout the research lifecycle. Good data management practices cover pre-project planning, active workflow organization, documentation, storage and backup strategies, and optimizing your final “data package” to fulfill research quality and reproducibility requirements and facilitate and new research. The second hour of this workshop will offer a mini “tour” of research data management tools including GitHub, LabArchives, and Tropy. Participants will be able to attend two of these demonstrations in the time allotted. We will end with a share out about what you’ve learned and how you might apply the tool to your own work.

Data Management 201: Preparing Data for Publishing

Fall 2019
Slides

Data management practices help researchers take care of their data throughout the entire research process from the planning phase to the end of a project when data might be shared or “published” within a repository. Building upon the foundational concepts covered in the data management 101 courses offered this year, this workshop will provide hands-on experience where participants will learn strategies for “curating” a dataset for formal sharing. Participants will identify common data issues, determine recommendations to optimize the dataset, generate metadata and documentation, and consider how these practices might be applied to their own research.

Open Science: General Principles and Practices

Spring 2019
Slides

Open Science is a growing movement that advocates for research to be transparent and openly available to all others for the purposes of engagement, validation, and extension. This workshop will present an overview of the Open Science movement and the general principles of the movement including the importance of access to data, publications, and the underlying research process as well as new initiatives within scholarly communications that support “openness” of the research endeavor such as preprints, registered reports, persistent identifiers, and community engagement platforms.

Introduction to Duke's Research Data Repository

Spring 2019 - Fall 2019
Slides

This workshop will provide an overview of Duke's Research Data Repository. The general functionalities of the platform as well as tips for submitting data will be discussed. Participants will also have an opportunity to discuss how the RDR or other repositories can help them comply with funder and journal policies as well as meet growing standards around data stewardship and sharing, such as the FAIR Guiding Principles.

Managing Sensitive Data

Spring 2018 - Present
Video · Slides

In the course of your research you may collect, interact with or analyze data that are classified as “Sensitive” or "Restricted" according to Duke's data classification standard. In this workshop we will examine common sensitive data types, how Duke’s IRB and Information Technology Security Office (ITSO) expects you to protect that data throughout your project’s lifecycle and the resources available to you for sensitive data storage and analysis, data de-identification, and data archiving and sharing.

Finding a Home for Your Data: An Introduction to Archives and Repositories

Fall 2017, Fall 2018, Fall 2019
Video · Slides

Publishing and preserving research data within a trusted repository helps researchers comply with funder and journal data sharing policies, supports the discovery of and access to data, and can result in more visibility and higher impact for research projects. This workshop will provide an overview of the different types of repositories and the overall role of repositories within the data sharing landscape. Key repositories in various disciplines will be explored, and attendees will learn about resources for locating and assessing repositories. Attendees will also have an opportunity to locate appropriate repositories for their own research.

Building Blocks for Reproducibility: Concepts and Practices

Fall 2018
Video · Slides

In response to a growing focus on the importance of reproducibility, replication, and transparency in the research endeavor, scholars are adapting their practices and learning new skills and tools. DVS is offering a workshop series that will introduce the concepts, practices and tools that will help increase the reproducibility of your work. This workshop will introduce the concept of reproducibility, its impact on science, and basic best practices that you can apply to make your work more transparent and reproducible. The workshop will be taught by a guest instructor, April Clyburne-Sherin, from Code Ocean. The general functionalities of the computational reproducibility platform Code Ocean will also be presented during the workshop.

Building Blocks for Reproducibility: Open Science Framework

Fall 2018
Video · Slides

In response to a growing focus on the importance of reproducibility, replication, and transparency in the research endeavor, scholars are adapting their practices and learning new skills and tools. DVS is offering a workshop series that will introduce the concepts, practices and tools that will help increase the reproducibility of your work. This workshop will introduce the Open Science Framework (OSF), which is a free, open source project management tool developed and maintained by the Center for Open Science. The OSF can help scholars manage their workflow, organize their materials, and share all or part of a project with the broader research community. This workshop will demonstrate some of the key functionalities of the tool including how to structure your materials, manage permissions, version content, integrate with third-party tools (such as Box, GitHub, or Mendeley), share materials, register projects, and track usage.

Building Blocks for Reproducibility: TIER Protocol

Fall 2018
Video · Slides

This workshop will introduce the TIER Protocol, which outlines a specification and process for maintaining well-organized documentation and producing more reproducible research projects. An example of using the TIER Protocol in conjunction with the Open Science Framework will also be presented.

Reproducibility: Data Management, Git, and RStudio

Fall 2017 - Spring 2018
Video · Slides

This workshop will introduce some general data management strategies that can increase the reproducibility of your work. You will also learn through hands-on exercises how to harness two specific tools, git and RStudio, to support the execution of more reproducible research projects. Git is a powerful version control system and RStudio is an open-source statistical software program. The Hands-on part of this workshop focuses on the practical aspects of configuring RStudio with Git. If you don't intend to use the R programming language, you may want to take a different workshop.

Data Management Fundamentals

Spring 2017 - Spring 2018
Video · Slides

This workshop introduces data management practices to consider throughout the research lifecycle: planning, organization, documentation, storage and backup, sharing, citation, and preservation. The workshop will offer an overview of general recommendations that are relevant across disciplines and will point attendees to additional resources at Duke and beyond.

Introduction to the Open Science Framework

Spring 2018
Video · Slides

The Open Science Framework (OSF) is a free, open source project management tool developed and maintained by the Center for Open Science. The OSF can help scholars manage their workflow, organize their materials, and share all or part of a project with the broader research community. This workshop will demonstrate some of the key functionalities of the tool including how to structure your materials, manage permissions, version content, integrate with third-party tools (such as Box, GitHub, or Mendeley), share materials, register projects, and track usage.

OSF + TIER Protocol: Designing a Reproducibile Workflow

Spring 2018
Video · Slides

The Open Science Framework is a free online system for managing and sharing research materials throughout the research cycle, and the TIER Protocol is a workflow for maintaining well-organized documentation of your data and analyses. This workshop will introduce you to key features of the OSF and demonstrate how it can be used in conjunction with TIER to facilitate reproducible research practices, collaboration, and best practices in data management.

Publishing Data with Research and Other Strategies for Increasing Your Impact

Spring 2018
Video · Slides

Scholars can and do communicate their research in various ways. While peer-reviewed journal publications remain the primary outlet for sharing the key results of research projects, there are growing norms (and expectations) that the underlying data from projects should also be published. In this workshop, we will look at 1) strategies to effectively publish data; 2) journal policies related to data sharing; 3) new types of publications such as data articles and registered reports; and 4) strategies for increasing and measuring the impact of your research. There will also be a hands-on portion of the workshop where participants will create their own ORCID identifier.

Writing a Data Management Plan

Fall 2017
Slides

This workshop will be a deep dive into the process of writing a data management plan (DMP) using the DMPTool. To make the most of this workshop, attendees are encouraged to bring a “live” DMP that they are ready to begin or are currently in the process of writing. Attendees without active DMPs may write a “test” DMP based on who they would typically apply to for research funding. The “test” DMP can then serve as a useful reference when it is time to write a live plan. The instructors (both Research Data Management Consultants) will be on-hand to provide individual help during the writing portion of the workshop as needed and, following the workshop, are available to review plans through the DMPTool at any point up to final submission.

Research Collaboration Strategies and Tools

Fall 2017
Slides

Scholars increasingly work on collaborative research projects. Collaborative projects often bring together partners across disciplines, institutions, and sectors. These projects present opportunities for innovation but also raise challenges for the development of efficient and effective workflows and the management of data. This workshop will examine considerations for collaborative research and present some strategies for developing and documenting workflows as well as methods for storing and sharing data. We will also look at some tools (i.e., Box, OSF, PRDN, etc.) available at Duke that can be used to support these types of projects.

Data Management and Grants: Complying with Mandates

Spring 2017
Slides

Today, researchers are increasingly faced with requirements from both federal and private funders to share, archive, and plan for the management of their data. This trend began in 2003 when NIH released their data sharing policy and in 2011 the NSF began requiring that all grant proposals include a two-page Data Management Plan (or DMP). Then in 2013 an Office of Science and Technology Policy Memo directed all federal agencies with over $100 million in annual funding to develop plans to make research products, including data, openly accessible. This workshop will provide an overview of funding agencies’ DMP requirements, the primary components of data management plans, and suggestions for integrating data management updates into grant reporting. Attendees will also learn about tools and resources that can help them write a DMP that complies with funder mandates. A portion of this workshop will include a hands-on data management plan exercise.

Data Management Tools: The Dataverse Project

Spring 2017
Slides

The Dataverse is an open source repository software platform for sharing, preserving, citing, discovering, exploring, and analyzing research data. This workshop will provide an overview of the Dataverse Project and demonstrate how the Dataverse can be used to discover research data and manage and share data in compliance with best practices.

Data Management Tools: Colectica for Excel

Spring 2017
Slides

Are you an avid Excel user? Would you like to know how to add helpful documentation into your Excel files and generate codebooks automatically? If so, I’d like to introduce you to Colectica. While there is a paid version, there is a free version that is more than adequate if you plan to do all of your analysis in Excel. Visit http://www.colectica.com/software/colecticaforexcel and download before the workshop (my apologies to Mac Users in advance - this software only works with Windows). You are encouraged to bring a laptop and your own Excel file(s) to take some time to get to know the tool. Feel free to bring your lunch. In the spirit of Love Your Data Week and Valentine's Day, chocolate will be provided.

Data Management and Reproducibility: Enabling Open and Transparent Research through Data Sharing

Spring 2017
Slides

Making data available within repositories is an essential aspect of supporting open and transparent research. Today as science is tackling the so-called “reproducibility crisis”, researchers are increasingly faced with journal requirements to share their data for the purposes of verification. This workshop will explore the concept of reproducibility, the growth in journal data sharing policies, and present strategies to help researchers share data that meet standards for reproducibility and reuse.

Data Management Plans: Grants, Strategies and Considerations

Fall 2012 - Spring 2014
Guide

In the last few years granting agencies across the disciplines have increasingly required data management plans as part of a grant proposal that detail strategies to manage, share and preserve research data as part of a funded grant project. NSF, the NIH, the National Endowment for the Humanities and other organizations have similar requirements, and Duke policy requires that research records (including digital data) be kept for at least five years. How should researchers respond? In this presentation, we’ll give an overview of research data management challenges and opportunities and describe some approaches for meeting them. We’ll ask the audience to share how they do data management now, and we’ll talk about planning underway for new services to help with data management at Duke.