• Tue, Sep 28, 2021 — 10:00 AM

    Presented by Drew Keener

    [In-person] The R language has became a popular option for working with geospatial data. Compared to traditional GIS software, the code-driven approach of R can be more reproducible and efficient. This workshop give participants the skills to perform geospatial workflows entirely within R. We will discuss how different types of geospatial data work in R, walk through examples of data operations, and explore common analysis methods for geospatial data. 

    This workshop is a companion to Geospatial Data in R: Mapping, which focuses on visualization more than analysis.

    Attendees will need basic familiarity with R and RStudio to follow along with the exercises. Knowledge of tidyverse packages such as ggplot2 and dplyr is also helpful.

    This workshop will be held in person in Bostock 127 (The Edge Workshop Room), but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at any time prior to the conclusion of the workshop.

    Mapping & GIS, Data Science

  • Thu, Sep 30, 2021 — 10:00 AM

    Presented by Mark Thomas

    [In-person] Are you looking for an open source option for GIS to make maps or to analyze geospatial data? In this workshop we will demonstrate how to import and analyze data in QGIS and discuss the benefits of using QGIS over other GIS software. In the process, we'll go over some general GIS concepts such as layers, types of GIS files, and projections, with an emphasis on feature (vector) layers. This is an introductory class, and no prior GIS experience is needed.

    This workshop will be held in person in Bostock Library (room 023, in the basement corridor leading to Perkins), but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    There 15 computers available, and attendees are welcome to bring laptops if they've loaded necessary software. Attendees using their own computers should have installed QGIS beforehand.

    • Go to the QGIS Downloads webpage.
    • Download and install the QGIS Installer for either the latest version or the latest long-term (most stable) release, either 64-bit or 32-bit.  Versions for several platforms are available.

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at anytime prior to the conclusion of the workshop.

    Mapping & GIS

  • Wed, Oct 6, 2021 — 10:00 AM

    Presented by Eric Monson

    [In-person] While Python is my preferred programming language for scripted data transformations, I have avoided routinely doing data visualization in Python. I could follow examples for the many Python visualization libraries, but in the end they all seemed confusing and made it hard to do the types of exploratory visualization that Tableau made easy. Finally, Altair has emerged as a viable alternative for me, because of the way it "thinks about" data and the visualization process.

    Altair is a declarative statistical visualization library for Python, built on top of the well-design and powerful Vega-Lite visualization grammar. (Vega-lite was built for the web, includes interaction, and is being adopted as a standard by high profile websites and tools.) It works well for small to medium-sized tabular data (like spreadsheets). 

    In this workshop, I’ll run you through both some introductory and some more complex examples using Altair with Python in Jupyter notebooks, so you can get a feeling for how you might use it in your own work. 

    Note: This is an introductory workshop on Altair, but you’ll probably be more comfortable following along if you have at least a little bit of experience using the Python programming language, since I won’t be spending time on the language itself. (Feel free to sign up no matter what your experience level, but past students with no Python or programming experience have found it too confusing to be useful.) 

    Bring a computer if you can!

    We hope there will be some computers in the training room with Python installed, but if you have a laptop with recent (from the past six months) versions of Python, Jupyter Lab and Altair installed, that will lessen the chance that you’ll have to share machines. Instructions are below.

    Anaconda Python distribution (Individual Edition): 
    https://www.anaconda.com/products/individual

    I strongly recommend that you install the Anaconda Python Distribution to use in class. In principle, if you have something above Python 3.7 or so, plus all the necessary modules, everything should work fine. But, the Anaconda Distribution is packaged nicely, can be installed without admin privileges, and comes with everything you’ll need. If you have another version of Python already installed and you’re going to install Anaconda, it’s best to uninstall the other version first. It can get to be a mess if you have multiple versions of Python installed on one machine. 

    • Go to the link above, hit Download, and choose the version for your operating system. I would recommend to just install for “yourself”, not for all users of the machine, since that way it will install everything in your Users/username folder and doesn’t require admin privileges. 
    • If you’re on Mac and aren’t comfortable with shell scripts on the command line, choose the Graphical Installer. 
    • On Windows, I would choose the 64-bit installer, unless you know you’re still running a 32-bit version of Windows on an older machine. 
    • If you’re sticking with your non-Anaconda version of Python, make sure you have JupyterLab, Pandas, Altair, and all of their respective dependencies installed.

    Please try to launch Python and JupyterLab before class to make sure they’re working! JupyterLab can be started from the Anaconda Navigator application, or from the Anaconda Prompt (Windows) or a Terminal (Mac) by typing (without quotes) “jupyter lab” and hitting return. From a Python notebook or an interactive Python prompt, you can test out the main modules you’ll need by typing this and executing the code cell:

    import pandas as pd
    import altair as alt

    This workshop will be held in person in the Bostock 023 Training Room, but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at any time prior to the conclusion of the workshop.

    Data Visualization

  • Thu, Oct 7, 2021 — 10:00 AM

    Presented by Mark Thomas

    [In-person] Tableau is a software package that is increasingly popular for creating striking visualizations, such as charts and graphs, from tabular data. It also has an increasing number of capabilities to create maps. Source data can include native geospatial files (such as shapefiles or GeoJSON files), but also tabular data (such as CSV or Excel files) that include locational values, such as place names or coordinate data. This workshop will cover how to create maps in Tableau and on ways to manipulate the data and to effectively symbolize it on a map.

    Please see this blog post for some background on mapping using Tableau.

    The workshop will be held in person in Bostock Library (room 023, in the basement corridor leading to Perkins), but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    There are 15 computers available, and attendees are welcome to bring laptops if they've loaded necessary software. If you're using your own computer, you need to have Tableau installed before the class. You can install Tableau for free as a university student: https://www.tableau.com/academic/students .  

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at anytime prior to the conclusion of the workshop.

    Mapping & GIS, Data Visualization

  • Wed, Oct 13, 2021 — 1:00 PM

    Presented by Jen Darragh & Sophia Lafferty-Hess

    [In-person] In this workshop participants will learn strategies for how to prepare data for publishing by “curating” an example dataset and identifying common data issues. Participants will also learn about the overall role of data repositories within the data sharing landscape and apply strategies for locating and assessing repositories. The workshop will include short lectures and group work via break-out rooms. As data sharing is increasingly required by journals and funders, this workshop will help early career researchers build the skills necessary to make their data FAIR (Findable, Accessible, Interoperable, and Reusable).

    This workshop contains the same material as the previously offered “RDM 201: How and where to publish your data” workshop offered in previous semesters - participants who have attended this previous workshop should not attend.

    This workshop (GS717.05) is eligible for 2 hours of Graduate School RCR Credit. 

    This workshop will be held in person in the Bostock 127 (The Edge Workshop Room), but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at any time prior to the conclusion of the workshop.

    Data Management

  • Thu, Oct 21, 2021 — 10:00 AM

    Presented by Eric Monson

    [In-person] Networks (or graphs) are a compelling way of studying relationships between people, places, object, ideas, etc. Generating network data and visualizations, however, can be an involved and unintuitive process requiring specialized tools. This workshop will explore some of the easier ways to produce, load, and visualize network data using Gephi, an open source, multi-platform network analysis and visualization application.

    No previous experience with Gephi or network data is requried. In this workshop you will get hands-on experience with an easy network data format and visualizing with the Gephi software.

    Bring a computer if you can!

    We hope there will be some computers in the training room with Gephi installed, but if you have a laptop with a recent (from the past six months) versions of Gephi installed, that will lessen the chance that you’ll have to share machines. Gephi can be downloaded from https://gephi.org/ for Windows, Mac and Linux. Please try to launch the program at least a day before the workshop begins.

    This workshop will be held in person in the Bostock 023, but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at any time prior to the conclusion of the workshop.

    Data Visualization

  • Tue, Oct 26, 2021 — 1:00 PM

    Presented by Jen Darragh & Sophia Lafferty-Hess

    [In-person] This workshop will explore strategies and best practices for sharing and publishing data to support open science, reproducibility, and future innovation. Topics covered will include the use of data and metadata standards to support interoperability and harmonization. An overview of repository options and examples of disciplinary repositories will be explored as well as methods to publish data to increase the impact of research projects. Participants will also engage in discussions regarding how academia and communities can develop policies, norms, and procedures that enable data sharing in line with the FAIR Guiding Principles (i.e., Findable, Accessible, Interoperable, and Reusable).

    This workshop is eligible for the 200-level faculty and staff RCR. 

    This workshop will be held in person in Bostock 127 (The Edge Workshop Room), but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at any time prior to the conclusion of the workshop.

    Data Management

  • Tue, Nov 9, 2021 — 10:00 AM

    Presented by Sophia Lafferty-Hess & John Little

    [In-person] Part of the Rfun series. The importance of reproducibility, replication, and transparency in the research endeavor is increasingly discussed in academia. This workshop will introduce foundational strategies that can increase the reproducibility of your work and present a potential end-to-end reproducible workflow using a suite of tools, including git, RStudio, Binder, and Zenodo. Configuration for the hands-on portion of the workshop will be sent to participants one week before the workshop. Participants are expected to bring their laptop already configured for the workshop. 

    Prerequisites:

    • Introductory familiarity with R (consider attending an Introduction to R workshop or watch a prerecorded workshop)

    • A GitHub account

    This workshop will be held in person in Bostock 127 (The Edge Workshop Room), but may move to virtual depending on Duke Coronavirus policies, health or equipment considerations. Please watch your email for updates. All participants are required to follow current Duke Coronavirus policies to attend the workshop.

    The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at any time prior to the conclusion of the workshop.

    Data Science, Data Management

Mailing List

Interested in keeping up to date with workshops and events in the Center for Data and Visualization Sciences? Subscribe to the cdvs-announce listserv, follow us on Twitter @duke_data, or look for announcements on our blog.