Upcoming Events
[RCR; Online] Digital Humanities Text Analysis: Building a Corpus
Before you can undertake computational text analysis, it's necessary to obtain a corpus of digitized texts and, in many instances, take steps to prepare them for further processing. This digital humanities workshop focuses on the technical and ethical dimensions of corpus development. We will explore:
- the risks, benefits, and implications of depending on optical character recognition (OCR) to transcribe text;
- best practices for preserving the integrity and usability of a corpus via file formatting, naming, and organization choices;
- the ethics of data cleaning and preparation;
- common sources for textual research data; and
- ways in which AI can (and can't) assist with these challenges, and whether it should.
Note: No previous experience with any of these topics is assumed, but this workshop includes hands-on exploration in small groups and requires active participation.
[RCR; Online] Digital Humanities Text Analysis: Topic Modeling
This workshop will equip students with a general understanding of topic modeling techniques for research. Topic modeling refers to a number of statistical techniques, but all of them are ways of discovering commonalities among documents based on patterns or groups of words that tend to occur together.
To facilitate a hands-on approach with a focus on process, we'll use existing sample corpora and open-source tools. The class won't require programming or command-line use. Participants will learn what topic modeling can (and can't) reveal about a body of texts, best ethical practices for discussing the results of topic modeling, and ways to apply such techniques to their own research. No previous experience with text analysis or statistics is required.
Getting It Published: Generative AI for Research
Generative AI tools, such as ChatGPT, are profoundly changing the way people search, write, and communicate their research. In this workshop, we’ll look under the hood of GAI research tools to get a better understanding of how they work and what ethical considerations students and researchers should be aware of when using them. We will look at several AI research tools (Elicit, Inquisite, and Consensus) that can help you find and analyze vast networks of scientific literature.
Jenna Strawbridge is the Librarian for the Nicholas School of the Environment and Chemistry.
Hannah Rozear is the Librarian for Biological Science and Global Health at Duke University.
** A Zoom link will be sent via email to registered participants of this workshop **
The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the presenter at any time prior to the conclusion of the workshop.
Hosted by Duke University Libraries in collaboration with Pratt Graduate Communications and Intercultural Programs.
[RCR] Structuring Primary Sources as Data
Are you looking for ways to apply your research question to a collection of newspapers, photographs, letters, or other primary source materials? Have you ever thought a spreadsheet or database might help your research?
In this workshop, you will learn how to organize primary resource data around your research question(s) with an eye toward ethical concerns and common humanities data challenges, including consistency, uncertainty, ambiguity, and scale. We will consider a range of examples and will collaborate on an exercise that will give you hands-on experience creating a data structure for primary sources. You will also be introduced to key methods for digital humanities analysis, including matching data structures to visualization types and applications of data feminism principles* such as recognizing and addressing source biases, creating an intentional data collection and structuring practice, and making labor and decision-making in data collection and structuring visible.
By the end of this workshop, you will be able to
- Identify when gathering data may be an appropriate way to address your research question(s);
- Articulate when you need a spreadsheet as opposed to a database;
- Structure primary resource data around your research question(s);
- Identify and address common humanities data challenges, including consistency, uncertainty, ambiguity, and scale;
- Apply data feminism principles to data structures by, for example, choosing when to de-identify data, making your labor and decision-making explicit, rethinking binaries and hierarchies, including multiple perspectives, and maintaining key contextual features.
*Data feminism scholarship we will be drawing from includes the Feminist Data Manifest-No and Data Feminism.
This workshop will be facilitated by Hannah Jacobs, Digital Humanities Consultant with Duke Libraries' ScholarWorks Center for Open Scholarship.
Location: Zoom
Participation: You will be invited to participate in the hands-on exercise and discussions via mic or chat. Use of cameras during interactive sections is encouraged.
Audience: Graduate Students & Faculty
All are welcome to register; however, if there is a wait list, priority seating will be given to graduate students and faculty, particularly those in the humanities and social sciences. This workshop provides 2 credit hours towards the Duke Graduate School's Responsible Conduct of Research (RCR) credit requirements (GS714.19) AND RCR200 credit for faculty. Faculty need to attend the event for 60 minutes to receive credit.