In partnership with departments across Duke and practitioners across the Research Triangle, Digital Scholarship and Publishing Services (DSPS) offers a variety of open workshops and symposia focused on digital projects, methods, tools, and best practices. Subscribe to our listserv for upcoming DSPS events or follow us on Twitter @DukeDSPS (Digital Scholarship and Publishing Services) / @MurthyDigital (Murthy Digital Studio).

Fall 2019 event series:

Questions about training possibilities for yourself or your project team, or want to suggest topics for future programs? Contact askdigital (at) 

RCR Workshops: Text/Data Series (fall 2019)

This series of workshops is about research methods for working with textual data.  It's geared toward scholars in the humanities, although some sessions may also be relevant to the social sciences.  The workshops move in a trajectory from close reading (markup, encoding, creating electronic editions) to "distant reading" (topic modeling, machine learning, aggregating and analyzing large corpora), and they aim to provide participants with a general knowledge of approaches to working with text.  Registration is required (see links below); when registering, please be sure to indicate whether you plan to receive RCR credit. Participants who are taking the workshops for RCR credit will receive priority registration.

Text/Data: Introduction to XML, TEI, and Structured Markup

Thursday, October 3, 2019 / 9:00 - 11:00 AM
Murthy Digital Studio (Bostock 121)

This session introduces the concept of semantic markup and distinguishes between markup and automated textual analysis. Using TEI as our platform for learning, we will study approaches to and reasons for marking up documents.  Participants will have a chance to work with documents directly and will encounter some of the real-life decisions that TEI editors must make.  Sample projects will illustrate the range of research and discovery made possible by TEI-encoded texts. Learning Outcomes: Participants will understand structured markup, the XML metalanguage, and the function of TEI.  Register now:

Text/Data: Applications of TEI for Research

Thursday, October 10, 2019 / 9:00 - 11:00 AM
Murthy Digital Studio (Bostock 121)

Building on the content of the previous workshop, this session takes an in-depth look at the use of TEI as a research tool.  We'll look in detail at sample projects and explore common technologies for searching and disseminating TEI (XSLT, Xquery).  Participants will get hands-on experience generating various outputs from TEI source (PDFs, HTML, RTF, TeX, KML for geographic data) and will learn how XQuery/XPath allow structured searching and cross-referencing in complex texts such as critical editions and diplomatic transcriptions.  Learning Outcomes: Participants will gain a sense of how TEI has been used in major scholarly projects and learn how to create TEI-encoded texts in support of their own research.  Register now:

Text/Data: Acquiring and Preparing a Corpus of Texts

Thursday, October 17, 2019 / 9:00 - 11:00 AM
Murthy Digital Studio (Bostock 121)

This session focuses on the technical dimensions of corpus development.  Using an array of printed matter -- from digital facsimiles of incunabula to modern letterpress/offset books -- we will explore the risks and benefits of optical character recognition (OCR); file formatting and naming issues; organization strategies for large corpora; and problems of data cleaning and preparation.  We will also look at some sources for textual research data, such as Project Gutenberg, the Internet Archive, and Google Books.  We will also discuss some common legal concerns around the use of textual corpora.  Learning Outcomes: Participants will learn how (and where) to assemble a body of texts for analysis, what characteristics those texts should exhibit, and what potential pitfalls -- legal and technical -- exist in the process of corpus acquisition.  Register now:

Text/Data: Analyzing Documents in Context

Thursday, October 24, 2019 / 9:00 - 11:00 AM
Murthy Digital Studio (Bostock 121)

This session focuses on the kinds of textual analysis that are possible using concordances, collocates, measures of distinctiveness (e.g., tf-idf) and other tools of style or content analysis based on the characteristics of tokens (words) within documents relative to general characteristics of a corpus.  We will look at how these techniques have been used in research (e.g., Mendenhall's early work on Shakespeare, Morton's stylometric study of the Pauline epistles, Mosteller and Wallace's analysis of the Federalist Papers, and Foster's examination of the novel Primary Colors).  Learning Outcomes: Participants will learn how to undertake some basic types of automated textual analysis using, e.g., metrics of document similarity.  Sample projects will illustrate the range of approaches and suggest the domains of relevant research questions. Register now:

Text/Data: Analyzing Text with Python (1/2)

Thursday, October 31, 2019 / 9:00 - 11:00 AM
Murthy Digital Studio (Bostock 121)

Python is a programming language that is well-suited to working with textual data: it has a clear syntax, it's easy to learn, and there are many libraries available for processing text.  We'll use some of its capabilities in this workshop as we discover how to code our own tools for analyzing individual texts and textual corpora.  This workshop does not assume any previous experience with Python, and it will include an accessible introduction to the language itself.  Like other workshops in the series, its goal is to match research questions with methods rather than to labor technical details.  Learning Outcomes: Participants will learn how to use the Python programming language to perform basic text analysis.  Register now:

Text/Data: Analyzing Text with Python (2/2)

Thursday, November 7, 2019 / 9:00 - 11:00 AM
Murthy Digital Studio (Bostock 121)

Continuing where the previous workshop left off, this session is focused on the Python Natural Language Toolkit (NLTK).  Using the TextBlob library, a user-friendly wrapper for NLTK, we'll examine how Python allows simple yet powerful exploration of corpora beyond word frequency and ngrams. Learning Outcomes: Participants will learn how to use the Python programming language and the NLTK library to analyze textual corpora.  Register now:

Text/Data: Topic Modeling for Humanities Research

Thursday, November 14, 2019 / 9:00 - 11:00 AM
Murthy Digital Studio (Bostock 121)

Participants in this session will acquire a general understanding of topic modeling, the automated analysis technique often referred to as "text mining."  Topic modeling can refer to a number of different algorithms, which are computationally intensive and mathematically complex. To facilitate a hands-on approach with a focus on process, this workshop uses the open-source MALLET toolkit as a platform for exploring topic modeling with LDA (Latent Dirichlet Allocation) and will not offer a comparison of algorithms.  Learning Outcomes: Participants will learn what topic modeling can reveal about a body of texts, how to interpret the results of topic modeling or document classification processes, and how to use the open-source MALLET software to analyze a body of texts.  Register now:

Munch & Mull (fall 2019)

Munch & Mull is an informal brown-bag discussion about topics at the intersections of scholarly publishing, academic libraries, and the digital humanities. Sessions take place on the second and fourth Thursdays of the month, except for the weeks of Thanksgiving and Christmas.  All Duke Libraries staff are welcome to attend.  To stay up-to-date on M&M topics, presenters, and news, consider subscribing to the Munch & Mull mailing list at  (This is a low-traffic list that generally receives 2-3 messages per month.) 

(Re)Introducing ScholarWorks

Thursday, September 12, 2019 / 12:00 - 1:00 PM
Murthy Digital Studio (Bostock 121)

ScholarWorks co-directors Paolo Mangiafico and Liz Milewicz will share their vision for the center and answer questions about how ScholarWorks can help librarians and researchers with a range of issues related to publishing and digital scholarship. We want to hear from Libraries staff about potential partnerships; about how we can support, coordinate, and communicate about our collective expertise in topics related to publishing; and about how we can serve as a resource for you and researchers in your departments. 

Information Matinenance as a Practice of Care

Thursday, September 26, 2019 / 12:00 - 1:00 PM
Murthy Digital Studio (Bostock 121)

Join us for a conversation about "Information Maintenance as a Practice of Care: An Invitation to Reflect and Share." Encouraging discussion about the relationship between an ethic of care and the practices of information maintenance, this reading addresses problems, opportunities, and occupational roles around preservation and the long-term upkeep of information -- digital or otherwise. This paper is by the Information Maintainers, a collective "with a common, vested interest in information maintenance and its role within our day-to-day work, our organizations, and our infrastructure."  Arianne Hartsell-Gundy leads the discussion.  

For more about the Maintainers, see; to download "Information Maintenance as a Practice of Care," please visit

Persistent Identifiers and Open Scholarship

Thursday, October 10, 2019 / 12:00 - 1:00 PM
Murthy Digital Studio (Bostock 121)

Haley Walton leads a discussion about persistent identifiers (such as Digital Object Identifiers [DOIs]) for open scholarship. We'll talk about DOIs and impact metrics, how to obtain a stable identifier for scholarly works, and the challenges of creating such identifiers for dynamic, evolving digital scholarship projects. For some background about the DOI system, it may be helpful to check out the website of the International DOI Foundation (

Library Services and Tenure & Promotion at Duke

Thursday, October 24, 2019 / 12:00 - 1:00 PM
Murthy Digital Studio (Bostock 121)

All are invited to join us for a discussion of the ways in which library staff and services play a role--or might potentially play a role--in the tenure and promotion process for faculty.  Readings, reports, and possible discussion questions will be circulated in advance of this Munch & Mull. 

Legal Issues + Best Practices for Digital Scholarship

Thursday, November 14, 2019 / 12:00 - 1:00 PM
Murthy Digital Studio (Bostock 121)

Arnetta Girardeau (Copyright & Information Policy Consultant, Duke Libraries) facilitates a discussion about some of the legal and privacy issues that intersect with digital scholarship projects (for example, copyright, student identity protection, and the rights of contributors and collaborators). As we think about developing a "best practices" guide on these subjects for digital scholarship at Duke, we welcome the perspectives of Libraries staff on what information would be useful for both librarians and our faculty and graduate student partners.

Beyond the PDF: Archiving and Sharing Different Kinds of Scholarship

Thursday, December 12, 2019 / 12:00 - 1:00 PM
Murthy Digital Studio (Bostock 121)

Join us for a conversation about the challenges of archiving, preserving, maintaining, and sharing forms of scholarship that range beyond the conventional monograph. We'll discuss the roles of library staff in facilitating the preservation of new forms of scholarship at all points in the lifecycle of scholarly communication. (Selected readings and resources will be circulated in advance of this M&M.)


Duke Libraries Digital Scholarship & Publishing Services department collaborates with researchers in the humanities and interpretive social sciences, at any level of study, to plan and build digital research projects. We supply consultation on technical matters, project management, and best practices for a wide range of technologically-engaged research. We also encourage learning and experimentation in digital scholarship through exploratory projects, programs of hands-on instruction, graduate student internships, and resources and programming in The Edge / Murthy Digital Studio.