Digital preservation is the process of managing and maintaining digital resources so they can be accessed and used in the future. This topic can be complex. For example, what does it mean to "preserve" a digital resource? Are back-ups sufficient? What does "long term" preservation mean with regard to digital objects? How are libraries engaging in preserving digital resources? And what are some of the best ways to ensure that personal documents will be reusable in the future?
While the answers to some of these questions are still emerging, the Duke University Libraries can help you begin to think about how to keep your content available for other users over time by highlighting some agreed-upon preservation strategies, as well as some of the services we are able to provide to the Duke community.
This page explains key concepts and commonly asked questions. For more formal guidelines, read about our Digital Preservation Policies.
Where to start
There are many excellent sources of information about personal digital archiving. The Library of Congress recommends a four-step approach to starting the digital preservation process:
- Identify where your digital resources are located. Are they on a digital camera? Your phone? Your personal laptop computer? Try to get an inventory of what you want to keep and where it lives.
- Decide which resources are worth preserving! We produce enormous amounts of digital data, but not all of it is necessarily worth keeping long-term.
- Organize your resources. This can seem onerous or time-consuming, but it is important. Make sure your resources at least have a descriptive file name that is distinctive, human readable and avoids special characters, and that they are organized into some kind of structure.
- Finally, make copies. Backing up your data in as many modes and storage media as possible helps ensure that it will be recoverable should something go disastrously wrong. Think about things like external hard drives and online storage.
- Personal Digital Archiving Day Kit, from the Library of Congress
- Federal Agencies Digital Guidelines Initiative
- Smithsonian Data Management Best Practices: Naming and Organizing Files
How to make your data last
Selecting, organizing, and making back-ups of your data are important first steps, but there are some other considerations to keep in mind when thinking about how to make your data available long term.
Consider file formats carefully
File formats have historically proliferated along with the different kinds of content types that make up digital data, and can vary widely in terms of data encoding complexity. Not all file formats are equally appropriate for preservation. If you've ever tried to work with documents created by commercial word-processing software from the 1990s, you will understand this well! File format obsolescence occurs when new generations of software abandon support for older formats, and it can render your data unusable.
There are some characteristics of certain formats that can make them relatively resistant to obsolescence. Generally, the more open (non-proprietary or non-commercial), commonly used , and well-documented the format, the better. Data saved as plain text, JPEG, WAV, or CSV files not only stand a better chance of being usable in the future, but are also more likely to be usable today by a broad array of software applications.
Our Recommended File Formats page describes preferred file formats for digital preservation. We’ve based them on best practices as described by a number of heritage institutions such as the Library of Congress and the National Archives and Records Administration. Our recommendations favor formats about which the libraries have a reasonable level of confidence in future re-use.
- File formats and standards, from the Digital Preservation Coalition's Digital Preservation Handbook
- Library of Congress Recommended Formats Statement
- U.S. National Archives Digital Preservation Framework
Ensuring continued reuse with bit-level preservation
Even recommended formats are not totally immune from format obsolescence, which can happen to both open and proprietary formats. File format migration, in which the data encoded in an obsolete format are migrated to a more current version, may help to ensure continued long-term reuse of the data. Because it involves potentially changing the structure and content of data, format migration may introduce error, data loss, or formatting changes. In addition, monitoring for file format obsolescence can be resource-intensive, especially when your data are encoded in a wide variety of formats.
For this reason, if the data are encoded in a format that does not lend itself to format migration, often the most robust level of preservation that can be applied is bit-level preservation , or ensuring that the bits in the preservation environment remain unchanged with no guarantee that future software programs will display the file successfully. A file's fixity (or the assurance that the digital file hasn't changed) can be calculated and monitored through the use of checksums .
A checksum is an algorithmic hash that acts as a digital fingerprint for the file. Any change to a file will cause the checksum to change, alerting the owner of the file to the fact that something may have gone wrong. Checksums can help ensure both that a file has been properly transferred (either from one storage environment to another, or from user to user), and that a file's fixity is maintained while it resides in the storage environment.
- Fixity and checksums, from the Digital Preservation Coalition's Digital Preservation Handbook
- "B is for bit preservation", from the Library of Congress blog, The Signal
Checksums and fixity are only a part of the digital preservation puzzle. Other concerns that are equally important include:
- Keeping multiple copies of your data, ideally in diverse environments across storage media and geographic location
- Making sure that they are accompanied by adequate documentation and description are equally important
- Data encryption, access controls, and other information security issues relevant to the long-term security of data, especially as preservation storage environments transition to the cloud
Long-term data protection & the Libraries
Duke University Libraries are committed to the preservation of a wide array of digital materials. In addition to maintaining a growing collection of digitized and born-digital library managed materials, we operate preservation environments for Duke-originated scholarship, especially Open Access publications and research data.
While the Libraries can't provide archiving or preservation for personal digital resources (we can't take vacation photos!), we are available to help you think through how digital preservation best practices might apply to research projects, publications, course materials, or university records.
For certain materials, like research data generated by scholars at Duke, the use of a formal preservation environment like the Duke Research Data Repository tackles many of the issues raised above. When you submit your research data to the Libraries for publication, library staff will:
- Review files in your dataset for preservation-suitability
- Review documentation, to be sure that the files and the software or equipment used to generate them are adequately described
- Generate checksums when we move your files into our system. The system will then periodically recalculate to be sure nothing about the files has changed while they're under our stewardship.
- Provide you with a Digital Object Identifier (DOI), which acts as a persistent link to data so that other researchers will be able to access your dataset long term.
Many faculty create or participate in collaborative or team-based research projects, such as Bass Connections, Story+, or Data+ projects. For guidance on preserving these projects, please review the Duke University Libraries' Guidelines for Preserving and Disseminating Research Products from Team Based Research. If you have questions regarding these guidelines, please contact ScholarWorks at firstname.lastname@example.org.
Don't hesitate to reach out to the Libraries to assist with digital preservation needs!
For questions about publishing and preserving scholarly publications and other forms of scholarly work, visit Scholarworks: A Center for Scholarly Publishing at Duke University Libraries, or send your inquiry to email@example.com.
To learn more about preserving other kinds of Duke University related materials, feel free to contact University Archives.
For all other digital preservation needs, drop us a line at firstname.lastname@example.org.